xref: /openbmc/linux/Documentation/bpf/prog_cgroup_sysctl.rst (revision c39f2d9db0fd81ea20bb5cce9b3f082ca63753e2)
1*da703149SAndrey Ignatov.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
2*da703149SAndrey Ignatov
3*da703149SAndrey Ignatov===========================
4*da703149SAndrey IgnatovBPF_PROG_TYPE_CGROUP_SYSCTL
5*da703149SAndrey Ignatov===========================
6*da703149SAndrey Ignatov
7*da703149SAndrey IgnatovThis document describes ``BPF_PROG_TYPE_CGROUP_SYSCTL`` program type that
8*da703149SAndrey Ignatovprovides cgroup-bpf hook for sysctl.
9*da703149SAndrey Ignatov
10*da703149SAndrey IgnatovThe hook has to be attached to a cgroup and will be called every time a
11*da703149SAndrey Ignatovprocess inside that cgroup tries to read from or write to sysctl knob in proc.
12*da703149SAndrey Ignatov
13*da703149SAndrey Ignatov1. Attach type
14*da703149SAndrey Ignatov**************
15*da703149SAndrey Ignatov
16*da703149SAndrey Ignatov``BPF_CGROUP_SYSCTL`` attach type has to be used to attach
17*da703149SAndrey Ignatov``BPF_PROG_TYPE_CGROUP_SYSCTL`` program to a cgroup.
18*da703149SAndrey Ignatov
19*da703149SAndrey Ignatov2. Context
20*da703149SAndrey Ignatov**********
21*da703149SAndrey Ignatov
22*da703149SAndrey Ignatov``BPF_PROG_TYPE_CGROUP_SYSCTL`` provides access to the following context from
23*da703149SAndrey IgnatovBPF program::
24*da703149SAndrey Ignatov
25*da703149SAndrey Ignatov    struct bpf_sysctl {
26*da703149SAndrey Ignatov        __u32 write;
27*da703149SAndrey Ignatov        __u32 file_pos;
28*da703149SAndrey Ignatov    };
29*da703149SAndrey Ignatov
30*da703149SAndrey Ignatov* ``write`` indicates whether sysctl value is being read (``0``) or written
31*da703149SAndrey Ignatov  (``1``). This field is read-only.
32*da703149SAndrey Ignatov
33*da703149SAndrey Ignatov* ``file_pos`` indicates file position sysctl is being accessed at, read
34*da703149SAndrey Ignatov  or written. This field is read-write. Writing to the field sets the starting
35*da703149SAndrey Ignatov  position in sysctl proc file ``read(2)`` will be reading from or ``write(2)``
36*da703149SAndrey Ignatov  will be writing to. Writing zero to the field can be used e.g. to override
37*da703149SAndrey Ignatov  whole sysctl value by ``bpf_sysctl_set_new_value()`` on ``write(2)`` even
38*da703149SAndrey Ignatov  when it's called by user space on ``file_pos > 0``. Writing non-zero
39*da703149SAndrey Ignatov  value to the field can be used to access part of sysctl value starting from
40*da703149SAndrey Ignatov  specified ``file_pos``. Not all sysctl support access with ``file_pos !=
41*da703149SAndrey Ignatov  0``, e.g. writes to numeric sysctl entries must always be at file position
42*da703149SAndrey Ignatov  ``0``. See also ``kernel.sysctl_writes_strict`` sysctl.
43*da703149SAndrey Ignatov
44*da703149SAndrey IgnatovSee `linux/bpf.h`_ for more details on how context field can be accessed.
45*da703149SAndrey Ignatov
46*da703149SAndrey Ignatov3. Return code
47*da703149SAndrey Ignatov**************
48*da703149SAndrey Ignatov
49*da703149SAndrey Ignatov``BPF_PROG_TYPE_CGROUP_SYSCTL`` program must return one of the following
50*da703149SAndrey Ignatovreturn codes:
51*da703149SAndrey Ignatov
52*da703149SAndrey Ignatov* ``0`` means "reject access to sysctl";
53*da703149SAndrey Ignatov* ``1`` means "proceed with access".
54*da703149SAndrey Ignatov
55*da703149SAndrey IgnatovIf program returns ``0`` user space will get ``-1`` from ``read(2)`` or
56*da703149SAndrey Ignatov``write(2)`` and ``errno`` will be set to ``EPERM``.
57*da703149SAndrey Ignatov
58*da703149SAndrey Ignatov4. Helpers
59*da703149SAndrey Ignatov**********
60*da703149SAndrey Ignatov
61*da703149SAndrey IgnatovSince sysctl knob is represented by a name and a value, sysctl specific BPF
62*da703149SAndrey Ignatovhelpers focus on providing access to these properties:
63*da703149SAndrey Ignatov
64*da703149SAndrey Ignatov* ``bpf_sysctl_get_name()`` to get sysctl name as it is visible in
65*da703149SAndrey Ignatov  ``/proc/sys`` into provided by BPF program buffer;
66*da703149SAndrey Ignatov
67*da703149SAndrey Ignatov* ``bpf_sysctl_get_current_value()`` to get string value currently held by
68*da703149SAndrey Ignatov  sysctl into provided by BPF program buffer. This helper is available on both
69*da703149SAndrey Ignatov  ``read(2)`` from and ``write(2)`` to sysctl;
70*da703149SAndrey Ignatov
71*da703149SAndrey Ignatov* ``bpf_sysctl_get_new_value()`` to get new string value currently being
72*da703149SAndrey Ignatov  written to sysctl before actual write happens. This helper can be used only
73*da703149SAndrey Ignatov  on ``ctx->write == 1``;
74*da703149SAndrey Ignatov
75*da703149SAndrey Ignatov* ``bpf_sysctl_set_new_value()`` to override new string value currently being
76*da703149SAndrey Ignatov  written to sysctl before actual write happens. Sysctl value will be
77*da703149SAndrey Ignatov  overridden starting from the current ``ctx->file_pos``. If the whole value
78*da703149SAndrey Ignatov  has to be overridden BPF program can set ``file_pos`` to zero before calling
79*da703149SAndrey Ignatov  to the helper. This helper can be used only on ``ctx->write == 1``. New
80*da703149SAndrey Ignatov  string value set by the helper is treated and verified by kernel same way as
81*da703149SAndrey Ignatov  an equivalent string passed by user space.
82*da703149SAndrey Ignatov
83*da703149SAndrey IgnatovBPF program sees sysctl value same way as user space does in proc filesystem,
84*da703149SAndrey Ignatovi.e. as a string. Since many sysctl values represent an integer or a vector
85*da703149SAndrey Ignatovof integers, the following helpers can be used to get numeric value from the
86*da703149SAndrey Ignatovstring:
87*da703149SAndrey Ignatov
88*da703149SAndrey Ignatov* ``bpf_strtol()`` to convert initial part of the string to long integer
89*da703149SAndrey Ignatov  similar to user space `strtol(3)`_;
90*da703149SAndrey Ignatov* ``bpf_strtoul()`` to convert initial part of the string to unsigned long
91*da703149SAndrey Ignatov  integer similar to user space `strtoul(3)`_;
92*da703149SAndrey Ignatov
93*da703149SAndrey IgnatovSee `linux/bpf.h`_ for more details on helpers described here.
94*da703149SAndrey Ignatov
95*da703149SAndrey Ignatov5. Examples
96*da703149SAndrey Ignatov***********
97*da703149SAndrey Ignatov
98*da703149SAndrey IgnatovSee `test_sysctl_prog.c`_ for an example of BPF program in C that access
99*da703149SAndrey Ignatovsysctl name and value, parses string value to get vector of integers and uses
100*da703149SAndrey Ignatovthe result to make decision whether to allow or deny access to sysctl.
101*da703149SAndrey Ignatov
102*da703149SAndrey Ignatov6. Notes
103*da703149SAndrey Ignatov********
104*da703149SAndrey Ignatov
105*da703149SAndrey Ignatov``BPF_PROG_TYPE_CGROUP_SYSCTL`` is intended to be used in **trusted** root
106*da703149SAndrey Ignatovenvironment, for example to monitor sysctl usage or catch unreasonable values
107*da703149SAndrey Ignatovan application, running as root in a separate cgroup, is trying to set.
108*da703149SAndrey Ignatov
109*da703149SAndrey IgnatovSince `task_dfl_cgroup(current)` is called at `sys_read` / `sys_write` time it
110*da703149SAndrey Ignatovmay return results different from that at `sys_open` time, i.e. process that
111*da703149SAndrey Ignatovopened sysctl file in proc filesystem may differ from process that is trying
112*da703149SAndrey Ignatovto read from / write to it and two such processes may run in different
113*da703149SAndrey Ignatovcgroups, what means ``BPF_PROG_TYPE_CGROUP_SYSCTL`` should not be used as a
114*da703149SAndrey Ignatovsecurity mechanism to limit sysctl usage.
115*da703149SAndrey Ignatov
116*da703149SAndrey IgnatovAs with any cgroup-bpf program additional care should be taken if an
117*da703149SAndrey Ignatovapplication running as root in a cgroup should not be allowed to
118*da703149SAndrey Ignatovdetach/replace BPF program attached by administrator.
119*da703149SAndrey Ignatov
120*da703149SAndrey Ignatov.. Links
121*da703149SAndrey Ignatov.. _linux/bpf.h: ../../include/uapi/linux/bpf.h
122*da703149SAndrey Ignatov.. _strtol(3): http://man7.org/linux/man-pages/man3/strtol.3p.html
123*da703149SAndrey Ignatov.. _strtoul(3): http://man7.org/linux/man-pages/man3/strtoul.3p.html
124*da703149SAndrey Ignatov.. _test_sysctl_prog.c:
125*da703149SAndrey Ignatov   ../../tools/testing/selftests/bpf/progs/test_sysctl_prog.c
126