1*da703149SAndrey Ignatov.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) 2*da703149SAndrey Ignatov 3*da703149SAndrey Ignatov=========================== 4*da703149SAndrey IgnatovBPF_PROG_TYPE_CGROUP_SYSCTL 5*da703149SAndrey Ignatov=========================== 6*da703149SAndrey Ignatov 7*da703149SAndrey IgnatovThis document describes ``BPF_PROG_TYPE_CGROUP_SYSCTL`` program type that 8*da703149SAndrey Ignatovprovides cgroup-bpf hook for sysctl. 9*da703149SAndrey Ignatov 10*da703149SAndrey IgnatovThe hook has to be attached to a cgroup and will be called every time a 11*da703149SAndrey Ignatovprocess inside that cgroup tries to read from or write to sysctl knob in proc. 12*da703149SAndrey Ignatov 13*da703149SAndrey Ignatov1. Attach type 14*da703149SAndrey Ignatov************** 15*da703149SAndrey Ignatov 16*da703149SAndrey Ignatov``BPF_CGROUP_SYSCTL`` attach type has to be used to attach 17*da703149SAndrey Ignatov``BPF_PROG_TYPE_CGROUP_SYSCTL`` program to a cgroup. 18*da703149SAndrey Ignatov 19*da703149SAndrey Ignatov2. Context 20*da703149SAndrey Ignatov********** 21*da703149SAndrey Ignatov 22*da703149SAndrey Ignatov``BPF_PROG_TYPE_CGROUP_SYSCTL`` provides access to the following context from 23*da703149SAndrey IgnatovBPF program:: 24*da703149SAndrey Ignatov 25*da703149SAndrey Ignatov struct bpf_sysctl { 26*da703149SAndrey Ignatov __u32 write; 27*da703149SAndrey Ignatov __u32 file_pos; 28*da703149SAndrey Ignatov }; 29*da703149SAndrey Ignatov 30*da703149SAndrey Ignatov* ``write`` indicates whether sysctl value is being read (``0``) or written 31*da703149SAndrey Ignatov (``1``). This field is read-only. 32*da703149SAndrey Ignatov 33*da703149SAndrey Ignatov* ``file_pos`` indicates file position sysctl is being accessed at, read 34*da703149SAndrey Ignatov or written. This field is read-write. Writing to the field sets the starting 35*da703149SAndrey Ignatov position in sysctl proc file ``read(2)`` will be reading from or ``write(2)`` 36*da703149SAndrey Ignatov will be writing to. Writing zero to the field can be used e.g. to override 37*da703149SAndrey Ignatov whole sysctl value by ``bpf_sysctl_set_new_value()`` on ``write(2)`` even 38*da703149SAndrey Ignatov when it's called by user space on ``file_pos > 0``. Writing non-zero 39*da703149SAndrey Ignatov value to the field can be used to access part of sysctl value starting from 40*da703149SAndrey Ignatov specified ``file_pos``. Not all sysctl support access with ``file_pos != 41*da703149SAndrey Ignatov 0``, e.g. writes to numeric sysctl entries must always be at file position 42*da703149SAndrey Ignatov ``0``. See also ``kernel.sysctl_writes_strict`` sysctl. 43*da703149SAndrey Ignatov 44*da703149SAndrey IgnatovSee `linux/bpf.h`_ for more details on how context field can be accessed. 45*da703149SAndrey Ignatov 46*da703149SAndrey Ignatov3. Return code 47*da703149SAndrey Ignatov************** 48*da703149SAndrey Ignatov 49*da703149SAndrey Ignatov``BPF_PROG_TYPE_CGROUP_SYSCTL`` program must return one of the following 50*da703149SAndrey Ignatovreturn codes: 51*da703149SAndrey Ignatov 52*da703149SAndrey Ignatov* ``0`` means "reject access to sysctl"; 53*da703149SAndrey Ignatov* ``1`` means "proceed with access". 54*da703149SAndrey Ignatov 55*da703149SAndrey IgnatovIf program returns ``0`` user space will get ``-1`` from ``read(2)`` or 56*da703149SAndrey Ignatov``write(2)`` and ``errno`` will be set to ``EPERM``. 57*da703149SAndrey Ignatov 58*da703149SAndrey Ignatov4. Helpers 59*da703149SAndrey Ignatov********** 60*da703149SAndrey Ignatov 61*da703149SAndrey IgnatovSince sysctl knob is represented by a name and a value, sysctl specific BPF 62*da703149SAndrey Ignatovhelpers focus on providing access to these properties: 63*da703149SAndrey Ignatov 64*da703149SAndrey Ignatov* ``bpf_sysctl_get_name()`` to get sysctl name as it is visible in 65*da703149SAndrey Ignatov ``/proc/sys`` into provided by BPF program buffer; 66*da703149SAndrey Ignatov 67*da703149SAndrey Ignatov* ``bpf_sysctl_get_current_value()`` to get string value currently held by 68*da703149SAndrey Ignatov sysctl into provided by BPF program buffer. This helper is available on both 69*da703149SAndrey Ignatov ``read(2)`` from and ``write(2)`` to sysctl; 70*da703149SAndrey Ignatov 71*da703149SAndrey Ignatov* ``bpf_sysctl_get_new_value()`` to get new string value currently being 72*da703149SAndrey Ignatov written to sysctl before actual write happens. This helper can be used only 73*da703149SAndrey Ignatov on ``ctx->write == 1``; 74*da703149SAndrey Ignatov 75*da703149SAndrey Ignatov* ``bpf_sysctl_set_new_value()`` to override new string value currently being 76*da703149SAndrey Ignatov written to sysctl before actual write happens. Sysctl value will be 77*da703149SAndrey Ignatov overridden starting from the current ``ctx->file_pos``. If the whole value 78*da703149SAndrey Ignatov has to be overridden BPF program can set ``file_pos`` to zero before calling 79*da703149SAndrey Ignatov to the helper. This helper can be used only on ``ctx->write == 1``. New 80*da703149SAndrey Ignatov string value set by the helper is treated and verified by kernel same way as 81*da703149SAndrey Ignatov an equivalent string passed by user space. 82*da703149SAndrey Ignatov 83*da703149SAndrey IgnatovBPF program sees sysctl value same way as user space does in proc filesystem, 84*da703149SAndrey Ignatovi.e. as a string. Since many sysctl values represent an integer or a vector 85*da703149SAndrey Ignatovof integers, the following helpers can be used to get numeric value from the 86*da703149SAndrey Ignatovstring: 87*da703149SAndrey Ignatov 88*da703149SAndrey Ignatov* ``bpf_strtol()`` to convert initial part of the string to long integer 89*da703149SAndrey Ignatov similar to user space `strtol(3)`_; 90*da703149SAndrey Ignatov* ``bpf_strtoul()`` to convert initial part of the string to unsigned long 91*da703149SAndrey Ignatov integer similar to user space `strtoul(3)`_; 92*da703149SAndrey Ignatov 93*da703149SAndrey IgnatovSee `linux/bpf.h`_ for more details on helpers described here. 94*da703149SAndrey Ignatov 95*da703149SAndrey Ignatov5. Examples 96*da703149SAndrey Ignatov*********** 97*da703149SAndrey Ignatov 98*da703149SAndrey IgnatovSee `test_sysctl_prog.c`_ for an example of BPF program in C that access 99*da703149SAndrey Ignatovsysctl name and value, parses string value to get vector of integers and uses 100*da703149SAndrey Ignatovthe result to make decision whether to allow or deny access to sysctl. 101*da703149SAndrey Ignatov 102*da703149SAndrey Ignatov6. Notes 103*da703149SAndrey Ignatov******** 104*da703149SAndrey Ignatov 105*da703149SAndrey Ignatov``BPF_PROG_TYPE_CGROUP_SYSCTL`` is intended to be used in **trusted** root 106*da703149SAndrey Ignatovenvironment, for example to monitor sysctl usage or catch unreasonable values 107*da703149SAndrey Ignatovan application, running as root in a separate cgroup, is trying to set. 108*da703149SAndrey Ignatov 109*da703149SAndrey IgnatovSince `task_dfl_cgroup(current)` is called at `sys_read` / `sys_write` time it 110*da703149SAndrey Ignatovmay return results different from that at `sys_open` time, i.e. process that 111*da703149SAndrey Ignatovopened sysctl file in proc filesystem may differ from process that is trying 112*da703149SAndrey Ignatovto read from / write to it and two such processes may run in different 113*da703149SAndrey Ignatovcgroups, what means ``BPF_PROG_TYPE_CGROUP_SYSCTL`` should not be used as a 114*da703149SAndrey Ignatovsecurity mechanism to limit sysctl usage. 115*da703149SAndrey Ignatov 116*da703149SAndrey IgnatovAs with any cgroup-bpf program additional care should be taken if an 117*da703149SAndrey Ignatovapplication running as root in a cgroup should not be allowed to 118*da703149SAndrey Ignatovdetach/replace BPF program attached by administrator. 119*da703149SAndrey Ignatov 120*da703149SAndrey Ignatov.. Links 121*da703149SAndrey Ignatov.. _linux/bpf.h: ../../include/uapi/linux/bpf.h 122*da703149SAndrey Ignatov.. _strtol(3): http://man7.org/linux/man-pages/man3/strtol.3p.html 123*da703149SAndrey Ignatov.. _strtoul(3): http://man7.org/linux/man-pages/man3/strtoul.3p.html 124*da703149SAndrey Ignatov.. _test_sysctl_prog.c: 125*da703149SAndrey Ignatov ../../tools/testing/selftests/bpf/progs/test_sysctl_prog.c 126