#
79d71974 |
| 21-Apr-2014 |
Aristeu Rozanski <aris@redhat.com> |
device_cgroup: rework device access check and exception checking
Whenever a device file is opened and checked against current device cgroup rules, it uses the same function (may_access()) as when a
device_cgroup: rework device access check and exception checking
Whenever a device file is opened and checked against current device cgroup rules, it uses the same function (may_access()) as when a new exception rule is added by writing devices.{allow,deny}. And in both cases, the algorithm is the same, doesn't matter the behavior.
First problem is having device access to be considered the same as rule checking. Consider the following structure:
A (default behavior: allow, exceptions disallow access) \ B (default behavior: allow, exceptions disallow access)
A new exception is added to B by writing devices.deny:
c 12:34 rw
When checking if that exception is allowed in may_access():
if (dev_cgroup->behavior == DEVCG_DEFAULT_ALLOW) { if (behavior == DEVCG_DEFAULT_ALLOW) { /* the exception will deny access to certain devices */ return true;
Which is ok, since B is not getting more privileges than A, it doesn't matter and the rule is accepted
Now, consider it's a device file open check and the process belongs to cgroup B. The access will be generated as:
behavior: allow exception: c 12:34 rw
The very same chunk of code will allow it, even if there's an explicit exception telling to do otherwise.
A simple test case:
# mkdir new_group # cd new_group # echo $$ >tasks # echo "c 1:3 w" >devices.deny # echo >/dev/null # echo $? 0
This is a serious bug and was introduced on
c39a2a3018f8 devcg: prepare may_access() for hierarchy support
To solve this problem, the device file open function was split from the new exception check.
Second problem is how exceptions are processed by may_access(). The first part of the said function tries to match fully with an existing exception:
list_for_each_entry_rcu(ex, &dev_cgroup->exceptions, list) { if ((refex->type & DEV_BLOCK) && !(ex->type & DEV_BLOCK)) continue; if ((refex->type & DEV_CHAR) && !(ex->type & DEV_CHAR)) continue; if (ex->major != ~0 && ex->major != refex->major) continue; if (ex->minor != ~0 && ex->minor != refex->minor) continue; if (refex->access & (~ex->access)) continue; match = true; break; }
That means the new exception should be contained into an existing one to be considered a match:
New exception Existing match? notes b 12:34 rwm b 12:34 rwm yes b 12:34 r b *:34 rw yes b 12:34 rw b 12:34 w no extra "r" b *:34 rw b 12:34 rw no too broad "*" b *:34 rw b *:34 rwm yes
Which is fine in some cases. Consider:
A (default behavior: deny, exceptions allow access) \ B (default behavior: deny, exceptions allow access)
In this case the full match makes sense, the new exception cannot add more access than the parent allows
But this doesn't always work, consider:
A (default behavior: allow, exceptions disallow access) \ B (default behavior: deny, exceptions allow access)
In this case, a new exception in B shouldn't match any of the exceptions in A, after all you can't allow something that was forbidden by A. But consider this scenario:
New exception Existing in A match? outcome b 12:34 rw b 12:34 r no exception is accepted
Because the new exception has "w" as extra, it doesn't match, so it'll be added to B's exception list.
The same problem can happen during a file access check. Consider a cgroup with allow as default behavior:
Access Exception match? b 12:34 rw b 12:34 r no
In this case, the access didn't match any of the exceptions in the cgroup, which is required since exceptions will disallow access.
To solve this problem, two new functions were created to match an exception either fully or partially. In the example above, a partial check will be performed and it'll produce a match since at least "b 12:34 r" from "b 12:34 rw" access matches.
Cc: cgroups@vger.kernel.org Cc: Tejun Heo <tj@kernel.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: Li Zefan <lizefan@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Aristeu Rozanski <arozansk@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
show more ...
|
Revision tags: v3.15-rc2, v3.15-rc1, v3.14, v3.14-rc8 |
|
#
4d3bb511 |
| 19-Mar-2014 |
Tejun Heo <tj@kernel.org> |
cgroup: drop const from @buffer of cftype->write_string()
cftype->write_string() just passes on the writeable buffer from kernfs and there's no reason to add const restriction on the buffer. The on
cgroup: drop const from @buffer of cftype->write_string()
cftype->write_string() just passes on the writeable buffer from kernfs and there's no reason to add const restriction on the buffer. The only thing const achieves is unnecessarily complicating parsing of the buffer. Drop const from @buffer.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Daniel Borkmann <dborkman@redhat.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
show more ...
|
Revision tags: v3.14-rc7, v3.14-rc6, v3.14-rc5, v3.14-rc4, v3.14-rc3, v3.14-rc2 |
|
#
073219e9 |
| 08-Feb-2014 |
Tejun Heo <tj@kernel.org> |
cgroup: clean up cgroup_subsys names and initialization
cgroup_subsys is a bit messier than it needs to be.
* The name of a subsys can be different from its internal identifier defined in cgroup_
cgroup: clean up cgroup_subsys names and initialization
cgroup_subsys is a bit messier than it needs to be.
* The name of a subsys can be different from its internal identifier defined in cgroup_subsys.h. Most subsystems use the matching name but three - cpu, memory and perf_event - use different ones.
* cgroup_subsys_id enums are postfixed with _subsys_id and each cgroup_subsys is postfixed with _subsys. cgroup.h is widely included throughout various subsystems, it doesn't and shouldn't have claim on such generic names which don't have any qualifier indicating that they belong to cgroup.
* cgroup_subsys->subsys_id should always equal the matching cgroup_subsys_id enum; however, we require each controller to initialize it and then BUG if they don't match, which is a bit silly.
This patch cleans up cgroup_subsys names and initialization by doing the followings.
* cgroup_subsys_id enums are now postfixed with _cgrp_id, and each cgroup_subsys with _cgrp_subsys.
* With the above, renaming subsys identifiers to match the userland visible names doesn't cause any naming conflicts. All non-matching identifiers are renamed to match the official names.
cpu_cgroup -> cpu mem_cgroup -> memory perf -> perf_event
* controllers no longer need to initialize ->subsys_id and ->name. They're generated in cgroup core and set automatically during boot.
* Redundant cgroup_subsys declarations removed.
* While updating BUG_ON()s in cgroup_init_early(), convert them to WARN()s. BUGging that early during boot is stupid - the kernel can't print anything, even through serial console and the trap handler doesn't even link stack frame properly for back-tracing.
This patch doesn't introduce any behavior changes.
v2: Rebased on top of fe1217c4f3f7 ("net: net_cls: move cgroupfs classid handling into core").
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: "David S. Miller" <davem@davemloft.net> Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Aristeu Rozanski <aris@redhat.com> Acked-by: Ingo Molnar <mingo@redhat.com> Acked-by: Li Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Thomas Graf <tgraf@suug.ch>
show more ...
|
Revision tags: v3.14-rc1, v3.13, v3.13-rc8, v3.13-rc7, v3.13-rc6, v3.13-rc5, v3.13-rc4, v3.13-rc3 |
|
#
2da8ca82 |
| 05-Dec-2013 |
Tejun Heo <tj@kernel.org> |
cgroup: replace cftype->read_seq_string() with cftype->seq_show()
In preparation of conversion to kernfs, cgroup file handling is updated so that it can be easily mapped to kernfs. This patch repla
cgroup: replace cftype->read_seq_string() with cftype->seq_show()
In preparation of conversion to kernfs, cgroup file handling is updated so that it can be easily mapped to kernfs. This patch replaces cftype->read_seq_string() with cftype->seq_show() which is not limited to single_open() operation and will map directcly to kernfs seq_file interface.
The conversions are mechanical. As ->seq_show() doesn't have @css and @cft, the functions which make use of them are converted to use seq_css() and seq_cft() respectively. In several occassions, e.f. if it has seq_string in its name, the function name is updated to fit the new method better.
This patch does not introduce any behavior changes.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Aristeu Rozanski <arozansk@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Li Zefan <lizefan@huawei.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Neil Horman <nhorman@tuxdriver.com>
show more ...
|
Revision tags: v3.13-rc2, v3.13-rc1, v3.12, v3.12-rc7 |
|
#
73ba3534 |
| 22-Oct-2013 |
Serge Hallyn <serge.hallyn@ubuntu.com> |
device_cgroup: remove can_attach
It is really only wanting to duplicate a check which is already done by the cgroup subsystem.
With this patch, user jdoe still cannot move pid 1 into a devices cgro
device_cgroup: remove can_attach
It is really only wanting to duplicate a check which is already done by the cgroup subsystem.
With this patch, user jdoe still cannot move pid 1 into a devices cgroup he owns, but now he can move his own other tasks into devices cgroups.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Aristeu Rozanski <aris@redhat.com>
show more ...
|
Revision tags: v3.12-rc6, v3.12-rc5, v3.12-rc4, v3.12-rc3, v3.12-rc2, v3.12-rc1, v3.11, v3.11-rc7, v3.11-rc6, v3.11-rc5 |
|
#
bd8815a6 |
| 08-Aug-2013 |
Tejun Heo <tj@kernel.org> |
cgroup: make css_for_each_descendant() and friends include the origin css in the iteration
Previously, all css descendant iterators didn't include the origin (root of subtree) css in the iteration.
cgroup: make css_for_each_descendant() and friends include the origin css in the iteration
Previously, all css descendant iterators didn't include the origin (root of subtree) css in the iteration. The reasons were maintaining consistency with css_for_each_child() and that at the time of introduction more use cases needed skipping the origin anyway; however, given that css_is_descendant() considers self to be a descendant, omitting the origin css has become more confusing and looking at the accumulated use cases rather clearly indicates that including origin would result in simpler code overall.
While this is a change which can easily lead to subtle bugs, cgroup API including the iterators has recently gone through major restructuring and no out-of-tree changes will be applicable without adjustments making this a relatively acceptable opportunity for this type of change.
The conversions are mostly straight-forward. If the iteration block had explicit origin handling before or after, it's moved inside the iteration. If not, if (pos == origin) continue; is added. Some conversions add extra reference get/put around origin handling by consolidating origin handling and the rest. While the extra ref operations aren't strictly necessary, this shouldn't cause any noticeable difference.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com>
show more ...
|
#
492eb21b |
| 08-Aug-2013 |
Tejun Heo <tj@kernel.org> |
cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup
cgroup is currently in the process of transitioning to using css (cgroup_subsys_state) as the primary handle instead
cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup
cgroup is currently in the process of transitioning to using css (cgroup_subsys_state) as the primary handle instead of cgroup in subsystem API. For hierarchy iterators, this is beneficial because
* In most cases, css is the only thing subsystems care about anyway.
* On the planned unified hierarchy, iterations for different subsystems will need to skip over different subtrees of the hierarchy depending on which subsystems are enabled on each cgroup. Passing around css makes it unnecessary to explicitly specify the subsystem in question as css is intersection between cgroup and subsystem
* For the planned unified hierarchy, css's would need to be created and destroyed dynamically independent from cgroup hierarchy. Having cgroup core manage css iteration makes enforcing deref rules a lot easier.
Most subsystem conversions are straight-forward. Noteworthy changes are
* blkio: cgroup_to_blkcg() is no longer used. Removed.
* freezer: cgroup_freezer() is no longer used. Removed.
* devices: cgroup_to_devcgroup() is no longer used. Removed.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
182446d0 |
| 08-Aug-2013 |
Tejun Heo <tj@kernel.org> |
cgroup: pass around cgroup_subsys_state instead of cgroup in file methods
cgroup is currently in the process of transitioning to using struct cgroup_subsys_state * as the primary handle instead of s
cgroup: pass around cgroup_subsys_state instead of cgroup in file methods
cgroup is currently in the process of transitioning to using struct cgroup_subsys_state * as the primary handle instead of struct cgroup. Please see the previous commit which converts the subsystem methods for rationale.
This patch converts all cftype file operations to take @css instead of @cgroup. cftypes for the cgroup core files don't have their subsytem pointer set. These will automatically use the dummy_css added by the previous patch and can be converted the same way.
Most subsystem conversions are straight forwards but there are some interesting ones.
* freezer: update_if_frozen() is also converted to take @css instead of @cgroup for consistency. This will make the code look simpler too once iterators are converted to use css.
* memory/vmpressure: mem_cgroup_from_css() needs to be exported to vmpressure while mem_cgroup_from_cont() can be made static. Updated accordingly.
* cpu: cgroup_tg() doesn't have any user left. Removed.
* cpuacct: cgroup_ca() doesn't have any user left. Removed.
* hugetlb: hugetlb_cgroup_form_cgroup() doesn't have any user left. Removed.
* net_cls: cgrp_cls_state() doesn't have any user left. Removed.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Acked-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Steven Rostedt <rostedt@goodmis.org>
show more ...
|
#
eb95419b |
| 08-Aug-2013 |
Tejun Heo <tj@kernel.org> |
cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
cgroup is currently in the process of transitioning to using struct cgroup_subsys_state * as the primary handle instead
cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
cgroup is currently in the process of transitioning to using struct cgroup_subsys_state * as the primary handle instead of struct cgroup * in subsystem implementations for the following reasons.
* With unified hierarchy, subsystems will be dynamically bound and unbound from cgroups and thus css's (cgroup_subsys_state) may be created and destroyed dynamically over the lifetime of a cgroup, which is different from the current state where all css's are allocated and destroyed together with the associated cgroup. This in turn means that cgroup_css() should be synchronized and may return NULL, making it more cumbersome to use.
* Differing levels of per-subsystem granularity in the unified hierarchy means that the task and descendant iterators should behave differently depending on the specific subsystem the iteration is being performed for.
* In majority of the cases, subsystems only care about its part in the cgroup hierarchy - ie. the hierarchy of css's. Subsystem methods often obtain the matching css pointer from the cgroup and don't bother with the cgroup pointer itself. Passing around css fits much better.
This patch converts all cgroup_subsys methods to take @css instead of @cgroup. The conversions are mostly straight-forward. A few noteworthy changes are
* ->css_alloc() now takes css of the parent cgroup rather than the pointer to the new cgroup as the css for the new cgroup doesn't exist yet. Knowing the parent css is enough for all the existing subsystems.
* In kernel/cgroup.c::offline_css(), unnecessary open coded css dereference is replaced with local variable access.
This patch shouldn't cause any behavior differences.
v2: Unnecessary explicit cgrp->subsys[] deref in css_online() replaced with local variable @css as suggested by Li Zefan.
Rebased on top of new for-3.12 which includes for-3.11-fixes so that ->css_free() invocation added by da0a12caff ("cgroup: fix a leak when percpu_ref_init() fails") is converted too. Suggested by Li Zefan.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Acked-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Steven Rostedt <rostedt@goodmis.org>
show more ...
|
#
63876986 |
| 08-Aug-2013 |
Tejun Heo <tj@kernel.org> |
cgroup: add css_parent()
Currently, controllers have to explicitly follow the cgroup hierarchy to find the parent of a given css. cgroup is moving towards using cgroup_subsys_state as the main cont
cgroup: add css_parent()
Currently, controllers have to explicitly follow the cgroup hierarchy to find the parent of a given css. cgroup is moving towards using cgroup_subsys_state as the main controller interface construct, so let's provide a way to climb the hierarchy using just csses.
This patch implements css_parent() which, given a css, returns its parent. The function is guarnateed to valid non-NULL parent css as long as the target css is not at the top of the hierarchy.
freezer, cpuset, cpu, cpuacct, hugetlb, memory, net_cls and devices are converted to use css_parent() instead of accessing cgroup->parent directly.
* __parent_ca() is dropped from cpuacct and its usage is replaced with parent_ca(). The only difference between the two was NULL test on cgroup->parent which is now embedded in css_parent() making the distinction moot. Note that eventually a css->parent field will be added to css and the NULL check in css_parent() will go away.
This patch shouldn't cause any behavior differences.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>
show more ...
|
#
a7c6d554 |
| 08-Aug-2013 |
Tejun Heo <tj@kernel.org> |
cgroup: add/update accessors which obtain subsys specific data from css
css (cgroup_subsys_state) is usually embedded in a subsys specific data structure. Subsystems either use container_of() direc
cgroup: add/update accessors which obtain subsys specific data from css
css (cgroup_subsys_state) is usually embedded in a subsys specific data structure. Subsystems either use container_of() directly to cast from css to such data structure or has an accessor function wrapping such cast. As cgroup as whole is moving towards using css as the main interface handle, add and update such accessors to ease dealing with css's.
All accessors explicitly handle NULL input and return NULL in those cases. While this looks like an extra branch in the code, as all controllers specific data structures have css as the first field, the casting doesn't involve any offsetting and the compiler can trivially optimize out the branch.
* blkio, freezer, cpuset, cpu, cpuacct and net_cls didn't have such accessor. Added.
* memory, hugetlb and devices already had one but didn't explicitly handle NULL input. Updated.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>
show more ...
|
#
8af01f56 |
| 08-Aug-2013 |
Tejun Heo <tj@kernel.org> |
cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/
The names of the two struct cgroup_subsys_state accessors - cgroup_subsys_state() and task_subsys_state() - are somewhat awkwa
cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/
The names of the two struct cgroup_subsys_state accessors - cgroup_subsys_state() and task_subsys_state() - are somewhat awkward. The former clashes with the type name and the latter doesn't even indicate it's somehow related to cgroup.
We're about to revamp large portion of cgroup API, so, let's rename them so that they're less awkward. Most per-controller usages of the accessors are localized in accessor wrappers and given the amount of scheduled changes, this isn't gonna add any noticeable headache.
Rename cgroup_subsys_state() to cgroup_css() and task_subsys_state() to task_css(). This patch is pure rename.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>
show more ...
|
Revision tags: v3.11-rc4, v3.11-rc3, v3.11-rc2, v3.11-rc1, v3.10, v3.10-rc7, v3.10-rc6, v3.10-rc5, v3.10-rc4, v3.10-rc3 |
|
#
d591fb56 |
| 23-May-2013 |
Tejun Heo <tj@kernel.org> |
device_cgroup: simplify cgroup tree walk in propagate_exception()
During a config change, propagate_exception() needs to traverse the subtree to update config on the subtree. Because such config up
device_cgroup: simplify cgroup tree walk in propagate_exception()
During a config change, propagate_exception() needs to traverse the subtree to update config on the subtree. Because such config updates need to allocate memory, it couldn't directly use cgroup_for_each_descendant_pre() which required the whole iteration to be contained in a single RCU read critical section. To work around the limitation, propagate_exception() built a linked list of descendant cgroups while read-locking RCU and then walked the list afterwards, which is safe as the whole iteration is protected by devcgroup_mutex. This works but is cumbersome.
With the recent updates, cgroup iterators now allow dropping RCU read lock while iteration is in progress making this workaround no longer necessary. This patch replaces dev_cgroup->propagate_pending list and get_online_devcg() with direct cgroup_for_each_descendant_pre() walk.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Aristeu Rozanski <aris@redhat.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Reviewed-by: Michal Hocko <mhocko@suse.cz>
show more ...
|
Revision tags: v3.10-rc2, v3.10-rc1, v3.9, v3.9-rc8 |
|
#
e57d5cf2 |
| 16-Apr-2013 |
Rami Rosen <ramirose@gmail.com> |
devcg: remove parent_cgroup.
In devcgroup_css_alloc(), there is no longer need for parent_cgroup. bd2953ebbb("devcg: propagate local changes down the hierarchy") made the variable parent_cgroup redu
devcg: remove parent_cgroup.
In devcgroup_css_alloc(), there is no longer need for parent_cgroup. bd2953ebbb("devcg: propagate local changes down the hierarchy") made the variable parent_cgroup redundant. This patch removes parent_cgroup from devcgroup_css_alloc().
Signed-off-by: Rami Rosen <ramirose@gmail.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
show more ...
|
Revision tags: v3.9-rc7, v3.9-rc6 |
|
#
8adf12b0 |
| 07-Apr-2013 |
Tejun Heo <tj@kernel.org> |
devcg: remove broken_hierarchy tag
bd2953ebbb ("devcg: propagate local changes down the hierarchy") implemented proper hierarchy support. Remove the broken tag.
Signed-off-by: Tejun Heo <tj@kernel
devcg: remove broken_hierarchy tag
bd2953ebbb ("devcg: propagate local changes down the hierarchy") implemented proper hierarchy support. Remove the broken tag.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Aristeu Rozanski <aris@redhat.com>
show more ...
|
Revision tags: v3.9-rc5, v3.9-rc4, v3.9-rc3, v3.9-rc2, v3.9-rc1, v3.8 |
|
#
bd2953eb |
| 15-Feb-2013 |
Aristeu Rozanski <aris@redhat.com> |
devcg: propagate local changes down the hierarchy
This patch makes exception changes to propagate down in hierarchy respecting when possible local exceptions.
New exceptions allowing additional acc
devcg: propagate local changes down the hierarchy
This patch makes exception changes to propagate down in hierarchy respecting when possible local exceptions.
New exceptions allowing additional access to devices won't be propagated, but it'll be possible to add an exception to access all of part of the newly allowed device(s).
New exceptions disallowing access to devices will be propagated down and the local group's exceptions will be revalidated for the new situation. Example: A / \ B
group behavior exceptions A allow "b 8:* rwm", "c 116:1 rw" B deny "c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm"
If a new exception is added to group A: # echo "c 116:* r" > A/devices.deny it'll propagate down and after revalidating B's local exceptions, the exception "c 116:2 rwm" will be removed.
In case parent's exceptions change and local exceptions are not allowed anymore, they'll be deleted.
v7: - do not allow behavior change when the cgroup has children - update documentation
v6: fixed issues pointed by Serge Hallyn - only copy parent's exceptions while propagating behavior if the local behavior is different - while propagating exceptions, do not clear and copy parent's: it'd be against the premise we don't propagate access to more devices
v5: fixed issues pointed by Serge Hallyn - updated documentation - not propagating when an exception is written to devices.allow - when propagating a new behavior, clean the local exceptions list if they're for a different behavior
v4: fixed issues pointed by Tejun Heo - separated function to walk the tree and collect valid propagation targets
v3: fixed issues pointed by Tejun Heo - update documentation - move css_online/css_offline changes to a new patch - use cgroup_for_each_descendant_pre() instead of own descendant walk - move exception_copy rework to a separared patch - move exception_clean rework to a separated patch
v2: fixed issues pointed by Tejun Heo - instead of keeping the local settings that won't apply anymore, remove them
Cc: Tejun Heo <tj@kernel.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
show more ...
|
#
1909554c |
| 15-Feb-2013 |
Aristeu Rozanski <aris@redhat.com> |
devcg: use css_online and css_offline
Allocate resources and change behavior only when online. This is needed in order to determine if a node is suitable for hierarchy propagation or if it's being r
devcg: use css_online and css_offline
Allocate resources and change behavior only when online. This is needed in order to determine if a node is suitable for hierarchy propagation or if it's being removed.
Locking: Both functions take devcgroup_mutex to make changes to device_cgroup structure. Hierarchy propagation will also take devcgroup_mutex before walking the tree while walking the tree itself is protected by rcu lock.
Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Tejun Heo <tj@kernel.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
show more ...
|
#
c39a2a30 |
| 15-Feb-2013 |
Aristeu Rozanski <aris@redhat.com> |
devcg: prepare may_access() for hierarchy support
Currently may_access() is only able to verify if an exception is valid for the current cgroup, which has the same behavior. With hierarchy, it'll be
devcg: prepare may_access() for hierarchy support
Currently may_access() is only able to verify if an exception is valid for the current cgroup, which has the same behavior. With hierarchy, it'll be also used to verify if a cgroup local exception is valid towards its cgroup parent, which might have different behavior.
v2: - updated patch description - rebased on top of a new patch to expand the may_access() logic to make it more clear - fixed argument description order in may_access()
Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Tejun Heo <tj@kernel.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
show more ...
|
#
26898fdf |
| 15-Feb-2013 |
Aristeu Rozanski <aris@redhat.com> |
devcg: expand may_access() logic
In order to make the next patch more clear, expand may_access() logic.
v2: may_access() returns bool now
Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Serge Hallyn
devcg: expand may_access() logic
In order to make the next patch more clear, expand may_access() logic.
v2: may_access() returns bool now
Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Tejun Heo <tj@kernel.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
show more ...
|
#
53eb8c82 |
| 21-Feb-2013 |
Jerry Snitselaar <jerry.snitselaar@oracle.com> |
device_cgroup: don't grab mutex in rcu callback
Commit 103a197c0c4e ("security/device_cgroup: lock assert fails in dev_exception_clean()") grabs devcgroup_mutex to fix assert failure, but a mutex ca
device_cgroup: don't grab mutex in rcu callback
Commit 103a197c0c4e ("security/device_cgroup: lock assert fails in dev_exception_clean()") grabs devcgroup_mutex to fix assert failure, but a mutex can't be grabbed in rcu callback. Since there shouldn't be any other references when css_free is called, mutex isn't needed for list cleanup in devcgroup_css_free().
Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Aristeu Rozanski <aris@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v3.8-rc7, v3.8-rc6, v3.8-rc5, v3.8-rc4 |
|
#
103a197c |
| 17-Jan-2013 |
Jerry Snitselaar <jerry.snitselaar@oracle.com> |
security/device_cgroup: lock assert fails in dev_exception_clean()
devcgroup_css_free() calls dev_exception_clean() without the devcgroup_mutex being locked.
Shutting down a kvm virt was giving me
security/device_cgroup: lock assert fails in dev_exception_clean()
devcgroup_css_free() calls dev_exception_clean() without the devcgroup_mutex being locked.
Shutting down a kvm virt was giving me the following trace:
[36280.732764] ------------[ cut here ]------------ [36280.732778] WARNING: at /home/snits/dev/linux/security/device_cgroup.c:172 dev_exception_clean+0xa9/0xc0() [36280.732782] Hardware name: Studio XPS 8100 [36280.732785] Modules linked in: xt_REDIRECT fuse ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc nf_conntrack_ipv4 ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter it87 hwmon_vid xt_state nf_conntrack ip6_tables snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq coretemp snd_seq_device crc32c_intel snd_pcm snd_page_alloc snd_timer snd broadcom tg3 serio_raw i7core_edac edac_core ptp pps_core lpc_ich pcspkr mfd_core soundcore microcode i2c_i801 nfsd auth_rpcgss nfs_acl lockd vhost_net sunrpc tun macvtap macvlan kvm_intel kvm uinput binfmt_misc autofs4 usb_storage firewire_ohci firewire_core crc_itu_t radeon drm_kms_helper ttm [36280.732921] Pid: 933, comm: libvirtd Tainted: G W 3.8.0-rc3-00307-g4c217de #1 [36280.732922] Call Trace: [36280.732927] [<ffffffff81044303>] warn_slowpath_common+0x93/0xc0 [36280.732930] [<ffffffff8104434a>] warn_slowpath_null+0x1a/0x20 [36280.732932] [<ffffffff812deaf9>] dev_exception_clean+0xa9/0xc0 [36280.732934] [<ffffffff812deb2a>] devcgroup_css_free+0x1a/0x30 [36280.732938] [<ffffffff810ccd76>] cgroup_diput+0x76/0x210 [36280.732941] [<ffffffff8119eac0>] d_delete+0x120/0x180 [36280.732943] [<ffffffff81195cff>] vfs_rmdir+0xef/0x130 [36280.732945] [<ffffffff81195e47>] do_rmdir+0x107/0x1c0 [36280.732949] [<ffffffff8132d17e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [36280.732951] [<ffffffff81198646>] sys_rmdir+0x16/0x20 [36280.732954] [<ffffffff8173bd82>] system_call_fastpath+0x16/0x1b [36280.732956] ---[ end trace ca39dced899a7d9f ]---
Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com> Cc: stable@kernel.org Signed-off-by: James Morris <james.l.morris@oracle.com>
show more ...
|
Revision tags: v3.8-rc3, v3.8-rc2, v3.8-rc1, v3.7, v3.7-rc8, v3.7-rc7 |
|
#
92fb9748 |
| 19-Nov-2012 |
Tejun Heo <tj@kernel.org> |
cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free()
Rename cgroup_subsys css lifetime related callbacks to better describe what their roles are. Also, upd
cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free()
Rename cgroup_subsys css lifetime related callbacks to better describe what their roles are. Also, update documentation.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>
show more ...
|
Revision tags: v3.7-rc6, v3.7-rc5 |
|
#
4b1c7840 |
| 06-Nov-2012 |
Tejun Heo <tj@kernel.org> |
device_cgroup: add lockdep asserts
device_cgroup uses RCU safe ->exceptions list which is write-protected by devcgroup_mutex and has had some issues using locking correctly. Add lockdep asserts to u
device_cgroup: add lockdep asserts
device_cgroup uses RCU safe ->exceptions list which is write-protected by devcgroup_mutex and has had some issues using locking correctly. Add lockdep asserts to utility functions so that future errors can be easily detected.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Cc: Aristeu Rozanski <aris@redhat.com> Cc: Li Zefan <lizefan@huawei.com>
show more ...
|
#
201e72ac |
| 06-Nov-2012 |
Tejun Heo <tj@kernel.org> |
device_cgroup: fix RCU usage
dev_cgroup->exceptions is protected with devcgroup_mutex for writes and RCU for reads; however, RCU usage isn't correct.
* dev_exception_clean() doesn't use RCU variant
device_cgroup: fix RCU usage
dev_cgroup->exceptions is protected with devcgroup_mutex for writes and RCU for reads; however, RCU usage isn't correct.
* dev_exception_clean() doesn't use RCU variant of list_del() and kfree(). The function can race with may_access() and may_access() may end up dereferencing already freed memory. Use list_del_rcu() and kfree_rcu() instead.
* may_access() may be called only with RCU read locked but doesn't use RCU safe traversal over ->exceptions. Use list_for_each_entry_rcu().
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Cc: stable@vger.kernel.org Cc: Aristeu Rozanski <aris@redhat.com> Cc: Li Zefan <lizefan@huawei.com>
show more ...
|
#
64e10477 |
| 06-Nov-2012 |
Aristeu Rozanski <aris@redhat.com> |
device_cgroup: fix unchecked cgroup parent usage
In 4cef7299b478687 ("device_cgroup: add proper checking when changing default behavior") the cgroup parent usage is unchecked. root will not have a
device_cgroup: fix unchecked cgroup parent usage
In 4cef7299b478687 ("device_cgroup: add proper checking when changing default behavior") the cgroup parent usage is unchecked. root will not have a parent and trying to use device.{allow,deny} will cause problems. For some reason my stressing scripts didn't test the root directory so I didn't catch it on my regular tests.
Signed-off-by: Aristeu Rozanski <aris@redhat.com> Cc: Li Zefan <lizefan@huawei.com> Cc: James Morris <jmorris@namei.org> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
show more ...
|