Revision tags: v6.6.67, v6.6.66, v6.6.65, v6.6.64, v6.6.63, v6.6.62, v6.6.61, v6.6.60, v6.6.59, v6.6.58, v6.6.57, v6.6.56, v6.6.55, v6.6.54, v6.6.53, v6.6.52, v6.6.51, v6.6.50, v6.6.49, v6.6.48, v6.6.47 |
|
#
c005e2f6 |
| 14-Aug-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.46' into for/openbmc/dev-6.6
This is the 6.6.46 stable release
|
Revision tags: v6.6.46, v6.6.45 |
|
#
726f4c24 |
| 07-Aug-2024 |
Steven Rostedt <rostedt@goodmis.org> |
tracefs: Use generic inode RCU for synchronizing freeing
commit 0b6743bd60a56a701070b89fb80c327a44b7b3e2 upstream.
With structure layout randomization enabled for 'struct inode' we need to avoid ov
tracefs: Use generic inode RCU for synchronizing freeing
commit 0b6743bd60a56a701070b89fb80c327a44b7b3e2 upstream.
With structure layout randomization enabled for 'struct inode' we need to avoid overlapping any of the RCU-used / initialized-only-once members, e.g. i_lru or i_sb_list to not corrupt related list traversals when making use of the rcu_head.
For an unlucky structure layout of 'struct inode' we may end up with the following splat when running the ftrace selftests:
[<...>] list_del corruption, ffff888103ee2cb0->next (tracefs_inode_cache+0x0/0x4e0 [slab object]) is NULL (prev is tracefs_inode_cache+0x78/0x4e0 [slab object]) [<...>] ------------[ cut here ]------------ [<...>] kernel BUG at lib/list_debug.c:54! [<...>] invalid opcode: 0000 [#1] PREEMPT SMP KASAN [<...>] CPU: 3 PID: 2550 Comm: mount Tainted: G N 6.8.12-grsec+ #122 ed2f536ca62f28b087b90e3cc906a8d25b3ddc65 [<...>] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 [<...>] RIP: 0010:[<ffffffff84656018>] __list_del_entry_valid_or_report+0x138/0x3e0 [<...>] Code: 48 b8 99 fb 65 f2 ff ff ff ff e9 03 5c d9 fc cc 48 b8 99 fb 65 f2 ff ff ff ff e9 33 5a d9 fc cc 48 b8 99 fb 65 f2 ff ff ff ff <0f> 0b 4c 89 e9 48 89 ea 48 89 ee 48 c7 c7 60 8f dd 89 31 c0 e8 2f [<...>] RSP: 0018:fffffe80416afaf0 EFLAGS: 00010283 [<...>] RAX: 0000000000000098 RBX: ffff888103ee2cb0 RCX: 0000000000000000 [<...>] RDX: ffffffff84655fe8 RSI: ffffffff89dd8b60 RDI: 0000000000000001 [<...>] RBP: ffff888103ee2cb0 R08: 0000000000000001 R09: fffffbd0082d5f25 [<...>] R10: fffffe80416af92f R11: 0000000000000001 R12: fdf99c16731d9b6d [<...>] R13: 0000000000000000 R14: ffff88819ad4b8b8 R15: 0000000000000000 [<...>] RBX: tracefs_inode_cache+0x0/0x4e0 [slab object] [<...>] RDX: __list_del_entry_valid_or_report+0x108/0x3e0 [<...>] RSI: __func__.47+0x4340/0x4400 [<...>] RBP: tracefs_inode_cache+0x0/0x4e0 [slab object] [<...>] RSP: process kstack fffffe80416afaf0+0x7af0/0x8000 [mount 2550 2550] [<...>] R09: kasan shadow of process kstack fffffe80416af928+0x7928/0x8000 [mount 2550 2550] [<...>] R10: process kstack fffffe80416af92f+0x792f/0x8000 [mount 2550 2550] [<...>] R14: tracefs_inode_cache+0x78/0x4e0 [slab object] [<...>] FS: 00006dcb380c1840(0000) GS:ffff8881e0600000(0000) knlGS:0000000000000000 [<...>] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [<...>] CR2: 000076ab72b30e84 CR3: 000000000b088004 CR4: 0000000000360ef0 shadow CR4: 0000000000360ef0 [<...>] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [<...>] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [<...>] ASID: 0003 [<...>] Stack: [<...>] ffffffff818a2315 00000000f5c856ee ffffffff896f1840 ffff888103ee2cb0 [<...>] ffff88812b6b9750 0000000079d714b6 fffffbfff1e9280b ffffffff8f49405f [<...>] 0000000000000001 0000000000000000 ffff888104457280 ffffffff8248b392 [<...>] Call Trace: [<...>] <TASK> [<...>] [<ffffffff818a2315>] ? lock_release+0x175/0x380 fffffe80416afaf0 [<...>] [<ffffffff8248b392>] list_lru_del+0x152/0x740 fffffe80416afb48 [<...>] [<ffffffff8248ba93>] list_lru_del_obj+0x113/0x280 fffffe80416afb88 [<...>] [<ffffffff8940fd19>] ? _atomic_dec_and_lock+0x119/0x200 fffffe80416afb90 [<...>] [<ffffffff8295b244>] iput_final+0x1c4/0x9a0 fffffe80416afbb8 [<...>] [<ffffffff8293a52b>] dentry_unlink_inode+0x44b/0xaa0 fffffe80416afbf8 [<...>] [<ffffffff8293fefc>] __dentry_kill+0x23c/0xf00 fffffe80416afc40 [<...>] [<ffffffff8953a85f>] ? __this_cpu_preempt_check+0x1f/0xa0 fffffe80416afc48 [<...>] [<ffffffff82949ce5>] ? shrink_dentry_list+0x1c5/0x760 fffffe80416afc70 [<...>] [<ffffffff82949b71>] ? shrink_dentry_list+0x51/0x760 fffffe80416afc78 [<...>] [<ffffffff82949da8>] shrink_dentry_list+0x288/0x760 fffffe80416afc80 [<...>] [<ffffffff8294ae75>] shrink_dcache_sb+0x155/0x420 fffffe80416afcc8 [<...>] [<ffffffff8953a7c3>] ? debug_smp_processor_id+0x23/0xa0 fffffe80416afce0 [<...>] [<ffffffff8294ad20>] ? do_one_tree+0x140/0x140 fffffe80416afcf8 [<...>] [<ffffffff82997349>] ? do_remount+0x329/0xa00 fffffe80416afd18 [<...>] [<ffffffff83ebf7a1>] ? security_sb_remount+0x81/0x1c0 fffffe80416afd38 [<...>] [<ffffffff82892096>] reconfigure_super+0x856/0x14e0 fffffe80416afd70 [<...>] [<ffffffff815d1327>] ? ns_capable_common+0xe7/0x2a0 fffffe80416afd90 [<...>] [<ffffffff82997436>] do_remount+0x416/0xa00 fffffe80416afdd0 [<...>] [<ffffffff829b2ba4>] path_mount+0x5c4/0x900 fffffe80416afe28 [<...>] [<ffffffff829b25e0>] ? finish_automount+0x13a0/0x13a0 fffffe80416afe60 [<...>] [<ffffffff82903812>] ? user_path_at_empty+0xb2/0x140 fffffe80416afe88 [<...>] [<ffffffff829b2ff5>] do_mount+0x115/0x1c0 fffffe80416afeb8 [<...>] [<ffffffff829b2ee0>] ? path_mount+0x900/0x900 fffffe80416afed8 [<...>] [<ffffffff8272461c>] ? __kasan_check_write+0x1c/0xa0 fffffe80416afee0 [<...>] [<ffffffff829b31cf>] __do_sys_mount+0x12f/0x280 fffffe80416aff30 [<...>] [<ffffffff829b36cd>] __x64_sys_mount+0xcd/0x2e0 fffffe80416aff70 [<...>] [<ffffffff819f8818>] ? syscall_trace_enter+0x218/0x380 fffffe80416aff88 [<...>] [<ffffffff8111655e>] x64_sys_call+0x5d5e/0x6720 fffffe80416affa8 [<...>] [<ffffffff8952756d>] do_syscall_64+0xcd/0x3c0 fffffe80416affb8 [<...>] [<ffffffff8100119b>] entry_SYSCALL_64_safe_stack+0x4c/0x87 fffffe80416affe8 [<...>] </TASK> [<...>] <PTREGS> [<...>] RIP: 0033:[<00006dcb382ff66a>] vm_area_struct[mount 2550 2550 file 6dcb38225000-6dcb3837e000 22 55(read|exec|mayread|mayexec)]+0x0/0xb8 [userland map] [<...>] Code: 48 8b 0d 29 18 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f6 17 0d 00 f7 d8 64 89 01 48 [<...>] RSP: 002b:0000763d68192558 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [<...>] RAX: ffffffffffffffda RBX: 00006dcb38433264 RCX: 00006dcb382ff66a [<...>] RDX: 000017c3e0d11210 RSI: 000017c3e0d1a5a0 RDI: 000017c3e0d1ae70 [<...>] RBP: 000017c3e0d10fb0 R08: 000017c3e0d11260 R09: 00006dcb383d1be0 [<...>] R10: 000000000020002e R11: 0000000000000246 R12: 0000000000000000 [<...>] R13: 000017c3e0d1ae70 R14: 000017c3e0d11210 R15: 000017c3e0d10fb0 [<...>] RBX: vm_area_struct[mount 2550 2550 file 6dcb38433000-6dcb38434000 5b 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] RCX: vm_area_struct[mount 2550 2550 file 6dcb38225000-6dcb3837e000 22 55(read|exec|mayread|mayexec)]+0x0/0xb8 [userland map] [<...>] RDX: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] RSI: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] RDI: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] RBP: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] RSP: vm_area_struct[mount 2550 2550 anon 763d68173000-763d68195000 7ffffffdd 100133(read|write|mayread|maywrite|growsdown|account)]+0x0/0xb8 [userland map] [<...>] R08: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] R09: vm_area_struct[mount 2550 2550 file 6dcb383d1000-6dcb383d3000 1cd 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] R13: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] R14: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] R15: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] </PTREGS> [<...>] Modules linked in: [<...>] ---[ end trace 0000000000000000 ]---
The list debug message as well as RBX's symbolic value point out that the object in question was allocated from 'tracefs_inode_cache' and that the list's '->next' member is at offset 0. Dumping the layout of the relevant parts of 'struct tracefs_inode' gives the following:
struct tracefs_inode { union { struct inode { struct list_head { struct list_head * next; /* 0 8 */ struct list_head * prev; /* 8 8 */ } i_lru; [...] } vfs_inode; struct callback_head { void (*func)(struct callback_head *); /* 0 8 */ struct callback_head * next; /* 8 8 */ } rcu; }; [...] };
Above shows that 'vfs_inode.i_lru' overlaps with 'rcu' which will destroy the 'i_lru' list as soon as the 'rcu' member gets used, e.g. in call_rcu() or later when calling the RCU callback. This will disturb concurrent list traversals as well as object reuse which assumes these list heads will keep their integrity.
For reproduction, the following diff manually overlays 'i_lru' with 'rcu' as, otherwise, one would require some good portion of luck for gambling an unlucky RANDSTRUCT seed:
--- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -629,6 +629,7 @@ struct inode { umode_t i_mode; unsigned short i_opflags; kuid_t i_uid; + struct list_head i_lru; /* inode LRU list */ kgid_t i_gid; unsigned int i_flags;
@@ -690,7 +691,6 @@ struct inode { u16 i_wb_frn_avg_time; u16 i_wb_frn_history; #endif - struct list_head i_lru; /* inode LRU list */ struct list_head i_sb_list; struct list_head i_wb_list; /* backing dev writeback list */ union {
The tracefs inode does not need to supply its own RCU delayed destruction of its inode. The inode code itself offers both a "destroy_inode()" callback that gets called when the last reference of the inode is released, and the "free_inode()" which is called after a RCU synchronization period from the "destroy_inode()".
The tracefs code can unlink the inode from its list in the destroy_inode() callback, and the simply free it from the free_inode() callback. This should provide the same protection.
Link: https://lore.kernel.org/all/20240807115143.45927-3-minipli@grsecurity.net/
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Ilkka =?utf-8?b?TmF1bGFww6TDpA==?= <digirigawa@gmail.com> Link: https://lore.kernel.org/20240807185402.61410544@gandalf.local.home Fixes: baa23a8d4360 ("tracefs: Reset permissions on remount if permissions are options") Reported-by: Mathias Krause <minipli@grsecurity.net> Reported-by: Brad Spengler <spender@grsecurity.net> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.44, v6.6.43, v6.6.42, v6.6.41, v6.6.40, v6.6.39, v6.6.38, v6.6.37, v6.6.36, v6.6.35, v6.6.34 |
|
#
b181f702 |
| 12-Jun-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.33' into dev-6.6
This is the 6.6.33 stable release
|
Revision tags: v6.6.33, v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24, v6.6.23, v6.6.16 |
|
#
e5c80b23 |
| 01-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Create eventfs_root_inode to store dentry
[ Upstream commit c3137ab6318d56370dd5541ebf027ddfc0c8557c ]
Only the root "events" directory stores a dentry. There's no reason to hold a dentry
eventfs: Create eventfs_root_inode to store dentry
[ Upstream commit c3137ab6318d56370dd5541ebf027ddfc0c8557c ]
Only the root "events" directory stores a dentry. There's no reason to hold a dentry pointer for every eventfs_inode as it is never set except for the root "events" eventfs_inode.
Create a eventfs_root_inode structure that holds the events_dir dentry. The "events" eventfs_inode *is* special, let it have its own descriptor.
Link: https://lore.kernel.org/linux-trace-kernel/20240201161617.658992558@goodmis.org
Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Stable-dep-of: b63db58e2fa5 ("eventfs/tracing: Add callback for release of an eventfs_inode") Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
e26405d5 |
| 02-May-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Do not differentiate the toplevel events directory
[ Upstream commit d53891d348ac3eceaf48f4732a1f4f5c0e0a55ce ]
The toplevel events directory is really no different than the events directo
eventfs: Do not differentiate the toplevel events directory
[ Upstream commit d53891d348ac3eceaf48f4732a1f4f5c0e0a55ce ]
The toplevel events directory is really no different than the events directory of instances. Having the two be different caused inconsistencies and made it harder to fix the permissions bugs.
Make all events directories act the same.
Link: https://lore.kernel.org/linux-trace-kernel/20240502200905.846448710@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: 8186fff7ab649 ("tracefs/eventfs: Use root and instance inodes as default ownership") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
e0d77d0f |
| 19-May-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.31' into dev-6.6
This is the 6.6.31 stable release
|
#
5f91fc82 |
| 02-May-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
tracefs: Reset permissions on remount if permissions are options
commit baa23a8d4360d981a49913841a726edede5cdd54 upstream.
There's an inconsistency with the way permissions are handled in tracefs.
tracefs: Reset permissions on remount if permissions are options
commit baa23a8d4360d981a49913841a726edede5cdd54 upstream.
There's an inconsistency with the way permissions are handled in tracefs. Because the permissions are generated when accessed, they default to the root inode's permission if they were never set by the user. If the user sets the permissions, then a flag is set and the permissions are saved via the inode (for tracefs files) or an internal attribute field (for eventfs).
But if a remount happens that specify the permissions, all the files that were not changed by the user gets updated, but the ones that were are not. If the user were to remount the file system with a given permission, then all files and directories within that file system should be updated.
This can cause security issues if a file's permission was updated but the admin forgot about it. They could incorrectly think that remounting with permissions set would update all files, but miss some.
For example:
# cd /sys/kernel/tracing # chgrp 1002 current_tracer # ls -l [..] -rw-r----- 1 root root 0 May 1 21:25 buffer_size_kb -rw-r----- 1 root root 0 May 1 21:25 buffer_subbuf_size_kb -r--r----- 1 root root 0 May 1 21:25 buffer_total_size_kb -rw-r----- 1 root lkp 0 May 1 21:25 current_tracer -rw-r----- 1 root root 0 May 1 21:25 dynamic_events -r--r----- 1 root root 0 May 1 21:25 dyn_ftrace_total_info -r--r----- 1 root root 0 May 1 21:25 enabled_functions
Where current_tracer now has group "lkp".
# mount -o remount,gid=1001 . # ls -l -rw-r----- 1 root tracing 0 May 1 21:25 buffer_size_kb -rw-r----- 1 root tracing 0 May 1 21:25 buffer_subbuf_size_kb -r--r----- 1 root tracing 0 May 1 21:25 buffer_total_size_kb -rw-r----- 1 root lkp 0 May 1 21:25 current_tracer -rw-r----- 1 root tracing 0 May 1 21:25 dynamic_events -r--r----- 1 root tracing 0 May 1 21:25 dyn_ftrace_total_info -r--r----- 1 root tracing 0 May 1 21:25 enabled_functions
Everything changed but the "current_tracer".
Add a new link list that keeps track of all the tracefs_inodes which has the permission flags that tell if the file/dir should use the root inode's permission or not. Then on remount, clear all the flags so that the default behavior of using the root inode's permission is done for all files and directories.
Link: https://lore.kernel.org/linux-trace-kernel/20240502200905.529542160@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: 8186fff7ab649 ("tracefs/eventfs: Use root and instance inodes as default ownership") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
c595db6d |
| 13-Mar-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.18' into dev-6.6
This is the 6.6.18 stable release
|
#
ed823ca4 |
| 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Restructure eventfs_inode structure to be more condensed
commit 264424dfdd5cbd92bc5b5ddf93944929fc877fac upstream.
Some of the eventfs_inode structure has holes in it. Rework the structure
eventfs: Restructure eventfs_inode structure to be more condensed
commit 264424dfdd5cbd92bc5b5ddf93944929fc877fac upstream.
Some of the eventfs_inode structure has holes in it. Rework the structure to be a bit more condensed, and also remove the no longer used llist field.
Link: https://lore.kernel.org/linux-trace-kernel/20240201161617.002321438@goodmis.org
Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
5dfb0410 |
| 06-Feb-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
eventfs: Get rid of dentry pointers without refcounts
commit 43aa6f97c2d03a52c1ddb86768575fc84344bdbb upstream.
The eventfs inode had pointers to dentries (and child dentries) without actually hold
eventfs: Get rid of dentry pointers without refcounts
commit 43aa6f97c2d03a52c1ddb86768575fc84344bdbb upstream.
The eventfs inode had pointers to dentries (and child dentries) without actually holding a refcount on said pointer. That is fundamentally broken, and while eventfs tried to then maintain coherence with dentries going away by hooking into the '.d_iput' callback, that doesn't actually work since it's not ordered wrt lookups.
There were two reasonms why eventfs tried to keep a pointer to a dentry:
- the creation of a 'events' directory would actually have a stable dentry pointer that it created with tracefs_start_creating().
And it needed that dentry when tearing it all down again in eventfs_remove_events_dir().
This use is actually ok, because the special top-level events directory dentries are actually stable, not just a temporary cache of the eventfs data structures.
- the 'eventfs_inode' (aka ei) needs to stay around as long as there are dentries that refer to it.
It then used these dentry pointers as a replacement for doing reference counting: it would try to make sure that there was only ever one dentry associated with an event_inode, and keep a child dentry array around to see which dentries might still refer to the parent ei.
This gets rid of the invalid dentry pointer use, and renames the one valid case to a different name to make it clear that it's not just any random dentry.
The magic child dentry array that is kind of a "reverse reference list" is simply replaced by having child dentries take a ref to the ei. As does the directory dentries. That makes the broken use case go away.
Link: https://lore.kernel.org/linux-trace-kernel/202401291043.e62e89dc-oliver.sang@intel.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240131185513.280463000@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: c1504e510238 ("eventfs: Implement eventfs dir creation functions") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
c4619205 |
| 06-Feb-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
eventfs: Clean up dentry ops and add revalidate function
commit 8dce06e98c70a7fcbb4bca7d90faf40522e65c58 upstream.
In order for the dentries to stay up-to-date with the eventfs changes, just add a
eventfs: Clean up dentry ops and add revalidate function
commit 8dce06e98c70a7fcbb4bca7d90faf40522e65c58 upstream.
In order for the dentries to stay up-to-date with the eventfs changes, just add a 'd_revalidate' function that checks the 'is_freed' bit.
Also, clean up the dentry release to actually use d_release() rather than the slightly odd d_iput() function. We don't care about the inode, all we want to do is to get rid of the refcount to the eventfs data added by dentry->d_fsdata.
It would probably be cleaner to make eventfs its own filesystem, or at least set its own dentry ops when looking up eventfs files. But as it is, only eventfs dentries use d_fsdata, so we don't really need to split these things up by use.
Another thing that might be worth doing is to make all eventfs lookups mark their dentries as not worth caching. We could do that with d_delete(), but the DCACHE_DONTCACHE flag would likely be even better.
As it is, the dentries are all freeable, but they only tend to get freed at memory pressure rather than more proactively. But that's a separate issue.
Link: https://lore.kernel.org/linux-trace-kernel/202401291043.e62e89dc-oliver.sang@intel.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240131185513.124644253@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: c1504e510238 ("eventfs: Implement eventfs dir creation functions") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
ca2d3b2c |
| 06-Feb-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
eventfs: Remove unused d_parent pointer field
commit 408600be78cdb8c650a97ecc7ff411cb216811b5 upstream.
It's never used
Link: https://lore.kernel.org/linux-trace-kernel/202401291043.e62e89dc-olive
eventfs: Remove unused d_parent pointer field
commit 408600be78cdb8c650a97ecc7ff411cb216811b5 upstream.
It's never used
Link: https://lore.kernel.org/linux-trace-kernel/202401291043.e62e89dc-oliver.sang@intel.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240131185512.961772428@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: c1504e510238 ("eventfs: Implement eventfs dir creation functions") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
21faa3de |
| 06-Feb-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
tracefs: dentry lookup crapectomy
commit 49304c2b93e4f7468b51ef717cbe637981397115 upstream.
The dentry lookup for eventfs files was very broken, and had lots of signs of the old situation where the
tracefs: dentry lookup crapectomy
commit 49304c2b93e4f7468b51ef717cbe637981397115 upstream.
The dentry lookup for eventfs files was very broken, and had lots of signs of the old situation where the filesystem names were all created statically in the dentry tree, rather than being looked up dynamically based on the eventfs data structures.
You could see it in the naming - how it claimed to "create" dentries rather than just look up the dentries that were given it.
You could see it in various nonsensical and very incorrect operations, like using "simple_lookup()" on the dentries that were passed in, which only results in those dentries becoming negative dentries. Which meant that any other lookup would possibly return ENOENT if it saw that negative dentry before the data was then later filled in.
You could see it in the immense amount of nonsensical code that didn't actually just do lookups.
Link: https://lore.kernel.org/linux-trace-kernel/202401291043.e62e89dc-oliver.sang@intel.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240131233227.73db55e1@gandalf.local.home
Cc: stable@vger.kernel.org Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Mark Rutland <mark.rutland@arm.com> Fixes: c1504e510238 ("eventfs: Implement eventfs dir creation functions") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
f0686a19 |
| 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
tracefs: Zero out the tracefs_inode when allocating it
commit d81786f53aec14fd4d56263145a0635afbc64617 upstream.
eventfs uses the tracefs_inode and assumes that it's already initialized to zero. Th
tracefs: Zero out the tracefs_inode when allocating it
commit d81786f53aec14fd4d56263145a0635afbc64617 upstream.
eventfs uses the tracefs_inode and assumes that it's already initialized to zero. That is, it doesn't set fields to zero (like ti->private) after getting its tracefs_inode. This causes bugs due to stale values.
Just initialize the entire structure to zero on allocation so there isn't any more surprises.
This is a partial fix to access to ti->private. The assignment still needs to be made before the dentry is instantiated.
Link: https://lore.kernel.org/linux-trace-kernel/20240131185512.315825944@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202401291043.e62e89dc-oliver.sang@intel.com Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
35e219f7 |
| 06-Feb-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
tracefs: remove stale update_gid code
commit 29142dc92c37d3259a33aef15b03e6ee25b0d188 upstream.
The 'eventfs_update_gid()' function is no longer called, so remove it (and the helper function it use
tracefs: remove stale update_gid code
commit 29142dc92c37d3259a33aef15b03e6ee25b0d188 upstream.
The 'eventfs_update_gid()' function is no longer called, so remove it (and the helper function it uses).
Link: https://lore.kernel.org/all/CAHk-=wj+DsZZ=2iTUkJ-Nojs9fjYMvPs1NuoM3yK7aTDtJfPYQ@mail.gmail.com/
Fixes: 8186fff7ab64 ("tracefs/eventfs: Use root and instance inodes as default ownership") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
9a187657 |
| 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Save directory inodes in the eventfs_inode structure
commit 834bf76add3e6168038150f162cbccf1fd492a67 upstream.
The eventfs inodes and directories are allocated when referenced. But this le
eventfs: Save directory inodes in the eventfs_inode structure
commit 834bf76add3e6168038150f162cbccf1fd492a67 upstream.
The eventfs inodes and directories are allocated when referenced. But this leaves the issue of keeping consistent inode numbers and the number is only saved in the inode structure itself. When the inode is no longer referenced, it can be freed. When the file that the inode was representing is referenced again, the inode is once again created, but the inode number needs to be the same as it was before.
Just making the inode numbers the same for all files is fine, but that does not work with directories. The find command will check for loops via the inode number and having the same inode number for directories triggers:
# find /sys/kernel/tracing find: File system loop detected; '/sys/kernel/debug/tracing/events/initcall/initcall_finish' is part of the same file system loop as '/sys/kernel/debug/tracing/events/initcall'. [..]
Linus pointed out that the eventfs_inode structure ends with a single 32bit int, and on 64 bit machines, there's likely a 4 byte hole due to alignment. We can use this hole to store the inode number for the eventfs_inode. All directories in eventfs are represented by an eventfs_inode and that data structure can hold its inode number.
That last int was also purposely placed at the end of the structure to prevent holes from within. Now that there's a 4 byte number to hold the inode, both the inode number and the last integer can be moved up in the structure for better cache locality, where the llist and rcu fields can be moved to the end as they are only used when the eventfs_inode is being deleted.
Link: https://lore.kernel.org/all/CAMuHMdXKiorg-jiuKoZpfZyDJ3Ynrfb8=X+c7x0Eewxn-YRdCA@mail.gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240122152748.46897388@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Fixes: 53c41052ba31 ("eventfs: Have the inodes all for files and directories all be the same") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
628adb84 |
| 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
tracefs/eventfs: Use root and instance inodes as default ownership
commit 8186fff7ab649085e2c60d032d9a20a85af1d87c upstream.
Instead of walking the dentries on mount/remount to update the gid value
tracefs/eventfs: Use root and instance inodes as default ownership
commit 8186fff7ab649085e2c60d032d9a20a85af1d87c upstream.
Instead of walking the dentries on mount/remount to update the gid values of all the dentries if a gid option is specified on mount, just update the root inode. Add .getattr, .setattr, and .permissions on the tracefs inode operations to update the permissions of the files and directories.
For all files and directories in the top level instance:
/sys/kernel/tracing/*
It will use the root inode as the default permissions. The inode that represents: /sys/kernel/tracing (or wherever it is mounted).
When an instance is created:
mkdir /sys/kernel/tracing/instance/foo
The directory "foo" and all its files and directories underneath will use the default of what foo is when it was created. A remount of tracefs will not affect it.
If a user were to modify the permissions of any file or directory in tracefs, it will also no longer be modified by a change in ownership of a remount.
The events directory, if it is in the top level instance, will use the tracefs root inode as the default ownership for itself and all the files and directories below it.
For the events directory in an instance ("foo"), it will keep the ownership of what it was when it was created, and that will be used as the default ownership for the files and directories beneath it.
Link: https://lore.kernel.org/linux-trace-kernel/CAHk-=wjVdGkjDXBbvLn2wbZnqP4UsH46E3gqJ9m7UG6DpX2+WA@mail.gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240103215016.1e0c9811@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
ee3bde69 |
| 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Fix bitwise fields for "is_events"
commit fd56cd5f6d76e93356d9520cf9dabffe1e3d1aa0 upstream.
A flag was needed to denote which eventfs_inode was the "events" directory, so a bit was taken
eventfs: Fix bitwise fields for "is_events"
commit fd56cd5f6d76e93356d9520cf9dabffe1e3d1aa0 upstream.
A flag was needed to denote which eventfs_inode was the "events" directory, so a bit was taken from the "nr_entries" field, as there's not that many entries, and 2^30 is plenty. But the bit number for nr_entries was not updated to reflect the bit taken from it, which would add an unnecessary integer to the structure.
Link: https://lore.kernel.org/linux-trace-kernel/20240102151832.7ca87275@gandalf.local.home
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: 7e8358edf503e ("eventfs: Fix file and directory uid and gid ownership") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
1b4dfdb3 |
| 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Fix file and directory uid and gid ownership
commit 7e8358edf503e87236c8d07f69ef0ed846dd5112 upstream.
It was reported that when mounting the tracefs file system with a gid other than root
eventfs: Fix file and directory uid and gid ownership
commit 7e8358edf503e87236c8d07f69ef0ed846dd5112 upstream.
It was reported that when mounting the tracefs file system with a gid other than root, the ownership did not carry down to the eventfs directory due to the dynamic nature of it.
A fix was done to solve this, but it had two issues.
(a) if the attr passed into update_inode_attr() was NULL, it didn't do anything. This is true for files that have not had a chown or chgrp done to itself or any of its sibling files, as the attr is allocated for all children when any one needs it.
# umount /sys/kernel/tracing # mount -o rw,seclabel,relatime,gid=1000 -t tracefs nodev /mnt
# ls -ld /mnt/events/sched drwxr-xr-x 28 root rostedt 0 Dec 21 13:12 /mnt/events/sched/
# ls -ld /mnt/events/sched/sched_switch drwxr-xr-x 2 root rostedt 0 Dec 21 13:12 /mnt/events/sched/sched_switch/
But when checking the files:
# ls -l /mnt/events/sched/sched_switch total 0 -rw-r----- 1 root root 0 Dec 21 13:12 enable -rw-r----- 1 root root 0 Dec 21 13:12 filter -r--r----- 1 root root 0 Dec 21 13:12 format -r--r----- 1 root root 0 Dec 21 13:12 hist -r--r----- 1 root root 0 Dec 21 13:12 id -rw-r----- 1 root root 0 Dec 21 13:12 trigger
(b) When the attr does not denote the UID or GID, it defaulted to using the parent uid or gid. This is incorrect as changing the parent uid or gid will automatically change all its children.
# chgrp tracing /mnt/events/timer
# ls -ld /mnt/events/timer drwxr-xr-x 2 root tracing 0 Dec 21 14:34 /mnt/events/timer
# ls -l /mnt/events/timer total 0 -rw-r----- 1 root root 0 Dec 21 14:35 enable -rw-r----- 1 root root 0 Dec 21 14:35 filter drwxr-xr-x 2 root tracing 0 Dec 21 14:35 hrtimer_cancel drwxr-xr-x 2 root tracing 0 Dec 21 14:35 hrtimer_expire_entry drwxr-xr-x 2 root tracing 0 Dec 21 14:35 hrtimer_expire_exit drwxr-xr-x 2 root tracing 0 Dec 21 14:35 hrtimer_init drwxr-xr-x 2 root tracing 0 Dec 21 14:35 hrtimer_start drwxr-xr-x 2 root tracing 0 Dec 21 14:35 itimer_expire drwxr-xr-x 2 root tracing 0 Dec 21 14:35 itimer_state drwxr-xr-x 2 root tracing 0 Dec 21 14:35 tick_stop drwxr-xr-x 2 root tracing 0 Dec 21 14:35 timer_cancel drwxr-xr-x 2 root tracing 0 Dec 21 14:35 timer_expire_entry drwxr-xr-x 2 root tracing 0 Dec 21 14:35 timer_expire_exit drwxr-xr-x 2 root tracing 0 Dec 21 14:35 timer_init drwxr-xr-x 2 root tracing 0 Dec 21 14:35 timer_start
At first it was thought that this could be easily fixed by just making the default ownership of the superblock when it was mounted. But this does not handle the case of:
# chgrp tracing instances # mkdir instances/foo
If the superblock was used, then the group ownership would be that of what it was when it was mounted, when it should instead be "tracing".
Instead, set a flag for the top level eventfs directory ("events") to flag which eventfs_inode belongs to it.
Since the "events" directory's dentry and inode are never freed, it does not need to use its attr field to restore its mode and ownership. Use the this eventfs_inode's attr as the default ownership for all the files and directories underneath it.
When the events eventfs_inode is created, it sets its ownership to its parent uid and gid. As the events directory is created at boot up before it gets mounted, this will always be uid=0 and gid=0. If it's created via an instance, then it will take the ownership of the instance directory.
When the file system is mounted, it will update all the gids if one is specified. This will have a callback to update the events evenfs_inode's default entries.
When a file or directory is created under the events directory, it will walk the ei->dentry parents until it finds the evenfs_inode that belongs to the events directory to retrieve the default uid and gid values.
Link: https://lore.kernel.org/all/CAHk-=wiwQtUHvzwyZucDq8=Gtw+AnwScyLhpFswrQ84PjhoGsg@mail.gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20231221190757.7eddbca9@gandalf.local.home
Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Dongliang Cui <cuidongliang390@gmail.com> Cc: Hongyu Jin <hongyu.jin@unisoc.com> Fixes: 0dfc852b6fe3 ("eventfs: Have event files and directories default to parent uid and gid") Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Tested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
74a9e56b |
| 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Use simple_recursive_removal() to clean up dentries
commit 407c6726ca71b33330d2d6345d9ea7ebc02575e9 upstream.
Looking at how dentry is removed via the tracefs system, I found that eventfs
eventfs: Use simple_recursive_removal() to clean up dentries
commit 407c6726ca71b33330d2d6345d9ea7ebc02575e9 upstream.
Looking at how dentry is removed via the tracefs system, I found that eventfs does not do everything that it did under tracefs. The tracefs removal of a dentry calls simple_recursive_removal() that does a lot more than a simple d_invalidate().
As it should be a requirement that any eventfs_inode that has a dentry, so does its parent. When removing a eventfs_inode, if it has a dentry, a call to simple_recursive_removal() on that dentry should clean up all the dentries underneath it.
Add WARN_ON_ONCE() to check for the parent having a dentry if any children do.
Link: https://lore.kernel.org/all/20231101022553.GE1957730@ZenIV/ Link: https://lkml.kernel.org/r/20231101172650.552471568@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Fixes: 5bdcd5f5331a2 ("eventfs: Implement removal of meta data from eventfs") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
c58673ca |
| 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Delete eventfs_inode when the last dentry is freed
commit 020010fbfa202aa528a52743eba4ab0da3400a4e upstream.
There exists a race between holding a reference of an eventfs_inode dentry and
eventfs: Delete eventfs_inode when the last dentry is freed
commit 020010fbfa202aa528a52743eba4ab0da3400a4e upstream.
There exists a race between holding a reference of an eventfs_inode dentry and the freeing of the eventfs_inode. If user space has a dentry held long enough, it may still be able to access the dentry's eventfs_inode after it has been freed.
To prevent this, have he eventfs_inode freed via the last dput() (or via RCU if the eventfs_inode does not have a dentry).
This means reintroducing the eventfs_inode del_list field at a temporary place to put the eventfs_inode. It needs to mark it as freed (via the list) but also must invalidate the dentry immediately as the return from eventfs_remove_dir() expects that they are. But the dentry invalidation must not be called under the eventfs_mutex, so it must be done after the eventfs_inode is marked as free (put on a deletion list).
Link: https://lkml.kernel.org/r/20231101172650.123479767@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ajay Kaher <akaher@vmware.com> Fixes: 5bdcd5f5331a2 ("eventfs: Implement removal of meta data from eventfs") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
32f4c167 |
| 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Save ownership and mode
commit 28e12c09f5aa081b2d13d1340e3610070b6c624d upstream.
Now that inodes and dentries are created on the fly, they are also reclaimed on memory pressure. Since the
eventfs: Save ownership and mode
commit 28e12c09f5aa081b2d13d1340e3610070b6c624d upstream.
Now that inodes and dentries are created on the fly, they are also reclaimed on memory pressure. Since the ownership and file mode are saved in the inode, if they are freed, any changes to the ownership and mode will be lost.
To counter this, if the user changes the permissions or ownership, save them, and when creating the inodes again, restore those changes.
Link: https://lkml.kernel.org/r/20231101172649.691841445@goodmis.org
Cc: stable@vger.kernel.org Cc: Ajay Kaher <akaher@vmware.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: 63940449555e7 ("eventfs: Implement eventfs lookup, read, open functions") Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
d2a632ae |
| 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Test for ei->is_freed when accessing ei->dentry
commit 77a06c33a22d13f3a6e31f06f6ee6bca666e6898 upstream.
The eventfs_inode (ei) is protected by SRCU, but the ei->dentry is not. It is prot
eventfs: Test for ei->is_freed when accessing ei->dentry
commit 77a06c33a22d13f3a6e31f06f6ee6bca666e6898 upstream.
The eventfs_inode (ei) is protected by SRCU, but the ei->dentry is not. It is protected by the eventfs_mutex. Anytime the eventfs_mutex is released, and access to the ei->dentry needs to be done, it should first check if ei->is_freed is set under the eventfs_mutex. If it is, then the ei->dentry is invalid and must not be used. The ei->dentry must only be accessed under the eventfs_mutex and after checking if ei->is_freed is set.
When the ei is being freed, it will (under the eventfs_mutex) set is_freed and at the same time move the dentry to a free list to be cleared after the eventfs_mutex is released. This means that any access to the ei->dentry must check first if ei->is_freed is set, because if it is, then the dentry is on its way to be freed.
Also add comments to describe this better.
Link: https://lore.kernel.org/all/CA+G9fYt6pY+tMZEOg=SoEywQOe19fGP3uR15SGowkdK+_X85Cg@mail.gmail.com/ Link: https://lore.kernel.org/all/CA+G9fYuDP3hVQ3t7FfrBAjd_WFVSurMgCepTxunSJf=MTe=6aA@mail.gmail.com/ Link: https://lkml.kernel.org/r/20231101172649.477608228@goodmis.org
Cc: Ajay Kaher <akaher@vmware.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: Beau Belgrave <beaub@linux.microsoft.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org> Tested-by: Beau Belgrave <beaub@linux.microsoft.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
4bb123ce |
| 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Remove "is_freed" union with rcu head
commit f2f496370afcbc5227d7002da28c74b91fed12ff upstream.
The eventfs_inode->is_freed was a union with the rcu_head with the assumption that when it w
eventfs: Remove "is_freed" union with rcu head
commit f2f496370afcbc5227d7002da28c74b91fed12ff upstream.
The eventfs_inode->is_freed was a union with the rcu_head with the assumption that when it was on the srcu list the head would contain a pointer which would make "is_freed" true. But that was a wrong assumption as the rcu head is a single link list where the last element is NULL.
Instead, split the nr_entries integer so that "is_freed" is one bit and the nr_entries is the next 31 bits. As there shouldn't be more than 10 (currently there's at most 5 to 7 depending on the config), this should not be a problem.
Link: https://lkml.kernel.org/r/20231101172649.049758712@goodmis.org
Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ajay Kaher <akaher@vmware.com> Fixes: 63940449555e7 ("eventfs: Implement eventfs lookup, read, open functions") Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
7cac392f |
| 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Fix typo in eventfs_inode union comment
commit 29e06c10702e81a7d0b75020ca514d2f2962704a upstream.
It's eventfs_inode not eventfs_indoe. There's no deer involved!
Link: https://lore.kernel
eventfs: Fix typo in eventfs_inode union comment
commit 29e06c10702e81a7d0b75020ca514d2f2962704a upstream.
It's eventfs_inode not eventfs_indoe. There's no deer involved!
Link: https://lore.kernel.org/linux-trace-kernel/20231024131024.5634c743@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|