cc6f5aa1 | 02-May-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Do not treat events directory different than other directories
commit 22e61e15af731dbe46704c775d2335e56fcef4e9 upstream.
Treat the events directory the same as other directories when it co
eventfs: Do not treat events directory different than other directories
commit 22e61e15af731dbe46704c775d2335e56fcef4e9 upstream.
Treat the events directory the same as other directories when it comes to permissions. The events directory was considered different because it's dentry is persistent, whereas the other directory dentries are created when accessed. But the way tracefs now does its ownership by using the root dentry's permissions as the default permissions, the events directory can get out of sync when a remount is performed setting the group and user permissions.
Remove the special case for the events directory on setting the attributes. This allows the updates caused by remount to work properly as well as simplifies the code.
Link: https://lore.kernel.org/linux-trace-kernel/20240502200906.002923579@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: 8186fff7ab649 ("tracefs/eventfs: Use root and instance inodes as default ownership") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
0c56915c | 02-May-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
tracefs: Still use mount point as default permissions for instances
commit 6599bd5517be66c8344f869f3ca3a91bc10f2b9e upstream.
If the instances directory's permissions were never change, then have i
tracefs: Still use mount point as default permissions for instances
commit 6599bd5517be66c8344f869f3ca3a91bc10f2b9e upstream.
If the instances directory's permissions were never change, then have it and its children use the mount point permissions as the default.
Currently, the permissions of instance directories are determined by the instance directory's permissions itself. But if the tracefs file system is remounted and changes the permissions, the instance directory and its children should use the new permission.
But because both the instance directory and its children use the instance directory's inode for permissions, it misses the update.
To demonstrate this:
# cd /sys/kernel/tracing/ # mkdir instances/foo # ls -ld instances/foo drwxr-x--- 5 root root 0 May 1 19:07 instances/foo # ls -ld instances drwxr-x--- 3 root root 0 May 1 18:57 instances # ls -ld current_tracer -rw-r----- 1 root root 0 May 1 18:57 current_tracer
# mount -o remount,gid=1002 . # ls -ld instances drwxr-x--- 3 root root 0 May 1 18:57 instances # ls -ld instances/foo/ drwxr-x--- 5 root root 0 May 1 19:07 instances/foo/ # ls -ld current_tracer -rw-r----- 1 root lkp 0 May 1 18:57 current_tracer
Notice that changing the group id to that of "lkp" did not affect the instances directory nor its children. It should have been:
# ls -ld current_tracer -rw-r----- 1 root root 0 May 1 19:19 current_tracer # ls -ld instances/foo/ drwxr-x--- 5 root root 0 May 1 19:25 instances/foo/ # ls -ld instances drwxr-x--- 3 root root 0 May 1 19:19 instances
# mount -o remount,gid=1002 . # ls -ld current_tracer -rw-r----- 1 root lkp 0 May 1 19:19 current_tracer # ls -ld instances drwxr-x--- 3 root lkp 0 May 1 19:19 instances # ls -ld instances/foo/ drwxr-x--- 5 root lkp 0 May 1 19:25 instances/foo/
Where all files were updated by the remount gid update.
Link: https://lore.kernel.org/linux-trace-kernel/20240502200905.686838327@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: 8186fff7ab649 ("tracefs/eventfs: Use root and instance inodes as default ownership") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
a49e9c72 | 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Keep all directory links at 1
commit ca185770db914869ff9fe773bac5e0e5e4165b83 upstream.
The directory link count in eventfs was somewhat bogus. It was only being updated when a directory c
eventfs: Keep all directory links at 1
commit ca185770db914869ff9fe773bac5e0e5e4165b83 upstream.
The directory link count in eventfs was somewhat bogus. It was only being updated when a directory child was being looked up and not on creation.
One solution would be to update in get_attr() the link count by iterating the ei->children list and then adding 2. But that could slow down simple stat() calls, especially if it's done on all directories in eventfs.
Another solution would be to add a parent pointer to the eventfs_inode and keep track of the number of sub directories it has on creation. But this adds overhead for something not really worthwhile.
The solution decided upon is to keep all directory links in eventfs as 1. This tells user space not to rely on the hard links of directories. Which in this case it shouldn't.
Link: https://lore.kernel.org/linux-trace-kernel/20240201002719.GS2087318@ZenIV/ Link: https://lore.kernel.org/linux-trace-kernel/20240201161617.339968298@goodmis.org
Cc: stable@vger.kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Fixes: c1504e510238 ("eventfs: Implement eventfs dir creation functions") Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
9bb8131a | 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Remove fsnotify*() functions from lookup()
commit 12d823b31fadf47c8f36ecada7abac5f903cac33 upstream.
The dentries and inodes are created when referenced in the lookup code. There's no reas
eventfs: Remove fsnotify*() functions from lookup()
commit 12d823b31fadf47c8f36ecada7abac5f903cac33 upstream.
The dentries and inodes are created when referenced in the lookup code. There's no reason to call fsnotify_*() functions when they are created by a reference. It doesn't make any sense.
Link: https://lore.kernel.org/linux-trace-kernel/20240201002719.GS2087318@ZenIV/ Link: https://lore.kernel.org/linux-trace-kernel/20240201161617.166973329@goodmis.org
Cc: stable@vger.kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Fixes: a376007917776 ("eventfs: Implement functions to create files and dirs when accessed"); Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
ed823ca4 | 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Restructure eventfs_inode structure to be more condensed
commit 264424dfdd5cbd92bc5b5ddf93944929fc877fac upstream.
Some of the eventfs_inode structure has holes in it. Rework the structure
eventfs: Restructure eventfs_inode structure to be more condensed
commit 264424dfdd5cbd92bc5b5ddf93944929fc877fac upstream.
Some of the eventfs_inode structure has holes in it. Rework the structure to be a bit more condensed, and also remove the no longer used llist field.
Link: https://lore.kernel.org/linux-trace-kernel/20240201161617.002321438@goodmis.org
Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
5c3ea7df | 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Warn if an eventfs_inode is freed without is_freed being set
commit 5a49f996046ba947466bc7461e4b19c4d1daf978 upstream.
There should never be a case where an evenfs_inode is being freed wit
eventfs: Warn if an eventfs_inode is freed without is_freed being set
commit 5a49f996046ba947466bc7461e4b19c4d1daf978 upstream.
There should never be a case where an evenfs_inode is being freed without is_freed being set. Add a WARN_ON_ONCE() if it ever happens. That would mean there was one too many put_ei()s.
Link: https://lore.kernel.org/linux-trace-kernel/20240201161616.843551963@goodmis.org
Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
5dfb0410 | 06-Feb-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
eventfs: Get rid of dentry pointers without refcounts
commit 43aa6f97c2d03a52c1ddb86768575fc84344bdbb upstream.
The eventfs inode had pointers to dentries (and child dentries) without actually hold
eventfs: Get rid of dentry pointers without refcounts
commit 43aa6f97c2d03a52c1ddb86768575fc84344bdbb upstream.
The eventfs inode had pointers to dentries (and child dentries) without actually holding a refcount on said pointer. That is fundamentally broken, and while eventfs tried to then maintain coherence with dentries going away by hooking into the '.d_iput' callback, that doesn't actually work since it's not ordered wrt lookups.
There were two reasonms why eventfs tried to keep a pointer to a dentry:
- the creation of a 'events' directory would actually have a stable dentry pointer that it created with tracefs_start_creating().
And it needed that dentry when tearing it all down again in eventfs_remove_events_dir().
This use is actually ok, because the special top-level events directory dentries are actually stable, not just a temporary cache of the eventfs data structures.
- the 'eventfs_inode' (aka ei) needs to stay around as long as there are dentries that refer to it.
It then used these dentry pointers as a replacement for doing reference counting: it would try to make sure that there was only ever one dentry associated with an event_inode, and keep a child dentry array around to see which dentries might still refer to the parent ei.
This gets rid of the invalid dentry pointer use, and renames the one valid case to a different name to make it clear that it's not just any random dentry.
The magic child dentry array that is kind of a "reverse reference list" is simply replaced by having child dentries take a ref to the ei. As does the directory dentries. That makes the broken use case go away.
Link: https://lore.kernel.org/linux-trace-kernel/202401291043.e62e89dc-oliver.sang@intel.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240131185513.280463000@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: c1504e510238 ("eventfs: Implement eventfs dir creation functions") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
c4619205 | 06-Feb-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
eventfs: Clean up dentry ops and add revalidate function
commit 8dce06e98c70a7fcbb4bca7d90faf40522e65c58 upstream.
In order for the dentries to stay up-to-date with the eventfs changes, just add a
eventfs: Clean up dentry ops and add revalidate function
commit 8dce06e98c70a7fcbb4bca7d90faf40522e65c58 upstream.
In order for the dentries to stay up-to-date with the eventfs changes, just add a 'd_revalidate' function that checks the 'is_freed' bit.
Also, clean up the dentry release to actually use d_release() rather than the slightly odd d_iput() function. We don't care about the inode, all we want to do is to get rid of the refcount to the eventfs data added by dentry->d_fsdata.
It would probably be cleaner to make eventfs its own filesystem, or at least set its own dentry ops when looking up eventfs files. But as it is, only eventfs dentries use d_fsdata, so we don't really need to split these things up by use.
Another thing that might be worth doing is to make all eventfs lookups mark their dentries as not worth caching. We could do that with d_delete(), but the DCACHE_DONTCACHE flag would likely be even better.
As it is, the dentries are all freeable, but they only tend to get freed at memory pressure rather than more proactively. But that's a separate issue.
Link: https://lore.kernel.org/linux-trace-kernel/202401291043.e62e89dc-oliver.sang@intel.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240131185513.124644253@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: c1504e510238 ("eventfs: Implement eventfs dir creation functions") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
ca2d3b2c | 06-Feb-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
eventfs: Remove unused d_parent pointer field
commit 408600be78cdb8c650a97ecc7ff411cb216811b5 upstream.
It's never used
Link: https://lore.kernel.org/linux-trace-kernel/202401291043.e62e89dc-olive
eventfs: Remove unused d_parent pointer field
commit 408600be78cdb8c650a97ecc7ff411cb216811b5 upstream.
It's never used
Link: https://lore.kernel.org/linux-trace-kernel/202401291043.e62e89dc-oliver.sang@intel.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240131185512.961772428@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: c1504e510238 ("eventfs: Implement eventfs dir creation functions") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
21faa3de | 06-Feb-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
tracefs: dentry lookup crapectomy
commit 49304c2b93e4f7468b51ef717cbe637981397115 upstream.
The dentry lookup for eventfs files was very broken, and had lots of signs of the old situation where the
tracefs: dentry lookup crapectomy
commit 49304c2b93e4f7468b51ef717cbe637981397115 upstream.
The dentry lookup for eventfs files was very broken, and had lots of signs of the old situation where the filesystem names were all created statically in the dentry tree, rather than being looked up dynamically based on the eventfs data structures.
You could see it in the naming - how it claimed to "create" dentries rather than just look up the dentries that were given it.
You could see it in various nonsensical and very incorrect operations, like using "simple_lookup()" on the dentries that were passed in, which only results in those dentries becoming negative dentries. Which meant that any other lookup would possibly return ENOENT if it saw that negative dentry before the data was then later filled in.
You could see it in the immense amount of nonsensical code that didn't actually just do lookups.
Link: https://lore.kernel.org/linux-trace-kernel/202401291043.e62e89dc-oliver.sang@intel.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240131233227.73db55e1@gandalf.local.home
Cc: stable@vger.kernel.org Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Mark Rutland <mark.rutland@arm.com> Fixes: c1504e510238 ("eventfs: Implement eventfs dir creation functions") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
4928d0e3 | 06-Feb-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
tracefs: Avoid using the ei->dentry pointer unnecessarily
commit 99c001cb617df409dac275a059d6c3f187a2da7a upstream.
The eventfs_find_events() code tries to walk up the tree to find the event direct
tracefs: Avoid using the ei->dentry pointer unnecessarily
commit 99c001cb617df409dac275a059d6c3f187a2da7a upstream.
The eventfs_find_events() code tries to walk up the tree to find the event directory that a dentry belongs to, in order to then find the eventfs inode that is associated with that event directory.
However, it uses an odd combination of walking the dentry parent, looking up the eventfs inode associated with that, and then looking up the dentry from there. Repeat.
But the code shouldn't have back-pointers to dentries in the first place, and it should just walk the dentry parenthood chain directly.
Similarly, 'set_top_events_ownership()' looks up the dentry from the eventfs inode, but the only reason it wants a dentry is to look up the superblock in order to look up the root dentry.
But it already has the real filesystem inode, which has that same superblock pointer. So just pass in the superblock pointer using the information that's already there, instead of looking up extraneous data that is irrelevant.
Link: https://lore.kernel.org/linux-trace-kernel/202401291043.e62e89dc-oliver.sang@intel.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240131185512.638645365@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: c1504e510238 ("eventfs: Implement eventfs dir creation functions") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
d1bcde94 | 06-Feb-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
eventfs: Initialize the tracefs inode properly
commit 4fa4b010b83fb2f837b5ef79e38072a79e96e4f1 upstream.
The tracefs-specific fields in the inode were not initialized before the inode was exposed t
eventfs: Initialize the tracefs inode properly
commit 4fa4b010b83fb2f837b5ef79e38072a79e96e4f1 upstream.
The tracefs-specific fields in the inode were not initialized before the inode was exposed to others through the dentry with 'd_instantiate()'.
Move the field initializations up to before the d_instantiate.
Link: https://lore.kernel.org/linux-trace-kernel/20240131185512.478449628@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202401291043.e62e89dc-oliver.sang@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
f0686a19 | 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
tracefs: Zero out the tracefs_inode when allocating it
commit d81786f53aec14fd4d56263145a0635afbc64617 upstream.
eventfs uses the tracefs_inode and assumes that it's already initialized to zero. Th
tracefs: Zero out the tracefs_inode when allocating it
commit d81786f53aec14fd4d56263145a0635afbc64617 upstream.
eventfs uses the tracefs_inode and assumes that it's already initialized to zero. That is, it doesn't set fields to zero (like ti->private) after getting its tracefs_inode. This causes bugs due to stale values.
Just initialize the entire structure to zero on allocation so there isn't any more surprises.
This is a partial fix to access to ti->private. The assignment still needs to be made before the dentry is instantiated.
Link: https://lore.kernel.org/linux-trace-kernel/20240131185512.315825944@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202401291043.e62e89dc-oliver.sang@intel.com Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
35e219f7 | 06-Feb-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
tracefs: remove stale update_gid code
commit 29142dc92c37d3259a33aef15b03e6ee25b0d188 upstream.
The 'eventfs_update_gid()' function is no longer called, so remove it (and the helper function it use
tracefs: remove stale update_gid code
commit 29142dc92c37d3259a33aef15b03e6ee25b0d188 upstream.
The 'eventfs_update_gid()' function is no longer called, so remove it (and the helper function it uses).
Link: https://lore.kernel.org/all/CAHk-=wj+DsZZ=2iTUkJ-Nojs9fjYMvPs1NuoM3yK7aTDtJfPYQ@mail.gmail.com/
Fixes: 8186fff7ab64 ("tracefs/eventfs: Use root and instance inodes as default ownership") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
9a187657 | 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Save directory inodes in the eventfs_inode structure
commit 834bf76add3e6168038150f162cbccf1fd492a67 upstream.
The eventfs inodes and directories are allocated when referenced. But this le
eventfs: Save directory inodes in the eventfs_inode structure
commit 834bf76add3e6168038150f162cbccf1fd492a67 upstream.
The eventfs inodes and directories are allocated when referenced. But this leaves the issue of keeping consistent inode numbers and the number is only saved in the inode structure itself. When the inode is no longer referenced, it can be freed. When the file that the inode was representing is referenced again, the inode is once again created, but the inode number needs to be the same as it was before.
Just making the inode numbers the same for all files is fine, but that does not work with directories. The find command will check for loops via the inode number and having the same inode number for directories triggers:
# find /sys/kernel/tracing find: File system loop detected; '/sys/kernel/debug/tracing/events/initcall/initcall_finish' is part of the same file system loop as '/sys/kernel/debug/tracing/events/initcall'. [..]
Linus pointed out that the eventfs_inode structure ends with a single 32bit int, and on 64 bit machines, there's likely a 4 byte hole due to alignment. We can use this hole to store the inode number for the eventfs_inode. All directories in eventfs are represented by an eventfs_inode and that data structure can hold its inode number.
That last int was also purposely placed at the end of the structure to prevent holes from within. Now that there's a 4 byte number to hold the inode, both the inode number and the last integer can be moved up in the structure for better cache locality, where the llist and rcu fields can be moved to the end as they are only used when the eventfs_inode is being deleted.
Link: https://lore.kernel.org/all/CAMuHMdXKiorg-jiuKoZpfZyDJ3Ynrfb8=X+c7x0Eewxn-YRdCA@mail.gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240122152748.46897388@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Fixes: 53c41052ba31 ("eventfs: Have the inodes all for files and directories all be the same") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
1f20155a | 06-Feb-2024 |
Erick Archer <erick.archer@gmx.com> |
eventfs: Use kcalloc() instead of kzalloc()
commit 1057066009c4325bb1d8430c9274894d0860e7c3 upstream.
As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documen
eventfs: Use kcalloc() instead of kzalloc()
commit 1057066009c4325bb1d8430c9274894d0860e7c3 upstream.
As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors.
So, use the purpose specific kcalloc() function instead of the argument size * count in the kzalloc() function.
[1] https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
Link: https://lore.kernel.org/linux-trace-kernel/20240115181658.4562-1-erick.archer@gmx.com
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://github.com/KSPP/linux/issues/162 Signed-off-by: Erick Archer <erick.archer@gmx.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
c55d11ea | 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Do not create dentries nor inodes in iterate_shared
commit 852e46e239ee6db3cd220614cf8bce96e79227c2 upstream.
The original eventfs code added a wrapper around the dcache_readdir open callb
eventfs: Do not create dentries nor inodes in iterate_shared
commit 852e46e239ee6db3cd220614cf8bce96e79227c2 upstream.
The original eventfs code added a wrapper around the dcache_readdir open callback and created all the dentries and inodes at open, and increment their ref count. A wrapper was added around the dcache_readdir release function to decrement all the ref counts of those created inodes and dentries. But this proved to be buggy[1] for when a kprobe was created during a dir read, it would create a dentry between the open and the release, and because the release would decrement all ref counts of all files and directories, that would include the kprobe directory that was not there to have its ref count incremented in open. This would cause the ref count to go to negative and later crash the kernel.
To solve this, the dentries and inodes that were created and had their ref count upped in open needed to be saved. That list needed to be passed from the open to the release, so that the release would only decrement the ref counts of the entries that were incremented in the open.
Unfortunately, the dcache_readdir logic was already using the file->private_data, which is the only field that can be used to pass information from the open to the release. What was done was the eventfs created another descriptor that had a void pointer to save the dcache_readdir pointer, and it wrapped all the callbacks, so that it could save the list of entries that had their ref counts incremented in the open, and pass it to the release. The wrapped callbacks would just put back the dcache_readdir pointer and call the functions it used so it could still use its data[2].
But Linus had an issue with the "hijacking" of the file->private_data (unfortunately this discussion was on a security list, so no public link). Which we finally agreed on doing everything within the iterate_shared callback and leave the dcache_readdir out of it[3]. All the information needed for the getents() could be created then.
But this ended up being buggy too[4]. The iterate_shared callback was not the right place to create the dentries and inodes. Even Christian Brauner had issues with that[5].
An attempt was to go back to creating the inodes and dentries at the open, create an array to store the information in the file->private_data, and pass that information to the other callbacks.[6]
The difference between that and the original method, is that it does not use dcache_readdir. It also does not up the ref counts of the dentries and pass them. Instead, it creates an array of a structure that saves the dentry's name and inode number. That information is used in the iterate_shared callback, and the array is freed in the dir release. The dentries and inodes created in the open are not used for the iterate_share or release callbacks. Just their names and inode numbers.
Linus did not like that either[7] and just wanted to remove the dentries being created in iterate_shared and use the hard coded inode numbers.
[ All this while Linus enjoyed an unexpected vacation during the merge window due to lack of power. ]
[1] https://lore.kernel.org/linux-trace-kernel/20230919211804.230edf1e@gandalf.local.home/ [2] https://lore.kernel.org/linux-trace-kernel/20230922163446.1431d4fa@gandalf.local.home/ [3] https://lore.kernel.org/linux-trace-kernel/20240104015435.682218477@goodmis.org/ [4] https://lore.kernel.org/all/202401152142.bfc28861-oliver.sang@intel.com/ [5] https://lore.kernel.org/all/20240111-unzahl-gefegt-433acb8a841d@brauner/ [6] https://lore.kernel.org/all/20240116114711.7e8637be@gandalf.local.home/ [7] https://lore.kernel.org/all/20240116170154.5bf0a250@gandalf.local.home/
Link: https://lore.kernel.org/linux-trace-kernel/20240116211353.573784051@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Fixes: 493ec81a8fb8 ("eventfs: Stop using dcache_readdir() for getdents()") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202401152142.bfc28861-oliver.sang@intel.com Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
4e8731d2 | 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Have the inodes all for files and directories all be the same
commit 53c41052ba3121761e6f62a813961164532a214f upstream.
The dentries and inodes are created in the readdir for the sole purp
eventfs: Have the inodes all for files and directories all be the same
commit 53c41052ba3121761e6f62a813961164532a214f upstream.
The dentries and inodes are created in the readdir for the sole purpose of getting a consistent inode number. Linus stated that is unnecessary, and that all inodes can have the same inode number. For a virtual file system they are pretty meaningless.
Instead use a single unique inode number for all files and one for all directories.
Link: https://lore.kernel.org/all/20240116133753.2808d45e@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20240116211353.412180363@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
e638899f | 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Shortcut eventfs_iterate() by skipping entries already read
commit 1de94b52d5e8d8b32f0252f14fad1f1edc2e71f1 upstream.
As the ei->entries array is fixed for the duration of the eventfs_inod
eventfs: Shortcut eventfs_iterate() by skipping entries already read
commit 1de94b52d5e8d8b32f0252f14fad1f1edc2e71f1 upstream.
As the ei->entries array is fixed for the duration of the eventfs_inode, it can be used to skip over already read entries in eventfs_iterate().
That is, if ctx->pos is greater than zero, there's no reason in doing the loop across the ei->entries array for the entries less than ctx->pos. Instead, start the lookup of the entries at the current ctx->pos.
Link: https://lore.kernel.org/all/CAHk-=wiKwDUDv3+jCsv-uacDcHDVTYsXtBR9=6sGM5mqX+DhOg@mail.gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240104220048.494956957@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
f3f41f44 | 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Read ei->entries before ei->children in eventfs_iterate()
commit 704f960dbee2f1634f4b4e16f208cb16eaf41c1e upstream.
In order to apply a shortcut to skip over the current ctx->pos immediate
eventfs: Read ei->entries before ei->children in eventfs_iterate()
commit 704f960dbee2f1634f4b4e16f208cb16eaf41c1e upstream.
In order to apply a shortcut to skip over the current ctx->pos immediately, by using the ei->entries array, the reading of that array should be first. Moving the array reading before the linked list reading will make the shortcut change diff nicer to read.
Link: https://lore.kernel.org/all/CAHk-=wiKwDUDv3+jCsv-uacDcHDVTYsXtBR9=6sGM5mqX+DhOg@mail.gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240104220048.333115095@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
82820a2d | 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Do ctx->pos update for all iterations in eventfs_iterate()
commit 1e4624eb5a0ecaae0d2c4e3019bece119725bb98 upstream.
The ctx->pos was only updated when it added an entry, but the "skip to
eventfs: Do ctx->pos update for all iterations in eventfs_iterate()
commit 1e4624eb5a0ecaae0d2c4e3019bece119725bb98 upstream.
The ctx->pos was only updated when it added an entry, but the "skip to current pos" check (c--) happened for every loop regardless of if the entry was added or not. This inconsistency caused readdir to be incorrect.
It was due to:
for (i = 0; i < ei->nr_entries; i++) {
if (c > 0) { c--; continue; }
mutex_lock(&eventfs_mutex); /* If ei->is_freed then just bail here, nothing more to do */ if (ei->is_freed) { mutex_unlock(&eventfs_mutex); goto out; } r = entry->callback(name, &mode, &cdata, &fops); mutex_unlock(&eventfs_mutex);
[..] ctx->pos++; }
But this can cause the iterator to return a file that was already read. That's because of the way the callback() works. Some events may not have all files, and the callback can return 0 to tell eventfs to skip the file for this directory.
for instance, we have:
# ls /sys/kernel/tracing/events/ftrace/function format hist hist_debug id inject
and
# ls /sys/kernel/tracing/events/sched/sched_switch/ enable filter format hist hist_debug id inject trigger
Where the function directory is missing "enable", "filter" and "trigger". That's because the callback() for events has:
static int event_callback(const char *name, umode_t *mode, void **data, const struct file_operations **fops) { struct trace_event_file *file = *data; struct trace_event_call *call = file->event_call;
[..]
/* * Only event directories that can be enabled should have * triggers or filters, with the exception of the "print" * event that can have a "trigger" file. */ if (!(call->flags & TRACE_EVENT_FL_IGNORE_ENABLE)) { if (call->class->reg && strcmp(name, "enable") == 0) { *mode = TRACE_MODE_WRITE; *fops = &ftrace_enable_fops; return 1; }
if (strcmp(name, "filter") == 0) { *mode = TRACE_MODE_WRITE; *fops = &ftrace_event_filter_fops; return 1; } }
if (!(call->flags & TRACE_EVENT_FL_IGNORE_ENABLE) || strcmp(trace_event_name(call), "print") == 0) { if (strcmp(name, "trigger") == 0) { *mode = TRACE_MODE_WRITE; *fops = &event_trigger_fops; return 1; } } [..] return 0; }
Where the function event has the TRACE_EVENT_FL_IGNORE_ENABLE set.
This means that the entries array elements for "enable", "filter" and "trigger" when called on the function event will have the callback return 0 and not 1, to tell eventfs to skip these files for it.
Because the "skip to current ctx->pos" check happened for all entries, but the ctx->pos++ only happened to entries that exist, it would confuse the reading of a directory. Which would cause:
# ls /sys/kernel/tracing/events/ftrace/function/ format hist hist hist_debug hist_debug id inject inject
The missing "enable", "filter" and "trigger" caused ls to show "hist", "hist_debug" and "inject" twice.
Update the ctx->pos for every iteration to keep its update and the "skip" update consistent. This also means that on error, the ctx->pos needs to be decremented if it was incremented without adding something.
Link: https://lore.kernel.org/all/20240104150500.38b15a62@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20240104220048.172295263@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: 493ec81a8fb8e ("eventfs: Stop using dcache_readdir() for getdents()") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
98102764 | 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Have eventfs_iterate() stop immediately if ei->is_freed is set
commit e109deadb73318cf4a3bd61287d969f705df278f upstream.
If ei->is_freed is set in eventfs_iterate(), it means that the dire
eventfs: Have eventfs_iterate() stop immediately if ei->is_freed is set
commit e109deadb73318cf4a3bd61287d969f705df278f upstream.
If ei->is_freed is set in eventfs_iterate(), it means that the directory that is being iterated on is in the process of being freed. Just exit the loop immediately when that is ever detected, and separate out the return of the entry->callback() from ei->is_freed.
Link: https://lore.kernel.org/linux-trace-kernel/20240104220048.016261289@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
628adb84 | 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
tracefs/eventfs: Use root and instance inodes as default ownership
commit 8186fff7ab649085e2c60d032d9a20a85af1d87c upstream.
Instead of walking the dentries on mount/remount to update the gid value
tracefs/eventfs: Use root and instance inodes as default ownership
commit 8186fff7ab649085e2c60d032d9a20a85af1d87c upstream.
Instead of walking the dentries on mount/remount to update the gid values of all the dentries if a gid option is specified on mount, just update the root inode. Add .getattr, .setattr, and .permissions on the tracefs inode operations to update the permissions of the files and directories.
For all files and directories in the top level instance:
/sys/kernel/tracing/*
It will use the root inode as the default permissions. The inode that represents: /sys/kernel/tracing (or wherever it is mounted).
When an instance is created:
mkdir /sys/kernel/tracing/instance/foo
The directory "foo" and all its files and directories underneath will use the default of what foo is when it was created. A remount of tracefs will not affect it.
If a user were to modify the permissions of any file or directory in tracefs, it will also no longer be modified by a change in ownership of a remount.
The events directory, if it is in the top level instance, will use the tracefs root inode as the default ownership for itself and all the files and directories below it.
For the events directory in an instance ("foo"), it will keep the ownership of what it was when it was created, and that will be used as the default ownership for the files and directories beneath it.
Link: https://lore.kernel.org/linux-trace-kernel/CAHk-=wjVdGkjDXBbvLn2wbZnqP4UsH46E3gqJ9m7UG6DpX2+WA@mail.gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240103215016.1e0c9811@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
1bfdd54a | 06-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
eventfs: Stop using dcache_readdir() for getdents()
commit 493ec81a8fb8e4ada6f223b8b73791a1280d4774 upstream.
The eventfs creates dynamically allocated dentries and inodes. Using the dcache_readdir
eventfs: Stop using dcache_readdir() for getdents()
commit 493ec81a8fb8e4ada6f223b8b73791a1280d4774 upstream.
The eventfs creates dynamically allocated dentries and inodes. Using the dcache_readdir() logic for its own directory lookups requires hiding the cursor of the dcache logic and playing games to allow the dcache_readdir() to still have access to the cursor while the eventfs saved what it created and what it needs to release.
Instead, just have eventfs have its own iterate_shared callback function that will fill in the dent entries. This simplifies the code quite a bit.
Link: https://lore.kernel.org/linux-trace-kernel/20240104015435.682218477@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ajay Kaher <akaher@vmware.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|