#
84158b7f |
| 25-Jan-2022 |
Jann Horn <jannh@google.com> |
coredump: Also dump first pages of non-executable ELF libraries
When I rewrote the VMA dumping logic for coredumps, I changed it to recognize ELF library mappings based on the file being executable
coredump: Also dump first pages of non-executable ELF libraries
When I rewrote the VMA dumping logic for coredumps, I changed it to recognize ELF library mappings based on the file being executable instead of the mapping having an ELF header. But turns out, distros ship many ELF libraries as non-executable, so the heuristic goes wrong...
Restore the old behavior where FILTER(ELF_HEADERS) dumps the first page of any offset-0 readable mapping that starts with the ELF magic.
This fix is technically layer-breaking a bit, because it checks for something ELF-specific in fs/coredump.c; but since we probably want to share this between standard ELF and FDPIC ELF anyway, I guess it's fine? And this also keeps the change small for backporting.
Cc: stable@vger.kernel.org Fixes: 429a22e776a2 ("coredump: rework elf/elf_fdpic vma_dump_size() into common helper") Reported-by: Bill Messmer <wmessmer@microsoft.com> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220126025739.2014888-1-jannh@google.com
show more ...
|
#
f0bc21b2 |
| 22-Jan-2022 |
Xiaoming Ni <nixiaoming@huawei.com> |
fs/coredump: move coredump sysctls into its own file
This moves the fs/coredump.c respective sysctls to its own file.
Link: https://lkml.kernel.org/r/20211129211943.640266-6-mcgrof@kernel.org Signe
fs/coredump: move coredump sysctls into its own file
This moves the fs/coredump.c respective sysctls to its own file.
Link: https://lkml.kernel.org/r/20211129211943.640266-6-mcgrof@kernel.org Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Antti Palosaari <crope@iki.fi> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Lukas Middendorf <kernel@tuxforce.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Stephen Kitt <steve@sk2.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
49697335 |
| 24-Jun-2021 |
Eric W. Biederman <ebiederm@xmission.com> |
signal: Remove the helper signal_group_exit
This helper is misleading. It tests for an ongoing exec as well as the process having received a fatal signal.
Sometimes it is appropriate to treat an o
signal: Remove the helper signal_group_exit
This helper is misleading. It tests for an ongoing exec as well as the process having received a fatal signal.
Sometimes it is appropriate to treat an on-going exec differently than a process that is shutting down due to a fatal signal. In particular taking the fast path out of exit_signals instead of retargeting signals is not appropriate during exec, and not changing the the exit code in do_group_exit during exec.
Removing the helper makes it more obvious what is going on as both cases must be coded for explicitly.
While removing the helper fix the two cases where I have observed using signal_group_exit resulted in the wrong result.
In exit_signals only test for SIGNAL_GROUP_EXIT so that signals are retargetted during an exec.
In do_group_exit use 0 as the exit code during an exec as de_thread does not set group_exit_code. As best as I can determine group_exit_code has been is set to 0 most of the time during de_thread. During a thread group stop group_exit_code is set to the stop signal and when the thread group receives SIGCONT group_exit_code is reset to 0.
Link: https://lkml.kernel.org/r/20211213225350.27481-8-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
show more ...
|
#
6ac79ec5 |
| 19-Nov-2021 |
Eric W. Biederman <ebiederm@xmission.com> |
coredump: Stop setting signal->group_exit_task
Currently the coredump code sets group_exit_task so that signal_group_exit() will return true during a coredump. Now that the coredump code always set
coredump: Stop setting signal->group_exit_task
Currently the coredump code sets group_exit_task so that signal_group_exit() will return true during a coredump. Now that the coredump code always sets SIGNAL_GROUP_EXIT there is no longer a need to set signal->group_exit_task.
Link: https://lkml.kernel.org/r/20211213225350.27481-6-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
show more ...
|
#
2f824d4d |
| 08-Jan-2022 |
Eric W. Biederman <ebiederm@xmission.com> |
signal: Remove SIGNAL_GROUP_COREDUMP
After the previous cleanups "signal->core_state" is set whenever SIGNAL_GROUP_COREDUMP is set and "signal->core_state" is tested whenver the code wants to know i
signal: Remove SIGNAL_GROUP_COREDUMP
After the previous cleanups "signal->core_state" is set whenever SIGNAL_GROUP_COREDUMP is set and "signal->core_state" is tested whenver the code wants to know if a coredump is in progress. The remaining tests of SIGNAL_GROUP_COREDUMP also test to see if SIGNAL_GROUP_EXIT is set. Similarly the only place that sets SIGNAL_GROUP_COREDUMP also sets SIGNAL_GROUP_EXIT.
Which makes SIGNAL_GROUP_COREDUMP unecessary and redundant. So stop setting SIGNAL_GROUP_COREDUMP, stop testing SIGNAL_GROUP_COREDUMP, and remove it's definition.
With the setting of SIGNAL_GROUP_COREDUMP gone, coredump_finish no longer needs to clear SIGNAL_GROUP_COREDUMP out of signal->flags by setting SIGNAL_GROUP_EXIT.
Link: https://lkml.kernel.org/r/20211213225350.27481-5-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
show more ...
|
#
752dc970 |
| 08-Jan-2022 |
Eric W. Biederman <ebiederm@xmission.com> |
signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process
There are only a few places that test SIGNAL_GROUP_EXIT and are not also already testing SIGNAL_GROUP_COREDUMP.
This will not affect th
signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process
There are only a few places that test SIGNAL_GROUP_EXIT and are not also already testing SIGNAL_GROUP_COREDUMP.
This will not affect the callers of signal_group_exit as zap_process also sets group_exit_task so signal_group_exit will continue to return true at the same times.
This does not affect wait_task_zombie as the none of the threads wind up in EXIT_ZOMBIE state during a coredump.
This does not affect oom_kill.c:__task_will_free_mem as sig->core_state is tested and handled before SIGNAL_GROUP_EXIT is tested for.
This does not affect complete_signal as signal->core_state is tested for to ensure the coredump case is handled appropriately.
Link: https://lkml.kernel.org/r/20211213225350.27481-4-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
show more ...
|
#
0258b5fd |
| 22-Sep-2021 |
Eric W. Biederman <ebiederm@xmission.com> |
coredump: Limit coredumps to a single thread group
Today when a signal is delivered with a handler of SIG_DFL whose default behavior is to generate a core dump not only that process but every proces
coredump: Limit coredumps to a single thread group
Today when a signal is delivered with a handler of SIG_DFL whose default behavior is to generate a core dump not only that process but every process that shares the mm is killed.
In the case of vfork this looks like a real world problem. Consider the following well defined sequence.
if (vfork() == 0) { execve(...); _exit(EXIT_FAILURE); }
If a signal that generates a core dump is received after vfork but before the execve changes the mm the process that called vfork will also be killed (as the mm is shared).
Similarly if the execve fails after the point of no return the kernel delivers SIGSEGV which will kill both the exec'ing process and because the mm is shared the process that called vfork as well.
As far as I can tell this behavior is a violation of people's reasonable expectations, POSIX, and is unnecessarily fragile when the system is low on memory.
Solve this by making a userspace visible change to only kill a single process/thread group. This is possible because Jann Horn recently modified[1] the coredump code so that the mm can safely be modified while the coredump is happening. With LinuxThreads long gone I don't expect anyone to have a notice this behavior change in practice.
To accomplish this move the core_state pointer from mm_struct to signal_struct, which allows different thread groups to coredump simultatenously.
In zap_threads remove the work to kill anything except for the current thread group.
v2: Remove core_state from the VM_BUG_ON_MM print to fix compile failure when CONFIG_DEBUG_VM is enabled. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
[1] a07279c9a8cd ("binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot") Fixes: d89f3847def4 ("[PATCH] thread-aware coredumps, 2.5.43-C3") History-tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Link: https://lkml.kernel.org/r/87y27mvnke.fsf@disp2133 Link: https://lkml.kernel.org/r/20211007144701.67592574@canb.auug.org.au Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
show more ...
|
#
92307383 |
| 01-Sep-2021 |
Eric W. Biederman <ebiederm@xmission.com> |
coredump: Don't perform any cleanups before dumping core
Rename coredump_exit_mm to coredump_task_exit and call it from do_exit before PTRACE_EVENT_EXIT, and before any cleanup work for a task happ
coredump: Don't perform any cleanups before dumping core
Rename coredump_exit_mm to coredump_task_exit and call it from do_exit before PTRACE_EVENT_EXIT, and before any cleanup work for a task happens. This ensures that an accurate copy of the process can be captured in the coredump as no cleanup for the process happens before the coredump completes. This also ensures that PTRACE_EVENT_EXIT will not be visited by any thread until the coredump is complete.
Add a new flag PF_POSTCOREDUMP so that tasks that have passed through coredump_task_exit can be recognized and ignored in zap_process.
Now that all of the coredumping happens before exit_mm remove code to test for a coredump in progress from mm_release.
Replace "may_ptrace_stop()" with a simple test of "current->ptrace". The other tests in may_ptrace_stop all concern avoiding stopping during a coredump. These tests are no longer necessary as it is now guaranteed that fatal_signal_pending will be set if the code enters ptrace_stop during a coredump. The code in ptrace_stop is guaranteed not to stop if fatal_signal_pending returns true.
Until this change "ptrace_event(PTRACE_EVENT_EXIT)" could call ptrace_stop without fatal_signal_pending being true, as signals are dequeued in get_signal before calling do_exit. This is no longer an issue as "ptrace_event(PTRACE_EVENT_EXIT)" is no longer reached until after the coredump completes.
Link: https://lkml.kernel.org/r/874kaax26c.fsf@disp2133 Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
show more ...
|
#
d67e03e3 |
| 01-Sep-2021 |
Eric W. Biederman <ebiederm@xmission.com> |
exit: Factor coredump_exit_mm out of exit_mm
Separate the coredump logic from the ordinary exit_mm logic by moving the coredump logic out of exit_mm into it's own function coredump_exit_mm.
Link: h
exit: Factor coredump_exit_mm out of exit_mm
Separate the coredump logic from the ordinary exit_mm logic by moving the coredump logic out of exit_mm into it's own function coredump_exit_mm.
Link: https://lkml.kernel.org/r/87a6k2x277.fsf@disp2133 Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
show more ...
|
#
39fd0cc0 |
| 08-Mar-2022 |
Eric W. Biederman <ebiederm@xmission.com> |
coredump: Use the vma snapshot in fill_files_note
commit 390031c942116d4733310f0684beb8db19885fe6 upstream.
Matthew Wilcox reported that there is a missing mmap_lock in file_files_note that could p
coredump: Use the vma snapshot in fill_files_note
commit 390031c942116d4733310f0684beb8db19885fe6 upstream.
Matthew Wilcox reported that there is a missing mmap_lock in file_files_note that could possibly lead to a user after free.
Solve this by using the existing vma snapshot for consistency and to avoid the need to take the mmap_lock anywhere in the coredump code except for dump_vma_snapshot.
Update the dump_vma_snapshot to capture vm_pgoff and vm_file that are neeeded by fill_files_note.
Add free_vma_snapshot to free the captured values of vm_file.
Reported-by: Matthew Wilcox <willy@infradead.org> Link: https://lkml.kernel.org/r/20220131153740.2396974-1-willy@infradead.org Cc: stable@vger.kernel.org Fixes: a07279c9a8cd ("binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot") Fixes: 2aa362c49c31 ("coredump: extend core dump note section to contain file names of mapped files") Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
7ba958df |
| 08-Mar-2022 |
Eric W. Biederman <ebiederm@xmission.com> |
coredump: Remove the WARN_ON in dump_vma_snapshot
commit 49c1866348f364478a0c4d3dd13fd08bb82d3a5b upstream.
The condition is impossible and to the best of my knowledge has never triggered.
We are
coredump: Remove the WARN_ON in dump_vma_snapshot
commit 49c1866348f364478a0c4d3dd13fd08bb82d3a5b upstream.
The condition is impossible and to the best of my knowledge has never triggered.
We are in deep trouble if that conditions happens and we walk past the end of our allocated array.
So delete the WARN_ON and the code that makes it look like the kernel can handle the case of walking past the end of it's vma_meta array.
Reviewed-by: Jann Horn <jannh@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
f6ca8628 |
| 08-Mar-2022 |
Eric W. Biederman <ebiederm@xmission.com> |
coredump: Snapshot the vmas in do_coredump
commit 95c5436a4883841588dae86fb0b9325f47ba5ad3 upstream.
Move the call of dump_vma_snapshot and kvfree(vma_meta) out of the individual coredump routines
coredump: Snapshot the vmas in do_coredump
commit 95c5436a4883841588dae86fb0b9325f47ba5ad3 upstream.
Move the call of dump_vma_snapshot and kvfree(vma_meta) out of the individual coredump routines into do_coredump itself. This makes the code less error prone and easier to maintain.
Make the vma snapshot available to the coredump routines in struct coredump_params. This makes it easier to change and update what is captures in the vma snapshot and will be needed for fixing fill_file_notes.
Reviewed-by: Jann Horn <jannh@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
6cdb84dd |
| 25-Jan-2022 |
Jann Horn <jannh@google.com> |
coredump: Also dump first pages of non-executable ELF libraries
commit 84158b7f6a0624b81800b4e7c90f7fb7fdecf66c upstream.
When I rewrote the VMA dumping logic for coredumps, I changed it to recogni
coredump: Also dump first pages of non-executable ELF libraries
commit 84158b7f6a0624b81800b4e7c90f7fb7fdecf66c upstream.
When I rewrote the VMA dumping logic for coredumps, I changed it to recognize ELF library mappings based on the file being executable instead of the mapping having an ELF header. But turns out, distros ship many ELF libraries as non-executable, so the heuristic goes wrong...
Restore the old behavior where FILTER(ELF_HEADERS) dumps the first page of any offset-0 readable mapping that starts with the ELF magic.
This fix is technically layer-breaking a bit, because it checks for something ELF-specific in fs/coredump.c; but since we probably want to share this between standard ELF and FDPIC ELF anyway, I guess it's fine? And this also keeps the change small for backporting.
Cc: stable@vger.kernel.org Fixes: 429a22e776a2 ("coredump: rework elf/elf_fdpic vma_dump_size() into common helper") Reported-by: Bill Messmer <wmessmer@microsoft.com> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220126025739.2014888-1-jannh@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
6fcac87e |
| 07-Sep-2021 |
QiuXi <qiuxi1@huawei.com> |
coredump: fix memleak in dump_vma_snapshot()
dump_vma_snapshot() allocs memory for *vma_meta, when dump_vma_snapshot() returns -EFAULT, the memory will be leaked, so we free it correctly.
Link: htt
coredump: fix memleak in dump_vma_snapshot()
dump_vma_snapshot() allocs memory for *vma_meta, when dump_vma_snapshot() returns -EFAULT, the memory will be leaked, so we free it correctly.
Link: https://lkml.kernel.org/r/20210810020441.62806-1-qiuxi1@huawei.com Fixes: a07279c9a8cd7 ("binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot") Signed-off-by: QiuXi <qiuxi1@huawei.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jann Horn <jannh@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
dbd9d6f8 |
| 07-Sep-2021 |
David Oberhollenzer <david.oberhollenzer@sigma-star.at> |
fs/coredump.c: log if a core dump is aborted due to changed file permissions
For obvious security reasons, a core dump is aborted if the filesystem cannot preserve ownership or permissions of the du
fs/coredump.c: log if a core dump is aborted due to changed file permissions
For obvious security reasons, a core dump is aborted if the filesystem cannot preserve ownership or permissions of the dump file.
This affects filesystems like e.g. vfat, but also something like a 9pfs share in a Qemu test setup, running as a regular user, depending on the security model used. In those cases, the result is an empty core file and a confused user.
To hopefully save other people a lot of time figuring out the cause, this patch adds a simple log message for those specific cases.
[akpm@linux-foundation.org: s/|%s/%s/ in printk text]
Link: https://lkml.kernel.org/r/20210701233151.102720-1-david.oberhollenzer@sigma-star.at Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
06af8679 |
| 10-Jun-2021 |
Eric W. Biederman <ebiederm@xmission.com> |
coredump: Limit what can interrupt coredumps
Olivier Langlois has been struggling with coredumps being incompletely written in processes using io_uring.
Olivier Langlois <olivier@trillion01.com> wr
coredump: Limit what can interrupt coredumps
Olivier Langlois has been struggling with coredumps being incompletely written in processes using io_uring.
Olivier Langlois <olivier@trillion01.com> writes: > io_uring is a big user of task_work and any event that io_uring made a > task waiting for that occurs during the core dump generation will > generate a TIF_NOTIFY_SIGNAL. > > Here are the detailed steps of the problem: > 1. io_uring calls vfs_poll() to install a task to a file wait queue > with io_async_wake() as the wakeup function cb from io_arm_poll_handler() > 2. wakeup function ends up calling task_work_add() with TWA_SIGNAL > 3. task_work_add() sets the TIF_NOTIFY_SIGNAL bit by calling > set_notify_signal()
The coredump code deliberately supports being interrupted by SIGKILL, and depends upon prepare_signal to filter out all other signals. Now that signal_pending includes wake ups for TIF_NOTIFY_SIGNAL this hack in dump_emitted by the coredump code no longer works.
Make the coredump code more robust by explicitly testing for all of the wakeup conditions the coredump code supports. This prevents new wakeup conditions from breaking the coredump code, as well as fixing the current issue.
The filesystem code that the coredump code uses already limits itself to only aborting on fatal_signal_pending. So it should not develop surprising wake-up reasons either.
v2: Don't remove the now unnecessary code in prepare_signal.
Cc: stable@vger.kernel.org Fixes: 12db8b690010 ("entry: Add support for TIF_NOTIFY_SIGNAL") Reported-by: Olivier Langlois <olivier@trillion01.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
ffb37ca3 |
| 01-Apr-2021 |
Al Viro <viro@zeniv.linux.org.uk> |
switch file_open_root() to struct path
... and provide file_open_root_mnt(), using the root of given mount.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
Revision tags: v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27, v5.4.26, v5.4.25 |
|
#
d0f1088b |
| 08-Mar-2020 |
Al Viro <viro@zeniv.linux.org.uk> |
coredump: don't bother with do_truncate()
have dump_skip() just remember how much needs to be skipped, leave actual seeks/writing zeroes to the next dump_emit() or the end of coredump output, whiche
coredump: don't bother with do_truncate()
have dump_skip() just remember how much needs to be skipped, leave actual seeks/writing zeroes to the next dump_emit() or the end of coredump output, whichever comes first. And instead of playing with do_truncate() in the end, just write one NUL at the end of the last gap (if any).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
3159ed57 |
| 25-Feb-2021 |
Ira Weiny <ira.weiny@intel.com> |
fs/coredump: use kmap_local_page()
In dump_user_range() there is no reason for the mapping to be global. Use kmap_local_page() rather than kmap.
Link: https://lkml.kernel.org/r/20210203223328.5589
fs/coredump: use kmap_local_page()
In dump_user_range() there is no reason for the mapping to be global. Use kmap_local_page() rather than kmap.
Link: https://lkml.kernel.org/r/20210203223328.558945-1-ira.weiny@intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
643fe55a |
| 21-Jan-2021 |
Christian Brauner <christian.brauner@ubuntu.com> |
open: handle idmapped mounts in do_truncate()
When truncating files the vfs will verify that the caller is privileged over the inode. Extend it to handle idmapped mounts. If the inode is accessed th
open: handle idmapped mounts in do_truncate()
When truncating files the vfs will verify that the caller is privileged over the inode. Extend it to handle idmapped mounts. If the inode is accessed through an idmapped mount it is mapped according to the mount's user namespace. Afterwards the permissions checks are identical to non-idmapped mounts. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before.
Link: https://lore.kernel.org/r/20210121131959.646623-16-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
show more ...
|
#
c39ab6de |
| 25-Nov-2020 |
Eric W. Biederman <ebiederm@xmission.com> |
coredump: Document coredump code exclusively used by cell spufs
Oleg Nesterov recently asked[1] why is there an unshare_files in do_coredump. After digging through all of the callers of lookup_fd i
coredump: Document coredump code exclusively used by cell spufs
Oleg Nesterov recently asked[1] why is there an unshare_files in do_coredump. After digging through all of the callers of lookup_fd it turns out that it is arch/powerpc/platforms/cell/spufs/coredump.c:coredump_next_context that needs the unshare_files in do_coredump.
Looking at the history[2] this code was also the only piece of coredump code that required the unshare_files when the unshare_files was added.
Looking at that code it turns out that cell is also the only architecture that implements elf_coredump_extra_notes_size and elf_coredump_extra_notes_write.
I looked at the gdb repo[3] support for cell has been removed[4] in binutils 2.34. Geoff Levand reports he is still getting questions on how to run modern kernels on the PS3, from people using 3rd party firmware so this code is not dead. According to Wikipedia the last PS3 shipped in Japan sometime in 2017. So it will probably be a little while before everyone's hardware dies.
Add some comments briefly documenting the coredump code that exists only to support cell spufs to make it easier to understand the coredump code. Eventually the hardware will be dead, or their won't be userspace tools, or the coredump code will be refactored and it will be too difficult to update a dead architecture and these comments make it easy to tell where to pull to remove cell spufs support.
[1] https://lkml.kernel.org/r/20201123175052.GA20279@redhat.com [2] 179e037fc137 ("do_coredump(): make sure that descriptor table isn't shared") [3] git://sourceware.org/git/binutils-gdb.git [4] abf516c6931a ("Remove Cell Broadband Engine debugging support"). Link: https://lkml.kernel.org/r/87h7pdnlzv.fsf_-_@x220.int.ebiederm.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
show more ...
|
#
1f702603 |
| 20-Nov-2020 |
Eric W. Biederman <ebiederm@xmission.com> |
exec: Simplify unshare_files
Now that exec no longer needs to return the unshared files to their previous value there is no reason to return displaced.
Instead when unshare_fd creates a copy of the
exec: Simplify unshare_files
Now that exec no longer needs to return the unshared files to their previous value there is no reason to return displaced.
Instead when unshare_fd creates a copy of the file table, call put_files_struct before returning from unshare_files.
Acked-by: Christian Brauner <christian.brauner@ubuntu.com> v1: https://lkml.kernel.org/r/20200817220425.9389-2-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20201120231441.29911-2-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
show more ...
|
#
2bf509d9 |
| 06-Dec-2020 |
Menglong Dong <dong.menglong@zte.com.cn> |
coredump: fix core_pattern parse error
'format_corename()' will splite 'core_pattern' on spaces when it is in pipe mode, and take helper_argv[0] as the path to usermode executable. It works fine in
coredump: fix core_pattern parse error
'format_corename()' will splite 'core_pattern' on spaces when it is in pipe mode, and take helper_argv[0] as the path to usermode executable. It works fine in most cases.
However, if there is a space between '|' and '/file/path', such as '| /usr/lib/systemd/systemd-coredump %P %u %g', then helper_argv[0] will be parsed as '', and users will get a 'Core dump to | disabled'.
It is not friendly to users, as the pattern above was valid previously. Fix this by ignoring the spaces between '|' and '/file/path'.
Fixes: 315c69261dd3 ("coredump: split pipe command whitespace before expanding template") Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Wise <pabs3@bonedaddy.net> Cc: Jakub Wilk <jwilk@jwilk.net> [https://bugs.debian.org/924398] Cc: Neil Horman <nhorman@tuxdriver.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/5fb62870.1c69fb81.8ef5d.af76@mx.google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
a07279c9 |
| 15-Oct-2020 |
Jann Horn <jannh@google.com> |
binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot
In both binfmt_elf and binfmt_elf_fdpic, use a new helper dump_vma_snapshot() to take a snapshot of the VMA list (including the gate VMA, if we
binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot
In both binfmt_elf and binfmt_elf_fdpic, use a new helper dump_vma_snapshot() to take a snapshot of the VMA list (including the gate VMA, if we have one) while protected by the mmap_lock, and then use that snapshot instead of walking the VMA list without locking.
An alternative approach would be to keep the mmap_lock held across the entire core dumping operation; however, keeping the mmap_lock locked while we may be blocked for an unbounded amount of time (e.g. because we're dumping to a FUSE filesystem or so) isn't really optimal; the mmap_lock blocks things like the ->release handler of userfaultfd, and we don't really want critical system daemons to grind to a halt just because someone "gifted" them SCM_RIGHTS to an eternally-locked userfaultfd, or something like that.
Since both the normal ELF code and the FDPIC ELF code need this functionality (and if any other binfmt wants to add coredump support in the future, they'd probably need it, too), implement this with a common helper in fs/coredump.c.
A downside of this approach is that we now need a bigger amount of kernel memory per userspace VMA in the normal ELF case, and that we need O(n) kernel memory in the FDPIC ELF case at all; but 40 bytes per VMA shouldn't be terribly bad.
There currently is a data race between stack expansion and anything that reads ->vm_start or ->vm_end under the mmap_lock held in read mode; to mitigate that for core dumping, take the mmap_lock in write mode when taking a snapshot of the VMA hierarchy. (If we only took the mmap_lock in read mode, we could end up with a corrupted core dump if someone does get_user_pages_remote() concurrently. Not really a major problem, but taking the mmap_lock either way works here, so we might as well avoid the issue.) (This doesn't do anything about the existing data races with stack expansion in other mm code.)
Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200827114932.3572699-6-jannh@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
429a22e7 |
| 15-Oct-2020 |
Jann Horn <jannh@google.com> |
coredump: rework elf/elf_fdpic vma_dump_size() into common helper
At the moment, the binfmt_elf and binfmt_elf_fdpic code have slightly different code to figure out which VMAs should be dumped, and
coredump: rework elf/elf_fdpic vma_dump_size() into common helper
At the moment, the binfmt_elf and binfmt_elf_fdpic code have slightly different code to figure out which VMAs should be dumped, and if so, whether the dump should contain the entire VMA or just its first page.
Eliminate duplicate code by reworking the binfmt_elf version into a generic core dumping helper in coredump.c.
As part of that, change the heuristic for detecting executable/library header pages to check whether the inode is executable instead of looking at the file mode.
This is less problematic in terms of locking because it lets us avoid get_user() under the mmap_sem. (And arguably it looks nicer and makes more sense in generic code.)
Adjust a little bit based on the binfmt_elf_fdpic version: ->anon_vma is only meaningful under CONFIG_MMU, otherwise we have to assume that the VMA has been written to.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200827114932.3572699-5-jannh@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|