Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50 |
|
#
ca96b162 |
| 30-Aug-2023 |
Mateusz Guzik <mjguzik@gmail.com> |
x86: bring back rep movsq for user access on CPUs without ERMS
Intel CPUs ship with ERMS for over a decade, but this is not true for AMD. In particular one reasonably recent uarch (EPYC 7R13) does
x86: bring back rep movsq for user access on CPUs without ERMS
Intel CPUs ship with ERMS for over a decade, but this is not true for AMD. In particular one reasonably recent uarch (EPYC 7R13) does not have it (or at least the bit is inactive when running on the Amazon EC2 cloud -- I found rather conflicting information about AMD CPUs vs the extension).
Hand-rolled mov loops executing in this case are quite pessimal compared to rep movsq for bigger sizes. While the upper limit depends on uarch, everyone is well south of 1KB AFAICS and sizes bigger than that are common.
While technically ancient CPUs may be suffering from rep usage, gcc has been emitting it for years all over kernel code, so I don't think this is a legitimate concern.
Sample result from read1_processes from will-it-scale (4KB reads/s):
before: 1507021 after: 1721828 (+14%)
Note that the cutoff point for rep usage is set to 64 bytes, which is way too conservative but I'm sticking to what was done in 47ee3f1dd93b ("x86: re-introduce support for ERMS copies for user space accesses"). That is to say *some* copies will now go slower, which is fixable but beyond the scope of this patch.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31 |
|
#
47ee3f1d |
| 26-May-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
x86: re-introduce support for ERMS copies for user space accesses
I tried to streamline our user memory copy code fairly aggressively in commit adfcf4231b8c ("x86: don't use REP_GOOD or ERMS for use
x86: re-introduce support for ERMS copies for user space accesses
I tried to streamline our user memory copy code fairly aggressively in commit adfcf4231b8c ("x86: don't use REP_GOOD or ERMS for user memory copies"), in order to then be able to clean up the code and inline the modern FSRM case in commit 577e6a7fd50d ("x86: inline the 'rep movs' in user copies for the FSRM case").
We had reports [1] of that causing regressions earlier with blogbench, but that turned out to be a horrible benchmark for that case, and not a sufficient reason for re-instating "rep movsb" on older machines.
However, now Eric Dumazet reported [2] a regression in performance that seems to be a rather more real benchmark, where due to the removal of "rep movs" a TCP stream over a 100Gbps network no longer reaches line speed.
And it turns out that with the simplified the calling convention for the non-FSRM case in commit 427fda2c8a49 ("x86: improve on the non-rep 'copy_user' function"), re-introducing the ERMS case is actually fairly simple.
Of course, that "fairly simple" is glossing over several missteps due to having to fight our assembler alternative code. This code really wanted to rewrite a conditional branch to have two different targets, but that made objtool sufficiently unhappy that this instead just ended up doing a choice between "jump to the unrolled loop, or use 'rep movsb' directly".
Let's see if somebody finds a case where the kernel memory copies also care (see commit 68674f94ffc9: "x86: don't use REP_GOOD or ERMS for small memory copies"). But Eric does argue that the user copies are special because networking tries to copy up to 32KB at a time, if order-3 pages allocations are possible.
In-kernel memory copies are typically small, unless they are the special "copy pages at a time" kind that still use "rep movs".
Link: https://lore.kernel.org/lkml/202305041446.71d46724-yujie.liu@intel.com/ [1] Link: https://lore.kernel.org/lkml/CANn89iKUbyrJ=r2+_kK+sb2ZSSHifFZ7QkPLDpAtkJ8v4WUumA@mail.gmail.com/ [2] Reported-and-tested-by: Eric Dumazet <edumazet@google.com> Fixes: adfcf4231b8c ("x86: don't use REP_GOOD or ERMS for user memory copies") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3 |
|
#
034ff37d |
| 20-Apr-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
x86: rewrite '__copy_user_nocache' function
I didn't really want to do this, but as part of all the other changes to the user copy loops, I've been looking at this horror.
I tried to clean it up mu
x86: rewrite '__copy_user_nocache' function
I didn't really want to do this, but as part of all the other changes to the user copy loops, I've been looking at this horror.
I tried to clean it up multiple times, but every time I just found more problems, and the way it's written, it's just too hard to fix them.
For example, the code is written to do quad-word alignment, and will use regular byte accesses to get to that point. That's fairly simple, but it means that any initial 8-byte alignment will be done with cached copies.
However, the code then is very careful to do any 4-byte _tail_ accesses using an uncached 4-byte write, and that was claimed to be relevant in commit a82eee742452 ("x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache()").
So if you do a 4-byte copy using that function, it carefully uses a 4-byte 'movnti' for the destination. But if you were to do a 12-byte copy that is 4-byte aligned, it would _not_ do a 4-byte 'movnti' followed by a 8-byte 'movnti' to keep it all uncached.
Instead, it would align the destination to 8 bytes using a byte-at-a-time loop, and then do a 8-byte 'movnti' for the final 8 bytes.
The main caller that cares is __copy_user_flushcache(), which knows about this insanity, and has odd cases for it all. But I just can't deal with looking at this kind of "it does one case right, and another related case entirely wrong".
And the code really wasn't fixable without hard drugs, which I try to avoid.
So instead, rewrite it in a form that hopefully not only gets this right, but is a bit more maintainable. Knock wood.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v6.1.25 |
|
#
e1f2750e |
| 19-Apr-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
x86: remove 'zerorest' argument from __copy_user_nocache()
Every caller passes in zero, meaning they don't want any partial copy to zero the remainder of the destination buffer.
Which is just as we
x86: remove 'zerorest' argument from __copy_user_nocache()
Every caller passes in zero, meaning they don't want any partial copy to zero the remainder of the destination buffer.
Which is just as well, because the implementation of that function didn't actually even look at that argument, and wasn't even aware it existed, although some misleading comments did mention it still.
The 'zerorest' thing is a historical artifact of how "copy_from_user()" worked, in that it would zero the rest of the kernel buffer that it copied into.
That zeroing still exists, but it's long since been moved to generic code, and the raw architecture-specific code doesn't do it. See _copy_from_user() in lib/usercopy.c for this all.
However, while __copy_user_nocache() shares some history and superficial other similarities with copy_from_user(), it is in many ways also very different.
In particular, while the code makes it *look* similar to the generic user copy functions that can copy both to and from user space, and take faults on both reads and writes as a result, __copy_user_nocache() does no such thing at all.
__copy_user_nocache() always copies to kernel space, and will never take a page fault on the destination. What *can* happen, though, is that the non-temporal stores take a machine check because one of the use cases is for writing to stable memory, and any memory errors would then take synchronous faults.
So __copy_user_nocache() does look a lot like copy_from_user(), but has faulting behavior that is more akin to our old copy_in_user() (which no longer exists, but copied from user space to user space and could fault on both source and destination).
And it very much does not have the "zero the end of the destination buffer", since a problem with the destination buffer is very possibly the very source of the partial copy.
So this whole thing was just a confusing historical artifact from having shared some code with a completely different function with completely different use cases.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
427fda2c |
| 16-Apr-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
x86: improve on the non-rep 'copy_user' function
The old 'copy_user_generic_unrolled' function was oddly implemented for largely historical reasons: it had been largely based on the uncached copy ca
x86: improve on the non-rep 'copy_user' function
The old 'copy_user_generic_unrolled' function was oddly implemented for largely historical reasons: it had been largely based on the uncached copy case, which has some other concerns.
For example, the __copy_user_nocache() function uses 'movnti' for the destination stores, and those want the destination to be aligned. In contrast, the regular copy function doesn't really care, and trying to align things only complicates matters.
Also, like the clear_user function, the copy function had some odd handling of the repeat counts, complicating the exception handling for no really good reason. So as with clear_user, just write it to keep all the byte counts in the %rcx register, exactly like the 'rep movs' functionality that this replaces.
Unlike a real 'rep movs', we do allow for this to trash a few temporary registers to not have to unnecessarily save/restore registers on the stack.
And like the clearing case, rename this to what it now clearly is: 'rep_movs_alternative', and make it one coherent function, so that it shows up as such in profiles (instead of the odd split between "copy_user_generic_unrolled" and "copy_user_short_string", the latter of which was not about strings at all, and which was shared with the uncached case).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
577e6a7f |
| 15-Apr-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
x86: inline the 'rep movs' in user copies for the FSRM case
This does the same thing for the user copies as commit 0db7058e8e23 ("x86/clear_user: Make it faster") did for clear_user(). In other wor
x86: inline the 'rep movs' in user copies for the FSRM case
This does the same thing for the user copies as commit 0db7058e8e23 ("x86/clear_user: Make it faster") did for clear_user(). In other words, it inlines the "rep movs" case when X86_FEATURE_FSRM is set, avoiding the function call entirely.
In order to do that, it makes the calling convention for the out-of-line case ("copy_user_generic_unrolled") match the 'rep movs' calling convention, although it does also end up clobbering a number of additional registers.
Also, to simplify code sharing in the low-level assembly with the __copy_user_nocache() function (that uses the normal C calling convention), we end up with a kind of mixed return value for the low-level asm code: it will return the result in both %rcx (to work as an alternative for the 'rep movs' case), _and_ in %rax (for the nocache case).
We could avoid this by wrapping __copy_user_nocache() callers in an inline asm, but since the cost is just an extra register copy, it's probably not worth it.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
3639a535 |
| 15-Apr-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
x86: move stac/clac from user copy routines into callers
This is preparatory work for inlining the 'rep movs' case, but also a cleanup. The __copy_user_nocache() function was mis-used by the rdma c
x86: move stac/clac from user copy routines into callers
This is preparatory work for inlining the 'rep movs' case, but also a cleanup. The __copy_user_nocache() function was mis-used by the rdma code to do uncached kernel copies that don't actually want user copies at all, and as a result doesn't want the stac/clac either.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
adfcf423 |
| 15-Apr-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
x86: don't use REP_GOOD or ERMS for user memory copies
The modern target to use is FSRM (Fast Short REP MOVS), and the other cases should only be used for bigger areas (ie mainly things like page cl
x86: don't use REP_GOOD or ERMS for user memory copies
The modern target to use is FSRM (Fast Short REP MOVS), and the other cases should only be used for bigger areas (ie mainly things like page clearing).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34 |
|
#
02041b32 |
| 11-Apr-2022 |
Josh Poimboeuf <jpoimboe@redhat.com> |
x86/uaccess: Don't jump between functions
For unwinding sanity, a function shouldn't jump to the middle of another function. Move the short string user copy code out to a separate non-function code
x86/uaccess: Don't jump between functions
For unwinding sanity, a function shouldn't jump to the middle of another function. Move the short string user copy code out to a separate non-function code snippet.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/9519e4853148b765e047967708f2b61e56c93186.1649718562.git.jpoimboe@redhat.com
show more ...
|
Revision tags: v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16 |
|
#
33761363 |
| 23-Dec-2021 |
Youquan Song <youquan.song@intel.com> |
x86/mce: Reduce number of machine checks taken during recovery
When any of the copy functions in arch/x86/lib/copy_user_64.S take a fault, the fixup code copies the remaining byte count from %ecx to
x86/mce: Reduce number of machine checks taken during recovery
When any of the copy functions in arch/x86/lib/copy_user_64.S take a fault, the fixup code copies the remaining byte count from %ecx to %edx and unconditionally jumps to .Lcopy_user_handle_tail to continue the copy in case any more bytes can be copied.
If the fault was #PF this may copy more bytes (because the page fault handler might have fixed the fault). But when the fault is a machine check the original copy code will have copied all the way to the poisoned cache line. So .Lcopy_user_handle_tail will just take another machine check for no good reason.
Every code path to .Lcopy_user_handle_tail comes from an exception fixup path, so add a check there to check the trap type (in %eax) and simply return the count of remaining bytes if the trap was a machine check.
Doing this reduces the number of machine checks taken during synthetic tests from four to three.
As well as reducing the number of machine checks, this also allows Skylake generation Xeons to recover some cases that currently fail. The is because REP; MOVSB is only recoverable when source and destination are well aligned and the byte count is large. That useless call to .Lcopy_user_handle_tail may violate one or more of these conditions and generate a fatal machine check.
[ Tony: Add more details to commit message. ] [ bp: Fixup comment. Also, another tip patchset which is adding straight-line speculation mitigation changes the "ret" instruction to an all-caps macro "RET". But, since gas is case-insensitive, use "RET" in the newly added asm block already in order to simplify tip branch merging on its way upstream. ]
Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/YcTW5dh8yTGucDd+@agluck-desk2.amr.corp.intel.com
show more ...
|
Revision tags: v5.15.10 |
|
#
244122b4 |
| 16-Dec-2021 |
Tony Luck <tony.luck@intel.com> |
x86/lib: Add fast-short-rep-movs check to copy_user_enhanced_fast_string()
Commit
f444a5ff95dc ("x86/cpufeatures: Add support for fast short REP; MOVSB")
fixed memmove() with an ALTERNATIVE that
x86/lib: Add fast-short-rep-movs check to copy_user_enhanced_fast_string()
Commit
f444a5ff95dc ("x86/cpufeatures: Add support for fast short REP; MOVSB")
fixed memmove() with an ALTERNATIVE that will use REP MOVSB for all string lengths.
copy_user_enhanced_fast_string() has a similar run time check to avoid using REP MOVSB for copies less that 64 bytes.
Add an ALTERNATIVE to patch out the short length check and always use REP MOVSB on X86_FEATURE_FSRM CPUs.
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211216172431.1396371-1-tony.luck@intel.com
show more ...
|
Revision tags: v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2 |
|
#
acba44d2 |
| 10-Nov-2021 |
Peter Zijlstra <peterz@infradead.org> |
x86/copy_user_64: Remove .fixup usage
Place the anonymous .fixup code at the tail of the regular functions.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Josh Poimboeuf
x86/copy_user_64: Remove .fixup usage
Place the anonymous .fixup code at the tail of the regular functions.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211110101325.068505810@infradead.org
show more ...
|
#
f94909ce |
| 04-Dec-2021 |
Peter Zijlstra <peterz@infradead.org> |
x86: Prepare asm files for straight-line-speculation
Replace all ret/retq instructions with RET in preparation of making RET a macro. Since AS is case insensitive it's a big no-op without RET define
x86: Prepare asm files for straight-line-speculation
Replace all ret/retq instructions with RET in preparation of making RET a macro. Since AS is case insensitive it's a big no-op without RET defined.
find arch/x86/ -name \*.S | while read file do sed -i 's/\<ret[q]*\>/RET/' $file done
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211204134907.905503893@infradead.org
show more ...
|
Revision tags: v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60 |
|
#
69065847 |
| 17-Aug-2021 |
Tony Luck <tony.luck@intel.com> |
x86/mce: Drop copyin special case for #MC
Fixes to the iterator code to handle faults that are not on page boundaries mean that the special case for machine check during copy from user is no longer
x86/mce: Drop copyin special case for #MC
Fixes to the iterator code to handle faults that are not on page boundaries mean that the special case for machine check during copy from user is no longer needed.
For a full list of those fixes, see the output of:
git log --oneline v5.14 ^v5.13 -- lib/iov_iter.c
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210818002942.1607544-4-tony.luck@intel.com
show more ...
|
#
14b476e0 |
| 04-Dec-2021 |
Peter Zijlstra <peterz@infradead.org> |
x86: Prepare asm files for straight-line-speculation
[ Upstream commit f94909ceb1ed4bfdb2ada72f93236305e6d6951f ]
Replace all ret/retq instructions with RET in preparation of making RET a macro. Si
x86: Prepare asm files for straight-line-speculation
[ Upstream commit f94909ceb1ed4bfdb2ada72f93236305e6d6951f ]
Replace all ret/retq instructions with RET in preparation of making RET a macro. Since AS is case insensitive it's a big no-op without RET defined.
find arch/x86/ -name \*.S | while read file do sed -i 's/\<ret[q]*\>/RET/' $file done
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211204134907.905503893@infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24 |
|
#
5e21a3ec |
| 11-Mar-2021 |
Juergen Gross <jgross@suse.com> |
x86/alternative: Merge include files
Merge arch/x86/include/asm/alternative-asm.h into arch/x86/include/asm/alternative.h in order to make it easier to use common definitions later.
Signed-off-by:
x86/alternative: Merge include files
Merge arch/x86/include/asm/alternative-asm.h into arch/x86/include/asm/alternative.h in order to make it easier to use common definitions later.
Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210311142319.4723-2-jgross@suse.com
show more ...
|
Revision tags: v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14, v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14 |
|
#
a2f73400 |
| 06-Oct-2020 |
Tony Luck <tony.luck@intel.com> |
x86/mce: Avoid tail copy when machine check terminated a copy from user
In the page fault case it is ok to see if a few more unaligned bytes can be copied from the source address. Worst case is that
x86/mce: Avoid tail copy when machine check terminated a copy from user
In the page fault case it is ok to see if a few more unaligned bytes can be copied from the source address. Worst case is that the page fault will be triggered again.
Machine checks are more serious. Just give up at the point where the main copy loop triggered the #MC and return from the copy code as if the copy succeeded. The machine check handler will use task_work_add() to make sure that the task is sent a SIGBUS.
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201006210910.21062-5-tony.luck@intel.com
show more ...
|
#
278b917f |
| 06-Oct-2020 |
Youquan Song <youquan.song@intel.com> |
x86/mce: Add _ASM_EXTABLE_CPY for copy user access
_ASM_EXTABLE_UA is a general exception entry to record the exception fixup for all exception spots between kernel and user space access.
To enable
x86/mce: Add _ASM_EXTABLE_CPY for copy user access
_ASM_EXTABLE_UA is a general exception entry to record the exception fixup for all exception spots between kernel and user space access.
To enable recovery from machine checks while coping data from user addresses it is necessary to be able to distinguish the places that are looping copying data from those that copy a single byte/word/etc.
Add a new macro _ASM_EXTABLE_CPY and use it in place of _ASM_EXTABLE_UA in the copy functions.
Record the exception reason number to regs->ax at ex_handler_uaccess which is used to check MCE triggered.
The new fixup routine ex_handler_copy() is almost an exact copy of ex_handler_uaccess() The difference is that it sets regs->ax to the trap number. Following patches use this to avoid trying to copy remaining bytes from the tail of the copy and possibly hitting the poison again.
New mce.kflags bit MCE_IN_KERNEL_COPYIN will be used by mce_severity() calculation to indicate that a machine check is recoverable because the kernel was copying from user space.
Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201006210910.21062-4-tony.luck@intel.com
show more ...
|
Revision tags: v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21, v5.4.20, v5.4.19, v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13, v5.4.12, v5.4.11, v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6, v5.4.5, v5.4.4, v5.4.3, v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13, v5.3.12, v5.3.11, v5.3.10, v5.3.9, v5.3.8, v5.3.7, v5.3.6 |
|
#
6dcc5627 |
| 11-Oct-2019 |
Jiri Slaby <jslaby@suse.cz> |
x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_*
These are all functions which are invoked from elsewhere, so annotate them as global using the new SYM_FUNC_START and their ENDPROC's by SYM_FUNC_END.
x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_*
These are all functions which are invoked from elsewhere, so annotate them as global using the new SYM_FUNC_START and their ENDPROC's by SYM_FUNC_END.
Make sure ENTRY/ENDPROC is not defined on X86_64, given these were the last users.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [hibernate] Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> [xen bits] Acked-by: Herbert Xu <herbert@gondor.apana.org.au> [crypto] Cc: Allison Randal <allison@lohutok.net> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andy Shevchenko <andy@infradead.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Armijn Hemel <armijn@tjaldur.nl> Cc: Cao jin <caoj.fnst@cn.fujitsu.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Enrico Weigelt <info@metux.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jim Mattson <jmattson@google.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Len Brown <len.brown@intel.com> Cc: linux-arch@vger.kernel.org Cc: linux-crypto@vger.kernel.org Cc: linux-efi <linux-efi@vger.kernel.org> Cc: linux-efi@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: platform-driver-x86@vger.kernel.org Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Wei Huang <wei@redhat.com> Cc: x86-ml <x86@kernel.org> Cc: xen-devel@lists.xenproject.org Cc: Xiaoyao Li <xiaoyao.li@linux.intel.com> Link: https://lkml.kernel.org/r/20191011115108.12392-25-jslaby@suse.cz
show more ...
|
#
fa972201 |
| 11-Oct-2019 |
Jiri Slaby <jslaby@suse.cz> |
x86/uaccess: Annotate local function
.Lcopy_user_handle_tail is a self-standing local function, annotate it as such using SYM_CODE_START_LOCAL.
Again, no functional change, just documentation.
Sig
x86/uaccess: Annotate local function
.Lcopy_user_handle_tail is a self-standing local function, annotate it as such using SYM_CODE_START_LOCAL.
Again, no functional change, just documentation.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: linux-arch@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191011115108.12392-9-jslaby@suse.cz
show more ...
|
Revision tags: v5.3.5, v5.3.4, v5.3.3, v5.3.2, v5.3.1, v5.3, v5.2.14, v5.3-rc8, v5.2.13, v5.2.12 |
|
#
98ededb6 |
| 06-Sep-2019 |
Jiri Slaby <jslaby@suse.cz> |
x86/asm: Make some functions local labels
Boris suggests to make a local label (prepend ".L") to these functions to eliminate them from the symbol table. These are functions with very local names an
x86/asm: Make some functions local labels
Boris suggests to make a local label (prepend ".L") to these functions to eliminate them from the symbol table. These are functions with very local names and really should not be visible anywhere.
Note that objtool won't see these functions anymore (to generate ORC debug info). But all the functions are not annotated with ENDPROC, so they won't have objtool's attention anyway.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Cao jin <caoj.fnst@cn.fujitsu.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve Winslow <swinslow@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Huang <wei@redhat.com> Cc: x86-ml <x86@kernel.org> Cc: Xiaoyao Li <xiaoyao.li@linux.intel.com> Link: https://lkml.kernel.org/r/20190906075550.23435-2-jslaby@suse.cz
show more ...
|
Revision tags: v5.2.11, v5.2.10, v5.2.9, v5.2.8, v5.2.7, v5.2.6, v5.2.5, v5.2.4, v5.2.3, v5.2.2 |
|
#
3a6ab4bc |
| 17-Jul-2019 |
Josh Poimboeuf <jpoimboe@redhat.com> |
x86/uaccess: Remove ELF function annotation from copy_user_handle_tail()
After an objtool improvement, it's complaining about the CLAC in copy_user_handle_tail():
arch/x86/lib/copy_user_64.o: war
x86/uaccess: Remove ELF function annotation from copy_user_handle_tail()
After an objtool improvement, it's complaining about the CLAC in copy_user_handle_tail():
arch/x86/lib/copy_user_64.o: warning: objtool: .altinstr_replacement+0x12: redundant UACCESS disable arch/x86/lib/copy_user_64.o: warning: objtool: copy_user_handle_tail()+0x6: (alt) arch/x86/lib/copy_user_64.o: warning: objtool: copy_user_handle_tail()+0x2: (alt) arch/x86/lib/copy_user_64.o: warning: objtool: copy_user_handle_tail()+0x0: <=== (func)
copy_user_handle_tail() is incorrectly marked as a callable function, so objtool is rightfully concerned about the CLAC with no corresponding STAC.
Remove the ELF function annotation. The copy_user_handle_tail() code path is already verified by objtool because it's jumped to by other callable asm code (which does the corresponding STAC).
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/6b6e436774678b4b9873811ff023bd29935bee5b.1563413318.git.jpoimboe@redhat.com
show more ...
|
Revision tags: v5.2.1, v5.2, v5.1.16, v5.1.15, v5.1.14, v5.1.13, v5.1.12, v5.1.11, v5.1.10, v5.1.9, v5.1.8, v5.1.7, v5.1.6 |
|
#
3fc21751 |
| 28-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 224
Based on 1 normalized pattern(s):
subject to the gnu public license v2
extracted by the scancode license scanner the SPDX licen
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 224
Based on 1 normalized pattern(s):
subject to the gnu public license v2
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 1 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Steve Winslow <swinslow@gmail.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190528171440.222651153@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v5.1.5, v5.1.4, v5.1.3, v5.1.2, v5.1.1, v5.0.14, v5.1, v5.0.13, v5.0.12, v5.0.11, v5.0.10, v5.0.9, v5.0.8, v5.0.7, v5.0.6, v5.0.5, v5.0.4, v5.0.3, v4.19.29, v5.0.2, v4.19.28, v5.0.1, v4.19.27, v5.0 |
|
#
3693ca81 |
| 01-Mar-2019 |
Peter Zijlstra <peterz@infradead.org> |
x86/uaccess: Move copy_user_handle_tail() into asm
By writing the function in asm we avoid cross object code flow and objtool no longer gets confused about a 'stray' CLAC.
Also; the asm version is
x86/uaccess: Move copy_user_handle_tail() into asm
By writing the function in asm we avoid cross object code flow and objtool no longer gets confused about a 'stray' CLAC.
Also; the asm version is actually _simpler_.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
Revision tags: v4.19.26, v4.19.25, v4.19.24, v4.19.23, v4.19.22, v4.19.21, v4.19.20, v4.19.19, v4.19.18, v4.19.17, v4.19.16, v4.19.15, v4.19.14, v4.19.13, v4.19.12, v4.19.11, v4.19.10, v4.19.9, v4.19.8, v4.19.7, v4.19.6, v4.19.5, v4.19.4, v4.18.20, v4.19.3, v4.18.19, v4.19.2, v4.18.18, v4.18.17, v4.19.1, v4.19, v4.18.16, v4.18.15, v4.18.14, v4.18.13, v4.18.12, v4.18.11, v4.18.10, v4.18.9, v4.18.7, v4.18.6 |
|
#
75045f77 |
| 28-Aug-2018 |
Jann Horn <jannh@google.com> |
x86/extable: Introduce _ASM_EXTABLE_UA for uaccess fixups
Currently, most fixups for attempting to access userspace memory are handled using _ASM_EXTABLE, which is also used for various other types
x86/extable: Introduce _ASM_EXTABLE_UA for uaccess fixups
Currently, most fixups for attempting to access userspace memory are handled using _ASM_EXTABLE, which is also used for various other types of fixups (e.g. safe MSR access, IRET failures, and a bunch of other things). In order to make it possible to add special safety checks to uaccess fixups (in particular, checking whether the fault address is actually in userspace), introduce a new exception table handler ex_handler_uaccess() and wire it up to all the user access fixups (excluding ones that already use _ASM_EXTABLE_EX).
Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: kernel-hardening@lists.openwall.com Cc: dvyukov@google.com Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20180828201421.157735-5-jannh@google.com
show more ...
|