/openbmc/linux/arch/x86/lib/ |
H A D | copy_user_uncached_64.S | 034ff37d34071ff3f48755f728cd229e42a4f15d Thu Apr 20 17:13:50 CDT 2023 Linus Torvalds <torvalds@linux-foundation.org> x86: rewrite '__copy_user_nocache' function
I didn't really want to do this, but as part of all the other changes to the user copy loops, I've been looking at this horror.
I tried to clean it up multiple times, but every time I just found more problems, and the way it's written, it's just too hard to fix them.
For example, the code is written to do quad-word alignment, and will use regular byte accesses to get to that point. That's fairly simple, but it means that any initial 8-byte alignment will be done with cached copies.
However, the code then is very careful to do any 4-byte _tail_ accesses using an uncached 4-byte write, and that was claimed to be relevant in commit a82eee742452 ("x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache()").
So if you do a 4-byte copy using that function, it carefully uses a 4-byte 'movnti' for the destination. But if you were to do a 12-byte copy that is 4-byte aligned, it would _not_ do a 4-byte 'movnti' followed by a 8-byte 'movnti' to keep it all uncached.
Instead, it would align the destination to 8 bytes using a byte-at-a-time loop, and then do a 8-byte 'movnti' for the final 8 bytes.
The main caller that cares is __copy_user_flushcache(), which knows about this insanity, and has odd cases for it all. But I just can't deal with looking at this kind of "it does one case right, and another related case entirely wrong".
And the code really wasn't fixable without hard drugs, which I try to avoid.
So instead, rewrite it in a form that hopefully not only gets this right, but is a bit more maintainable. Knock wood.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
H A D | copy_user_64.S | diff 034ff37d34071ff3f48755f728cd229e42a4f15d Thu Apr 20 17:13:50 CDT 2023 Linus Torvalds <torvalds@linux-foundation.org> x86: rewrite '__copy_user_nocache' function
I didn't really want to do this, but as part of all the other changes to the user copy loops, I've been looking at this horror.
I tried to clean it up multiple times, but every time I just found more problems, and the way it's written, it's just too hard to fix them.
For example, the code is written to do quad-word alignment, and will use regular byte accesses to get to that point. That's fairly simple, but it means that any initial 8-byte alignment will be done with cached copies.
However, the code then is very careful to do any 4-byte _tail_ accesses using an uncached 4-byte write, and that was claimed to be relevant in commit a82eee742452 ("x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache()").
So if you do a 4-byte copy using that function, it carefully uses a 4-byte 'movnti' for the destination. But if you were to do a 12-byte copy that is 4-byte aligned, it would _not_ do a 4-byte 'movnti' followed by a 8-byte 'movnti' to keep it all uncached.
Instead, it would align the destination to 8 bytes using a byte-at-a-time loop, and then do a 8-byte 'movnti' for the final 8 bytes.
The main caller that cares is __copy_user_flushcache(), which knows about this insanity, and has odd cases for it all. But I just can't deal with looking at this kind of "it does one case right, and another related case entirely wrong".
And the code really wasn't fixable without hard drugs, which I try to avoid.
So instead, rewrite it in a form that hopefully not only gets this right, but is a bit more maintainable. Knock wood.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
H A D | Makefile | diff 034ff37d34071ff3f48755f728cd229e42a4f15d Thu Apr 20 17:13:50 CDT 2023 Linus Torvalds <torvalds@linux-foundation.org> x86: rewrite '__copy_user_nocache' function
I didn't really want to do this, but as part of all the other changes to the user copy loops, I've been looking at this horror.
I tried to clean it up multiple times, but every time I just found more problems, and the way it's written, it's just too hard to fix them.
For example, the code is written to do quad-word alignment, and will use regular byte accesses to get to that point. That's fairly simple, but it means that any initial 8-byte alignment will be done with cached copies.
However, the code then is very careful to do any 4-byte _tail_ accesses using an uncached 4-byte write, and that was claimed to be relevant in commit a82eee742452 ("x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache()").
So if you do a 4-byte copy using that function, it carefully uses a 4-byte 'movnti' for the destination. But if you were to do a 12-byte copy that is 4-byte aligned, it would _not_ do a 4-byte 'movnti' followed by a 8-byte 'movnti' to keep it all uncached.
Instead, it would align the destination to 8 bytes using a byte-at-a-time loop, and then do a 8-byte 'movnti' for the final 8 bytes.
The main caller that cares is __copy_user_flushcache(), which knows about this insanity, and has odd cases for it all. But I just can't deal with looking at this kind of "it does one case right, and another related case entirely wrong".
And the code really wasn't fixable without hard drugs, which I try to avoid.
So instead, rewrite it in a form that hopefully not only gets this right, but is a bit more maintainable. Knock wood.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|