5bf196f1 | 10-Jun-2024 |
John David Anglin <dave@parisc-linux.org> |
parisc: Try to fix random segmentation faults in package builds
commit 72d95924ee35c8cd16ef52f912483ee938a34d49 upstream.
PA-RISC systems with PA8800 and PA8900 processors have had problems with ra
parisc: Try to fix random segmentation faults in package builds
commit 72d95924ee35c8cd16ef52f912483ee938a34d49 upstream.
PA-RISC systems with PA8800 and PA8900 processors have had problems with random segmentation faults for many years. Systems with earlier processors are much more stable.
Systems with PA8800 and PA8900 processors have a large L2 cache which needs per page flushing for decent performance when a large range is flushed. The combined cache in these systems is also more sensitive to non-equivalent aliases than the caches in earlier systems.
The majority of random segmentation faults that I have looked at appear to be memory corruption in memory allocated using mmap and malloc.
My first attempt at fixing the random faults didn't work. On reviewing the cache code, I realized that there were two issues which the existing code didn't handle correctly. Both relate to cache move-in. Another issue is that the present bit in PTEs is racy.
1) PA-RISC caches have a mind of their own and they can speculatively load data and instructions for a page as long as there is a entry in the TLB for the page which allows move-in. TLBs are local to each CPU. Thus, the TLB entry for a page must be purged before flushing the page. This is particularly important on SMP systems.
In some of the flush routines, the flush routine would be called and then the TLB entry would be purged. This was because the flush routine needed the TLB entry to do the flush.
2) My initial approach to trying the fix the random faults was to try and use flush_cache_page_if_present for all flush operations. This actually made things worse and led to a couple of hardware lockups. It finally dawned on me that some lines weren't being flushed because the pte check code was racy. This resulted in random inequivalent mappings to physical pages.
The __flush_cache_page tmpalias flush sets up its own TLB entry and it doesn't need the existing TLB entry. As long as we can find the pte pointer for the vm page, we can get the pfn and physical address of the page. We can also purge the TLB entry for the page before doing the flush. Further, __flush_cache_page uses a special TLB entry that inhibits cache move-in.
When switching page mappings, we need to ensure that lines are removed from the cache. It is not sufficient to just flush the lines to memory as they may come back.
This made it clear that we needed to implement all the required flush operations using tmpalias routines. This includes flushes for user and kernel pages.
After modifying the code to use tmpalias flushes, it became clear that the random segmentation faults were not fully resolved. The frequency of faults was worse on systems with a 64 MB L2 (PA8900) and systems with more CPUs (rp4440).
The warning that I added to flush_cache_page_if_present to detect pages that couldn't be flushed triggered frequently on some systems.
Helge and I looked at the pages that couldn't be flushed and found that the PTE was either cleared or for a swap page. Ignoring pages that were swapped out seemed okay but pages with cleared PTEs seemed problematic.
I looked at routines related to pte_clear and noticed ptep_clear_flush. The default implementation just flushes the TLB entry. However, it was obvious that on parisc we need to flush the cache page as well. If we don't flush the cache page, stale lines will be left in the cache and cause random corruption. Once a PTE is cleared, there is no way to find the physical address associated with the PTE and flush the associated page at a later time.
I implemented an updated change with a parisc specific version of ptep_clear_flush. It fixed the random data corruption on Helge's rp4440 and rp3440, as well as on my c8000.
At this point, I realized that I could restore the code where we only flush in flush_cache_page_if_present if the page has been accessed. However, for this, we also need to flush the cache when the accessed bit is cleared in ptep_clear_flush_young to keep things synchronized. The default implementation only flushes the TLB entry.
Other changes in this version are:
1) Implement parisc specific version of ptep_get. It's identical to default but needed in arch/parisc/include/asm/pgtable.h. 2) Revise parisc implementation of ptep_test_and_clear_young to use ptep_get (READ_ONCE). 3) Drop parisc implementation of ptep_get_and_clear. We can use default. 4) Revise flush_kernel_vmap_range and invalidate_kernel_vmap_range to use full data cache flush. 5) Move flush_cache_vmap and flush_cache_vunmap to cache.c. Handle VM_IOREMAP case in flush_cache_vmap.
At this time, I don't know whether it is better to always flush when the PTE present bit is set or when both the accessed and present bits are set. The later saves flushing pages that haven't been accessed, but we need to flush in ptep_clear_flush_young. It also needs a page table lookup to find the PTE pointer. The lpa instruction only needs a page table lookup when the PTE entry isn't in the TLB.
We don't atomically handle setting and clearing the _PAGE_ACCESSED bit. If we miss an update, we may miss a flush and the cache may get corrupted. Whether the current code is effectively atomic depends on process control.
When CONFIG_FLUSH_PAGE_ACCESSED is set to zero, the page will eventually be flushed when the PTE is cleared or in flush_cache_page_if_present. The _PAGE_ACCESSED bit is not used, so the problem is avoided.
The flush method can be selected using the CONFIG_FLUSH_PAGE_ACCESSED define in cache.c. The default is 0. I didn't see a large difference in performance.
Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: <stable@vger.kernel.org> # v6.6+ Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
b2b685c7 | 27-Apr-2024 |
Helge Deller <deller@kernel.org> |
parisc: Define sigset_t in parisc uapi header
commit 487fa28fa8b60417642ac58e8beda6e2509d18f9 upstream.
The util-linux debian package fails to build on parisc, because sigset_t isn't defined in asm
parisc: Define sigset_t in parisc uapi header
commit 487fa28fa8b60417642ac58e8beda6e2509d18f9 upstream.
The util-linux debian package fails to build on parisc, because sigset_t isn't defined in asm/signal.h when included from userspace. Move the sigset_t type from internal header to the uapi header to fix the build.
Link: https://buildd.debian.org/status/fetch.php?pkg=util-linux&arch=hppa&ver=2.40-7&stamp=1714163443&raw=0 Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
644604e6 | 27-Feb-2024 |
Guenter Roeck <linux@roeck-us.net> |
parisc: Strip upper 32 bit of sum in csum_ipv6_magic for 64-bit builds
[ Upstream commit 0568b6f0d863643db2edcc7be31165740c89fa82 ]
IPv6 checksum tests with unaligned addresses on 64-bit builds res
parisc: Strip upper 32 bit of sum in csum_ipv6_magic for 64-bit builds
[ Upstream commit 0568b6f0d863643db2edcc7be31165740c89fa82 ]
IPv6 checksum tests with unaligned addresses on 64-bit builds result in unexpected failures.
Expected expected == csum_result, but expected == 46591 (0xb5ff) csum_result == 46381 (0xb52d) with alignment offset 1
Oddly enough, the problem disappeared after adding test code into the beginning of csum_ipv6_magic().
As it turns out, the 'sum' parameter of csum_ipv6_magic() is declared as __wsum, which is a 32-bit variable. However, it is treated as 64-bit variable in the 64-bit assembler code. Tests showed that the upper 32 bit of the register used to pass the variable are _not_ cleared when entering the function. This can result in checksum calculation errors.
Clearing the upper 32 bit of 'sum' as first operation in the assembler code fixes the problem.
Acked-by: Helge Deller <deller@gmx.de> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
d4b71ff9 | 13-Feb-2024 |
Guenter Roeck <linux@roeck-us.net> |
parisc: Fix csum_ipv6_magic on 64-bit systems
[ Upstream commit 4b75b12d70506e31fc02356bbca60f8d5ca012d0 ]
hppa 64-bit systems calculates the IPv6 checksum using 64-bit add operations. The last add
parisc: Fix csum_ipv6_magic on 64-bit systems
[ Upstream commit 4b75b12d70506e31fc02356bbca60f8d5ca012d0 ]
hppa 64-bit systems calculates the IPv6 checksum using 64-bit add operations. The last add folds protocol and length fields into the 64-bit result. While unlikely, this operation can overflow. The overflow can be triggered with a code sequence such as the following.
/* try to trigger massive overflows */ memset(tmp_buf, 0xff, sizeof(struct in6_addr)); csum_result = csum_ipv6_magic((struct in6_addr *)tmp_buf, (struct in6_addr *)tmp_buf, 0xffff, 0xff, 0xffffffff);
Fix the problem by adding any overflows from the final add operation into the calculated checksum. Fortunately, we can do this without additional cost by replacing the add operation used to fold the checksum into 32 bit with "add,dc" to add in the missing carry.
Cc: Palmer Dabbelt <palmer@rivosinc.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
cabe3343 | 10-Feb-2024 |
Guenter Roeck <linux@roeck-us.net> |
parisc: Fix csum_ipv6_magic on 32-bit systems
[ Upstream commit 4408ba75e4ba80c91fde7e10bccccf388f5c09be ]
Calculating the IPv6 checksum on 32-bit systems missed overflows when adding the proto+len
parisc: Fix csum_ipv6_magic on 32-bit systems
[ Upstream commit 4408ba75e4ba80c91fde7e10bccccf388f5c09be ]
Calculating the IPv6 checksum on 32-bit systems missed overflows when adding the proto+len fields into the checksum. This results in the following unit test failure.
# test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:506 Expected ( u64)csum_result == ( u64)expected, but ( u64)csum_result == 46722 (0xb682) ( u64)expected == 46721 (0xb681) not ok 5 test_csum_ipv6_magic
This is probably rarely seen in the real world because proto+len are usually small values which will rarely result in overflows when calculating the checksum. However, the unit test code uses large values for the length field, causing the test to fail.
Fix the problem by adding the missing carry into the final checksum.
Cc: Palmer Dabbelt <palmer@rivosinc.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
d31c8d0c | 10-Feb-2024 |
Guenter Roeck <linux@roeck-us.net> |
parisc: Fix ip_fast_csum
[ Upstream commit a2abae8f0b638c31bb9799d9dd847306e0d005bd ]
IP checksum unit tests report the following error when run on hppa/hppa64.
# test_ip_fast_csum: ASSERTION
parisc: Fix ip_fast_csum
[ Upstream commit a2abae8f0b638c31bb9799d9dd847306e0d005bd ]
IP checksum unit tests report the following error when run on hppa/hppa64.
# test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:463 Expected ( u64)csum_result == ( u64)expected, but ( u64)csum_result == 33754 (0x83da) ( u64)expected == 10946 (0x2ac2) not ok 4 test_ip_fast_csum
0x83da is the expected result if the IP header length is 20 bytes. 0x2ac2 is the expected result if the IP header length is 24 bytes. The test fails with an IP header length of 24 bytes. It appears that ip_fast_csum() always returns the checksum for a 20-byte header, no matter how long the header actually is.
Code analysis shows a suspicious assembler sequence in ip_fast_csum().
" addc %0, %3, %0\n" "1: ldws,ma 4(%1), %3\n" " addib,< 0, %2, 1b\n" <---
While my understanding of HPPA assembler is limited, it does not seem to make much sense to subtract 0 from a register and to expect the result to ever be negative. Subtracting 1 from the length parameter makes more sense. On top of that, the operation should be repeated if and only if the result is still > 0, so change the suspicious instruction to " addib,> -1, %2, 1b\n"
The IP checksum unit test passes after this change.
Cc: Palmer Dabbelt <palmer@rivosinc.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
aaff74d8 | 09-Feb-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
work around gcc bugs with 'asm goto' with outputs
commit 4356e9f841f7fbb945521cef3577ba394c65f3fc upstream.
We've had issues with gcc and 'asm goto' before, and we created a 'asm_volatile_goto()' m
work around gcc bugs with 'asm goto' with outputs
commit 4356e9f841f7fbb945521cef3577ba394c65f3fc upstream.
We've had issues with gcc and 'asm goto' before, and we created a 'asm_volatile_goto()' macro for that in the past: see commits 3f0116c3238a ("compiler/gcc4: Add quirk for 'asm goto' miscompilation bug") and a9f180345f53 ("compiler/gcc4: Make quirk for asm_volatile_goto() unconditional").
Then, much later, we ended up removing the workaround in commit 43c249ea0b1e ("compiler-gcc.h: remove ancient workaround for gcc PR 58670") because we no longer supported building the kernel with the affected gcc versions, but we left the macro uses around.
Now, Sean Christopherson reports a new version of a very similar problem, which is fixed by re-applying that ancient workaround. But the problem in question is limited to only the 'asm goto with outputs' cases, so instead of re-introducing the old workaround as-is, let's rename and limit the workaround to just that much less common case.
It looks like there are at least two separate issues that all hit in this area:
(a) some versions of gcc don't mark the asm goto as 'volatile' when it has outputs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420
which is easy to work around by just adding the 'volatile' by hand.
(b) Internal compiler errors:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422
which are worked around by adding the extra empty 'asm' as a barrier, as in the original workaround.
but the problem Sean sees may be a third thing since it involves bad code generation (not an ICE) even with the manually added 'volatile'.
but the same old workaround works for this case, even if this feels a bit like voodoo programming and may only be hiding the issue.
Reported-and-tested-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/ Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Andrew Pinski <quic_apinski@quicinc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
5bdd0ced | 27-Nov-2023 |
Helge Deller <deller@gmx.de> |
parisc: Fix asm operand number out of range build error in bug table
[ Upstream commit 487635756198cad563feb47539c6a37ea57f1dae ]
Build is broken if CONFIG_DEBUG_BUGVERBOSE=n. Fix it be using the c
parisc: Fix asm operand number out of range build error in bug table
[ Upstream commit 487635756198cad563feb47539c6a37ea57f1dae ]
Build is broken if CONFIG_DEBUG_BUGVERBOSE=n. Fix it be using the correct asm operand number.
Signed-off-by: Helge Deller <deller@gmx.de> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Fixes: fe76a1349f23 ("parisc: Use natural CPU alignment for bug_table") Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
405870b8 | 20-Nov-2023 |
Helge Deller <deller@gmx.de> |
parisc: Mark jump_table naturally aligned
commit 07eecff8ae78df7f28800484d31337e1f9bfca3a upstream.
The jump_table stores two 32-bit words and one 32- (on 32-bit kernel) or one 64-bit word (on 64-b
parisc: Mark jump_table naturally aligned
commit 07eecff8ae78df7f28800484d31337e1f9bfca3a upstream.
The jump_table stores two 32-bit words and one 32- (on 32-bit kernel) or one 64-bit word (on 64-bit kernel). Ensure that the last word is always 64-bit aligned on a 64-bit kernel by aligning the whole structure on sizeof(long).
Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
e0837630 | 23-Nov-2023 |
Helge Deller <deller@gmx.de> |
parisc: Drop the HP-UX ENOSYM and EREMOTERELEASE error codes
commit e5f3e299a2b1e9c3ece24a38adfc089aef307e8a upstream.
Those return codes are only defined for the parisc architecture and are leftov
parisc: Drop the HP-UX ENOSYM and EREMOTERELEASE error codes
commit e5f3e299a2b1e9c3ece24a38adfc089aef307e8a upstream.
Those return codes are only defined for the parisc architecture and are leftovers from when we wanted to be HP-UX compatible.
They are not returned by any Linux kernel syscall but do trigger problems with the glibc strerrorname_np() and strerror() functions as reported in glibc issue #31080.
There is no need to keep them, so simply remove them.
Signed-off-by: Helge Deller <deller@gmx.de> Reported-by: Bruno Haible <bruno@clisp.org> Closes: https://sourceware.org/bugzilla/show_bug.cgi?id=31080 Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
fb997f16 | 25-Nov-2023 |
Helge Deller <deller@gmx.de> |
parisc: Mark lock_aligned variables 16-byte aligned on SMP
commit b28fc0d8739c03e7b6c44914a9d00d4c6dddc0ea upstream.
On parisc we need 16-byte alignment for variables which are used for locking. Ma
parisc: Mark lock_aligned variables 16-byte aligned on SMP
commit b28fc0d8739c03e7b6c44914a9d00d4c6dddc0ea upstream.
On parisc we need 16-byte alignment for variables which are used for locking. Mark the __lock_aligned attribute acordingly so that the .data..lock_aligned section will get that alignment in the generated object files.
Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
d68cafc2 | 20-Nov-2023 |
Helge Deller <deller@gmx.de> |
parisc: Use natural CPU alignment for bug_table
commit fe76a1349f235969381832c83d703bc911021eb6 upstream.
Make sure that the __bug_table section gets 32- or 64-bit aligned, depending if a 32- or 64
parisc: Use natural CPU alignment for bug_table
commit fe76a1349f235969381832c83d703bc911021eb6 upstream.
Make sure that the __bug_table section gets 32- or 64-bit aligned, depending if a 32- or 64-bit kernel is being built. Mark it non-writeable and use .blockz instead of the .org assembler directive to pad the struct.
Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
b8ee167d | 20-Nov-2023 |
Helge Deller <deller@gmx.de> |
parisc: Mark ex_table entries 32-bit aligned in uaccess.h
commit a80aeb86542a50aa8521729ea4cc731ee7174f03 upstream.
Add an align statement to tell the linker that all ex_table entries and as such t
parisc: Mark ex_table entries 32-bit aligned in uaccess.h
commit a80aeb86542a50aa8521729ea4cc731ee7174f03 upstream.
Add an align statement to tell the linker that all ex_table entries and as such the whole ex_table section should be 32-bit aligned in vmlinux and modules.
Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|