Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32 |
|
#
febe950d |
| 31-May-2023 |
Peter Zijlstra <peterz@infradead.org> |
arch: Remove cmpxchg_double
No moar users, remove the monster.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mar
arch: Remove cmpxchg_double
No moar users, remove the monster.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.991907085@infradead.org
show more ...
|
#
b23e139d |
| 31-May-2023 |
Peter Zijlstra <peterz@infradead.org> |
arch: Introduce arch_{,try_}_cmpxchg128{,_local}()
For all architectures that currently support cmpxchg_double() implement the cmpxchg128() family of functions that is basically the same but with a
arch: Introduce arch_{,try_}_cmpxchg128{,_local}()
For all architectures that currently support cmpxchg_double() implement the cmpxchg128() family of functions that is basically the same but with a saner interface.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.452120708@infradead.org
show more ...
|
Revision tags: v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8 |
|
#
06855063 |
| 18-Jan-2023 |
Andrzej Hajda <andrzej.hajda@intel.com> |
locking/arch: Rename all internal __xchg() names to __arch_xchg()
Decrease the probability of this internal facility to be used by driver code.
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com
locking/arch: Rename all internal __xchg() names to __arch_xchg()
Decrease the probability of this internal facility to be used by driver code.
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Palmer Dabbelt <palmer@rivosinc.com> [riscv] Link: https://lore.kernel.org/r/20230118154450.73842-1-andrzej.hajda@intel.com Cc: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15 |
|
#
3364c6ce |
| 12-Jan-2022 |
Kees Cook <keescook@chromium.org> |
arm64: atomics: lse: Dereference matching size
When building with -Warray-bounds, the following warning is generated:
In file included from ./arch/arm64/include/asm/lse.h:16, from
arm64: atomics: lse: Dereference matching size
When building with -Warray-bounds, the following warning is generated:
In file included from ./arch/arm64/include/asm/lse.h:16, from ./arch/arm64/include/asm/cmpxchg.h:14, from ./arch/arm64/include/asm/atomic.h:16, from ./include/linux/atomic.h:7, from ./include/asm-generic/bitops/atomic.h:5, from ./arch/arm64/include/asm/bitops.h:25, from ./include/linux/bitops.h:33, from ./include/linux/kernel.h:22, from kernel/printk/printk.c:22: ./arch/arm64/include/asm/atomic_lse.h:247:9: warning: array subscript 'long unsigned int[0]' is partly outside array bounds of 'atomic_t[1]' [-Warray-bounds] 247 | asm volatile( \ | ^~~ ./arch/arm64/include/asm/atomic_lse.h:266:1: note: in expansion of macro '__CMPXCHG_CASE' 266 | __CMPXCHG_CASE(w, , acq_, 32, a, "memory") | ^~~~~~~~~~~~~~ kernel/printk/printk.c:3606:17: note: while referencing 'printk_cpulock_owner' 3606 | static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1); | ^~~~~~~~~~~~~~~~~~~~
This is due to the compiler seeing an unsigned long * cast against something (atomic_t) that is int sized. Replace the cast with the matching size cast. This results in no change in binary output.
Note that __ll_sc__cmpxchg_case_##name##sz already uses the same constraint:
[v] "+Q" (*(u##sz *)ptr
Which is why only the LSE form needs updating and not the LL/SC form, so this change is unlikely to be problematic.
Cc: Will Deacon <will@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220112202259.3950286-1-keescook@chromium.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
Revision tags: v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14, v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21, v5.4.20, v5.4.19, v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13, v5.4.12, v5.4.11, v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6, v5.4.5, v5.4.4, v5.4.3, v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13, v5.3.12, v5.3.11, v5.3.10, v5.3.9, v5.3.8, v5.3.7, v5.3.6, v5.3.5, v5.3.4, v5.3.3, v5.3.2, v5.3.1, v5.3 |
|
#
920fdab7 |
| 10-Sep-2019 |
Arnd Bergmann <arnd@arndb.de> |
arm64: fix unreachable code issue with cmpxchg
On arm64 build with clang, sometimes the __cmpxchg_mb is not inlined when CONFIG_OPTIMIZE_INLINING is set. Clang then fails a compile-time assertion, b
arm64: fix unreachable code issue with cmpxchg
On arm64 build with clang, sometimes the __cmpxchg_mb is not inlined when CONFIG_OPTIMIZE_INLINING is set. Clang then fails a compile-time assertion, because it cannot tell at compile time what the size of the argument is:
mm/memcontrol.o: In function `__cmpxchg_mb': memcontrol.c:(.text+0x1a4c): undefined reference to `__compiletime_assert_175' memcontrol.c:(.text+0x1a4c): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `__compiletime_assert_175'
Mark all of the cmpxchg() style functions as __always_inline to ensure that the compiler can see the result.
Acked-by: Nick Desaulniers <ndesaulniers@google.com> Reported-by: Nathan Chancellor <natechancellor@gmail.com> Link: https://github.com/ClangBuiltLinux/linux/issues/648 Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Andrew Murray <andrew.murray@arm.com> Tested-by: Andrew Murray <andrew.murray@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Will Deacon <will@kernel.org>
show more ...
|
Revision tags: v5.2.14, v5.3-rc8, v5.2.13, v5.2.12 |
|
#
5aad6cda |
| 29-Aug-2019 |
Will Deacon <will@kernel.org> |
arm64: atomics: Undefine internal macros after use
We use a bunch of internal macros when constructing our atomic and cmpxchg routines in order to save on boilerplate. Avoid exposing these directly
arm64: atomics: Undefine internal macros after use
We use a bunch of internal macros when constructing our atomic and cmpxchg routines in order to save on boilerplate. Avoid exposing these directly to users of the header files.
Reviewed-by: Andrew Murray <andrew.murray@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
show more ...
|
#
0533f97b |
| 29-Aug-2019 |
Will Deacon <will@kernel.org> |
arm64: asm: Kill 'asm/atomic_arch.h'
The contents of 'asm/atomic_arch.h' can be split across some of our other 'asm/' headers. Remove it.
Reviewed-by: Andrew Murray <andrew.murray@arm.com> Signed-o
arm64: asm: Kill 'asm/atomic_arch.h'
The contents of 'asm/atomic_arch.h' can be split across some of our other 'asm/' headers. Remove it.
Reviewed-by: Andrew Murray <andrew.murray@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
show more ...
|
Revision tags: v5.2.11 |
|
#
addfc386 |
| 28-Aug-2019 |
Andrew Murray <andrew.murray@arm.com> |
arm64: atomics: avoid out-of-line ll/sc atomics
When building for LSE atomics (CONFIG_ARM64_LSE_ATOMICS), if the hardware or toolchain doesn't support it the existing code will fallback to ll/sc ato
arm64: atomics: avoid out-of-line ll/sc atomics
When building for LSE atomics (CONFIG_ARM64_LSE_ATOMICS), if the hardware or toolchain doesn't support it the existing code will fallback to ll/sc atomics. It achieves this by branching from inline assembly to a function that is built with special compile flags. Further this results in the clobbering of registers even when the fallback isn't used increasing register pressure.
Improve this by providing inline implementations of both LSE and ll/sc and use a static key to select between them, which allows for the compiler to generate better atomics code. Put the LL/SC fallback atomics in their own subsection to improve icache performance.
Signed-off-by: Andrew Murray <andrew.murray@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
show more ...
|
Revision tags: v5.2.10, v5.2.9, v5.2.8, v5.2.7, v5.2.6, v5.2.5, v5.2.4, v5.2.3, v5.2.2, v5.2.1, v5.2, v5.1.16, v5.1.15, v5.1.14, v5.1.13, v5.1.12, v5.1.11, v5.1.10, v5.1.9, v5.1.8, v5.1.7 |
|
#
caab277b |
| 03-Jun-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of th
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 503 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v5.1.6, v5.1.5, v5.1.4, v5.1.3, v5.1.2, v5.1.1, v5.0.14, v5.1, v5.0.13, v5.0.12, v5.0.11, v5.0.10, v5.0.9, v5.0.8, v5.0.7, v5.0.6, v5.0.5, v5.0.4, v5.0.3, v4.19.29, v5.0.2, v4.19.28, v5.0.1, v4.19.27, v5.0, v4.19.26, v4.19.25, v4.19.24, v4.19.23, v4.19.22, v4.19.21, v4.19.20, v4.19.19, v4.19.18, v4.19.17, v4.19.16, v4.19.15, v4.19.14, v4.19.13, v4.19.12, v4.19.11, v4.19.10, v4.19.9, v4.19.8, v4.19.7, v4.19.6, v4.19.5, v4.19.4, v4.18.20, v4.19.3, v4.18.19, v4.19.2, v4.18.18, v4.18.17, v4.19.1, v4.19, v4.18.16, v4.18.15, v4.18.14, v4.18.13, v4.18.12, v4.18.11, v4.18.10, v4.18.9 |
|
#
b4f9209b |
| 13-Sep-2018 |
Will Deacon <will.deacon@arm.com> |
arm64: Avoid masking "old" for LSE cmpxchg() implementation
The CAS instructions implicitly access only the relevant bits of the "old" argument, so there is no need for explicit masking via type-cas
arm64: Avoid masking "old" for LSE cmpxchg() implementation
The CAS instructions implicitly access only the relevant bits of the "old" argument, so there is no need for explicit masking via type-casting as there is in the LL/SC implementation.
Move the casting into the LL/SC code and remove it altogether for the LSE implementation.
Signed-off-by: Will Deacon <will.deacon@arm.com>
show more ...
|
#
5ef3fe4c |
| 13-Sep-2018 |
Will Deacon <will.deacon@arm.com> |
arm64: Avoid redundant type conversions in xchg() and cmpxchg()
Our atomic instructions (either LSE atomics of LDXR/STXR sequences) natively support byte, half-word, word and double-word memory acce
arm64: Avoid redundant type conversions in xchg() and cmpxchg()
Our atomic instructions (either LSE atomics of LDXR/STXR sequences) natively support byte, half-word, word and double-word memory accesses so there is no need to mask the data register prior to being stored.
Signed-off-by: Will Deacon <will.deacon@arm.com>
show more ...
|
Revision tags: v4.18.7, v4.18.6 |
|
#
c0df1081 |
| 04-Sep-2018 |
Mark Rutland <mark.rutland@arm.com> |
arm64, locking/atomics: Use instrumented atomics
Now that the generic atomic headers provide instrumented wrappers of all the atomics implemented by arm64, let's migrate arm64 over to these.
The ad
arm64, locking/atomics: Use instrumented atomics
Now that the generic atomic headers provide instrumented wrappers of all the atomics implemented by arm64, let's migrate arm64 over to these.
The additional instrumentation will help to find bugs (e.g. when fuzzing with Syzkaller).
Mostly this change involves adding an arch_ prefix to a number of function names and macro definitions. When LSE atomics are used, the out-of-line LL/SC atomics will be named __ll_sc_arch_atomic_${OP}.
Adding the arch_ prefix requires some whitespace fixups to keep things aligned. Some other unusual whitespace is fixed up at the same time (e.g. in the cmpxchg wrappers).
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linuxdrivers@attotech.com Cc: dvyukov@google.com Cc: boqun.feng@gmail.com Cc: arnd@arndb.de Cc: aryabinin@virtuozzo.com Cc: glider@google.com Link: http://lkml.kernel.org/r/20180904104830.2975-7-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
Revision tags: v4.18.5, v4.17.18, v4.18.4, v4.18.3, v4.17.17, v4.18.2, v4.17.16, v4.17.15, v4.18.1, v4.18, v4.17.14, v4.17.13, v4.17.12, v4.17.11, v4.17.10, v4.17.9, v4.17.8, v4.17.7, v4.17.6, v4.17.5, v4.17.4, v4.17.3, v4.17.2, v4.17.1, v4.17 |
|
#
1cfc63b5 |
| 30-Apr-2018 |
Will Deacon <will.deacon@arm.com> |
arm64: cmpwait: Clear event register before arming exclusive monitor
When waiting for a cacheline to change state in cmpwait, we may immediately wake-up the first time around the outer loop if the e
arm64: cmpwait: Clear event register before arming exclusive monitor
When waiting for a cacheline to change state in cmpwait, we may immediately wake-up the first time around the outer loop if the event register was already set (for example, because of the event stream).
Avoid these spurious wakeups by explicitly clearing the event register before loading the cacheline and setting the exclusive monitor.
Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
Revision tags: v4.16 |
|
#
2a58fca9 |
| 27-Feb-2018 |
Will Deacon <will.deacon@arm.com> |
arm64: cmpxchg: Include linux/compiler.h in asm/cmpxchg.h
We need linux/compiler.h for unreachable(), so #include it here.
Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deaco
arm64: cmpxchg: Include linux/compiler.h in asm/cmpxchg.h
We need linux/compiler.h for unreachable(), so #include it here.
Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
show more ...
|
#
c9406e51 |
| 27-Feb-2018 |
Will Deacon <will.deacon@arm.com> |
arm64: move percpu cmpxchg implementation from cmpxchg.h to percpu.h
We want to avoid pulling linux/preempt.h into cmpxchg.h, since that can introduce a circular dependency on linux/bitops.h. linux/
arm64: move percpu cmpxchg implementation from cmpxchg.h to percpu.h
We want to avoid pulling linux/preempt.h into cmpxchg.h, since that can introduce a circular dependency on linux/bitops.h. linux/preempt.h is only needed by the per-cpu cmpxchg implementation, which is better off alongside the per-cpu xchg implementation in percpu.h, so move it there and add the missing #include.
Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
show more ...
|
#
e8a2d040 |
| 19-Feb-2018 |
Will Deacon <will.deacon@arm.com> |
arm64: cmpxchg: Include build_bug.h instead of bug.h for BUILD_BUG
Having asm/cmpxchg.h pull in linux/bug.h is problematic because this ends up pulling in the atomic bitops which themselves may be b
arm64: cmpxchg: Include build_bug.h instead of bug.h for BUILD_BUG
Having asm/cmpxchg.h pull in linux/bug.h is problematic because this ends up pulling in the atomic bitops which themselves may be built on top of atomic.h and cmpxchg.h.
Instead, just include build_bug.h for the definition of BUILD_BUG.
Signed-off-by: Will Deacon <will.deacon@arm.com>
show more ...
|
Revision tags: v4.15, v4.13.16, v4.14, v4.13.5, v4.13, v4.12, v4.10.17, v4.10.16, v4.10.15, v4.10.14 |
|
#
fee960be |
| 03-May-2017 |
Mark Rutland <mark.rutland@arm.com> |
arm64: xchg: hazard against entire exchange variable
The inline assembly in __XCHG_CASE() uses a +Q constraint to hazard against other accesses to the memory location being exchanged. However, the p
arm64: xchg: hazard against entire exchange variable
The inline assembly in __XCHG_CASE() uses a +Q constraint to hazard against other accesses to the memory location being exchanged. However, the pointer passed to the constraint is a u8 pointer, and thus the hazard only applies to the first byte of the location.
GCC can take advantage of this, assuming that other portions of the location are unchanged, as demonstrated with the following test case:
union u { unsigned long l; unsigned int i[2]; };
unsigned long update_char_hazard(union u *u) { unsigned int a, b;
a = u->i[1]; asm ("str %1, %0" : "+Q" (*(char *)&u->l) : "r" (0UL)); b = u->i[1];
return a ^ b; }
unsigned long update_long_hazard(union u *u) { unsigned int a, b;
a = u->i[1]; asm ("str %1, %0" : "+Q" (*(long *)&u->l) : "r" (0UL)); b = u->i[1];
return a ^ b; }
The linaro 15.08 GCC 5.1.1 toolchain compiles the above as follows when using -O2 or above:
0000000000000000 <update_char_hazard>: 0: d2800001 mov x1, #0x0 // #0 4: f9000001 str x1, [x0] 8: d2800000 mov x0, #0x0 // #0 c: d65f03c0 ret
0000000000000010 <update_long_hazard>: 10: b9400401 ldr w1, [x0,#4] 14: d2800002 mov x2, #0x0 // #0 18: f9000002 str x2, [x0] 1c: b9400400 ldr w0, [x0,#4] 20: 4a000020 eor w0, w1, w0 24: d65f03c0 ret
This patch fixes the issue by passing an unsigned long pointer into the +Q constraint, as we do for our cmpxchg code. This may hazard against more than is necessary, but this is better than missing a necessary hazard.
Fixes: 305d454aaa29 ("arm64: atomics: implement native {relaxed, acquire, release} atomics") Cc: <stable@vger.kernel.org> # 4.4.x- Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
Revision tags: v4.10.13, v4.10.12, v4.10.11, v4.10.10, v4.10.9, v4.10.8, v4.10.7, v4.10.6, v4.10.5, v4.10.4, v4.10.3, v4.10.2, v4.10.1, v4.10, v4.9, openbmc-4.4-20161121-1, v4.4.33, v4.4.32, v4.4.31, v4.4.30, v4.4.29, v4.4.28, v4.4.27, v4.7.10, openbmc-4.4-20161021-1, v4.7.9, v4.4.26, v4.7.8, v4.4.25, v4.4.24, v4.7.7, v4.8, v4.4.23, v4.7.6, v4.7.5, v4.4.22, v4.4.21, v4.7.4, v4.7.3, v4.4.20 |
|
#
05492f2f |
| 06-Sep-2016 |
Will Deacon <will.deacon@arm.com> |
arm64: lse: convert lse alternatives NOP padding to use __nops
The LSE atomics are implemented using alternative code sequences of different lengths, and explicit NOP padding is used to ensure the p
arm64: lse: convert lse alternatives NOP padding to use __nops
The LSE atomics are implemented using alternative code sequences of different lengths, and explicit NOP padding is used to ensure the patching works correctly.
This patch converts the bulk of the LSE code over to using the __nops macro, which makes it slightly clearer as to what is going on and also consolidates all of the padding at the end of the various sequences.
Signed-off-by: Will Deacon <will.deacon@arm.com>
show more ...
|
Revision tags: v4.7.2, v4.4.19, openbmc-4.4-20160819-1, v4.7.1, v4.4.18, v4.4.17, openbmc-4.4-20160804-1, v4.4.16, v4.7, openbmc-4.4-20160722-1, openbmc-20160722-1, openbmc-20160713-1, v4.4.15, v4.6.4 |
|
#
03e3c2b7 |
| 27-Jun-2016 |
Will Deacon <will.deacon@arm.com> |
locking/barriers, arch/arm64: Implement LDXR+WFE based smp_cond_load_acquire()
smp_cond_load_acquire() is used to spin on a variable until some expression involving that variable becomes true.
On a
locking/barriers, arch/arm64: Implement LDXR+WFE based smp_cond_load_acquire()
smp_cond_load_acquire() is used to spin on a variable until some expression involving that variable becomes true.
On arm64, we can build this using the LDXR and WFE instructions, since clearing of the exclusive monitor as a result of the variable being changed by another CPU generates an event, which will wake us up out of WFE.
This patch implements smp_cond_load_acquire() using LDXR and WFE, which themselves are contained in an internal __cmpwait() function.
Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: catalin.marinas@arm.com Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1467049434-30451-1-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
Revision tags: v4.6.3, v4.4.14, v4.6.2, v4.4.13, openbmc-20160606-1, v4.6.1, v4.4.12, openbmc-20160521-1, v4.4.11, openbmc-20160518-1, v4.6, v4.4.10, openbmc-20160511-1, openbmc-20160505-1, v4.4.9, v4.4.8, v4.4.7, openbmc-20160329-2, openbmc-20160329-1, openbmc-20160321-1, v4.4.6, v4.5, v4.4.5, v4.4.4, v4.4.3, openbmc-20160222-1, v4.4.2, openbmc-20160212-1, openbmc-20160210-1, openbmc-20160202-2, openbmc-20160202-1, v4.4.1, openbmc-20160127-1, openbmc-20160120-1, v4.4, openbmc-20151217-1 |
|
#
4a6ccf30 |
| 10-Dec-2015 |
Mark Brown <broonie@kernel.org> |
arm64: cmpxchg: Don't incldue linux/mmdebug.h
The arm64 asm/cmpxchg.h includes linux/mmdebug.h but doesn't so far as I can tell actually use anything from it. Removing the inclusion reduces spuriou
arm64: cmpxchg: Don't incldue linux/mmdebug.h
The arm64 asm/cmpxchg.h includes linux/mmdebug.h but doesn't so far as I can tell actually use anything from it. Removing the inclusion reduces spurious header dependency rebuilds and also avoids issues with recursive inclusions of headers causing build breaks due to attempts to use things before they are defined if linux/mmdebug.h starts pulling in more low level headers.
Such errors have happened in -next recently, for example:
In file included from include/linux/completion.h:11:0, from include/linux/rcupdate.h:43, from include/linux/tracepoint.h:19, from include/linux/mmdebug.h:6, from ./arch/arm64/include/asm/cmpxchg.h:22, from ./arch/arm64/include/asm/atomic.h:41, from include/linux/atomic.h:4, from include/linux/spinlock.h:406, from include/linux/seqlock.h:35, from include/linux/time.h:5, from include/uapi/linux/timex.h:56, from include/linux/timex.h:56, from include/linux/sched.h:19, from arch/arm64/kernel/asm-offsets.c:21: include/linux/wait.h: In function 'wait_on_atomic_t': include/linux/wait.h:1218:2: error: implicit declaration of function 'atomic_read' [-Werror=implicit-function-declaration] if (atomic_read(val) == 0)
Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
show more ...
|
Revision tags: openbmc-20151210-1, openbmc-20151202-1, openbmc-20151123-1, openbmc-20151118-1, openbmc-20151104-1, v4.3, openbmc-20151102-1, openbmc-20151028-1 |
|
#
305d454a |
| 08-Oct-2015 |
Will Deacon <will.deacon@arm.com> |
arm64: atomics: implement native {relaxed, acquire, release} atomics
Commit 654672d4ba1a ("locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operation") introduced a relaxed
arm64: atomics: implement native {relaxed, acquire, release} atomics
Commit 654672d4ba1a ("locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operation") introduced a relaxed atomic API to Linux that maps nicely onto the arm64 memory model, including the new ARMv8.1 atomic instructions.
This patch hooks up the API to our relaxed atomic instructions, rather than have them all expand to the full-barrier variants as they do currently.
Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
Revision tags: v4.3-rc1, v4.2, v4.2-rc8, v4.2-rc7, v4.2-rc6, v4.2-rc5 |
|
#
a14949e0 |
| 30-Jul-2015 |
Will Deacon <will.deacon@arm.com> |
arm64: cmpxchg: truncate sub-word signed types before comparison
When performing a cmpxchg operation on a signed sub-word type (e.g. s8), we need to ensure that the upper register bits of the "old"
arm64: cmpxchg: truncate sub-word signed types before comparison
When performing a cmpxchg operation on a signed sub-word type (e.g. s8), we need to ensure that the upper register bits of the "old" value used for comparison are zeroed, otherwise we may erroneously fail the cmpxchg which may even be interpreted as success by the caller (if the compiler performs the truncation as part of its check). This has been observed in mod_state, where negative values where causing problems with this_cpu_cmpxchg.
This patch fixes the issue by explicitly casting 8-bit and 16-bit "old" values using unsigned types in our cmpxchg wrappers. 32-bit types can be left alone, since the underlying asm makes use of W registers in this case.
Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
show more ...
|
Revision tags: v4.2-rc4, v4.2-rc3, v4.2-rc2, v4.2-rc1, v4.1, v4.1-rc8, v4.1-rc7, v4.1-rc6 |
|
#
0ea366f5 |
| 29-May-2015 |
Will Deacon <will.deacon@arm.com> |
arm64: atomics: prefetch the destination word for write prior to stxr
The cost of changing a cacheline from shared to exclusive state can be significant, especially when this is triggered by an excl
arm64: atomics: prefetch the destination word for write prior to stxr
The cost of changing a cacheline from shared to exclusive state can be significant, especially when this is triggered by an exclusive store, since it may result in having to retry the transaction.
This patch makes use of prfm to prefetch cachelines for write prior to ldxr/stxr loops when using the ll/sc atomic routines.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
show more ...
|
Revision tags: v4.1-rc5, v4.1-rc4 |
|
#
e9a4b795 |
| 14-May-2015 |
Will Deacon <will.deacon@arm.com> |
arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc seq
arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences.
This patch introduces runtime patching of our cmpxchg_double primitives so that the LSE casp instruction is used instead.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
show more ...
|
Revision tags: v4.1-rc3, v4.1-rc2, v4.1-rc1 |
|
#
c342f782 |
| 23-Apr-2015 |
Will Deacon <will.deacon@arm.com> |
arm64: cmpxchg: patch in lse instructions when supported by the CPU
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequenc
arm64: cmpxchg: patch in lse instructions when supported by the CPU
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences.
This patch introduces runtime patching of our cmpxchg primitives so that the LSE cas instruction is used instead.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
show more ...
|