Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69 |
|
#
3cea8d47 |
| 17-Sep-2022 |
Yury Norov <yury.norov@gmail.com> |
lib: add find_nth{,_and,_andnot}_bit()
Kernel lacks for a function that searches for Nth bit in a bitmap. Usually people do it like this: for_each_set_bit(bit, mask, size) if (n-- == 0) return
lib: add find_nth{,_and,_andnot}_bit()
Kernel lacks for a function that searches for Nth bit in a bitmap. Usually people do it like this: for_each_set_bit(bit, mask, size) if (n-- == 0) return bit;
We can do it more efficiently, if we: 1. find a word containing Nth bit, using hweight(); and 2. find the bit, using a helper fns(), that works similarly to __ffs() and ffz().
fns() is implemented as a simple loop. For x86_64, there's PDEP instruction to do that: ret = clz(pdep(1 << idx, num)). However, for large bitmaps the most of improvement comes from using hweight(), so I kept fns() simple.
New find_nth_bit() is ~70 times faster on x86_64/kvm in find_bit benchmark: find_nth_bit: 7154190 ns, 16411 iterations for_each_bit: 505493126 ns, 16315 iterations
With all that, a family of 3 new functions is added, and used where appropriate in the following patches.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
show more ...
|
Revision tags: v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63 |
|
#
e1d7c760 |
| 22-Aug-2022 |
Uros Bizjak <ubizjak@gmail.com> |
bitops: use try_cmpxchg in set_mask_bits and bit_clear_unless
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in set_mask_bits and bit_clear_unless. x86 CMPXCHG instruction returns succe
bitops: use try_cmpxchg in set_mask_bits and bit_clear_unless
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in set_mask_bits and bit_clear_unless. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg).
Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg fails, enabling further code simplifications.
Link: https://lkml.kernel.org/r/20220822143851.3290-1-ubizjak@gmail.com Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
8238b457 |
| 26-Aug-2022 |
Mikulas Patocka <mpatocka@redhat.com> |
wait_on_bit: add an acquire memory barrier
There are several places in the kernel where wait_on_bit is not followed by a memory barrier (for example, in drivers/md/dm-bufio.c:new_read).
On architec
wait_on_bit: add an acquire memory barrier
There are several places in the kernel where wait_on_bit is not followed by a memory barrier (for example, in drivers/md/dm-bufio.c:new_read).
On architectures with weak memory ordering, it may happen that memory accesses that follow wait_on_bit are reordered before wait_on_bit and they may return invalid data.
Fix this class of bugs by introducing a new function "test_bit_acquire" that works like test_bit, but has acquire memory ordering semantics.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Will Deacon <will@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50 |
|
#
b03fc117 |
| 24-Jun-2022 |
Alexander Lobakin <alexandr.lobakin@intel.com> |
bitops: let optimize out non-atomic bitops on compile-time constants
Currently, many architecture-specific non-atomic bitop implementations use inline asm or other hacks which are faster or more rob
bitops: let optimize out non-atomic bitops on compile-time constants
Currently, many architecture-specific non-atomic bitop implementations use inline asm or other hacks which are faster or more robust when working with "real" variables (i.e. fields from the structures etc.), but the compilers have no clue how to optimize them out when called on compile-time constants. That said, the following code:
DECLARE_BITMAP(foo, BITS_PER_LONG) = { }; // -> unsigned long foo[1]; unsigned long bar = BIT(BAR_BIT); unsigned long baz = 0;
__set_bit(FOO_BIT, foo); baz |= BIT(BAZ_BIT);
BUILD_BUG_ON(!__builtin_constant_p(test_bit(FOO_BIT, foo)); BUILD_BUG_ON(!__builtin_constant_p(bar & BAR_BIT)); BUILD_BUG_ON(!__builtin_constant_p(baz & BAZ_BIT));
triggers the first assertion on x86_64, which means that the compiler is unable to evaluate it to a compile-time initializer when the architecture-specific bitop is used even if it's obvious. In order to let the compiler optimize out such cases, expand the bitop() macro to use the "constant" C non-atomic bitop implementations when all of the arguments passed are compile-time constants, which means that the result will be a compile-time constant as well, so that it produces more efficient and simple code in 100% cases, comparing to the architecture-specific counterparts.
The savings are architecture, compiler and compiler flags dependent, for example, on x86_64 -O2:
GCC 12: add/remove: 78/29 grow/shrink: 332/525 up/down: 31325/-61560 (-30235) LLVM 13: add/remove: 79/76 grow/shrink: 184/537 up/down: 55076/-141892 (-86816) LLVM 14: add/remove: 10/3 grow/shrink: 93/138 up/down: 3705/-6992 (-3287)
and ARM64 (courtesy of Mark):
GCC 11: add/remove: 92/29 grow/shrink: 933/2766 up/down: 39340/-82580 (-43240) LLVM 14: add/remove: 21/11 grow/shrink: 620/651 up/down: 12060/-15824 (-3764)
Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Marco Elver <elver@google.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
show more ...
|
#
e69eb9c4 |
| 24-Jun-2022 |
Alexander Lobakin <alexandr.lobakin@intel.com> |
bitops: wrap non-atomic bitops with a transparent macro
In preparation for altering the non-atomic bitops with a macro, wrap them in a transparent definition. This requires prepending one more '_' t
bitops: wrap non-atomic bitops with a transparent macro
In preparation for altering the non-atomic bitops with a macro, wrap them in a transparent definition. This requires prepending one more '_' to their names in order to be able to do that seamlessly. It is a simple change, given that all the non-prefixed definitions are now in asm-generic. sparc32 already has several triple-underscored functions, so I had to rename them ('___' -> 'sp32_').
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
show more ...
|
#
bb7379bf |
| 24-Jun-2022 |
Alexander Lobakin <alexandr.lobakin@intel.com> |
bitops: define const_*() versions of the non-atomics
Define const_*() variants of the non-atomic bitops to be used when the input arguments are compile-time constants, so that the compiler will be a
bitops: define const_*() versions of the non-atomics
Define const_*() variants of the non-atomic bitops to be used when the input arguments are compile-time constants, so that the compiler will be always able to resolve those to compile-time constants as well. Those are mostly direct aliases for generic_*() with one exception for const_test_bit(): the original one is declared atomic-safe and thus doesn't discard the `volatile` qualifier, so in order to let optimize code, define it separately disregarding the qualifier. Add them to the compile-time type checks as well just in case.
Suggested-by: Marco Elver <elver@google.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
show more ...
|
#
0e862838 |
| 24-Jun-2022 |
Alexander Lobakin <alexandr.lobakin@intel.com> |
bitops: unify non-atomic bitops prototypes across architectures
Currently, there is a mess with the prototypes of the non-atomic bitops across the different architectures:
ret bool, int, unsigned l
bitops: unify non-atomic bitops prototypes across architectures
Currently, there is a mess with the prototypes of the non-atomic bitops across the different architectures:
ret bool, int, unsigned long nr int, long, unsigned int, unsigned long addr volatile unsigned long *, volatile void *
Thankfully, it doesn't provoke any bugs, but can sometimes make the compiler angry when it's not handy at all. Adjust all the prototypes to the following standard:
ret bool retval can be only 0 or 1 nr unsigned long native; signed makes no sense addr volatile unsigned long * bitmaps are arrays of ulongs
Next, some architectures don't define 'arch_' versions as they don't support instrumentation, others do. To make sure there is always the same set of callables present and to ease any potential future changes, make them all follow the rule: * architecture-specific files define only 'arch_' versions; * non-prefixed versions can be defined only in asm-generic files; and place the non-prefixed definitions into a new file in asm-generic to be included by non-instrumented architectures.
Finally, add some static assertions in order to prevent people from making a mess in this room again. I also used the %__always_inline attribute consistently, so that they always get resolved to the actual operations.
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
show more ...
|
Revision tags: v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60 |
|
#
bc9d6635 |
| 14-Aug-2021 |
Yury Norov <yury.norov@gmail.com> |
include/linux: move for_each_bit() macros from bitops.h to find.h
for_each_bit() macros depend on find_bit() machinery, and so the proper place for them is the find.h header.
Signed-off-by: Yury No
include/linux: move for_each_bit() macros from bitops.h to find.h
for_each_bit() macros depend on find_bit() machinery, and so the proper place for them is the find.h header.
Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
show more ...
|
Revision tags: v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49 |
|
#
cb0f8003 |
| 02-Jul-2021 |
Kumar Kartikeya Dwivedi <memxor@gmail.com> |
bitops: Add non-atomic bitops for pointers
cpumap needs to set, clear, and test the lowest bit in skb pointer in various places. To make these checks less noisy, add pointer friendly bitop macros th
bitops: Add non-atomic bitops for pointers
cpumap needs to set, clear, and test the lowest bit in skb pointer in various places. To make these checks less noisy, add pointer friendly bitop macros that also do some typechecking to sanitize the argument.
These wrap the non-atomic bitops __set_bit, __clear_bit, and test_bit but for pointer arguments. Pointer's address has to be passed in and it is treated as an unsigned long *, since width and representation of pointer and unsigned long match on targets Linux supports. They are prefixed with double underscore to indicate lack of atomicity.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210702111825.491065-3-memxor@gmail.com
show more ...
|
Revision tags: v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35 |
|
#
2cc7b6a4 |
| 06-May-2021 |
Yury Norov <yury.norov@gmail.com> |
lib: add fast path for find_first_*_bit() and find_last_bit()
Similarly to bitmap functions, users would benefit if we'll handle a case of small-size bitmaps that fit into a single word.
While here
lib: add fast path for find_first_*_bit() and find_last_bit()
Similarly to bitmap functions, users would benefit if we'll handle a case of small-size bitmaps that fit into a single word.
While here, move the find_last_bit() declaration to bitops/find.h where other find_*_bit() functions sit.
Link: https://lkml.kernel.org/r/20210401003153.97325-11-yury.norov@gmail.com Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Alexey Klimov <aklimov@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Sterba <dsterba@suse.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jianpeng Ma <jianpeng.ma@intel.com> Cc: Joe Perches <joe@perches.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Rich Felker <dalias@libc.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101 |
|
#
4945cca2 |
| 25-Feb-2021 |
Geert Uytterhoeven <geert+renesas@glider.be> |
include/linux/bitops.h: spelling s/synomyn/synonym/
Fix a misspelling of "synonym".
Link: https://lkml.kernel.org/r/20210108105305.2028120-1-geert+renesas@glider.be Signed-off-by: Geert Uytterhoeve
include/linux/bitops.h: spelling s/synomyn/synonym/
Fix a misspelling of "synonym".
Link: https://lkml.kernel.org/r/20210108105305.2028120-1-geert+renesas@glider.be Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14 |
|
#
aa6159ab |
| 15-Dec-2020 |
Andy Shevchenko <andriy.shevchenko@linux.intel.com> |
kernel.h: split out mathematical helpers
kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out mathematical helpers.
kernel.h: split out mathematical helpers
kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out mathematical helpers.
At the same time convert users in header and lib folder to use new header. Though for time being include new header back to kernel.h to avoid twisted indirected includes for existing users.
[sfr@canb.auug.org.au: fix powerpc build] Link: https://lkml.kernel.org/r/20201029150809.13059608@canb.auug.org.au
Link: https://lkml.kernel.org/r/20201028173212.41768-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Jeff Layton <jlayton@kernel.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v5.10, v5.8.17, v5.8.16 |
|
#
004fba1a |
| 15-Oct-2020 |
Wei Yang <richard.weiyang@linux.alibaba.com> |
bitops: use the same mechanism for get_count_order[_long]
These two functions share the same logic.
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@li
bitops: use the same mechanism for get_count_order[_long]
These two functions share the same logic.
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lkml.kernel.org/r/20200807085837.11697-3-richard.weiyang@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
a9eb6370 |
| 15-Oct-2020 |
Wei Yang <richard.weiyang@linux.alibaba.com> |
bitops: simplify get_count_order_long()
These two cases could be unified into one.
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.or
bitops: simplify get_count_order_long()
These two cases could be unified into one.
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lkml.kernel.org/r/20200807085837.11697-2-richard.weiyang@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1 |
|
#
bd93f003 |
| 04-Jun-2020 |
Arnd Bergmann <arnd@arndb.de> |
include/linux/bitops.h: avoid clang shift-count-overflow warnings
Clang normally does not warn about certain issues in inline functions when it only happens in an eliminated code path. However if so
include/linux/bitops.h: avoid clang shift-count-overflow warnings
Clang normally does not warn about certain issues in inline functions when it only happens in an eliminated code path. However if something else goes wrong, it does tend to complain about the definition of hweight_long() on 32-bit targets:
include/linux/bitops.h:75:41: error: shift count >= width of type [-Werror,-Wshift-count-overflow] return sizeof(w) == 4 ? hweight32(w) : hweight64(w); ^~~~~~~~~~~~ include/asm-generic/bitops/const_hweight.h:29:49: note: expanded from macro 'hweight64' define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w)) ^~~~~~~~~~~~~~~~~~~~ include/asm-generic/bitops/const_hweight.h:21:76: note: expanded from macro '__const_hweight64' define __const_hweight64(w) (__const_hweight32(w) + __const_hweight32((w) >> 32)) ^ ~~ include/asm-generic/bitops/const_hweight.h:20:49: note: expanded from macro '__const_hweight32' define __const_hweight32(w) (__const_hweight16(w) + __const_hweight16((w) >> 16)) ^ include/asm-generic/bitops/const_hweight.h:19:72: note: expanded from macro '__const_hweight16' define __const_hweight16(w) (__const_hweight8(w) + __const_hweight8((w) >> 8 )) ^ include/asm-generic/bitops/const_hweight.h:12:9: note: expanded from macro '__const_hweight8' (!!((w) & (1ULL << 2))) + \
Adding an explicit cast to __u64 avoids that warning and makes it easier to read other output.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Link: http://lkml.kernel.org/r/20200505135513.65265-1-arnd@arndb.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31 |
|
#
f80ac98a |
| 06-Apr-2020 |
Josh Poimboeuf <jpoimboe@redhat.com> |
bitops: always inline sign extension helpers
With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports:
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b
bitops: always inline sign extension helpers
With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports:
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b7: call to gen8_canonical_addr() with UACCESS enabled
This means i915_gem_execbuffer2_ioctl() is calling gen8_canonical_addr() from the user_access_begin/end critical region (i.e, with SMAP disabled).
While it's probably harmless in this case, in general we like to avoid extra function calls in SMAP-disabled regions because it can open up inadvertent security holes.
Fix the warning by changing the sign extension helpers to __always_inline. This convinces GCC to inline gen8_canonical_addr().
The sign extension functions are trivial anyway, so it makes sense to always inline them. With my test optimize-for-size-based config, this actually shrinks the text size of i915_gem_execbuffer.o by 45 bytes -- and no change for vmlinux.
Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: http://lkml.kernel.org/r/740179324b2b18b750b16295c48357f00b5fa9ed.1582982020.git.jpoimboe@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21, v5.4.20, v5.4.19, v5.4.18 |
|
#
0bddc1bd |
| 03-Feb-2020 |
Yury Norov <yury.norov@gmail.com> |
bitops: more BITS_TO_* macros
Introduce BITS_TO_U64, BITS_TO_U32 and BITS_TO_BYTES as they are handy in the following patches (BITS_TO_U32 specifically). Reimplement tools/ version of the macros ac
bitops: more BITS_TO_* macros
Introduce BITS_TO_U64, BITS_TO_U32 and BITS_TO_BYTES as they are handy in the following patches (BITS_TO_U32 specifically). Reimplement tools/ version of the macros according to the kernel implementation.
Also fix indentation for BITS_PER_TYPE definition.
Link: http://lkml.kernel.org/r/20200102043031.30357-3-yury.norov@gmail.com Signed-off-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Amritha Nambiar <amritha.nambiar@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kees Cook <keescook@chromium.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: "Tobin C . Harding" <tobin@kernel.org> Cc: Vineet Gupta <vineet.gupta1@synopsys.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v5.4.17 |
|
#
dd3e7cba |
| 31-Jan-2020 |
Andy Shevchenko <andriy.shevchenko@linux.intel.com> |
ocfs2/dlm: move BITS_TO_BYTES() to bitops.h for wider use
There are users already and will be more of BITS_TO_BYTES() macro. Move it to bitops.h for wider use.
In the case of ocfs2 the replacement
ocfs2/dlm: move BITS_TO_BYTES() to bitops.h for wider use
There are users already and will be more of BITS_TO_BYTES() macro. Move it to bitops.h for wider use.
In the case of ocfs2 the replacement is identical.
As for bnx2x, there are two places where floor version is used. In the first case to calculate the amount of structures that can fit one memory page. In this case obviously the ceiling variant is correct and original code might have a potential bug, if amount of bits % 8 is not 0. In the second case the macro is used to calculate bytes transmitted in one microsecond. This will work for all speeds which is multiply of 1Gbps without any change, for the rest new code will give ceiling value, for instance 100Mbps will give 13 bytes, while old code gives 12 bytes and the arithmetically correct one is 12.5 bytes. Further the value is used to setup timer threshold which in any case has its own margins due to certain resolution. I don't see here an issue with slightly shifting thresholds for low speed connections, the card is supposed to utilize highest available rate, which is usually 10Gbps.
Link: http://lkml.kernel.org/r/20200108121316.22411-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Acked-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13, v5.4.12, v5.4.11, v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6, v5.4.5, v5.4.4, v5.4.3 |
|
#
169c474f |
| 04-Dec-2019 |
William Breathitt Gray <vilhelm.gray@gmail.com> |
bitops: introduce the for_each_set_clump8 macro
Pach series "Introduce the for_each_set_clump8 macro", v18.
While adding GPIO get_multiple/set_multiple callback support for various drivers, I notic
bitops: introduce the for_each_set_clump8 macro
Pach series "Introduce the for_each_set_clump8 macro", v18.
While adding GPIO get_multiple/set_multiple callback support for various drivers, I noticed a pattern of looping manifesting that would be useful standardized as a macro.
This patchset introduces the for_each_set_clump8 macro and utilizes it in several GPIO drivers. The for_each_set_clump macro8 facilitates a for-loop syntax that iterates over a memory region entire groups of set bits at a time.
For example, suppose you would like to iterate over a 32-bit integer 8 bits at a time, skipping over 8-bit groups with no set bit, where XXXXXXXX represents the current 8-bit group:
Example: 10111110 00000000 11111111 00110011 First loop: 10111110 00000000 11111111 XXXXXXXX Second loop: 10111110 00000000 XXXXXXXX 00110011 Third loop: XXXXXXXX 00000000 11111111 00110011
Each iteration of the loop returns the next 8-bit group that has at least one set bit.
The for_each_set_clump8 macro has four parameters:
* start: set to the bit offset of the current clump * clump: set to the current clump value * bits: bitmap to search within * size: bitmap size in number of bits
In this version of the patchset, the for_each_set_clump macro has been reimplemented and simplified based on the suggestions provided by Rasmus Villemoes and Andy Shevchenko in the version 4 submission.
In particular, the function of the for_each_set_clump macro has been restricted to handle only 8-bit clumps; the drivers that use the for_each_set_clump macro only handle 8-bit ports so a generic for_each_set_clump implementation is not necessary. Thus, a solution for large clumps (i.e. those larger than the width of a bitmap word) can be postponed until a driver appears that actually requires such a generic for_each_set_clump implementation.
For what it's worth, a semi-generic for_each_set_clump (i.e. for clumps smaller than the width of a bitmap word) can be implemented by simply replacing the hardcoded '8' and '0xFF' instances with respective variables. I have not yet had a need for such an implementation, and since it falls short of a true generic for_each_set_clump function, I have decided to forgo such an implementation for now.
In addition, the bitmap_get_value8 and bitmap_set_value8 functions are introduced to get and set 8-bit values respectively. Their use is based on the behavior suggested in the patchset version 4 review.
This patch (of 14):
This macro iterates for each 8-bit group of bits (clump) with set bits, within a bitmap memory region. For each iteration, "start" is set to the bit offset of the found clump, while the respective clump value is stored to the location pointed by "clump". Additionally, the bitmap_get_value8 and bitmap_set_value8 functions are introduced to respectively get and set an 8-bit value in a bitmap memory region.
[gustavo@embeddedor.com: fix potential sign-extension overflow] Link: http://lkml.kernel.org/r/20191015184657.GA26541@embeddedor [akpm@linux-foundation.org: s/ULL/UL/, per Joe] [vilhelm.gray@gmail.com: add for_each_set_clump8 documentation] Link: http://lkml.kernel.org/r/20191016161825.301082-1-vilhelm.gray@gmail.com Link: http://lkml.kernel.org/r/893c3b4f03266c9496137cc98ac2b1bd27f92c73.1570641097.git.vilhelm.gray@gmail.com Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Suggested-by: Lukas Wunner <lukas@wunner.de> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Phil Reid <preid@electromag.com.au> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Mathias Duckeck <m.duckeck@kunbus.de> Cc: Morten Hein Tiljeset <morten.tiljeset@prevas.dk> Cc: Sean Nyekjaer <sean.nyekjaer@prevas.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13, v5.3.12, v5.3.11, v5.3.10, v5.3.9, v5.3.8, v5.3.7, v5.3.6, v5.3.5, v5.3.4, v5.3.3, v5.3.2 |
|
#
f5a1a536 |
| 30-Sep-2019 |
Aleksa Sarai <cyphar@cyphar.com> |
lib: introduce copy_struct_from_user() helper
A common pattern for syscall extensions is increasing the size of a struct passed from userspace, such that the zero-value of the new fields result in t
lib: introduce copy_struct_from_user() helper
A common pattern for syscall extensions is increasing the size of a struct passed from userspace, such that the zero-value of the new fields result in the old kernel behaviour (allowing for a mix of userspace and kernel vintages to operate on one another in most cases).
While this interface exists for communication in both directions, only one interface is straightforward to have reasonable semantics for (userspace passing a struct to the kernel). For kernel returns to userspace, what the correct semantics are (whether there should be an error if userspace is unaware of a new extension) is very syscall-dependent and thus probably cannot be unified between syscalls (a good example of this problem is [1]).
Previously there was no common lib/ function that implemented the necessary extension-checking semantics (and different syscalls implemented them slightly differently or incompletely[2]). Future patches replace common uses of this pattern to make use of copy_struct_from_user().
Some in-kernel selftests that insure that the handling of alignment and various byte patterns are all handled identically to memchr_inv() usage.
[1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and robustify sched_read_attr() ABI logic and code")
[2]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do similar checks to copy_struct_from_user() while rt_sigprocmask(2) always rejects differently-sized struct arguments.
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20191001011055.19283-2-cyphar@cyphar.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
show more ...
|
Revision tags: v5.3.1, v5.3, v5.2.14, v5.3-rc8, v5.2.13, v5.2.12, v5.2.11, v5.2.10, v5.2.9, v5.2.8, v5.2.7, v5.2.6, v5.2.5, v5.2.4, v5.2.3, v5.2.2, v5.2.1, v5.2, v5.1.16, v5.1.15, v5.1.14, v5.1.13, v5.1.12, v5.1.11, v5.1.10, v5.1.9, v5.1.8, v5.1.7, v5.1.6, v5.1.5, v5.1.4, v5.1.3 |
|
#
ef4d6f6b |
| 14-May-2019 |
Rasmus Villemoes <linux@rasmusvillemoes.dk> |
include/linux/bitops.h: sanitize rotate primitives
The ror32 implementation (word >> shift) | (word << (32 - shift) has undefined behaviour if shift is outside the [1, 31] range. Similarly for the
include/linux/bitops.h: sanitize rotate primitives
The ror32 implementation (word >> shift) | (word << (32 - shift) has undefined behaviour if shift is outside the [1, 31] range. Similarly for the 64 bit variants. Most callers pass a compile-time constant (naturally in that range), but there's an UBSAN report that these may actually be called with a shift count of 0.
Instead of special-casing that, we can make them DTRT for all values of shift while also avoiding UB. For some reason, this was already partly done for rol32 (which was well-defined for [0, 31]). gcc 8 recognizes these patterns as rotates, so for example
__u32 rol32(__u32 word, unsigned int shift) { return (word << (shift & 31)) | (word >> ((-shift) & 31)); }
compiles to
0000000000000020 <rol32>: 20: 89 f8 mov %edi,%eax 22: 89 f1 mov %esi,%ecx 24: d3 c0 rol %cl,%eax 26: c3 retq
Older compilers unfortunately do not do as well, but this only affects the small minority of users that don't pass constants.
Due to integer promotions, ro[lr]8 were already well-defined for shifts in [0, 8], and ro[lr]16 were mostly well-defined for shifts in [0, 16] (only mostly - u16 gets promoted to _signed_ int, so if bit 15 is set, word << 16 is undefined). For consistency, update those as well.
Link: http://lkml.kernel.org/r/20190410211906.2190-1-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reported-by: Ido Schimmel <idosch@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Cc: Vadim Pasternak <vadimp@mellanox.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v5.1.2, v5.1.1, v5.0.14, v5.1, v5.0.13, v5.0.12, v5.0.11, v5.0.10, v5.0.9, v5.0.8, v5.0.7, v5.0.6, v5.0.5, v5.0.4, v5.0.3, v4.19.29, v5.0.2, v4.19.28, v5.0.1 |
|
#
1db604f6 |
| 07-Mar-2019 |
Vineet Gupta <vineet.gupta1@synopsys.com> |
include/linux/bitops.h: set_mask_bits() to return old value
| > Also, set_mask_bits is used in fs quite a bit and we can possibly come up | > with a generic llsc based implementation (w/o the cmpxch
include/linux/bitops.h: set_mask_bits() to return old value
| > Also, set_mask_bits is used in fs quite a bit and we can possibly come up | > with a generic llsc based implementation (w/o the cmpxchg loop) | | May I also suggest changing the return value of set_mask_bits() to old. | | You can compute the new value given old, but you cannot compute the old | value given new, therefore old is the better return value. Also, no | current user seems to use the return value, so changing it is without | risk.
Link: http://lkml.kernel.org/g/20150807110955.GH16853@twins.programming.kicks-ass.net Link: http://lkml.kernel.org/r/1548275584-18096-4-git-send-email-vgupta@synopsys.com Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v4.19.27, v5.0, v4.19.26, v4.19.25, v4.19.24, v4.19.23, v4.19.22, v4.19.21, v4.19.20, v4.19.19, v4.19.18, v4.19.17, v4.19.16, v4.19.15, v4.19.14, v4.19.13, v4.19.12, v4.19.11, v4.19.10, v4.19.9, v4.19.8, v4.19.7, v4.19.6, v4.19.5, v4.19.4, v4.18.20, v4.19.3, v4.18.19, v4.19.2, v4.18.18, v4.18.17, v4.19.1, v4.19, v4.18.16, v4.18.15 |
|
#
edfa8728 |
| 15-Oct-2018 |
Miklos Szeredi <mszeredi@redhat.com> |
bitops: protect variables in bit_clear_unless() macro
Unprotected naming of local variables within bit_clear_unless() can easily lead to using the wrong scope.
Noticed this by code review after hav
bitops: protect variables in bit_clear_unless() macro
Unprotected naming of local variables within bit_clear_unless() can easily lead to using the wrong scope.
Noticed this by code review after having hit this issue in set_mask_bits()
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 85ad1d13ee9b ("md: set MD_CHANGE_PENDING in a atomic region") Cc: Guoqing Jiang <gqjiang@suse.com>
show more ...
|
#
18127429 |
| 15-Oct-2018 |
Miklos Szeredi <mszeredi@redhat.com> |
bitops: protect variables in set_mask_bits() macro
Unprotected naming of local variables within the set_mask_bits() can easily lead to using the wrong scope.
Noticed this when "set_mask_bits(&foo->
bitops: protect variables in set_mask_bits() macro
Unprotected naming of local variables within the set_mask_bits() can easily lead to using the wrong scope.
Noticed this when "set_mask_bits(&foo->bar, 0, mask)" behaved as no-op.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 00a1a053ebe5 ("ext4: atomically set inode->i_flags in ext4_set_inode_flags()") Cc: Theodore Ts'o <tytso@mit.edu>
show more ...
|
Revision tags: v4.18.14, v4.18.13, v4.18.12, v4.18.11, v4.18.10, v4.18.9, v4.18.7, v4.18.6, v4.18.5, v4.17.18, v4.18.4 |
|
#
9144d75e |
| 21-Aug-2018 |
Chris Wilson <chris@chris-wilson.co.uk> |
include/linux/bitops.h: introduce BITS_PER_TYPE
net_dim.h has a rather useful extension to BITS_PER_BYTE to compute the number of bits in a type (BITS_PER_BYTE * sizeof(T)), so promote the macro to
include/linux/bitops.h: introduce BITS_PER_TYPE
net_dim.h has a rather useful extension to BITS_PER_BYTE to compute the number of bits in a type (BITS_PER_BYTE * sizeof(T)), so promote the macro to bitops.h, alongside BITS_PER_BYTE, for wider usage.
Link: http://lkml.kernel.org/r/20180706094458.14116-1-chris@chris-wilson.co.uk Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Andy Gospodarek <gospo@broadcom.com> Cc: David S. Miller <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|