History log of /openbmc/linux/arch/x86/crypto/sm4-aesni-avx2-asm_64.S (Results 1 – 6 of 6)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80
# 2d203c46 18-Nov-2022 Eric Biggers <ebiggers@google.com>

crypto: x86/sm4 - fix crash with CFI enabled

sm4_aesni_avx_ctr_enc_blk8(), sm4_aesni_avx_cbc_dec_blk8(),
sm4_aesni_avx_cfb_dec_blk8(), sm4_aesni_avx2_ctr_enc_blk16(),
sm4_aesni_avx2_cbc_dec_blk16(),

crypto: x86/sm4 - fix crash with CFI enabled

sm4_aesni_avx_ctr_enc_blk8(), sm4_aesni_avx_cbc_dec_blk8(),
sm4_aesni_avx_cfb_dec_blk8(), sm4_aesni_avx2_ctr_enc_blk16(),
sm4_aesni_avx2_cbc_dec_blk16(), and sm4_aesni_avx2_cfb_dec_blk16() are
called via indirect function calls. Therefore they need to use
SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause their type
hashes to be emitted when the kernel is built with CONFIG_CFI_CLANG=y.
Otherwise, the code crashes with a CFI failure.

(Or at least that should be the case. For some reason the CFI checks in
sm4_avx_cbc_decrypt(), sm4_avx_cfb_decrypt(), and sm4_avx_ctr_crypt()
are not always being generated, using current tip-of-tree clang.
Anyway, this patch is a good idea anyway.)

Fixes: ccace936eec7 ("x86: Add types to indirectly called assembly functions")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


Revision tags: v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69
# 2f93238b 15-Sep-2022 Thomas Gleixner <tglx@linutronix.de>

crypto: x86/sm[34]: Remove redundant alignments

SYM_FUNC_START*() and friends already imply alignment, remove custom
alignment hacks to make code consistent. This prepares for future
function call A

crypto: x86/sm[34]: Remove redundant alignments

SYM_FUNC_START*() and friends already imply alignment, remove custom
alignment hacks to make code consistent. This prepares for future
function call ABI changes.

Also, with having pushed the function alignment to 16 bytes, this
custom alignment is completely superfluous.

( this code couldn't seem to make up it's mind about what alignment it
actually wanted, randomly mixing 8 and 16 bytes )

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111144.868540856@infradead.org

show more ...


Revision tags: v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7
# f94909ce 04-Dec-2021 Peter Zijlstra <peterz@infradead.org>

x86: Prepare asm files for straight-line-speculation

Replace all ret/retq instructions with RET in preparation of making
RET a macro. Since AS is case insensitive it's a big no-op without
RET define

x86: Prepare asm files for straight-line-speculation

Replace all ret/retq instructions with RET in preparation of making
RET a macro. Since AS is case insensitive it's a big no-op without
RET defined.

find arch/x86/ -name \*.S | while read file
do
sed -i 's/\<ret[q]*\>/RET/' $file
done

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211204134907.905503893@infradead.org

show more ...


# 14b476e0 04-Dec-2021 Peter Zijlstra <peterz@infradead.org>

x86: Prepare asm files for straight-line-speculation

[ Upstream commit f94909ceb1ed4bfdb2ada72f93236305e6d6951f ]

Replace all ret/retq instructions with RET in preparation of making
RET a macro. Si

x86: Prepare asm files for straight-line-speculation

[ Upstream commit f94909ceb1ed4bfdb2ada72f93236305e6d6951f ]

Replace all ret/retq instructions with RET in preparation of making
RET a macro. Since AS is case insensitive it's a big no-op without
RET defined.

find arch/x86/ -name \*.S | while read file
do
sed -i 's/\<ret[q]*\>/RET/' $file
done

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211204134907.905503893@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


Revision tags: v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13
# f8690a4b 14-Oct-2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>

crypto: x86/sm4 - Fix invalid section entry size

This fixes the following warning:

vmlinux.o: warning: objtool: elf_update: invalid section entry size

The size of the rodata section is 164 bytes

crypto: x86/sm4 - Fix invalid section entry size

This fixes the following warning:

vmlinux.o: warning: objtool: elf_update: invalid section entry size

The size of the rodata section is 164 bytes, directly using the
entry_size of 164 bytes will cause errors in some versions of the
gcc compiler, while using 16 bytes directly will cause errors in
the clang compiler. This patch correct it by filling the size of
rodata to a 16-byte boundary.

Fixes: a7ee22ee1445 ("crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation")
Fixes: 5b2efa2bb865 ("crypto: x86/sm4 - add AES-NI/AVX2/x86_64 implementation")
Reported-by: Peter Zijlstra <peterz@infradead.org>
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Tested-by: Heyuan Shi <heyuan@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


Revision tags: v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60
# 5b2efa2b 17-Aug-2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>

crypto: x86/sm4 - add AES-NI/AVX2/x86_64 implementation

Like the implementation of AESNI/AVX, this patch adds an accelerated
implementation of AESNI/AVX2. In terms of code implementation, by
reusing

crypto: x86/sm4 - add AES-NI/AVX2/x86_64 implementation

Like the implementation of AESNI/AVX, this patch adds an accelerated
implementation of AESNI/AVX2. In terms of code implementation, by
reusing AESNI/AVX mode-related codes, the amount of code is greatly
reduced. From the benchmark data, it can be seen that when the block
size is 1024, compared to AVX acceleration, the performance achieved
by AVX2 has increased by about 70%, it is also 7.7 times of the pure
software implementation of sm4-generic.

The main algorithm implementation comes from SM4 AES-NI work by
libgcrypt and Markku-Juhani O. Saarinen at:
https://github.com/mjosaarinen/sm4ni

This optimization supports the four modes of SM4, ECB, CBC, CFB,
and CTR. Since CBC and CFB do not support multiple block parallel
encryption, the optimization effect is not obvious.

Benchmark on Intel i5-6200U 2.30GHz, performance data of three
implementation methods, pure software sm4-generic, aesni/avx
acceleration, and aesni/avx2 acceleration, the data comes from
the 218 mode and 518 mode of tcrypt. The abscissas are blocks of
different lengths. The data is tabulated and the unit is Mb/s:

block-size | 16 64 128 256 1024 1420 4096
sm4-generic
ECB enc | 60.94 70.41 72.27 73.02 73.87 73.58 73.59
ECB dec | 61.87 70.53 72.15 73.09 73.89 73.92 73.86
CBC enc | 56.71 66.31 68.05 69.84 70.02 70.12 70.24
CBC dec | 54.54 65.91 68.22 69.51 70.63 70.79 70.82
CFB enc | 57.21 67.24 69.10 70.25 70.73 70.52 71.42
CFB dec | 57.22 64.74 66.31 67.24 67.40 67.64 67.58
CTR enc | 59.47 68.64 69.91 71.02 71.86 71.61 71.95
CTR dec | 59.94 68.77 69.95 71.00 71.84 71.55 71.95
sm4-aesni-avx
ECB enc | 44.95 177.35 292.06 316.98 339.48 322.27 330.59
ECB dec | 45.28 178.66 292.31 317.52 339.59 322.52 331.16
CBC enc | 57.75 67.68 69.72 70.60 71.48 71.63 71.74
CBC dec | 44.32 176.83 284.32 307.24 328.61 312.61 325.82
CFB enc | 57.81 67.64 69.63 70.55 71.40 71.35 71.70
CFB dec | 43.14 167.78 282.03 307.20 328.35 318.24 325.95
CTR enc | 42.35 163.32 279.11 302.93 320.86 310.56 317.93
CTR dec | 42.39 162.81 278.49 302.37 321.11 310.33 318.37
sm4-aesni-avx2
ECB enc | 45.19 177.41 292.42 316.12 339.90 322.53 330.54
ECB dec | 44.83 178.90 291.45 317.31 339.85 322.55 331.07
CBC enc | 57.66 67.62 69.73 70.55 71.58 71.66 71.77
CBC dec | 44.34 176.86 286.10 501.68 559.58 483.87 527.46
CFB enc | 57.43 67.60 69.61 70.52 71.43 71.28 71.65
CFB dec | 43.12 167.75 268.09 499.33 558.35 490.36 524.73
CTR enc | 42.42 163.39 256.17 493.95 552.45 481.58 517.19
CTR dec | 42.49 163.11 256.36 493.34 552.62 481.49 516.83

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...