Searched hist:fc074e130051015e39245a4241956ff122e2f465 (Results 1 – 3 of 3) sorted by relevance
/openbmc/linux/arch/arm64/crypto/ |
H A D | aes-neonbs-glue.c | diff fc074e130051015e39245a4241956ff122e2f465 Thu Jan 27 05:35:44 CST 2022 Ard Biesheuvel <ardb@kernel.org> crypto: arm64/aes-neonbs-ctr - fallback to plain NEON for final chunk
Instead of processing the entire input with the 8-way bit sliced algorithm, which is sub-optimal for inputs that are not a multiple of 128 bytes in size, invoke the plain NEON version of CTR for the remainder of the input after processing the bulk using 128 byte strides.
This allows us to greatly simplify the asm code that implements CTR, and get rid of all the branches and special code paths. It also gains us a couple of percent of performance.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
H A D | aes-neonbs-core.S | diff fc074e130051015e39245a4241956ff122e2f465 Thu Jan 27 05:35:44 CST 2022 Ard Biesheuvel <ardb@kernel.org> crypto: arm64/aes-neonbs-ctr - fallback to plain NEON for final chunk
Instead of processing the entire input with the 8-way bit sliced algorithm, which is sub-optimal for inputs that are not a multiple of 128 bytes in size, invoke the plain NEON version of CTR for the remainder of the input after processing the bulk using 128 byte strides.
This allows us to greatly simplify the asm code that implements CTR, and get rid of all the branches and special code paths. It also gains us a couple of percent of performance.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
H A D | aes-glue.c | diff fc074e130051015e39245a4241956ff122e2f465 Thu Jan 27 05:35:44 CST 2022 Ard Biesheuvel <ardb@kernel.org> crypto: arm64/aes-neonbs-ctr - fallback to plain NEON for final chunk
Instead of processing the entire input with the 8-way bit sliced algorithm, which is sub-optimal for inputs that are not a multiple of 128 bytes in size, invoke the plain NEON version of CTR for the remainder of the input after processing the bulk using 128 byte strides.
This allows us to greatly simplify the asm code that implements CTR, and get rid of all the branches and special code paths. It also gains us a couple of percent of performance.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|