Commits · 1f4b0d5596d2e3ea8e953d578ab8444ce860d35d · jan.koester / Linux

Aug 04, 2017

crypto: arm/aes - avoid expanded lookup tables in the final round · 0d149ce6

Ard Biesheuvel authored Jul 24, 2017

For the final round, avoid the expanded and padded lookup tables
exported by the generic AES driver. Instead, for encryption, we can
perform byte loads from the same table we used for the inner rounds,
which will still be hot in the caches. For decryption, use the inverse
AES Sbox directly, which is 4x smaller than the inverse lookup table
exported by the generic driver.

This should significantly reduce the Dcache footprint of our code,
which makes the code more robust against timing attacks. It does not
introduce any additional module dependencies, given that we already
rely on the core AES module for the shared key expansion routines.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

0d149ce6

crypto: arm/ghash - add NEON accelerated fallback for vmull.p64 · 3759ee05

Ard Biesheuvel authored Jul 24, 2017

Implement a NEON fallback for systems that do support NEON but have
no support for the optional 64x64->128 polynomial multiplication
instruction that is part of the ARMv8 Crypto Extensions. It is based
on the paper "Fast Software Polynomial Multiplication on ARM Processors
Using the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and
Ricardo Dahab (https://hal.inria.fr/hal-01506572

)

On a 32-bit guest executing under KVM on a Cortex-A57, the new code is
not only 4x faster than the generic table based GHASH driver, it is also
time invariant. (Note that the existing vmull.p64 code is 16x faster on
this core).

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

3759ee05

crypto: algapi - make crypto_xor() take separate dst and src arguments · 45fe93df

Ard Biesheuvel authored Jul 24, 2017



There are quite a number of occurrences in the kernel of the pattern

  if (dst != src)
          memcpy(dst, src, walk.total % AES_BLOCK_SIZE);
  crypto_xor(dst, final, walk.total % AES_BLOCK_SIZE);

or

  crypto_xor(keystream, src, nbytes);
  memcpy(dst, keystream, nbytes);

where crypto_xor() is preceded or followed by a memcpy() invocation
that is only there because crypto_xor() uses its output parameter as
one of the inputs. To avoid having to add new instances of this pattern
in the arm64 code, which will be refactored to implement non-SIMD
fallbacks, add an alternative implementation called crypto_xor_cpy(),
taking separate input and output arguments. This removes the need for
the separate memcpy().

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

45fe93df

Jun 01, 2017

crypto: arm/crc32 - enable module autoloading based on CPU feature bits · 2a9faf8b

Ard Biesheuvel authored May 21, 2017



Make the module autoloadable by tying it to the CPU feature bits that
describe whether the optional instructions it relies on are implemented
by the current CPU.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

2a9faf8b

crypto: arm/sha2-ce - enable module autoloading based on CPU feature bits · a83ff88b

Ard Biesheuvel authored May 21, 2017



Make the module autoloadable by tying it to the CPU feature bit that
describes whether the optional instructions it relies on are implemented
by the current CPU.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

a83ff88b

crypto: arm/sha1-ce - enable module autoloading based on CPU feature bits · bd56f95e

Ard Biesheuvel authored May 21, 2017



Make the module autoloadable by tying it to the CPU feature bit that
describes whether the optional instructions it relies on are implemented
by the current CPU.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

bd56f95e

crypto: arm/ghash-ce - enable module autoloading based on CPU feature bits · c9d9f608

Ard Biesheuvel authored May 21, 2017



Make the module autoloadable by tying it to the CPU feature bit that
describes whether the optional instructions it relies on are implemented
by the current CPU.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

c9d9f608

crypto: arm/aes-ce - enable module autoloading based on CPU feature bits · 4d8061a5

Ard Biesheuvel authored May 21, 2017



Make the module autoloadable by tying it to the CPU feature bit that
describes whether the optional instructions it relies on are implemented
by the current CPU.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

4d8061a5

Mar 09, 2017

crypto: arm/aes-neonbs - resolve fallback cipher at runtime · b56f5cbc

Ard Biesheuvel authored Feb 14, 2017



Currently, the bit sliced NEON AES code for ARM has a link time
dependency on the scalar ARM asm implementation, which it uses as a
fallback to perform CBC encryption and the encryption of the initial
XTS tweak.

The bit sliced NEON code is both fast and time invariant, which makes
it a reasonable default on hardware that supports it. However, the
ARM asm code it pulls in is not time invariant, and due to the way it
is linked in, cannot be overridden by the new generic time invariant
driver. In fact, it will not be used at all, given that the ARM asm
code registers itself as a cipher with a priority that exceeds the
priority of the fixed time cipher.

So remove the link time dependency, and allocate the fallback cipher
via the crypto API. Note that this requires this driver's module_init
call to be replaced with late_initcall, so that the (possibly generic)
fallback cipher is guaranteed to be available when the builtin test
is performed at registration time.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

b56f5cbc

Mar 01, 2017

crypto: arm/crc32 - add build time test for CRC instruction support · efa7cebd

Ard Biesheuvel authored Feb 28, 2017



The accelerated CRC32 module for ARM may use either the scalar CRC32
instructions, the NEON 64x64 to 128 bit polynomial multiplication
(vmull.p64) instruction, or both, depending on what the current CPU
supports.

However, this also requires support in binutils, and as it turns out,
versions of binutils exist that support the vmull.p64 instruction but
not the crc32 instructions.

So refactor the Makefile logic so that this module only gets built if
binutils has support for both.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

efa7cebd

crypto: arm/crc32 - fix build error with outdated binutils · 1fb1683c

Ard Biesheuvel authored Feb 28, 2017



Annotate a vmov instruction with an explicit element size of 32 bits.
This is inferred by recent toolchains, but apparently, older versions
need some help figuring this out.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

1fb1683c

Feb 03, 2017

crypto: arm/aes - don't use IV buffer to return final keystream block · 1a20b966

Ard Biesheuvel authored Feb 02, 2017

The ARM bit sliced AES core code uses the IV buffer to pass the final
keystream block back to the glue code if the input is not a multiple of
the block size, so that the asm code does not have to deal with anything
except 16 byte blocks. This is done under the assumption that the outgoing
IV is meaningless anyway in this case, given that chaining is no longer
possible under these circumstances.

However, as it turns out, the CCM driver does expect the IV to retain
a value that is equal to the original IV except for the counter value,
and even interprets byte zero as a length indicator, which may result
in memory corruption if the IV is overwritten with something else.

So use a separate buffer to return the final keystream block.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

1a20b966

crypto: arm/chacha20 - remove cra_alignmask · 4a70b526

Ard Biesheuvel authored Jan 28, 2017



Remove the unnecessary alignmask: it is much more efficient to deal with
the misalignment in the core algorithm than relying on the crypto API to
copy the data to a suitably aligned buffer.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

4a70b526

crypto: arm/aes-ce - remove cra_alignmask · 1465fb13

Ard Biesheuvel authored Jan 28, 2017



Remove the unnecessary alignmask: it is much more efficient to deal with
the misalignment in the core algorithm than relying on the crypto API to
copy the data to a suitably aligned buffer.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

1465fb13

Jan 23, 2017

crypto: arm/aes-neonbs - fix issue with v2.22 and older assembler · 13954e78

Ard Biesheuvel authored Jan 19, 2017



The GNU assembler for ARM version 2.22 or older fails to infer the
element size from the vmov instructions, and aborts the build in
the following way;

.../aes-neonbs-core.S: Assembler messages:
.../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1h[1],r10'
.../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1h[0],r9'
.../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1l[1],r8'
.../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1l[0],r7'
.../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2h[1],r10'
.../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2h[0],r9'
.../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2l[1],r8'
.../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2l[0],r7'

Fix this by setting the element size explicitly, by replacing vmov with
vmov.32.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

13954e78

Jan 13, 2017

crypto: arm/aes - avoid reserved 'tt' mnemonic in asm code · 658fa754

Ard Biesheuvel authored Jan 13, 2017

The ARMv8-M architecture introduces 'tt' and 'ttt' instructions,
which means we can no longer use 'tt' as a register alias on recent
versions of binutils for ARM. So replace the alias with 'ttab'.

Fixes: 81edb426 ("crypto: arm/aes - replace scalar AES cipher")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

658fa754

crypto: arm/aes - replace bit-sliced OpenSSL NEON code · cc477bf6

Ard Biesheuvel authored Jan 11, 2017



This replaces the unwieldy generated implementation of bit-sliced AES
in CBC/CTR/XTS modes that originated in the OpenSSL project with a
new version that is heavily based on the OpenSSL implementation, but
has a number of advantages over the old version:
- it does not rely on the scalar AES cipher that also originated in the
  OpenSSL project and contains redundant lookup tables and key schedule
  generation routines (which we already have in crypto/aes_generic.)
- it uses the same expanded key schedule for encryption and decryption,
  reducing the size of the per-key data structure by 1696 bytes
- it adds an implementation of AES in ECB mode, which can be wrapped by
  other generic chaining mode implementations
- it moves the handling of corner cases that are non critical to performance
  to the glue layer written in C
- it was written directly in assembler rather than generated from a Perl
  script

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

cc477bf6

Jan 12, 2017

crypto: arm/aes - replace scalar AES cipher · 81edb426

Ard Biesheuvel authored Jan 11, 2017



This replaces the scalar AES cipher that originates in the OpenSSL project
with a new implementation that is ~15% (*) faster (on modern cores), and
reuses the lookup tables and the key schedule generation routines from the
generic C implementation (which is usually compiled in anyway due to
networking and other subsystems depending on it).

Note that the bit sliced NEON code for AES still depends on the scalar cipher
that this patch replaces, so it is not removed entirely yet.

* On Cortex-A57, the performance increases from 17.0 to 14.9 cycles per byte
  for 128-bit keys.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

81edb426

crypto: arm/chacha20 - implement NEON version based on SSE3 code · afaf712e

Ard Biesheuvel authored Jan 11, 2017

This is a straight port to ARM/NEON of the x86 SSE3 implementation
of the ChaCha20 stream cipher. It uses the new skcipher walksize
attribute to process the input in strides of 4x the block size.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

afaf712e

Dec 28, 2016

Revert "crypto: arm64/ARM: NEON accelerated ChaCha20" · 5386e5d1

Herbert Xu authored Dec 28, 2016



This patch reverts the following commits:

8621caa0
80966672

I should not have applied them because they had already been
obsoleted by a subsequent patch series.  They also cause a build
failure because of the subsequent commit 9ae433bc.

Fixes: 9ae433bc ("crypto: chacha20 - convert generic and...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

5386e5d1

Dec 27, 2016

crypto: arm/chacha20 - implement NEON version based on SSE3 code · 80966672

Ard Biesheuvel authored Dec 08, 2016



This is a straight port to ARM/NEON of the x86 SSE3 implementation
of the ChaCha20 stream cipher.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

80966672

Dec 07, 2016

crypto: arm/crc32 - accelerated support based on x86 SSE implementation · d0a3431a

Ard Biesheuvel authored Dec 05, 2016

This is a combination of the the Intel algorithm implemented using SSE
and PCLMULQDQ instructions from arch/x86/crypto/crc32-pclmul_asm.S, and
the new CRC32 extensions introduced for both 32-bit and 64-bit ARM in
version 8 of the architecture. Two versions of the above combo are
provided, one for CRC32 and one for CRC32C.

The PMULL/NEON algorithm is faster, but operates on blocks of at least
64 bytes, and on multiples of 16 bytes only. For the remaining input,
or for all input on systems that lack the PMULL 64x64->128 instructions,
the CRC32 instructions will be used.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

d0a3431a

crypto: arm/crct10dif - port x86 SSE implementation to ARM · 1d481f1c

Ard Biesheuvel authored Dec 05, 2016



This is a transliteration of the Intel algorithm implemented
using SSE and PCLMULQDQ instructions that resides in the file
arch/x86/crypto/crct10dif-pcl-asm_64.S, but simplified to only
operate on buffers that are 16 byte aligned (but of any size)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

1d481f1c

Dec 01, 2016

crypto: aes-ce - Make aes_simd_algs static · efad2b61

Herbert Xu authored Dec 01, 2016



The variable aes_simd_algs should be static.  In fact if it isn't
it causes build errors when multiple copies of aes-ce-glue.c are
built into the kernel.

Fixes: da40e7a4 ("crypto: aes-ce - Convert to skcipher")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

efad2b61

Nov 30, 2016

crypto: arm/aesbs - fix brokenness after skcipher conversion · 81126d1a

Ard Biesheuvel authored Nov 29, 2016



The CBC encryption routine should use the encryption round keys, not
the decryption round keys.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

81126d1a

crypto: arm/aes - Add missing SIMD select for aesbs · 6fdf436f

Herbert Xu authored Nov 29, 2016



This patch adds one more missing SIMD select for AES_ARM_BS.  It
also changes selects on ALGAPI to BLKCIPHER.

Fixes: 211f41af ("crypto: aesbs - Convert to skcipher")
Reported-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

6fdf436f

Nov 29, 2016

crypto: arm/aes - Select SIMD in Kconfig · 585b5fa6

Herbert Xu authored Nov 29, 2016



The skcipher conversion for ARM missed the select on CRYPTO_SIMD,
causing build failures if SIMD was not otherwise enabled.

Fixes: da40e7a4 ("crypto: aes-ce - Convert to skcipher")
Fixes: 211f41af ("crypto: aesbs - Convert to skcipher")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

585b5fa6

Nov 28, 2016

crypto: aesbs - Convert to skcipher · 211f41af

Herbert Xu authored Nov 22, 2016



This patch converts aesbs over to the skcipher interface.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

211f41af

crypto: aes-ce - Convert to skcipher · da40e7a4

Herbert Xu authored Nov 22, 2016



This patch converts aes-ce over to the skcipher interface.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

da40e7a4

Oct 21, 2016

crypto: arm/aes-ce - fix for big endian · 58010fa6

Ard Biesheuvel authored Oct 11, 2016

The AES key schedule generation is mostly endian agnostic, with the
exception of the rotation and the incorporation of the round constant
at the start of each round. So implement a big endian specific version
of that part to make the whole routine big endian compatible.

Fixes: 86464859 ("crypto: arm - AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

58010fa6

Sep 13, 2016

crypto: arm/aes-ctr - fix NULL dereference in tail processing · f82e90b2

Ard Biesheuvel authored Sep 13, 2016



The AES-CTR glue code avoids calling into the blkcipher API for the
tail portion of the walk, by comparing the remainder of walk.nbytes
modulo AES_BLOCK_SIZE with the residual nbytes, and jumping straight
into the tail processing block if they are equal. This tail processing
block checks whether nbytes != 0, and does nothing otherwise.

However, in case of an allocation failure in the blkcipher layer, we
may enter this code with walk.nbytes == 0, while nbytes > 0. In this
case, we should not dereference the source and destination pointers,
since they may be NULL. So instead of checking for nbytes != 0, check
for (walk.nbytes % AES_BLOCK_SIZE) != 0, which implies the former in
non-error conditions.

Fixes: 86464859 ("crypto: arm - AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions")
Cc: stable@vger.kernel.org
Reported-by: xiakaixu <xiakaixu@huawei.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

f82e90b2

Sep 07, 2016

crypto: arm/ghash - change internal cra_name to "__ghash" · 71f89917

Ard Biesheuvel authored Sep 05, 2016



The fact that the internal synchrous hash implementation is called
"ghash" like the publicly visible one is causing the testmgr code
to misidentify it as an algorithm that requires testing at boottime.
So rename it to "__ghash" to prevent this.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

71f89917

crypto: arm/ghash-ce - add missing async import/export · ed4767d6

Ard Biesheuvel authored Sep 01, 2016



Since commit 8996eafd ("crypto: ahash - ensure statesize is non-zero"),
all ahash drivers are required to implement import()/export(), and must have
a non-zero statesize. Fix this for the ARM Crypto Extensions GHASH
implementation.

Fixes: 8996eafd ("crypto: ahash - ensure statesize is non-zero")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

ed4767d6

crypto: arm/sha1-neon - add support for building in Thumb2 mode · 2117eaa6

Ard Biesheuvel authored Sep 01, 2016



The ARMv7 NEON module is explicitly built in ARM mode, which is not
supported by the Thumb2 kernel. So remove the explicit override, and
leave it up to the build environment to decide whether the core SHA1
routines are assembled as ARM or as Thumb2 code.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

2117eaa6

Jun 23, 2016

crypto: ghash-ce - Fix cryptd reordering · 820573eb

Herbert Xu authored Jun 21, 2016



This patch fixes an old bug where requests can be reordered because
some are processed by cryptd while others are processed directly
in softirq context.

The fix is to always postpone to cryptd if there are currently
requests outstanding from the same tfm.

This patch also removes the redundant use of cryptd in the async
init function as init never touches the FPU.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

820573eb

Feb 17, 2016

crypto: xts - fix compile errors · 49abc0d2

Stephan Mueller authored Feb 17, 2016



Commit 28856a9e missed the addition of the crypto/xts.h include file
for different architecture-specific AES implementations.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

49abc0d2

Feb 16, 2016

crypto: xts - consolidate sanity check for keys · 28856a9e

Stephan Mueller authored Feb 09, 2016

The patch centralizes the XTS key check logic into the service function
xts_check_key which is invoked from the different XTS implementations.
With this, the XTS implementations in ARM, ARM64, PPC and S390 have now
a sanity check for the XTS keys similar to the other arches.

In addition, this service function received a check to ensure that the
key != the tweak key which is mandated by FIPS 140-2 IG A.9. As the
check is not present in the standards defining XTS, it is only enforced
in FIPS mode of the kernel.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

28856a9e

Feb 15, 2016

arm/arm64: crypto: assure that ECB modes don't require an IV · bee038a4

Jeremy Linton authored Feb 12, 2016



ECB modes don't use an initialization vector. The kernel
/proc/crypto interface doesn't reflect this properly.

Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

bee038a4

Jul 06, 2015

crypto: arm - ignore generated SHA2 assembly files · 4d666dbe

Baruch Siach authored Jul 06, 2015



These files are generated since commits f2f770d7 (crypto: arm/sha256 - Add
optimized SHA-256/224, 2015-04-03) and c80ae7ca (crypto: arm/sha512 -
accelerated SHA-512 using ARM generic ASM and NEON, 2015-05-08).

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

4d666dbe

May 11, 2015

crypto: arm/aes - streamline AES-192 code path · 6499e8cf

Ard Biesheuvel authored May 08, 2015



This trims off a couple of instructions of the total size of the
core AES transform by reordering the final branch in the AES-192
code path with the rounds that are performed regardless of whether
the branch is taken or not. Other than the slight size reduction,
this has no performance benefit.

Fix up a comment regarding the prototype of this function while
we're at it.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

6499e8cf