i.MX6ULL issue with DCP for SHA-256

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

i.MX6ULL issue with DCP for SHA-256

1,666 Views
andreiv
Contributor I

i.MX6ULL comes with a Data Co-Processor (DCP) to accelerate AES-128, SHA-1, and SHA-256.  When testing SHA-256 I am not seeing acceleration taking place.

For the sake of simplicity, here testing is done on the i.MX6ULL-EVK running standard image for that EVK:

  1. Power up the EVK and log in
  2. Run OpenSSL speed test without cryptodev to get baseline performance
    root@imx6ull14x14evk:~# openssl speed sha256
    Doing sha256 for 3s on 16 size blocks: 635770 sha256's in 3.00s
    Doing sha256 for 3s on 64 size blocks: 364232 sha256's in 3.00s
    Doing sha256 for 3s on 256 size blocks: 161926 sha256's in 3.00s
    Doing sha256 for 3s on 1024 size blocks: 50296 sha256's in 2.99s
    Doing sha256 for 3s on 8192 size blocks: 6761 sha256's in 3.00s
    OpenSSL 1.0.2h 3 May 2016
    built on: reproducible build, date unspecified
    options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)
    compiler: arm-poky-linux-gnueabi-gcc -march=armv7ve -mfpu=neon -mfloat-abi=hard -mcpu=cortex-a7 --sysroot=/home/bamboo/build/4.1.X-2.0.0_ga/fsl-imx-x11/temp_build_dir/build_fsl-imx-x11/tmp/sysroots/imx6ul7d -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O2 -pipe -g -feliminate-unused-debug-types -fdebug-prefix-map=/home/bamboo/build/4.1.X-2.0.0_ga/fsl-imx-x11/temp_build_dir/build_fsl-imx-x11/tmp/work/cortexa7hf-neon-poky-linux-gnueabi/openssl/1.0.2h-r0=/usr/src/debug/openssl/1.0.2h-r0 -fdebug-prefix-map=/home/bamboo/build/4.1.X-2.0.0_ga/fsl-imx-x11/temp_build_dir/build_fsl-imx-x11/tmp/sysroots/x86_64-linux= -fdebug-prefix-map=/home/bamboo/build/4.1.X-2.0.0_ga/fsl-imx-x11/temp_build_dir/build_fsl-imx-x11/tmp/sysroots/imx6ul7d= -Wall -Wa,--noexecstack -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
    The 'numbers' are in 1000s of bytes per second processed.
    type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
    sha256 3390.77k 7770.28k 13817.69k 17225.12k 18462.04k
  3. Load cryptodev driver:
    root@imx6ull14x14evk:~# modprobe cryptodev
    cryptodev: driver 1.8 loaded.
  4. Now run OpenSSL speed test with the cryptodev engine
    root@imx6ull14x14evk:~# openssl speed sha256 -engine cryptodev
    engine "cryptodev" set.
    Doing sha256 for 3s on 16 size blocks: 641864 sha256's in 3.00s
    Doing sha256 for 3s on 64 size blocks: 364203 sha256's in 3.00s
    Doing sha256 for 3s on 256 size blocks: 161940 sha256's in 3.00s
    Doing sha256 for 3s on 1024 size blocks: 50285 sha256's in 3.00s
    Doing sha256 for 3s on 8192 size blocks: 6762 sha256's in 3.00s
    OpenSSL 1.0.2h 3 May 2016
    built on: reproducible build, date unspecified
    options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)
    compiler: arm-poky-linux-gnueabi-gcc -march=armv7ve -mfpu=neon -mfloat-abi=hard -mcpu=cortex-a7 --sysroot=/home/bamboo/build/4.1.X-2.0.0_ga/fsl-imx-x11/temp_build_dir/build_fsl-imx-x11/tmp/sysroots/imx6ul7d -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O2 -pipe -g -feliminate-unused-debug-types -fdebug-prefix-map=/home/bamboo/build/4.1.X-2.0.0_ga/fsl-imx-x11/temp_build_dir/build_fsl-imx-x11/tmp/work/cortexa7hf-neon-poky-linux-gnueabi/openssl/1.0.2h-r0=/usr/src/debug/openssl/1.0.2h-r0 -fdebug-prefix-map=/home/bamboo/build/4.1.X-2.0.0_ga/fsl-imx-x11/temp_build_dir/build_fsl-imx-x11/tmp/sysroots/x86_64-linux= -fdebug-prefix-map=/home/bamboo/build/4.1.X-2.0.0_ga/fsl-imx-x11/temp_build_dir/build_fsl-imx-x11/tmp/sysroots/imx6ul7d= -Wall -Wa,--noexecstack -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
    The 'numbers' are in 1000s of bytes per second processed.
    type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
    sha256 3423.27k 7769.66k 13818.88k 17163.95k 18464.77k
  5. Note that results in steps (2) and (4) are comparable.  Also check DCP interrupts, note abnormally low DCP interrupt count - only 65!
    root@imx6ull14x14evk:~# cat /proc/interrupts | grep dcp
    236: 0 GPC 46 Level dcp-vmi-irq
    237: 65 GPC 47 Level dcp-irq

If I perform the same test for AES-128 or SHA-1 I get reasonable results.  For example, SHA-1 produces

  • Software mode (no cryptodev):
    root@imx6ull14x14evk:~# openssl speed sha1
    Doing sha1 for 3s on 16 size blocks: 489675 sha1's in 3.00s
    Doing sha1 for 3s on 64 size blocks: 371930 sha1's in 2.99s
    Doing sha1 for 3s on 256 size blocks: 220473 sha1's in 3.00s
    Doing sha1 for 3s on 1024 size blocks: 82921 sha1's in 3.00s
    Doing sha1 for 3s on 8192 size blocks: 12197 sha1's in 3.00s
    OpenSSL 1.0.2h 3 May 2016
    ...

    The 'numbers' are in 1000s of bytes per second processed.
    type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
    sha1 2611.60k 7961.04k 18813.70k 28303.70k 33305.94k
  • With cryptodev:
    root@imx6ull14x14evk:~# modprobe cryptodev
    cryptodev: driver 1.8 loaded.
    root@imx6ull14x14evk:~# openssl speed sha1 -engine cryptodev
    engine "cryptodev" set.
    Doing sha1 for 3s on 16 size blocks: 16345 sha1's in 0.25s
    Doing sha1 for 3s on 64 size blocks: 19929 sha1's in 0.35s
    Doing sha1 for 3s on 256 size blocks: 18929 sha1's in 0.28s
    Doing sha1 for 3s on 1024 size blocks: 14448 sha1's in 0.30s
    Doing sha1 for 3s on 8192 size blocks: 7307 sha1's in 0.26s
    OpenSSL 1.0.2h 3 May 2016
    ...
    The 'numbers' are in 1000s of bytes per second processed.

    type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
    sha1 1046.08k 3644.16k 17306.51k 49315.84k 230226.71k
    Note performance gains for 1024 and 8192 block sizes.  Also check DCP interrupt and observe that the count is reasonable:
    root@imx6ull14x14evk:~# cat /proc/interrupts | grep dcp
    236: 0 GPC 46 Level dcp-vmi-irq
    237: 84330 GPC 47 Level dcp-irq

What's going on?

Labels (2)
0 Kudos
4 Replies

1,010 Views
kef2
Senior Contributor IV

Hi

Not a lot is told in NXP forums about DCP and even cryptodev. Here my findings:

I don't see speed issues using newer OpenSSL on i.MX6ULL, but I do see issues with both SHA1 and SHA256.

 

1) With added support for HMAC in cryptodev 

  • openssl dgst -sha1 --- works well
  • openssl dgst -sha256  --- works well
  • openssl dgst -sha1 -hmac --- produces wrong digest, differs to one calculated when cryptodev is not loaded
  • openssl dgst -sha256 -hmac --- produces wrong digest, differs to one calculated when cryptodev is not loaded

Not a big deal, just not enable HMAC in cryptodev. But it's weird, there's no hmac(sha1) listed in /proc/crypto. Does Linux ignore lack of hmac support in driver and assumes all sha1/sha256 drivers have to support it? Yes, driver misses setkey() for SHA's.

2) cryptodev + openssl is OK for AES ciphers like -aes-128-cbc, provided you supply separate RAW key ( -K) and  IV (-iv). But if you supply passphrase and let openssl calcultate K and IV (PBKDF2), then again encryption / decryption produces odd data. 

It is possible to disable SHA in cryptodev (see ioctl.c, you need to suppress corresponding `case CRYPTO_SHAxxx` cases), but it would be best to fix them. 

 

2) Using Softether VPN server and its "AES128-SHA" cipher. Impossible to client connect until I disable both SHA1 and SHA256 in cryptodev. 

 

Perhaps does mxs-dcp driver signal SHA completion to early? Hardware issue? 

 

Edward

 

0 Kudos

1,464 Views
Yuri
NXP Employee
NXP Employee

Hello,

  NXP Linux BSP supports only CAAM based hardware acceleration via CryptoDev interface.

i.MX 6ULL does not have CAAM.

Regards,

Yuri.

0 Kudos

1,464 Views
changbaoma
Contributor III

how do to if i want to do this hardware acceleration in i.MX6ull? is there any corresponding driver?

0 Kudos

1,426 Views
Yuri
NXP Employee
NXP Employee

@changbaoma 
Hello,

   Customers can try the following:

https://github.com/f-secure-foundry/mxs-dcp

 

Regards,
Yuri.