Why is AES on CAAM slower than OpenSSL w/o CAAM?

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

Why is AES on CAAM slower than OpenSSL w/o CAAM?

6,696 次查看
hhan
Contributor I

The kernel used in my tests is imx_3.0.35_4.1.0 with the latest patches.The test board is Boundary Devices's iMX6Q Sabre Lite.

With caam enabled in the kernel's command line parameters, I got output on dmesg showing that caam is loaded.

caam caam.0: device ID = 0x0a16010000000000

caam caam.0: job rings = 2, qi = 0

In addition, /proc/crypto also confirms that the caam drivers have been enabled.

TEST 1: With caam, I use tcrypt.ko to test AES encryption/decryption


#insmod tcrypt.ko sec=1 mode=200

testing speed of cbc(aes) encryption

test 0 (128 bit key, 16 byte blocks): 170772 operations in 1 seconds (2732352 B/s)

test 1 (128 bit key, 64 byte blocks): 121409 operations in 1 seconds (7770176 B/s)

test 2 (128 bit key, 256 byte blocks): 56503 operations in 1 seconds (14464768 B/s)

test 3 (128 bit key, 1024 byte blocks): 18017 operations in 1 seconds (18449408 B/s)

test 4 (128 bit key, 8192 byte blocks): 2390 operations in 1 seconds (19578880 B/s)

testing speed of cbc(aes) decryption

test 0 (128 bit key, 16 byte blocks): 169842 operations in 1 seconds (2717472 B/s)

test 1 (128 bit key, 64 byte blocks): 121802 operations in 1 seconds (7795328 B/s)

test 2 (128 bit key, 256 byte blocks): 57603 operations in 1 seconds (14746368  B/s)

test 3 (128 bit key, 1024 byte blocks): 18534 operations in 1 seconds (18978816 B/s)

test 4 (128 bit key, 8192 byte blocks): 2462 operations in 1 seconds (20168704 B/s)


TEST 2: Without caam, I use openssl to test AES encryption/decryption

# openssl speed -evp aes-128-cbc

Doing aes-128-cbc for 3s on 16 size blocks: 3547600 aes-128-cbc's in 2.99s

Doing aes-128-cbc for 3s on 64 size blocks: 946867 aes-128-cbc's in 3.00s

Doing aes-128-cbc for 3s on 256 size blocks: 240616 aes-128-cbc's in 3.00s

Doing aes-128-cbc for 3s on 1024 size blocks: 60033 aes-128-cbc's in 3.00s

Doing aes-128-cbc for 3s on 8192 size blocks: 7513 aes-128-cbc's in 2.99s

OpenSSL 1.0.1e 11 Feb 2013

built on: Thu Aug  8 15:44:23 EDT 2013

options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) blowfish(ptr)

compiler: armv7l-timesys-linux-gnueabi-gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -I/here/worl

The 'numbers' are in 1000s of bytes per second processed.

type 16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes

aes-128-cbc 18983.81k    20199.83k 20532.57k    20491.26k    20584.11k



Question: Why is AES on CAAM slower than openssl without CAAM? Is it normal, or I made some mistakes in configuration?

Thank you!

标签 (2)
标记 (1)
7 回复数

3,536 次查看
Yuri
NXP Employee
NXP Employee

As for general CAAM considerations :

1.

Please refer to the following  :

“Q&A: Why is CAAM Driver Not Functioning in Linux for iMX6?”

< https://community.freescale.com/docs/DOC-95700 >

If You are using the latest (on the Web) BSP release 3.0.35_4.1.0 :

there are fixes for CAAM driver and black keys test application, the

patches are not part of the formal BSP and are online in our git repository:

< http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/log/?h=imx_3.0.35_4.1.0 >

Patches related to CAAM changes are:

ENGR00290444: Need to update CAAM driver with SM patches from STC

ENGR00290448: Double instantiation of RNG in CAAM Driver

ENGR00290449: Cannot build CAAM as a loadable module

ENGR00291081 CAAM: Fix the Copyright issue introduced by commit: 2b94a4b


Have a great day,
Yuri

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 项奖励
回复

3,536 次查看
hhan
Contributor I

Yuri Muhin wrote:

As for general CAAM considerations :

1.

Please refer to the following  :

“Q&A: Why is CAAM Driver Not Functioning in Linux for iMX6?”

< https://community.freescale.com/docs/DOC-95700 >

If You are using the latest (on the Web) BSP release 3.0.35_4.1.0 :

there are fixes for CAAM driver and black keys test application, the

patches are not part of the formal BSP and are online in our git repository:

< http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/log/?h=imx_3.0.35_4.1.0 >

Patches related to CAAM changes are:

ENGR00290444: Need to update CAAM driver with SM patches from STC

ENGR00290448: Double instantiation of RNG in CAAM Driver

ENGR00290449: Cannot build CAAM as a loadable module

ENGR00291081 CAAM: Fix the Copyright issue introduced by commit: 2b94a4b


Have a great day,
Yuri

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

Yuri,

Thank you for the reply.  The tests were done with all the patches you mentioned patched, but the performance was very disappointing. Any idea about that? Do you have any other test applications or performance data that I can have a look. Thanks again!

0 项奖励
回复

3,536 次查看
Aymen_IRT
Contributor III

Hi Hao Han,

I am using kernel_3.0.35_4.1.0 with the latest patches and my test board is the Nitrogen6x.

I have got the same results as you for AES. In addition, I tested digests (md5, sha1, sha224 & sha256) and HMAC algorithms. All the results were disappointing, considering execution time as evaluation metric. 

First, I have done my tests using my own crypto modules (based on kernel crypto API). Then, I used cryptodev for testing again. For the two cases, I got disappointing results. In addition, I am quite sure that CAAM is working because I was watching /proc/interrupts evolution for caam_jr every time that a crypto operation was executed. If you want, I can send you my test modules and results in order to check together our different results.

Best regards,

Aymen

0 项奖励
回复

3,536 次查看
thomasschuster
Contributor I

Hello again!

I have done a second experiment using the ablkcipher API (via tcrypt.ko).

To make this work I followed the procedure described here:

LKML: Stephan Mueller: [PATCH] kernel crypto API interface specification

Blue plot: reference results from synchronous blkcipher API using the SW AES implementation from the kernel (aes-generic).

Green plot: results for asynchronous ablkcipher API using CAAM (cbc-aes-caam)

caam_results.png

(Numbers refer to 10k times encryption and decryption of data packages.

The size of a single package is given on the x-axis.)

What I see here confirms the previous observations from OpenSSL + AF_ALG.

For small package sizes the performance seems very poor. At approximately 8k we have break-even.

Would be really great to learn if this is the expected behaviour of the device.

Any feedback welcome!

Cheers,

Thomas

0 项奖励
回复

3,536 次查看
Aymen_IRT
Contributor III

Hi Thomas,

Sorry for the late answer.

When I used the following kernel version 3.0.35_4.1.0, I got the following results with kernel crypto API (asynchronous interface): compared to SW, CAAM elapsed time becomes interesting starting from an input length >= 1MB. That is, for an input length equal to 1KB, 10KB or 100KB, CAAM has poor performance. By elapsed time, I mean the real time recovered with do_gettimeofday(). However, CAAM CPU time is smaller than SW CPU time starting from an input length >=100KB. CPU time is recovered using CPU cycle counters which are enabled by setting kernel CONFIG_HW_PERF_EVENTS. I think that CAAM is intended to be used for secure boot, software update and application layer cryptography. That is, CAAM can be used with applications that take a large input. However, it can not be used for network or transport layer encryption (smaller inputs).

In addition, I did some tests with kernel version 3.10.17, and I noticed that CAAM performance is improved for 1KB and 10KB inputs. However, it is still poor compared to SW.

Best regards,

Aymen

0 项奖励
回复

3,536 次查看
thomasschuster
Contributor I

Hello Aymen,

thanks for sharing! That's fully in line with our observations.

Cheers,

Thomas

0 项奖励
回复

3,536 次查看
thomasschuster
Contributor I

Dear all,

I have also given a look to CAAM. Find below the results from OpenSSL.

The first block shows the software computed reference for AES encryption:


AES-128-CBC in SW

openssl speed -evp aes-128-cbc

Doing aes-128-cbc for 3s on 16 size blocks: 1109305 aes-128-cbc's in 2.99s
Doing aes-128-cbc for 3s on 64 size blocks: 314532 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 80732 aes-128-cbc's in 2.97s
Doing aes-128-cbc for 3s on 1024 size blocks: 20438 aes-128-cbc's in 2.98s
Doing aes-128-cbc for 3s on 8192 size blocks: 2565 aes-128-cbc's in 3.00s

And here is what I get for using CAAM via AF_ALG. It can be seen that the processor load goes down to almost zero, but the overall performance is very poor.

AES-128-CBC in HW

openssl speed -evp aes-128-cbc -engine af_alg

Doing aes-256-cbc for 3s on 16 size blocks: 2533 aes-256-cbc's in 0.02s
Doing aes-256-cbc for 3s on 64 size blocks: 2525 aes-256-cbc's in 0.01s
Doing aes-256-cbc for 3s on 256 size blocks: 2493 aes-256-cbc's in 0.01s
Doing aes-256-cbc for 3s on 1024 size blocks: 2387 aes-256-cbc's in 0.04s
Doing aes-256-cbc for 3s on 8192 size blocks: 1795 aes-256-cbc's in 0.00s

Can anyone confirm these numbers?

I'm using kernel version 3.0.35 with all CAAM related patches applied.

Cheers,

Thomas

0 项奖励
回复