LS1021A CAAM errors when using aes

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

LS1021A CAAM errors when using aes

1,819 Views
tuomasph
Contributor I

I'm getting CAAM crypto hardware errors when using IPsec VPN and aes encryption:

caam_jr 1720000.jr: 40002d1c: DECO: desc idx 45: DECO Watchdog timer timeout error

Different ipsec clients have been tested with the same result. aes encryption is the only one that causes this. Other methods like 3DES works.

My setup:

  • LS1021A fls-sdk v2.0-1701 (kernel 4.1.35)
  • lan interface: eth2
  • wan interfaces: eth1 and wlan0 (pcie1)

VPN client is configured to tunnel traffic from eth2 interface.  This error only occurs when trying to ping from the device using eth2 address as the ping source:

ping -I <eth2 address> <internet ping target>

Traffic from eth2 interface is encrypted and goes out from wlan0 interface, only pings originating from the ls1021a with source address causes errors and packets are lost.

Pings and traffic work if eth1 wan is used.

Also if hardware crypto is disabled, no errors and everything works as it should.

Something is happening to the traffic originating from the ls1021a device that causes crypto hardware errors.

Attached are debug output from CAAM. The interesting part is in the beginning of each debug files when first aead_encrypt_done function is called. After failing at encryption, CAAM starts to encrypt again but cryptlen has increased by 80 and continues to fail 13 times before giving up. Cryptlen increases by 80 after each failed attempt.

Same thing happens with eth1 wan when trying to ping with packet size greater than 1400. 

ping -s 1400 -I <eth2 address> <internet ping target>

This should be easy to test with any setup with IPsec and CAAM hardware.

Labels (1)
0 Kudos
Reply
2 Replies

1,343 Views
horiageanta
NXP Employee
NXP Employee

1. The "DECO Watchdog timer timeout error" might be caused by a timing issue in the CAAM descriptor used for IPsec encryption.

A fix is available here (currently under review):

crypto: caam - fix concurrency issue in givencrypt descriptor - Patchwork

2. With regards to "After failing at encryption, CAAM starts to encrypt again but cryptlen has increased by 80 and continues to fail 13 times before giving up".

The root cause seems to be the following: CAAM driver, in case of a failure, is not returning the correct error code back to the networking stack. This causes the networking stack to try to encapsulate (IPsec ESP) the resulting packet (bigger than the original one) again and again, until it goes over the MTU size when eventually xfrm gives up.

More exactly: the resume path (from crypto to networking stack) is: esp_output_done() -> xfrm_output_resume() -> xfrm_output_one(..., err) and since err is incorrect (a positive number representing the CAAM HW status instead of a negative errno, for e.g. -EINVAL) xfrm_output_one() does not jump to the "resume" label and re-encapsulates the packet.

A fix is available here (also under review):

[v3,02/14] crypto: caam - fix return code in completion callbacks - Patchwork 

1,343 Views
tuomasph
Contributor I

I tested a workaround and removed authenc drivers with aes in caamalg.c. Pings and traffic are working normally now also with aes encryption.

Normally hardware crypto (CAAM) is using aead_givencrypt(), aead_encrypt() and aead_decrypt() -functions.  If caamalg.c is modified and all authenc drivers with aes are removed CAAM is using ablkcipher and ahash instead of aead and everything works correctly. 3DES and aes-gcm are still being handled by aead-functions.

Driver registered for aes in /proc/crypto is:

driver       : authenc(hmac-sha512-caam,cbc-aes-caam)

So, aead for aes is not working in some situations but ablkcipher with ahash works. I have not noticed any difference in performance with the workaround. 

Has somebody noticed similar issues with aead and CAAM hardware?

0 Kudos
Reply