i.MX8MP 平台开启 CAAM + trusted keys 后 warm reboot 稳定性问题咨询

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

i.MX8MP 平台开启 CAAM + trusted keys 后 warm reboot 稳定性问题咨询

498 Views
DADAXIN
Contributor III

NXP社区技术人员,大家好

       在 i.MX8MP 平台上做系统稳定性测试时,遇到一个与 CAAM 相关的 warm reboot 小概率卡住问题,想请教是否有相关经验或建议的配置方式。

测试环境说明

  • SoC:i.MX8MP

  • 系统:Yocto Linux(init 启动方式)

  • 内核版本:6.6.52

  • 存储:eMMC

  • 测试方式:系统启动后,通过脚本进行连续 reboot 压力测试

现象描述

  1. 关闭 CAAM 相关功能时

    • 连续 reboot 测试超过 500 次

    • 系统均可正常启动

    • 未出现卡死或启动异常

  2. 开启 CAAM 及 trusted/encrypted keys 相关功能后

    • reboot 测试中 约 1/100 ~ 1/200 的概率

    • 系统在启动阶段卡住

    • 串口停留在早期初始化日志

    • 异常时可观察到类似日志

    3、异常时可观察到类似日志

         <1>内核卡住log

[    2.159151] caam 30900000.crypto: caam pkc algorithms registered in /proc/crypto
[    2.166628] caam 30900000.crypto: rng crypto API alg registered prng-caam
[    2.173441] caam 30900000.crypto: registering rng-caam
[    2.180670] Executing RNG SELF-TEST with wait
[    2.259240] mmc2: new HS400 Enhanced strobe MMC card at address 0001
[    2.266720] mmcblk2: mmc2:0001 DV4032 29.1 GiB
[    2.274171]  mmcblk2: p1 p2 p3
[    2.279584] mmcblk2boot0: mmc2:0001 DV4032 4.00 MiB
[    2.286069] mmcblk2boot1: mmc2:0001 DV4032 4.00 MiB
[    2.293040] mmcblk2rpmb: mmc2:0001 DV4032 16.0 MiB, chardev (234:0)
[   60.410412] imx-sdma 30bd0000.dma-controller: Direct firmware load for imx/sdma/sdma-imx7d.bin failed with error -2
[   60.420874] imx-sdma 30bd0000.dma-controller: Falling back to sysfs fallback for: imx/sdma/sdma-imx7d.bin
[  121.822422] imx-sdma 30bd0000.dma-controller: external firmware not found, using ROM firmware
[  123.010384] random: crng init done

     <2> 在 reboot 过程中,也偶尔能看到:

caam_jr ... Device is busy

     <3> 相关内核配置

CONFIG_CRYPTO=y
CONFIG_CRYPTO_DEV_FSL_CAAM=y
CONFIG_CRYPTO_DEV_FSL_CAAM_JR=y
CONFIG_CRYPTO_DEV_FSL_CAAM_RNG_API=y
CONFIG_TRUSTED_KEYS=y
CONFIG_TRUSTED_KEYS_CAAM=y
CONFIG_ENCRYPTED_KEYS=y
CONFIG_DM_CRYPT=y

关闭 CAAM(或 trusted/encrypted keys),warm reboot 稳定性恢复正常

0 Kudos
Reply
2 Replies

438 Views
Bio_TICFSL
NXP TechSupport
NXP TechSupport

Hello,

On the i.MX8MP platform, considering the Kernel 6.6.52 and CAAM RNG Self-test stuck issues you provided, this is typically related to a chain reaction caused by residual RNG initialization state, Job Ring permission lockout, or SDMA firmware loading timeout. The following are analysis suggestions and configuration solutions for this issue:

1. Core Cause Analysis CAAM State Machine Not Reset: Warm reboot does not reset all SoC registers. If CAAM was busy before reboot (e.g., caam_jr ... Device is busy in the log), the RNG's hardware state machine may be in an indeterminate intermediate state after reboot. Job Ring Permission Issue: If High Assurance Boot (HAB) or OP-TEE is used, the BootROM or ATF may lock certain Job Rings (JRs). During warm reboot, if the JR registers are not properly released, the Linux kernel will wait indefinitely during the Executing RNG SELF-TEST with wait phase because it cannot obtain a hardware response.

2. Suggested Optimizations and Solutions

A. Kernel Configuration and Driver Adjustment Disable Synchronous Self-Check: Try disabling the forced wait mechanism in CONFIG_CRYPTO_DEV_FSL_CAAM_RNG_API, or pre-initialize the RNG in U-Boot.

Integrate SDMA Firmware: Compile the SDMA firmware into the kernel (CONFIG_EXTRA_FIRMWARE="imx/sdma/sdma-imx7d.bin") to avoid the 60-second wait time caused by the file system not being mounted in the early stages of startup. This can significantly reduce timing risks during the startup process.

B. Modify U-Boot/ATF Reset Behavior (Workaround) Warm Reset often has this kind of residual problem on the i.MX8M series.

C. DTS Device Tree Check the attributes of the caam_jr node. In some versions, it is necessary to ensure that the Job Ring allocation is consistent with the security mode:

&crypto {
    status = "okay";
};

&sec_jr0 {
    status = "okay";
};

 

Regards

0 Kudos
Reply

279 Views
DADAXIN
Contributor III
感谢回复!我将对这两个部分进行改动验证,后续反馈情况;
0 Kudos
Reply
%3CLINGO-SUB%20id%3D%22lingo-sub-2290510%22%20slang%3D%22zh-CN%22%20mode%3D%22CREATE%22%3Ei.MX8MP%20platform%20warm%20reboot%20stability%20issue%20after%20enabling%20CAAM%20%2B%20trusted%20keys%20advisory%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2290510%22%20slang%3D%22zh-CN%22%20mode%3D%22CREATE%22%3E%3CP%3EHello%20NXP%20Community%20Technicians!%3C%2FP%3E%3CP%3E%20%20%20%20%20%20%20While%20doing%20system%20stability%20testing%20on%20the%20i.MX8MP%20platform%2C%20I%20encountered%20a%20%3CSTRONG%3Esmall%20probability%20of%20stuck%20warm%20reboot%20issue%3C%2FSTRONG%3E%20related%20to%20CAAM%2C%20and%20would%20like%20to%20ask%20if%20there%20is%20any%20relevant%20experience%20or%20suggested%20way%20to%20configure%20it.%3C%2FP%3E%3CH3%20id%3D%22toc-hId-2005998910%22%20id%3D%22toc-hId-2006025946%22%20id%3D%22toc-hId-2006025946%22%3ETest%20environment%20description%3C%2FH3%3E%3CUL%3E%3CLI%3E%3CP%3ESoC%3A%20i.MX8MP%3C%2FP%3E%3C%2FLI%3E%3CLI%3E%3CP%3ESystem%3A%20Yocto%20Linux%20(init%20boot%20method)%3C%2FP%3E%3C%2FLI%3E%3CLI%3E%3CP%3EKernel%20version%3A%206.6.52%3C%2FP%3E%3C%2FLI%3E%3CLI%3E%3CP%3EStorage%3A%20eMMC%3C%2FP%3E%3C%2FLI%3E%3CLI%3E%3CP%3ETest%20method%3A%20After%20the%20system%20is%20booted%2C%20a%20continuous%20reboot%20stress%20test%20is%20performed%20through%20a%20script.%3C%2FP%3E%3C%2FLI%3E%3C%2FUL%3E%3CH3%20id%3D%22toc-hId-198544447%22%20id%3D%22toc-hId-198571483%22%20id%3D%22toc-hId-198571483%22%3EDescription%20of%20the%20phenomenon%3C%2FH3%3E%3COL%3E%3CLI%3E%3CP%3E%3CSTRONG%3EWhen%20disabling%20CAAM-related%20functions%3C%2FSTRONG%3E%3C%2FP%3E%3CUL%3E%3CLI%3E%3CP%3EMore%20than%20%3CSTRONG%3E500%3C%2FSTRONG%3E%20consecutive%20reboot%20tests%3C%2FP%3E%3C%2FLI%3E%3CLI%3E%3CP%3EThe%20system%20can%20be%20started%20normally%3C%2FP%3E%3C%2FLI%3E%3CLI%3E%3CP%3ENo%20jams%20or%20startup%20anomalies%3C%2FP%3E%3C%2FLI%3E%3C%2FUL%3E%3C%2FLI%3E%3CLI%3E%3CP%3E%3CSTRONG%3EWhen%20CAAM%20and%20trusted%2Fencrypted%20keys%20are%20enabled%3C%2FSTRONG%3E%3C%2FP%3E%3CUL%3E%3CLI%3E%3CP%3E%3CSTRONG%3EAbout%201%2F100%20to%201%2F200%20probability%3C%2FSTRONG%3E%20in%20reboot%20test%3C%2FP%3E%3C%2FLI%3E%3CLI%3E%3CP%3ESystem%20stuck%20at%20startup%3C%2FP%3E%3C%2FLI%3E%3CLI%3E%3CP%3ESerial%20port%20stuck%20in%20early%20initialization%20log%3C%2FP%3E%3C%2FLI%3E%3CLI%3ESimilar%20logs%20can%20be%20observed%20for%20exceptions%3C%2FLI%3E%3C%2FUL%3E%3C%2FLI%3E%3C%2FOL%3E%3CP%3E%3CSTRONG%3E%20%20%20%203.%20Similar%20logs%20can%20be%20observed%20in%20the%20event%20of%20an%20exception%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3E%20%20%20%20%20%20%20%20%26lt%3B1%26gt%3Bkernel%20stuck%20log%3C%2FP%3E%3CPRE%20class%3D%22lia-code-sample%20language-javascript%22%3E%3CCODE%20translate%3D%22no%22%3E%5B%20%20%20%202.159151%5D%20caam%2030900000.crypto%3A%20caam%20pkc%20algorithms%20registered%20in%20%2Fproc%2Fcrypto%0A%5B%20%20%20%202.166628%5D%20caam%2030900000.crypto%3A%20rng%20crypto%20API%20alg%20registered%20prng-caam%0A%5B%20%20%20%202.173441%5D%20caam%2030900000.crypto%3A%20registering%20rng-caam%0A%5B%20%20%20%202.180670%5D%20Executing%20RNG%20SELF-TEST%20with%20wait%0A%5B%20%20%20%202.259240%5D%20mmc2%3A%20new%20HS400%20Enhanced%20strobe%20MMC%20card%20at%20address%200001%0A%5B%20%20%20%202.266720%5D%20mmcblk2%3A%20mmc2%3A0001%20DV4032%2029.1%20GiB%0A%5B%20%20%20%202.274171%5D%20%20mmcblk2%3A%20p1%20p2%20p3%0A%5B%20%20%20%202.279584%5D%20mmcblk2boot0%3A%20mmc2%3A0001%20DV4032%204.00%20MiB%0A%5B%20%20%20%202.286069%5D%20mmcblk2boot1%3A%20mmc2%3A0001%20DV4032%204.00%20MiB%0A%5B%20%20%20%202.293040%5D%20mmcblk2rpmb%3A%20mmc2%3A0001%20DV4032%2016.0%20MiB%2C%20chardev%20(234%3A0)%0A%5B%20%20%2060.410412%5D%20imx-sdma%2030bd0000.dma-controller%3A%20Direct%20firmware%20load%20for%20imx%2Fsdma%2Fsdma-imx7d.bin%20failed%20with%20error%20-2%0A%5B%20%20%2060.420874%5D%20imx-sdma%2030bd0000.dma-controller%3A%20Falling%20back%20to%20sysfs%20fallback%20for%3A%20imx%2Fsdma%2Fsdma-imx7d.bin%0A%5B%20%20121.822422%5D%20imx-sdma%2030bd0000.dma-controller%3A%20external%20firmware%20not%20found%2C%20using%20ROM%20firmware%0A%5B%20%20123.010384%5D%20random%3A%20crng%20init%20done%3C%2FCODE%3E%3C%2FPRE%3E%3CP%3E%20%20%20%20%26lt%3B2%26gt%3B%20can%20also%20be%20seen%20occasionally%20during%20reboot%3A%3C%2FP%3E%3CPRE%20class%3D%22lia-code-sample%20language-javascript%22%3E%3CCODE%20translate%3D%22no%22%3Ecaam_jr%20...%20Device%20is%20busy%3C%2FCODE%3E%3C%2FPRE%3E%3CP%3E%20%20%20%20%26lt%3B3%26gt%3B%20Related%20Kernel%20Configurations%3C%2FP%3E%3CPRE%20class%3D%22lia-code-sample%20language-javascript%22%3E%3CCODE%20translate%3D%22no%22%3ECONFIG_CRYPTO%3Dy%0ACONFIG_CRYPTO_DEV_FSL_CAAM%3Dy%0ACONFIG_CRYPTO_DEV_FSL_CAAM_JR%3Dy%0ACONFIG_CRYPTO_DEV_FSL_CAAM_RNG_API%3Dy%0ACONFIG_TRUSTED_KEYS%3Dy%0ACONFIG_TRUSTED_KEYS_CAAM%3Dy%0ACONFIG_ENCRYPTED_KEYS%3Dy%0ACONFIG_DM_CRYPT%3Dy%3C%2FCODE%3E%3C%2FPRE%3E%3CP%3ECAAM%20(or%20trusted%2Fencrypted%20keys)%20turned%20off%2C%20warm%20reboot%20stabilization%20back%20to%20normal%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-2291082%22%20slang%3D%22en-US%22%20mode%3D%22CREATE%22%20translate%3D%22no%22%3ERe%3A%20i.MX8MP%20%E5%B9%B3%E5%8F%B0%E5%BC%80%E5%90%AF%20CAAM%20%2B%20trusted%20keys%20%E5%90%8E%20warm%20reboot%20%E7%A8%B3%E5%AE%9A%E6%80%A7%E9%97%AE%E9%A2%98%E5%92%A8%E8%AF%A2%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2291082%22%20slang%3D%22en-US%22%20mode%3D%22CREATE%22%3E%3CP%3EHello%2C%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22Y2IQFc%22%3EOn%20the%20i.MX8MP%20platform%2C%20considering%20the%20Kernel%206.6.52%20and%20CAAM%20RNG%20Self-test%20stuck%20issues%20you%20provided%2C%20this%20is%20typically%20related%20to%20a%20chain%20reaction%20caused%20by%20residual%20RNG%20initialization%20state%2C%20Job%20Ring%20permission%20lockout%2C%20or%20SDMA%20firmware%20loading%20timeout.%20The%20following%20are%20analysis%20suggestions%20and%20configuration%20solutions%20for%20this%20issue%3A%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22Y2IQFc%22%3E1.%20Core%20Cause%20Analysis%20CAAM%20State%20Machine%20Not%20Reset%3A%20Warm%20reboot%20does%20not%20reset%20all%20SoC%20registers.%20If%20CAAM%20was%20busy%20before%20reboot%20(e.g.%2C%20caam_jr%20...%20Device%20is%20busy%20in%20the%20log)%2C%20the%20RNG's%20hardware%20state%20machine%20may%20be%20in%20an%20indeterminate%20intermediate%20state%20after%20reboot.%20Job%20Ring%20Permission%20Issue%3A%20If%20High%20Assurance%20Boot%20(HAB)%20or%20OP-TEE%20is%20used%2C%20the%20BootROM%20or%20ATF%20may%20lock%20certain%20Job%20Rings%20(JRs).%20During%20warm%20reboot%2C%20if%20the%20JR%20registers%20are%20not%20properly%20released%2C%20the%20Linux%20kernel%20will%20wait%20indefinitely%20during%20the%20Executing%20RNG%20SELF-TEST%20with%20wait%20phase%20because%20it%20cannot%20obtain%20a%20hardware%20response.%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22Y2IQFc%22%3E2.%20Suggested%20Optimizations%20and%20Solutions%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22Y2IQFc%22%3EA.%20Kernel%20Configuration%20and%20Driver%20Adjustment%26nbsp%3B%3C%2FSPAN%3E%3CSPAN%20class%3D%22Y2IQFc%22%3EDisable%20Synchronous%20Self-Check%3A%20Try%20disabling%20the%20forced%20wait%20mechanism%20in%20CONFIG_CRYPTO_DEV_FSL_CAAM_RNG_API%2C%20or%20pre-initialize%20the%20RNG%20in%20U-Boot.%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22Y2IQFc%22%3EIntegrate%20SDMA%20Firmware%3A%20Compile%20the%20SDMA%20firmware%20into%20the%20kernel%20(CONFIG_EXTRA_FIRMWARE%3D%22imx%2Fsdma%2Fsdma-imx7d.bin%22)%20to%20avoid%20the%2060-second%20wait%20time%20caused%20by%20the%20file%20system%20not%20being%20mounted%20in%20the%20early%20stages%20of%20startup.%20This%20can%20significantly%20reduce%20timing%20risks%20during%20the%20startup%20process.%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22Y2IQFc%22%3EB.%20Modify%20U-Boot%2FATF%20Reset%20Behavior%20(Workaround)%20Warm%20Reset%20often%20has%20this%20kind%20of%20residual%20problem%20on%20the%20i.MX8M%20series.%20%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22Y2IQFc%22%3EC%3C%2FSPAN%3E%3CSPAN%20class%3D%22Y2IQFc%22%3E.%20DTS%20Device%20Tree%20Check%26nbsp%3B%3C%2FSPAN%3E%3CSPAN%20class%3D%22Y2IQFc%22%3Ethe%20attributes%20of%20the%20caam_jr%20node.%20In%20some%20versions%2C%20it%20is%20necessary%20to%20ensure%20that%20the%20Job%20Ring%26nbsp%3B%3C%2FSPAN%3E%3CSPAN%20class%3D%22Y2IQFc%22%3Eallocation%20is%20consistent%20with%20the%20security%20mode%3A%3C%2FSPAN%3E%3C%2FP%3E%0A%3CDIV%20id%3D%22tw-target-rmn-container%22%20class%3D%22tw-target-rmn%20tw-ta-container%20tw-nfl%22%20tabindex%3D%22-1%22%20role%3D%22text%22%3E%0A%3CPRE%20id%3D%22tw-target-rmn%22%20class%3D%22tw-data-placeholder%20tw-text-small%20tw-ta%22%20dir%3D%22ltr%22%20style%3D%22text-align%3A%20left%3B%22%20tabindex%3D%22-1%22%20role%3D%22text%22%20data-placeholder%3D%22%22%3E%3CCODE%3E%3CSPAN%20class%3D%22undefined%22%3E%26amp%3Bcrypto%20%7B%0A%20%20%20%20status%20%3D%20%3C%2FSPAN%3E%3CSPAN%20class%3D%22CS0cqb%22%3E%22okay%22%3C%2FSPAN%3E%3CSPAN%20class%3D%22undefined%22%3E%3B%0A%7D%3B%0A%0A%26amp%3Bsec_jr0%20%7B%0A%20%20%20%20status%20%3D%20%3C%2FSPAN%3E%3CSPAN%20class%3D%22CS0cqb%22%3E%22okay%22%3C%2FSPAN%3E%3CSPAN%20class%3D%22undefined%22%3E%3B%0A%7D%3B%3C%2FSPAN%3E%3C%2FCODE%3E%3C%2FPRE%3E%0A%3C%2FDIV%3E%0A%3CBR%20%2F%3E%0A%3CP%3E%3CSPAN%20class%3D%22Y2IQFc%22%3ERegards%3C%2FSPAN%3E%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-2292301%22%20slang%3D%22zh-CN%22%20mode%3D%22CREATE%22%20translate%3D%22no%22%3ERe%3A%20i.MX8MP%20%E5%B9%B3%E5%8F%B0%E5%BC%80%E5%90%AF%20CAAM%20%2B%20trusted%20keys%20%E5%90%8E%20warm%20reboot%20%E7%A8%B3%E5%AE%9A%E6%80%A7%E9%97%AE%E9%A2%98%E5%92%A8%E8%AF%A2%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2292301%22%20slang%3D%22zh-CN%22%20mode%3D%22CREATE%22%3EThanks%20for%20the%20replies!%20I%20will%20verify%20the%20changes%20to%20both%20sections%20and%20follow%20up%20with%20feedback%3B%3C%2FLINGO-BODY%3E