Hello,
We are integrating an i.MX 8QuadMax and are experiencing intermittent failures of SanDisk brand 32GB microSD cards in our development units.
With a protocol analyzer on bootup (this microSD card is the OS boot source) and a *failed* card, I see the i.MX 8 send a CMD0 (reset), then CMD8 (check supplied voltage), then a CMD55 (APP_CMD) and ACMD41 (send operating condition register). The CMD55 and ACMD41 commands are continuously sent until the i.MX 8 resets itself after about 300mS. Bus analysis shows that the ACMD41 response "Busy Status" bit is never set, therefore the card has not completed initialization and the i.MX 8 resets before it ever does.
Questions:
1. The SDIO Physical specification says that the SDIO controller should wait for the "Busy Status" bit to be set for at least 1 second, yet the i.MX 8 resets before then. Why? Can this be changed?
2. Has anyone else experienced SD card failures like this using i.MX 8 processors? Our previous generation product with an i.MX 6 processor did not exhibit this issue.
Hi Rita,
We are using NXP Hardknott, kernel version 5.10. We do have another boot device - an eMMC chip, but we have yet to observe this issue with it.
I suspect it's possible there is an issue with the microSD cards and we're communicating with the manufacturer. However, it doesn't explain why the processor resets before the minimum time specified in the SDIO specification.
We have now replicated this issue with a Western Digital Industrial card
Hello,
It's been about a week, when can I expect a response? We have replicated this issue on a Quad Max MEK.
How can we escalate this issue?
We have observed that with the SanDisk Extreme, Western Digital, Industrial, and other brand microSD cards that after strenuous use our system no longer boots up. Upon probing the SDIO bus we observe the CMD0, CMD8, and then CDM55 and ACMD41 repeatedly issued for roughly 300mS before a system reset occurs. I believe this is because these cards have not ~yet~ set their “Busy Status” bit and therefore their file system contents cannot yet be accessed. This system reset originates from a Watchdog strobe event on the i.MX 8 SCU_WDOG_OUT pin.
I read in the “SD Specifications Part 1 Physical Layer Simplified Specification Version 9.10” document in section 4.2.3 that “The host repeatedly issues ACMD41 for at least 1 second or until the busy bit are set to 1.” This reads to me like the i.MX 8 violates SD specification by not allowing 1 full second after the first ACMD41 message for the card to become ready.
We believe we've identified the source of this Watchdog event - the SECO hardware reset timer. The i.MX 8 Hardware Reference Manual states in section 5.7.3.3 that "There is a hardware reset timer enabled by default on the SECO side. The timeout of this timer is 300ms...". Since the firmware resides on the microSD card and the CPU/PMICs execute a system-wide power cycle before the microSD card is ready, our system gets caught in a boot-reset loop indefinitely.
Is there a way we can extend or suppress the SECO reset timer?
sorry for delay, your information is limited, I need more detailed information, you design your own imx8qm board, right? firstly, could you confirm if your HW design is total correct? then which stage do you get such error? it seems you stuck in the uboot stage, right? for HW WDOG, refer to the RM,
If the WDOG has been enabled, the below scenarios is when the SCU ROM will service
the WDOG:
• After loading the Image Container Set0 header.
• After loading the Image Container Set1 header, provided that the secondary boot has
not been disabled by fuse.
• After loading an image. If the image is bigger than the size which can be loaded
within the WDOG timeout, depending on the image size and the speed of the boot
interface, a WDOG timeout will occur and the chip will reset.
• After entering USB boot loop
The system controller includes watchdog hardware (WDOG) to protect the system by inducing a warm reset if the SCFW is unable to periodically service the WDOG timer. By default this is enabled by the SCFW at boot with a timeout of one second
Hi joanxie,
Yes, we did design our own imx8 board but heavily based on the MEK. The reset/watchdog topology is identical. The design was reviewed by NXP a year or two ago.
The issue precludes anything the SCU does since we believe it is the SECO that fires the watchdog in this case - see section 5.7.3.3. We observe a watchdog event that causes the PMICs to perform a hard reset every 300ms.
this 300ms couldn't be changed by customer, I'm afraid you used very old chip, in the newer one, this issue has been fixed, I found errata for this issue, I think this should be your problem, you can check it
ERR050108: ROM: eMMC/SD boot failure due to ROM code timeout under certain conditions
Description
This issue is related to boot from eMMC or SD devices. On power-up, a boot monitor timer is initialized. On a successful boot, SECO firmware is loaded and run from the boot device—that is, eMMC or SD, which, after loading and verifying SCU firmware (SCFW), disables the timer. When a successful boot requires more than 300 msecs, a timeout occurs that is considered a boot failure, and therefore generates a warm reset, which results in a looping boot failure.
Typically, the initialization time for most eMMC/SD devices is about 100 to 200 msecs, however, the eMMC/SD specification allows up to 1 second for this initialization time.
Any eMMC/SD device that exceeds the timeout to initialization will fail to boot. Sudden power loss or power cycle stress testing to eMMC/SD devices can cause data corruption, which can force the eMMC/SD device to run an internal data check on the next
power up, which results in a longer initialization time, forcing a timeout and a looping boot failure.
Workaround
Generally reducing eMMC/SD initialization time under 300 msecs is the most effective way to avoid this looping boot failure.
For the data corruption case, change the boot mode to serial download mode then load and run an image via USB. This image can initialize the eMMC/SD device and exit out of the internal data check state.
In future silicon releases, ROM will consider this case and wait 1 second to avoid the boot failure.
This errata describes our issue very well, and yet it is not present in the errata for the i.MX 8 Quad Max.
https://www.nxp.com/docs/en/errata/IMX8_1N94W.pdf
I don't know what you mean by saying it's "a very old chip" - this part is still active with many more years of expected lifecycle. What is "the newer one"?
The workaround mentioned in the errata is not feasible in our design. "reducing eMMC/SD initialization time under 300 msecs" cannot be guaranteed since it is an aspect of the SD card that cannot be controlled externally. And our design is not equipped to "change the boot mode to serial download mode then load and run an image via USB".
yes, this errata is for imx8qxp B0 chip, C0 has fixed this issue already, I didn't find this issue for the imx8qm, and we don't reproduce this issue on imx8qm, so I assume maybe you use very old imx8qm chip, if no, I suggest you check your HW again, if you couldn't guarantee this design, you can send the HW schematic to review, or try to use the other SD card to test again, because you couldn't extend the 300ms to 1s
I can assure you that this issue exists on our imx8qm. We've replicated it with multiple SD card makes and models, and on the imx8 MEK itself.
The SECO firmware can be loaded in under 300ms when the SD cards are fresh, but after hard use and power cycles it may take longer than 300ms and the boot loop issue presents.
How many die revisions of the imx8qm have there been, and how can we check on our parts?
I'd be happy to send the schematic to NXP for review a second time.
Again, how many die revisions of the imx8qm have there been, and how can we check on our parts?
How can I share the schematic design securely with you or another NXP employee in a non-public fashion? Or are you asking us to send our physical board?
We have confirmed the issue with at least 3 of our boards.
this isn't related how many die revisions of the imx8qm have been, it is related to what revision you use, do you know what your chip part number? and for HW review, you can submit ticket to our salesforce system, this isn't public, click the "submit a ticket"
Ok, I've submitted a ticket. Case number 00624932.
The part number is MIMX8QM6AVUFFAB.
Other markings on the package are:
SBBL2205
1N94W
KOREA SEBLSBD
The cards we're having issues with are SanDisk Extreme 32GB
