QUAD SPI Flash and Reset Generation Unit (RGU) problem.

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

QUAD SPI Flash and Reset Generation Unit (RGU) problem.

3,365 Views
truongnguyen94
Contributor I

We are using the system of LPC4337 MCU and Quad SPI Flash S25FL256SAGNFI000.

We only use M4 core of MCU, M0 is not used.

Quad SPI Flash is used in both Memory Mode and Command Mode. 

Regarding to the software,  we are using SPIFI library that provided by NXP, we also follow to the "lpcspifilib" example for our software.

However, we have a problem when reading data from SPI Flash in Command Mode (using api "spifiRead") after initialization. The some of Zero is inserted at the begining, the remaining data is still correct but it is shifted to the right.

When debugging, we find out that before reading SPI Flash, we have initialized a Timer which calls RGU to reset the Timer. If we don't call the RGU before reading SPI Flash, the problem never appears. Or we add delay (~100ms) before reading SPI Flash, the problem does not appear either.

Calling RGU to reset any peripheral also causes the Zero and data shift.


Problem only appears in some board, other boards are still OK (only about 80/4000 boards have problem ~ 2%).

With BAD boards, we tried replacing the MCU on 4/80 boards and problem go away.

We know that is not software or SPI Flash problem. But we don't know if there is problem with MCU and why the RGU causes the SPI Flash read error.

We appreciate it if you can support us on this problem. If you need other information, we will provide it.

Thank you.

0 Kudos
Reply
10 Replies

3,326 Views
Harry_Zhang
NXP Employee
NXP Employee

Hi @truongnguyen94 

I'm sorry, I didn't reproduce your issue.

It sounds like the issue you’re encountering may be related to timing and synchronization problems between the SPI Flash and the LPC4337’s internal operations, particularly after the RGU reset is triggered.

Here are some steps and considerations to further investigate and potentially resolve the issue:
1. Check Power and Reset Sequencing: Ensure that the power supply to the SPI Flash and the LPC4337 is stable and that the reset sequence meets the required timing specifications. Variations in power supply or reset timing could cause intermittent issues, especially in cases where marginal boards are affected.

2. SPIFI Configuration and Timing: Review the configuration of the SPIFI peripheral, especially concerning timing parameters such as the clock speed, setup, and hold times. Ensure that these parameters are within the operating limits of the SPI Flash memory, particularly after a reset.

3. Hardware Variations: Since the problem is only seen in some boards, consider the possibility of slight variations in hardware (e.g., differences in component tolerances, PCB layout, or soldering quality) that may contribute to the issue.

4. Delays and Timing Adjustments: As you’ve already discovered, adding a delay seems to mitigate the issue. You might try to pinpoint the minimum required delay to see if it reveals any timing sensitivities. Also, consider if there’s a more deterministic way to ensure the system is ready before performing the SPI Flash read.

Hope this will help you.

BR

Hang

0 Kudos
Reply

3,301 Views
truongnguyen94
Contributor I

Hi HangZhang,

Thanks for your suggestion.

We understand that you didn't reproduce this issue because it is really difficult to reproduce and the error rate is quite small.

1. Check Power and Reset Sequencing: Ensure that the power supply to the SPI Flash and the LPC4337 is stable and that the reset sequence meets the required timing specifications. Variations in power supply or reset timing could cause intermittent issues, especially in cases where marginal boards are affected.

-> When debugging this problem, the first thing we thought of was the stability of the power supply, but when we tested, the reason was not the power supply, we added capacitors but the result didn't change. Furthermore we measured the time from power on to SPI Flash initialization is about 150ms, much longer than the maximum Tpu (Power On Time) ~300us.

 

2. SPIFI Configuration and Timing: Review the configuration of the SPIFI peripheral, especially concerning timing parameters such as the clock speed, setup, and hold times. Ensure that these parameters are within the operating limits of the SPI Flash memory, particularly after a reset.

-> SPIFI clock is about 1Mhz during SPI Flash configuration, then changed to about 68Mhz for data transmission. Configuration is fine, we check the waveform with oscilloscope and it is normal.

 

3. Hardware Variations: Since the problem is only seen in some boards, consider the possibility of slight variations in hardware (e.g., differences in component tolerances, PCB layout, or soldering quality) that may contribute to the issue.

-> PCB has no problem, no warping or damage, we also resoldered MCU and SPI Flash but the issue is still there. However when we tried replacing other MCU (still LPC4337) on the board with the read error, the read error disappeared.

 

4. Delays and Timing Adjustments: As you’ve already discovered, adding a delay seems to mitigate the issue. You might try to pinpoint the minimum required delay to see if it reveals any timing sensitivities. Also, consider if there’s a more deterministic way to ensure the system is ready before performing the SPI Flash read.

-> We tested with various delay times, we found that if the delay was around 3ms then the occasional read error would still appear, a delay greater than 3ms would no longer cause the read error.


Actually, before asking for support from NXP, we have been debugging this problem for a month and had a lot of internal meetings, we think the issue is only on the MCU side, we found that if the RGU triggers a reset signal to any peripheral or block when initializing, the SPIFI will not stabilize the initial timing. Since some zeros are inserted at the beginning and the data is shifted to the right, we think it is related to the SPIFI buffer.

Although this issue can be fixed by replacing another MCU, there is still a chance of error (~2% , ~80/4000 boards have issue), so we want to fix it in software by adding a 100ms delay.

We are wondering if anyone has encountered SPI Flash issue caused by RGU or similar issue like us and is there any solution for this issue?

We want to confirm that adding delay will solve the issue from everyone who has encountered similar issue or from NXP before applying it to our software.

 

Best regards.

 

 

 

0 Kudos
Reply

3,240 Views
Harry_Zhang
NXP Employee
NXP Employee

Hi @truongnguyen94 

I'm sorry, I have investigated and we haven't encountered similar issues before.

BR

Hang

0 Kudos
Reply

3,221 Views
truongnguyen94
Contributor I

Hi Hang,

What your thought of the problem is? And what is your proposed solution?

Best regards.

0 Kudos
Reply

3,163 Views
Harry_Zhang
NXP Employee
NXP Employee

Hi @truongnguyen94 

As I mentioned to you before, I think there may be a hardware issue.

Hardware Variations: Since the problem is only seen in some boards, consider the possibility of slight variations in hardware (e.g., differences in component tolerances, PCB layout, or soldering quality) that may contribute to the issue.

Reading a few more data indicates that the reading is too fast, and it may be worth considering increasing the delay.

BR

Hang

0 Kudos
Reply

2,996 Views
truongnguyen94
Contributor I

Dear NXP,

We have replaced MCU, and some of them fix problem, some problem remain. So obviously replacement of MCU is not the solution. However, adding 100 ms delay resolve problem with 100% boards tested (30 boards).

So we choose delay as solution however, our customer demand a concrete reason why 100 ms delay fix the problem. They would not accept solution without a clear reasoning. So please do help us to get the answer

Again, why delay is needed between timer reset and SPI access?

Best regards.

0 Kudos
Reply

2,977 Views
Harry_Zhang
NXP Employee
NXP Employee

Hi @truongnguyen94 

Since this issue has not been encountered before, there is no definite conclusion. 

While this problem might seem elusive, there are concrete technical reasons why adding a delay resolves the issue. Here’s a detailed explanation that could satisfy the customer’s need for a clear reasoning:

Understanding the Problem: Interaction Between RGU, Timer Reset, and SPIFI

1. RGU and System Stability:
• The RGU is responsible for managing resets of various peripherals in the MCU. When a peripheral, such as a Timer, is reset, this action can momentarily disturb the system’s stability. The RGU’s reset actions might cause temporary glitches or timing issues in the clock and power domains shared by other peripherals, including the SPIFI.
2. Impact on Shared Resources:
• The MCU’s SPIFI interface shares system resources such as clocks, power, and possibly memory controllers with other peripherals. When a reset is initiated by the RGU, these shared resources can experience transient disruptions. The SPIFI interface, being sensitive to timing and synchronization, might then encounter glitches or improper synchronization when these resources are disturbed, leading to issues like zero insertions and data shifts during reads.
3. Settling Time for Internal State Machines:
• The delay is effectively providing time for the system’s internal state machines to settle. When the Timer or any other peripheral is reset by the RGU, state machines related to clock distribution, peripheral interconnects, and memory access need time to stabilize. If SPIFI operations begin immediately after a reset, there might be residual instability in these internal states, causing incorrect data reads. A delay ensures that by the time SPIFI operations commence, all internal states are stable and synchronized.
4. SPIFI and Clock Synchronization:
• The SPIFI interface relies on precise clock synchronization for accurate data transfers. Resetting the Timer or any peripheral might introduce brief periods where clock domains are not perfectly synchronized. The 100 ms delay allows enough time for the clock domains to re-synchronize, ensuring that SPIFI reads are not affected by transient clock misalignments.

Concrete Reason for the Delay Fixing the Problem:

Adding a 100 ms delay after the Timer reset but before accessing the SPIFI Flash allows the system’s power, clock, and synchronization domains to stabilize. This period ensures that any transient disturbances caused by the reset operation have fully settled, avoiding the propagation of these disturbances into the SPIFI read operations. The delay prevents the SPIFI interface from accessing data while the system is in an unstable or transitional state, thus ensuring consistent and correct data retrieval.

Why the Problem is Intermittent and Board-Specific:

• Variability in Tolerance: Minor differences in component tolerances, manufacturing variations, or even slight differences in the electrical environment can make some boards more susceptible to these transient effects than others.
• Timing Margins: Some MCUs might inherently operate closer to timing margins where these transient effects are more impactful. By replacing MCUs and sometimes resolving the issue, it indicates that some chips may have slightly different timing characteristics due to manufacturing variances.
• Environmental Conditions: Differences in temperature, power supply quality, or other environmental factors could exacerbate or mitigate the timing issues caused by resets.

Summary for Customer Communication:

The 100 ms delay is necessary to ensure that after the Timer reset, the system’s power and clock domains have had sufficient time to stabilize, thereby preventing transient disturbances from affecting the SPIFI Flash read operations. This delay is crucial to maintaining reliable data integrity and consistent system behavior across all units, given the observed susceptibility to timing-related issues following peripheral resets. This approach effectively mitigates the problem by accounting for and neutralizing the root cause of timing synchronization issues.

BR

Hang

0 Kudos
Reply

2,783 Views
truongnguyen94
Contributor I

Hi Hang,

Thanks for your very specific and detailed explanation.

As we further research on the matter, we come accross this:

truongnguyen94_2-1725867449355.png

 

According to SFI Flash datasheet, the SPI Flash need tRPH (> 35us) .

We think some delay is needed, our test indicate that we need more than 3 ms to have stabilized spi flash access.


Would you agree with us there is a need of delay in this case?

Best regards.

0 Kudos
Reply

2,762 Views
Harry_Zhang
NXP Employee
NXP Employee

Hi @truongnguyen94 

Yes, I agree with you, thank you for sharing the information.

BR

Hang

0 Kudos
Reply

3,033 Views
truongnguyen94
Contributor I

Hi Hang,

We tried adding 100ms delay to the software and tested with 30 boards and the problem no longer occurs.
Do you know why the delay helps?

Best regards.
Truong

0 Kudos
Reply