LS1021A: How to write an efficient QuadSPI FLASH driver?

jean-francoisri · ‎12-19-2016

The driver we are currently using seems to be playing very safe by restricting the size of a FLASH command to the size of a FIFO. Furthermore, the size of these FIFOs may have been inherited from an early Vybrid implementation: TX is configured to 64 bytes, which RX is configured to 128 bytes. If you consider the case of a Page Program command, on a large Flash requiring 4-bytes address, the largest possible PP command you could send at a time is:

[02][a3][a2][a1][a0][59 bytes of data]

where a3-a0 represents the address in big Endian. Considering that we are using a Spansion S25FL512S, we could technically program a full 512-bytes page at a time, but we are restricted to 59 bytes due to the driver implementation. However, we still need to stay within a page boundary in a PP command, so programming a full 512-bytes page would be cut into the following sequence: 59, 59, 59, 59, 59. 59, 59, 59, 59, 40. That does not seem very efficient.

I read chapter 28 (QuadSPI) of the LS1021A reference manual multiple times, especially section 28.6.3.6 about Flash Programming, and it would seem that we could indeed do better. The TX and RX buffers are not only FIFOs, but they are circular FIFOs. If I understand correctly the Flash Programming sequence in section 26.6.3.6, we could improve our driver by doing the following:

Ensure the TX buffer is empty, and clear it if necessary.
Program the address related to the command.
If the command fits in the FIFO, write it word-by-word into the TX Buffer Data Register (TBDR).
Otherwise, fill the FIFO with the first 32-bytes to be sent and write the full size of the command in IDATSZ.
Trigger the command by setting the appropriate LUT Sequence ID.
Busy loop to monitor the "TX full" bit: if we have remaining data to be sent, put the next one.
Perform a sanity check on the number of words actually written in the FIFO.
Wait for programming to complete.

Before jumping to implement such a thing, I'd like to verify 2 things:

Is my understanding correct?
Does anyone have a reference implementation that uses the "full power" of the QuadSPI FIFOs? We are using vxWorks, but I guess we could adapt a Linux driver without too much problem.

Thank you!

bpe · ‎12-20-2016

Your understanding is generally correct. The only remark I can add
is, that in the step 4 you can fill up to 32x4 byte Tx buffer entries.

QuadSPI Linux driver, if you want to use it as a reference, can be
found here:

http://git.freescale.com/git/cgit.cgi/ppc/sdk/linux.git/tree/drivers/mtd/spi-nor/fsl-quadspi.c

Additional information on the driver can be found in the SDK online
documentation:

https://freescale.sdlproducts.com/LiveContent/content/en-US/QorIQ_SDK/GUID-CBD01377-779A-479C-BE7A-2...

Have a great day,
Platon

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

LS1021A: How to write an efficient QuadSPI FLASH driver?

LS1021A: How to write an efficient QuadSPI FLASH driver?

QorIQ LS1 Devices