RX fifo overflow on MIPI-CSI2 i.MX8MQ

t_spil · ‎04-27-2020

Hi,

We are using a variscite iMX8MQ board attached to a custom PCB. This PCB has a VGA image sensor outputting RAW12 using 4 CSI MIPI Lanes, and is connected to the MIPI-CSI2 port of the iMX8. Using linux build 4.14.98

We have successfully captured frames from the image sensor at 500fps, using 960Mbps readout speed. However we have encountered a bug with the RxFIFO overflowing and not being able to recover.

Bugs:
We are encountering a bug similar as errata ERR050066 (https://www.nxp.com/docs/en/errata/IMX8_1N94W.pdf), when the AXI bridge gets busy with other activities the RxFIFO overflows. We can induce this by running an intensive operation, such as starting Chromium, or streaming from MIPI-CSI1. However according to this thread ( https://community.nxp.com/thread/524861 ) because we are streaming into memory we should not be encountering this exact bug.

The other bug is the RxFIFO Overflow does not recover, despite the csi_error_recovery() function in mx6s_capture.c clearing the BIT_RFF_OR_INT bit, the fifo overflows again and again.

We have not been able to resolve this except by power cycling the iMX8.

We have another image sensor attached to MIPI-CSI1 that is running at 13MP @ 10fps without any troubles. Both are using the same mx6s_capture.c driver. When running MIPI-CSI1 port, the MIPI-CSI2 port overflows even when running at 300Mbps 4 lanes.

Questions:
Is there a way to prioritize the Memory transfers for the RxFIFO of the CSI2 port, so we very unlikely to overflow? I see that is patch was once applied for the VPU (Tom Zheng's patch from this thread https://community.nxp.com/thread/476615 ), is there something similar we could do? We have attached another image sensor to the MIPI-CSI1 port, and that does not appear to be overflowing.

Is there a fix for the perpetual RxFIFO Overflow bug? Dropping a few frames in our application doesn't matter. Are there any known issues with clearing the RxFIFO and reflashing the DMA on the CSI2 port?

Thanks,

Refael and Twan

igorpadykov · ‎05-06-2020

Hi

1.There is CSI bridge register named FIFO_level register,
offset is 0x4c, it's max value is 255, overflow will occur when fifo level bigger than 255.
2. Watch this register, recovery the CSI bridge when when it close to 255.
3. from some tests, the better threshold value is 192.
So for example M-core can start polling the 0x4c, if the value is equal or bigger than 192.
then mask the phy lanes totally to stop the video stream to come to bridge.
Attached example of SDK_2.6.0_EVK-MIMX8MQ (monitor 0x4c at M-core.).
In linux execute cmd to start M4 monitor: echo “start” > /dev/ttyRPMSG0
When monitor starts , M4 console will print the fifo level value.

Best regards
igor

View solution in original post

rabe1 · ‎10-28-2021

Hi,

is the proposed solution to monitor the FIFO_level using the M4 still the only solution?

I got the M4 running but i have problems applying the patch to my kernel version (5.4.85). Does there exits a patch for a newer kernel version?

best regards

rabe

igorpadykov · ‎11-09-2021

>is the proposed solution to monitor the FIFO_level using the M4 still the only solution?

yes

>I got the M4 running but i have problems applying the patch to my kernel version (5.4.85).

>Does there exits a patch for a newer kernel version?

sorry there is no such patch. Also I saw your service request, please wait support engineer will answer soon.

Best regards
igor

michel_reifenra · ‎05-06-2020

Hi All,

We are facing the same Issue.

Can you please share the solution here or send it to me too Igor?

Thanks!

Michel

igorpadykov · ‎05-06-2020

Hi

1.There is CSI bridge register named FIFO_level register,
offset is 0x4c, it's max value is 255, overflow will occur when fifo level bigger than 255.
2. Watch this register, recovery the CSI bridge when when it close to 255.
3. from some tests, the better threshold value is 192.
So for example M-core can start polling the 0x4c, if the value is equal or bigger than 192.
then mask the phy lanes totally to stop the video stream to come to bridge.
Attached example of SDK_2.6.0_EVK-MIMX8MQ (monitor 0x4c at M-core.).
In linux execute cmd to start M4 monitor: echo “start” > /dev/ttyRPMSG0
When monitor starts , M4 console will print the fifo level value.

Best regards
igor

rabe1 · ‎11-09-2021

Hi @igorpadykov ,

My M4 core is currently doing the following:

1 Read the CSI FIFO debug registers: Register CSI1_CSICR19 (0x30A9_004C)

2 If the value of register CSI1_CSICR19 is larger than 80, Set register CSI_PHY_CTL_REG (0x30A7_0104) as 0xf to stop CSI.

3 Set Register CSI1_CSICR19 (0x30A9_004C) to 0xff

4 Wait until CSI1_CSICR19 value is below 16

5 Enable mipi lanes again. Set register CSI_PHY_CTL_REG (0x30A7_0104) as 0x0c

The above described sequence seems to work some times but sometimes it keeps hanging in step 4 because the value of CSI1_CSICR19 doesn't sink below 16. Do you have some suggestions to improve my monitor application?

t_spil · ‎05-20-2020

Hi Igor,

By streaming two image sensors at the same time, we can induce the overflow bug in under a minute on CSI2. With the M4 monitor we can close the physical lanes in time to stop the overflow from occurring. But we've found no way to properly recover from it.

At first, we just closed the video devices after the physical lanes were closed. Then we reopened the video device and started streaming from both sensors again and the overflow bug occurred. This was happening because the overflow counter never resets. Using the M4 we tried: writing to register 0x4c to reset it, the csisw_reset register write sequence and out of pure frustration every single bit in CR1, CR3 and CR18. But nothing worked.

As I've written before, the posted patch does not apply properly to linux_1.14.98_ga_2.0.0, there are other changes to the mx6s_capture.c driver that we've been unable to find in the public nxp linux git repository. Specifically the function overflow_check_timer, which looks to be in control of reenabling the physical lanes. So there might be a way to properly reset the fifo and it's overflow counter that we're not aware of.

This piece of software in csi_monitor main.c seems to suggest that the fifo_level recovers by itself, but we've seen no sign of this.

#if 0 // this should be recovery by application restart.
if ((/* *fifo_level_reg*/fifo_level <= MIN_FIFO_LEVEL) && (ctrl_status == PHY_LANES_DISABLE)) {
__asm("NOP");
__asm("NOP");
*phy_ctrl_reg = 0x0;
__asm("NOP");
__asm("NOP");
ctrl_status = PHY_LANES_ENABLE;
PRINTF("\r\n re open phy : fifo_level: 0x%x, phy_ctrl_reg: 0x%x, fifo_level_reg: 0x%x\r\n", fifo_level, *phy_ctrl_reg, *fifo_level_reg);
}
#endif

Is there a way to recover the CSI bridge after we stop the physical lanes?

Thanks,

Twan

igorpadykov · ‎05-20-2020

Hi Twan

I asked internally and below suggestion:

------------------------

this only support one channel fifo level monitor, when for CSI1 and CS2 used both , you should add two monitors.

// use csi1
#define FIFO_LV1_REG_ADDR (0x30a90000 + 0x4c) // csi1 bridge
#define PHY_CTL1_REG_ADDR (0x30A70000 + 0x104)
// use csi2
#define FIFO_LV2_REG_ADDR (0x30b80000 + 0x4c) // csi2 bridge
#define PHY_CTL2_REG_ADDR (0x30b60000 + 0x104)

    fifo_level = * ((u32 *) FIFO_LV1_REG_ADDR);
    PRINTF("0x%x\n", fifo_level);
    if (/* *fifo_level_reg*/fifo_level >= 200) {

.....

fifo_level = * ((u32 *) FIFO_LV2_REG_ADDR);
PRINTF("0x%x\n", fifo_level);

-----------------------

Best regards
igor

t_spil · ‎06-01-2020

Hi Igor,

We're specifically monitoring csi2 bridge, because that's the bridge that we can reliably get to overflow on our board. It also seems that the CSI2 bridge is more likely to overflow. We've been able to replicate the issue using the Variscite DART-MXM8 EVK and the Variscite OV5640 camera module. This is a good setup, because the camera module has two identical sensors connected to the two MIPI lanes. What we found is that when running both sensors at the same time using gstreamer and stressing the cpu, CSI2 bridge overflows within a minute while CSI1 bridge barely has any overflow counters. This is at fairly low MIPI speeds as well, 600 Mbps on 2 lanes.

To replicate:

On DART-MX8M with image fsl-image-gui with linux-variscite on branch imx_4.14.98_2.0.0_ga_var01.

Run both sensors at the same time:

gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw,width=2592,height=1944 ! fakesink v4l2src device=/dev/video1 ! video/x-raw,width=2592,height=1944 ! autovideosink sync=false

While stressing the cpu:

stress --cpu 8 --io 4 --vm 4

Within a minute

devmem2 0x30a9004c && devmem2 0x30b8004c

will read

/dev/mem opened.
Memory mapped at address 0xffff9d714000.
Read at address 0x30A9004C (0xffff9d71404c): 0x00000010
/dev/mem opened.
Memory mapped at address 0xffff94ccf000.
Read at address 0x30B8004C (0xffff94ccf04c): 0x000000FF

Is there a road map to when this issue might be fixed (hardware/software)? Or do we have to look at doing a new hardware revision where we connect our most important sensor to the CSI1 bridge. Or is there a way to reset the overflow counter in the CSI bridge after we stop the physical lanes?

Hope you can help,

Twan

igorpadykov · ‎06-02-2020

Hi Twan

I asked internally, answer below:

--------------------

There should not be a road map to fixed HW issue. for SW, there is only this M4 monitor method to work round it now. Watching the fifo-level both CSI1 and CSI2 and stop physical lanes when fifo-level reached threshold and use a way to inform application(may be V4l2 event) to re-init themselves and preview again.

--------------------

Best regards
igor

t_spil · ‎06-02-2020

Hi Igor,

Thanks for getting the reply to me. Could you respond the following to Liu?

--------------------

Hi Liu,

The workaround does not work for us, because without resetting the 0x4C register, we'll run into the overflow situation eventually.

Could you supply the sequence of register writes that need to be done to re-init the CSI bridge that resets the 0x4C register count? Because the current mx6s_capture.c driver (in linux 4.14.98 and 4.19.35) does not re-init the 0x4C register on open/start/stop/close. The supplied patch does not give insight into what registers need to be written to do this.

Thanks,

Twan

igorpadykov · ‎06-02-2020

Hi Twan

I asked Liu, his answer below:

--------------------

This monitor should run at M4, the kernel is not a Realtime os, there are may be lost case .

Below is flow for M4 monitor case:

For example, create one program on M4 core to monitor CSI FIFO level and do as follows:

Read the CSI FIFO debug registers: Register CSI1_CSICR19 (0x30A9_004C) for MIPI CS1 or register CSI2_CSICR19 ( 0x30B8_004C) for MIPI CSI2
If the value of register CSIx_CSICR19 is larger than 192, Set register CSI_PHY_CTL_REG (0x30A7_0104 for MIPI CS1 or 0x30B6_0104 for MIPI CSI2) as 0xff to restart CSI.
Wait for 3us. Jump to step 1).

after m4 get a threshold value , m4 stop phy lanes and inform the Linux kernel (rpmsg is good method) , kernel with v4l2 event inform app, app restart flow is close and open device again, it should be taken a new capture again.

--------------------

Best regards
igor

t_spil · ‎06-08-2020

Hi Igor,

Thanks for the reply, I still have one more follow up question to ask to Liu.

--------------------

Hi Liu,

Thanks for getting back to me, but I wasn't after the sequence of register read/writes for the workaround on the M4 core. We got that working.

The second part does not work for us. Closing and opening the video device does not reset the 0x4C register in the latest mx6s_capture driver. So overflow will still occur. What registers do we need to write in the mx6s_capture to get this to happen?

Currently on open the mx6s_capture driver writes the following registers in the following functions:

mx6s_csi_enable()
- csisw_reset()
  - cr18 clear BIT_CSI_ENABLE
  - cr1 clear BIT_FCC
  - cr1 set BIT_CLR_RXFIFO
  - cr3 set BIT_DMA_REFLASH_RFF and BIT_FRMCNT_RST
  - cr1 set BIT_FCC
  - cr18 set MASK_OPTION, BASEADDR_SWITCH_SEL, BASEADDR_SWITCH_EN and BIT_CSI_ENABLE
- csi_tvdec_enable() - Depends on boolean, but changes registers
  - cr1 - BIT_CCIR_MODE, BIT_SOF_POL and BIT_REDGE
  - cr18 - BIT_TVDECODER_IN_EN, BIT_BASEADDR_SWITCH_EN, BIT_BASEADDR_SWITCH_SEL and BIT_BASEADDR_CHG_ERR_EN
- csi_dmareq_rff_enable()
  - cr2 set DMA_BURST_TYPE_RFF
  - cr3 set BIT_DMA_REQ_EN_RFF, BIT_HRESP_ERR_EN, RxFF_LEVEL and BIT_TWO_8BIT_SENSOR
- csi_enable_int()
  - cr1 set BIT_SOF_INTEN, BIT_RFF_OR_INT, BIT_FB1_DMA_DONE_INTEN and BIT_FB2_DMA_DONE_INTEN
- csi_enable()
  - cr1 set BIT_CSI_ENABLE

What bit's in what registers need to be written extra to reset the count in CR19?

Thanks,

Twan

igorpadykov · ‎06-09-2020

Hi Twan

Liu answer below:

--------------------

when the CSI_CSICR19 closed to overflow(we set threshold =192), the CSI_PHY_CTL_REG must be set to 0xff to cut the data flow from sensor immediately.

from our test, after set CSI_PHY_CTL_REG =0xff, we kill the sensor application and restart this app, the fifo leve will be reset. the flow should be close and open all the pipe line of csi bridge , csi2 and sensor.

--------------------

Best regards
igor

t_spil · ‎05-11-2020

Hi Igor,

We're having trouble applying the patch, the patch seems to have been made on a different version of the linux kernel. There are significant changes to mx6_capture.c (there's an added thread etc) that can't be replicated from just the patch file. We're on branch imx_4.14.98_2.0.0_ga.

Hope you can help,

Twan

igorpadykov · ‎05-11-2020

Hi Twan

nxp has special service for helping with porting drivers

Commercial Support and Engineering Services | NXP

Best regards
igor

jobusch · ‎05-08-2020

Thank you for your response on how to check the FIFO level status. But for us, the root of the problem is that if the FIFO overflows or not is so much dependant on if the CPU is used in any other way or not. So it seems to be a miss behaviour of the scheduler not giving enough priority to the FIFO read process so that it results in overflows.

Is there a way to give that highest priority in the system so that other tasks may slow down, but it keeps reading the FIFO like there was no other task running?

igorpadykov · ‎05-04-2020

Hi Twan

I checked internally and additional details were sent via mail.

Best regards
igor

santhana_kumar

Hi Igor,

We are facing similar issue RXFIFO overflow issue in the IMX8mm with the streaming of Camera image (RAW10). Can you please support us?

reading the register CR19 value is 0xFF.

jobusch · ‎05-06-2020

Hi Igor,

we are having the same error. The only way to get rid of it is to make nearly no computation on the IMX8M to keep the CPU usage as low as possible. When CPU usage rises, it starts to fail instantly.

Could you share the results on this thread or send me to email, too?

Best regards,

Johannes

t_spil · ‎05-05-2020

Thanks Igor,

That'll hopefully allow us to circumvent the overflow issue.

RX fifo overflow on MIPI-CSI2 i.MX8MQ

RX fifo overflow on MIPI-CSI2 i.MX8MQ

i.MX 8M | i.MX 8M Mini | i.MX 8M Nano