IMX8MP: SError how to debug source

キャンセル
次の結果を表示 
表示  限定  | 次の代わりに検索 
もしかして: 

IMX8MP: SError how to debug source

2,845件の閲覧回数
TerryBarnaby1
Contributor V

We have a custom IMX8MP board, based on the IMX8MP-EVK board, that runs for many hours doing video processing work with no issues. LPDDR4 passes memory stress test and all other tests we have tried.

However occasionally, when we power off or reboot the board and sometimes at powerup there is an SError thrown. See: https://community.nxp.com/t5/i-MX-Processors/IMX8MP-Yocto-hardknott-hang-on-poweroff-reboot/td-p/154.... Its a bit random in that changing things "appears" to affect the failure, for example adding a USB disk affects it and simply connecting the UART2_RXD line to a USB to serial lead that is showing ttymxc1 console printouts appears to eliminate it or at least make it happen less often. (UART2_RXD is connected like the IMX8MP-EVK board via a voltage translator chip).

We normally see this SError reported at the code address imx_uart_readl although sometimes there are no kernel messages just a cutoff printk message.

My question here is how to find out what caused the SError ? As far as I understand it one of the IMX8MP hardware blocks outside of the ARM CPU core generated it but how to find out which one ?

Terry

0 件の賞賛
返信
5 返答(返信)

2,825件の閲覧回数
TerryBarnaby1
Contributor V

Attached is the UART2 schematic (pieced together from overall schematic). I don't think this is a cause of the problem, just affects the issue somehow as the problem occurs fairly randomly and appears to be affected by timings of things that I suspect the kernel printk's affect.

However the main thing I want to find out is how to work out what caused an SError when it occurs ?

0 件の賞賛
返信

2,822件の閲覧回数
TerryBarnaby1
Contributor V

Actually I note that the RXD line is pulled low rather than high. This could be generating a continuous serial line BREAK condition. Could the ttymxc1 driver be doing something with this and due to a driver timing bug this is generating the SError somehow ?

0 件の賞賛
返信

2,805件の閲覧回数
Zhiming_Liu
NXP TechSupport
NXP TechSupport

Hi @TerryBarnaby1 

The SError normally caused by accessing the memory system, can you add more print to confirm this error happened in which uart ?

From the comment, the UCR2_SRST is cached, this could cause the SError.

 

 

static u32 imx_uart_readl(struct imx_port *sport, u32 offset)
{
	switch (offset) {
	case UCR1:
		return sport->ucr1;
		break;
	case UCR2:
		/*
		 * UCR2_SRST is the only bit in the cached registers that might
		 * differ from the value that was last written. As it only
		 * automatically becomes one after being cleared, reread
		 * conditionally.
		 */
		if (!(sport->ucr2 & UCR2_SRST))
			sport->ucr2 = readl(sport->port.membase + offset);
		return sport->ucr2;
		break;
	case UCR3:
		return sport->ucr3;
		break;
	case UCR4:
		return sport->ucr4;
		break;

 

 

Can you use UART2 RXD test point to reproduce this issue on EVK?

0 件の賞賛
返信

2,794件の閲覧回数
TerryBarnaby1
Contributor V

"The SError normally caused by accessing the memory system, can you add more print to confirm this error happened in which uart ?". I am pretty sure this is with UART2, no others are in use. Obviously as it is an SError the kernel backtrace may not be related to where the SError happened.

"From the comment, the UCR2_SRST is cached, this could cause the SError.", actually UCR2_SRST is the only bit that is not read from memory cache but is read directly from the UART's registers. I did note that there was code in the uart driver that turns of the UART hardware clock for some reasons, maybe something has switched this off ?

"Can you use UART2 RXD test point to reproduce this issue on EVK?" I guess I can remove a resistor and tie the UART2_RXD line low to test. However it has taken me three weeks of work to get to a position where I can relatively reliably (within an hour or so) catch this fairly random fault. It obviously has a timing aspect. I think that using the EVK and start EVK code could be a 3 week or more job to get another example failure. So as I have a reasonable way to get the failure in my environment I would like to dig down and try and track down the source. Hence I need to find out why the SError occurred and if an address access what address.

Isn't there a standard way of determining what caused an SError interrupt on an IMX8MP ?

 

 

 

If the SError was generated due to some memory access issue, how do I find out what cause the SError and what memory location was being accessed ?

 

0 件の賞賛
返信

2,829件の閲覧回数
Zhiming_Liu
NXP TechSupport
NXP TechSupport

Hi @TerryBarnaby1 

Can you share your design about uart?

0 件の賞賛
返信