Shared etherent MAC Crash

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Shared etherent MAC Crash

Jump to solution
651 Views
ramkrishnan
Contributor III

We are using the T1024 processor and its DPAA capabilites. The dpaa is configured to transfer data packets from the ethernet port to a macless interfaces configured in the kernel. This code was working in the T1024rdb eval board. 

The target is very similarly built based on the eval board. I run the DPAA software which comes up and there are other packet transfer that work fine. But the shared ethernet data causes the kernel to crash with the following dump. 

-------------------------------------------------------------------------------------------------------------------------------

Unable to handle kernel paging request for data at address 0xa02d99c0
Faulting instruction address: 0xc0019a94
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=8 CoreNet Generic
Modules linked in:
CPU: 0 PID: 1263 Comm: cti_datapath Not tainted 4.1.8-rt8+gbd51baf #18
task: c4393910 ti: c5fd4000 task.ti: c4d9e000
NIP: c0019a94 LR: c0490c08 CTR: 00000009
REGS: c5fd5df0 TRAP: 0300 Not tainted (4.1.8-rt8+gbd51baf)
MSR: 00021002 <CE,ME> CR: 24002228 XER: 20000000
DEAR: a02d99c0 ESR: 00000000
GPR00: 00000000 c5fd5ea0 c4393910 c4bcd580 a02d99bc 0000004a c4bcd57c 00000009
GPR08: 00000007 00000001 00000007 000001b7 24002282 101b4a54 00000033 0000051f
GPR16: c08ddfd0 c08ddff8 c08de018 c09fd2b0 c08de034 c09fd0d0 c08de050 a02d99c0
GPR24: 00000000 0000004a a02d9000 0000004a c4bcd5ca 0000004a 00000001 e02d9a0a
NIP [c0019a94] memcpy+0x1c/0x9c
LR [c0490c08] copy_from_unmapped_area+0x128/0x140
Call Trace:
[c5fd5ea0] [c0490bf4] copy_from_unmapped_area+0x114/0x140 (unreliable)
[c5fd5ef0] [c04912c8] shared_rx_dqrr+0x268/0x490
[c5fd5f30] [c0599728] portal_isr+0x1c8/0x240
[c5fd5f60] [c0075324] handle_irq_event_percpu+0xd4/0x1c0
[c5fd5fa0] [c0075454] handle_irq_event+0x44/0x80
[c5fd5fc0] [c0078cd8] handle_fasteoi_irq+0xd8/0x1f0
[c5fd5fd0] [c007482c] generic_handle_irq+0x3c/0x70
[c5fd5fe0] [c000447c] __do_irq+0x2c/0x80
[c5fd5ff0] [c000d70c] call_do_irq+0x24/0x3c
[c4d9ff20] [c000455c] do_IRQ+0x8c/0x120
[c4d9ff40] [c000f7f0] ret_from_except+0x0/0x18
--- interrupt: 501 at 0x1002a2d8
LR = 0x1001ab1c
Instruction dump:
9c860001 4200fffc 4e800020 7c032040 418100a0 54a7e8ff 38c3fffc 3884fffc
41820028 70c00003 7ce903a6 40820054 <80e40004> 85040008 90e60004 95060008
---[ end trace e788ca880c8f1dac ]---

--------------------------------------------------------------------------------------------------------------------------------

What would be a good place to identify the reason for this crash. All I am doing is a simple ping from the pc and the packet size is 0x3c. The dts files are similar with similar ports enabled with dpaa. We have two eval boards one is the T1024rdb and the other is the T4240rdb and the same piece of code was running on both systems and worked fine. 

Thank you,

Ram Krishnan 

0 Kudos
1 Solution
479 Views
Pavel
NXP Employee
NXP Employee

Your message dump contains the following message:

Faulting instruction address: 0xc0019a94
Oops: Kernel access of bad area, sig: 11 [#1]

 

It looks like that your code is correct since connection the T1024RDB boards provides correctl data transfers using your code.

Often it is happens if buffers for DPAA are allocated in invalid address on your board. Check the T1024 LAW for QMAN, FMAN on your board. Compare this setting on your board and on the T1024rdb board.


Have a great day,
Pavel Chubakov

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

View solution in original post

0 Kudos
4 Replies
479 Views
ramkrishnan
Contributor III

Solved the problem. 

In the tlb.c I had removed the PCIE memory maps because it was overlapping with our mapped devices. But the kernel dts files still have the pcie0 .. pcei3. Removed this from the dts files and now it all works fine. 

Thank you again for the pointer.

Ram Krishnan

0 Kudos
479 Views
ramkrishnan
Contributor III

The associated tlb changes are also pasted below. I have remoted the PCI fields and put in the changes for the 16_BIT_PHYS area and the DISP_BASE_PHYS. 

struct fsl_e_tlb_entry tlb_table[] = {
/* TLB 0 - for temp stack in cache */
SET_TLB_ENTRY(0, CONFIG_SYS_INIT_RAM_ADDR,
CONFIG_SYS_INIT_RAM_ADDR_PHYS,
MAS3_SX|MAS3_SW|MAS3_SR, 0,
0, 0, BOOKE_PAGESZ_4K, 0),
SET_TLB_ENTRY(0, CONFIG_SYS_INIT_RAM_ADDR + 4 * 1024,
CONFIG_SYS_INIT_RAM_ADDR_PHYS + 4 * 1024,
MAS3_SX|MAS3_SW|MAS3_SR, 0,
0, 0, BOOKE_PAGESZ_4K, 0),
SET_TLB_ENTRY(0, CONFIG_SYS_INIT_RAM_ADDR + 8 * 1024,
CONFIG_SYS_INIT_RAM_ADDR_PHYS + 8 * 1024,
MAS3_SX|MAS3_SW|MAS3_SR, 0,
0, 0, BOOKE_PAGESZ_4K, 0),
SET_TLB_ENTRY(0, CONFIG_SYS_INIT_RAM_ADDR + 12 * 1024,
CONFIG_SYS_INIT_RAM_ADDR_PHYS + 12 * 1024,
MAS3_SX|MAS3_SW|MAS3_SR, 0,
0, 0, BOOKE_PAGESZ_4K, 0),

/* TLB 1 */
/* *I*** - Covers boot page */
#if defined(CONFIG_SYS_RAMBOOT) && defined(CONFIG_SYS_INIT_L3_ADDR)
/*
* *I*G - L3SRAM. When L3 is used as 256K SRAM, the address of the
* SRAM is at 0xfffc0000, it covered the 0xfffff000.
*/
SET_TLB_ENTRY(1, CONFIG_SYS_INIT_L3_ADDR, CONFIG_SYS_INIT_L3_ADDR,
MAS3_SX|MAS3_SW|MAS3_SR, MAS2_I|MAS2_G,
0, 0, BOOKE_PAGESZ_256K, 1),
#else
SET_TLB_ENTRY(1, 0xfffff000, 0xfffff000,
MAS3_SX|MAS3_SW|MAS3_SR, MAS2_I|MAS2_G,
0, 0, BOOKE_PAGESZ_4K, 1),
#endif

/* *I*G* - CCSRBAR */
SET_TLB_ENTRY(1, CONFIG_SYS_CCSRBAR, CONFIG_SYS_CCSRBAR_PHYS,
MAS3_SX|MAS3_SW|MAS3_SR, MAS2_I|MAS2_G,
0, 1, BOOKE_PAGESZ_16M, 1),

/* *I*G* - Flash, localbus */
/* This will be changed to *I*G* after relocation to RAM. */
SET_TLB_ENTRY(1, CONFIG_SYS_FLASH_BASE, CONFIG_SYS_FLASH_BASE_PHYS,
MAS3_SX|MAS3_SR, MAS2_W|MAS2_G,
0, 2, BOOKE_PAGESZ_256M, 1),

#ifndef CONFIG_SPL_BUILD
#if 0 // Removed the PCI memory mapping because it was overlapping with the one needef for CPLD, NVRAM, FPGA.
/* *I*G* - PCI */
SET_TLB_ENTRY(1, CONFIG_SYS_PCIE1_MEM_VIRT, CONFIG_SYS_PCIE1_MEM_PHYS,
MAS3_SX|MAS3_SW|MAS3_SR, MAS2_I|MAS2_G,
0, 3, BOOKE_PAGESZ_1G, 1),

/* *I*G* - PCI I/O */
SET_TLB_ENTRY(1, CONFIG_SYS_PCIE1_IO_VIRT, CONFIG_SYS_PCIE1_IO_PHYS,
MAS3_SX|MAS3_SW|MAS3_SR, MAS2_I|MAS2_G,
0, 4, BOOKE_PAGESZ_256K, 1),
#else
#ifdef CONFIG_SYS_CTI_DISP_BASE
SET_TLB_ENTRY(1, CONFIG_SYS_CTI_DISP_BASE, CONFIG_SYS_CTI_DISP_BASE_PHYS,
MAS3_SX|MAS3_SW|MAS3_SR, MAS2_I|MAS2_G,
0, 3, BOOKE_PAGESZ_256M, 1),
#endif
#endif
/* Bman/Qman */
#ifdef CONFIG_SYS_BMAN_MEM_PHYS
SET_TLB_ENTRY(1, CONFIG_SYS_BMAN_MEM_BASE, CONFIG_SYS_BMAN_MEM_PHYS,
MAS3_SX|MAS3_SW|MAS3_SR, 0,
0, 5, BOOKE_PAGESZ_16M, 1),
SET_TLB_ENTRY(1, CONFIG_SYS_BMAN_MEM_BASE + 0x01000000,
CONFIG_SYS_BMAN_MEM_PHYS + 0x01000000,
MAS3_SX|MAS3_SW|MAS3_SR, MAS2_I|MAS2_G,
0, 6, BOOKE_PAGESZ_16M, 1),
#endif
#ifdef CONFIG_SYS_QMAN_MEM_PHYS
SET_TLB_ENTRY(1, CONFIG_SYS_QMAN_MEM_BASE, CONFIG_SYS_QMAN_MEM_PHYS,
MAS3_SX|MAS3_SW|MAS3_SR, 0,
0, 7, BOOKE_PAGESZ_16M, 1),
SET_TLB_ENTRY(1, CONFIG_SYS_QMAN_MEM_BASE + 0x01000000,
CONFIG_SYS_QMAN_MEM_PHYS + 0x01000000,
MAS3_SX|MAS3_SW|MAS3_SR, MAS2_I|MAS2_G,
0, 8, BOOKE_PAGESZ_16M, 1),
#endif
#endif
#ifdef CONFIG_SYS_DCSRBAR_PHYS
SET_TLB_ENTRY(1, CONFIG_SYS_DCSRBAR, CONFIG_SYS_DCSRBAR_PHYS,
MAS3_SX|MAS3_SW|MAS3_SR, MAS2_I|MAS2_G,
0, 9, BOOKE_PAGESZ_4M, 1),
#endif
#ifdef CONFIG_SYS_NAND_BASE
SET_TLB_ENTRY(1, CONFIG_SYS_NAND_BASE, CONFIG_SYS_NAND_BASE_PHYS,
MAS3_SX|MAS3_SW|MAS3_SR, MAS2_I|MAS2_G,
0, 10, BOOKE_PAGESZ_64K, 1),
#endif
#ifdef CONFIG_SYS_MG50_16BIT_BASE
SET_TLB_ENTRY(1, CONFIG_SYS_MG50_16BIT_BASE, CONFIG_SYS_MG50_16BIT_BASE_PHYS,
MAS3_SX|MAS3_SW|MAS3_SR, MAS2_I|MAS2_G,
0, 11, BOOKE_PAGESZ_256M, 1),
#endif

#if defined(CONFIG_RAMBOOT_PBL) && !defined(CONFIG_SPL_BUILD)
SET_TLB_ENTRY(1, CONFIG_SYS_DDR_SDRAM_BASE, CONFIG_SYS_DDR_SDRAM_BASE,
MAS3_SX|MAS3_SW|MAS3_SR, 0,
0, 12, BOOKE_PAGESZ_1G, 1),
SET_TLB_ENTRY(1, CONFIG_SYS_DDR_SDRAM_BASE + 0x40000000,
CONFIG_SYS_DDR_SDRAM_BASE + 0x40000000,
MAS3_SX|MAS3_SW|MAS3_SR, 0,
0, 13, BOOKE_PAGESZ_1G, 1)
#endif
/* entry 14 and 15 has been used hard coded, they will be disabled
* in cpu_init_f, so if needed more, will use entry 16 later.
*/
};

0 Kudos
479 Views
ramkrishnan
Contributor III

That was one of the things we did change and we checked that to make sure they are not overlapping in anyway. This is what we have on the modified board

struct law_entry law_table[] = {
#ifndef CONFIG_SYS_NO_FLASH
SET_LAW(CONFIG_SYS_FLASH_BASE_PHYS, LAW_SIZE_256M, LAW_TRGT_IF_IFC),
#endif
#ifdef CONFIG_SYS_BMAN_MEM_PHYS
SET_LAW(CONFIG_SYS_BMAN_MEM_PHYS, LAW_SIZE_32M, LAW_TRGT_IF_BMAN),
#endif
#ifdef CONFIG_SYS_QMAN_MEM_PHYS
SET_LAW(CONFIG_SYS_QMAN_MEM_PHYS, LAW_SIZE_32M, LAW_TRGT_IF_QMAN),
#endif
#ifdef CONFIG_SYS_MG50_16BIT_BASE_PHYS
SET_LAW(CONFIG_SYS_MG50_16BIT_BASE_PHYS, LAW_SIZE_256M, LAW_TRGT_IF_IFC),
#endif
#ifdef CONFIG_SYS_CTI_DISP_BASE_PHYS
SET_LAW(CONFIG_SYS_CTI_DISP_BASE_PHYS, LAW_SIZE_256M, LAW_TRGT_IF_IFC),
#endif
#ifdef CONFIG_SYS_DCSRBAR_PHYS
SET_LAW(CONFIG_SYS_DCSRBAR_PHYS, LAW_SIZE_4M, LAW_TRGT_IF_DCSR),
#endif
#ifdef CONFIG_SYS_NAND_BASE_PHYS
SET_LAW(CONFIG_SYS_NAND_BASE_PHYS, LAW_SIZE_64K, LAW_TRGT_IF_IFC),
#endif
};

FLASH, BMAN and QMAN are the same as in the T1024RDB.

MG50_16BIT_BASE_PHYS  is 0xFA0000000

DISP_BASE_PHYS is set to 0xFB0000000

Both of the above are memory mapped areas to devices like nvram, disp ...etc. 

The CONFIG_SYS_NAND_BASE_PHYS has been changed to 0xFD0000000

We removed the CONFIG_SYS_CPLD_BASE_PHYS since the CPLD on this board is hanging of the 16_BIT_BASE_PHYS area. 

On an additional note. After putting debug in the kernel in the dpaa_eth_shared.c file I see that the buffer allocated in the application for BPID 9 is the one actually used and comes into that function (I dumped the dma_mem_vtop and they match in the buffers). But the problem is in the page_vaddr() function which looks like it maps it to an invalid memory space. The phys_start is pointing to the correct memory buffer and it is from BPID 9. 

Thank you again for your quick response and narrowing it down pretty quickly.

Ram Krishnan

0 Kudos
480 Views
Pavel
NXP Employee
NXP Employee

Your message dump contains the following message:

Faulting instruction address: 0xc0019a94
Oops: Kernel access of bad area, sig: 11 [#1]

 

It looks like that your code is correct since connection the T1024RDB boards provides correctl data transfers using your code.

Often it is happens if buffers for DPAA are allocated in invalid address on your board. Check the T1024 LAW for QMAN, FMAN on your board. Compare this setting on your board and on the T1024rdb board.


Have a great day,
Pavel Chubakov

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos