Custom ls1046a board. Linux hangs after trying to activate BMan.

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Custom ls1046a board. Linux hangs after trying to activate BMan.

Jump to solution
4,873 Views
AbelianMeme
Contributor III

We have a custom board with an ls1046a that I have been struggling to get Linux running on for several months. I have tried nearly every branch of Linux available in the github archives for lsdk. The version that gets furthest through the boot process is the 5.4.47 branch.

Our board does not have any native ethernet ports using the fman facilities, but we do want to support networking expansion cards via USB and PCIe. It is not clear to me what options I need in the .dts file to allow this. As it stands, Linux 5.4.47 boots normally until it reaches the following statements, after which it hangs:

[ 1.870489] bman-fpbr addr 0x0000000000000000 size 0x1000000
[ 1.876537] Bman err interrupt handler present
[ 1.881248] Can't get bman-portal@0 property 'cell-index'
[ 1.886704] Can't get bman-portal@10000 property 'cell-index'
[ 1.892509] Can't get bman-portal@20000 property 'cell-index'
[ 1.898327] Can't get bman-portal@30000 property 'cell-index'
[ 1.904137] Can't get bman-portal@40000 property 'cell-index'
[ 1.909942] Can't get bman-portal@50000 property 'cell-index'
[ 1.915747] Can't get bman-portal@60000 property 'cell-index'
[ 1.921551] Can't get bman-portal@70000 property 'cell-index'
[ 1.927356] Can't get bman-portal@80000 property 'cell-index'
[ 1.933160] Can't get bman-portal@90000 property 'cell-index'
[ 1.938953] No Bman portals available!

 

After this point, the everything just hangs. I have no idea what the CPU is doing, but it doesn't appear to be executing kernel code any longer, and the synchronous exception handler is never triggered.

We are using the standard fsl-ls1046a.dtsi file that comes with the 5.4.47 lsdk distribution, with the attached .dts file for our board. I have also attached the .config that we are compiling against, which was created simply by doing a "make defconfig lsdk" after a pristine clone of the github branch.

Can someone suggest what might be going on, and what I need to modify to get this to work properly? We are not concerned with security in this project, so we do not need a trusted architecture. I have not currently loaded any special firmware, as there does not appear to be a need for it. Please let me know if I have misunderstood or overlooked something.

Thank you for any assistance.

0 Kudos
Reply
1 Solution
4,854 Views
AbelianMeme
Contributor III

I finally resolved the problem. The issue stems from the qman driver hanging the entire CPU about 300 msecs after issuing the init command (0x01) to MCR. It must stall the main bus, because every processor core stops working.  The issue was eventually resolved by adding the following to the .dts file:

&bman_fbpr {
compatible = "fsl,bman-fbpr";
alloc-ranges = <0 0 0x10000 0>;
};


&qman_fqd {
compatible = "fsl,qman-fqd";
alloc-ranges = <0 0 0x10000 0>;
};


&qman_pfdr {
compatible = "fsl,qman-pfdr";
alloc-ranges = <0 0 0x10000 0>;
};

 

I found these settings in the fsl-ls1046a-rdb-sdk.dts file. They were not in the fsl-ls1046a-rdb.dts that I was using as my template.

View solution in original post

0 Kudos
Reply
4 Replies
4,855 Views
AbelianMeme
Contributor III

I finally resolved the problem. The issue stems from the qman driver hanging the entire CPU about 300 msecs after issuing the init command (0x01) to MCR. It must stall the main bus, because every processor core stops working.  The issue was eventually resolved by adding the following to the .dts file:

&bman_fbpr {
compatible = "fsl,bman-fbpr";
alloc-ranges = <0 0 0x10000 0>;
};


&qman_fqd {
compatible = "fsl,qman-fqd";
alloc-ranges = <0 0 0x10000 0>;
};


&qman_pfdr {
compatible = "fsl,qman-pfdr";
alloc-ranges = <0 0 0x10000 0>;
};

 

I found these settings in the fsl-ls1046a-rdb-sdk.dts file. They were not in the fsl-ls1046a-rdb.dts that I was using as my template.

0 Kudos
Reply
4,866 Views
AbelianMeme
Contributor III

An update on this issue. The Linux kernel is hanging in the function qm_init_pfdr() in the file qman_config.c.  It is called from qman_init_ccsr for device node qman@1880000.  qm_init_pfdr wirtes an MCR command, and then waits for a non idle result. Unfortunately, the device is always idle (rslt = 0x01), and so the kernel hangs forever waiting for something that never comes.

Can someone offer some guidance on what may be going on here, and what I have to correct in order to get this to proceed?

Thank you again for any assistance.

 

0 Kudos
Reply
4,861 Views
AbelianMeme
Contributor III

One further update. I fixed the .dts file to include a "cell-index" and "cpu-handle" property on each portal node as outlined in the LSDK 18.09 documentation, section 8.2.3.2.2.2.3. While that removed the Bman errors, it did not resolve the system hanging in qm_init_pfdr(). The issue stems from the following calls:

qm_out( 0x0b04, 0x8)            // REG_MCP(0)
qm_out( 0x0b08, 0x7FFF0)    // REG_MCP(1)
qm_out( 0x0b00, 0x1000000) // REG_MCR

while(!MCR_rslt_idle(MCR_get_result(qm_in(0x0b00))));  // Read result from REG_MCR

0x8 is pfdr_start

0x7FFF0 is pfdr_start + (a parameter passed into the function, helpfully called "num") - 16

0x1000000 is a constant MCR_INIT_PFDR

I can't understand what is being done here or why it is not working. This sequence results in a hung OS, as MCR never reports it is not idle. What *should* happen in this case?

 

 

Tags (1)
0 Kudos
Reply
160 Views
pholden
Contributor III

Was there any resolution to this problem?  I'm running into the same problem with the QLS1046A SDK device tree.

kernel gets this far, then hangs:

[ 1.621013] bman-fbpr addr 0x0000000000000000 size 0x1000000
[ 1.627104] Bman err interrupt handler present
[ 1.632141] Bman portal initialised, cpu 0
[ 1.636387] Bman portal initialised, cpu 1
[ 1.640633] Bman portal initialised, cpu 2
[ 1.644874] Bman portal initialised, cpu 3
[ 1.648995] Bman portals initialised

 

Tags (3)
0 Kudos
Reply