PCIe: RC cannot write into EP

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

PCIe: RC cannot write into EP

8,592 Views
elijahbrown
Contributor III

The IMX6 is configured as a PCIe EP, which is connected to a RC (not an IMX6).  I've got two iATU configurations, as follows:

    pcie_map_inbound_addr(PCIE_IATU_VIEWPORT_0,

                                            TLP_TYPE_MemRdWr,

                                            (uint32_t)endpointBuffer,

                                            0x90500000,

                                            SZ_64K);    

    pcie_map_outbound(PCIE_IATU_VIEWPORT_1,

                                          TLP_TYPE_MemRdWr,

                                           PCIE_ARB_BASE_ADDR,

                                           0x310000,

                                           SZ_64K);

Here's what the inbound mapping function does:

uint32_t pcie_map_inbound_addr(uint32_t viewport, uint32_t tlp_type,

                               uint32_t addr_base_cpu_side, uint32_t addr_base_pcie_side, uint32_t size)

{

    // configure as an inbound region

    HW_PCIE_PL_IATUVR_WR((viewport & 0x0F) | (1 << 31));

    // configure region's base and limit address

    HW_PCIE_PL_IATURLBA_WR(addr_base_pcie_side);

    HW_PCIE_PL_IATURUBA_WR(0);

    HW_PCIE_PL_IATURLA_WR(addr_base_pcie_side + size - 1);

    // configure target address

    HW_PCIE_PL_IATURUTA_WR(0);

    HW_PCIE_PL_IATURLTA_WR(addr_base_cpu_side);

    // configure TLP type

    HW_PCIE_PL_IATURC1_WR(tlp_type & 0x0F);

    // enable region

    HW_PCIE_PL_IATURC2_WR(((unsigned int)(1 << 31)));

    return addr_base_cpu_side;

}

Bus mastering is configured in the EP.  The outbound transactions (EP to RC) work fine, I can read and write memory in the RC by reading or writing PCIE_ARB_BASE_ADDR.  But the other way around, inbound transactions (RC to EP) doesn't work.  The RC is sending TLPs with address 0x90500000, which I want to map into the IMX's DRAM, specifically a 64K buffer named endpointBuffer.  This buffer is 1M aligned to meet the iATU requirements of 64k aligning and the MMU requirements of 1M aligning so I can turn caching off for it... The RC sets BAR0 to 0x90500000 and BAR2 to 0x310000.  For whatever reason I can't make the BAR1 mask anything non-zero so I'm using BAR2 instead. 

What am I missing?  The fact that outbound transactions are working makes me think this has to be close, and just an error in the mapping.  I've tried setting up the inbound mapping to do BAR matching and address matching, neither seems to work.

Labels (3)
Tags (3)
0 Kudos
23 Replies

5,012 Views
carlpii
Contributor I

Have you made any progress on this issue?  I see that there is a new branch to the discussion but that thread is closed to me.

I am also using the i.MX6Q as an endpoint and have finally been able to get the inbound iATU regions to work.  My test setup is a Freescale SDB running the 3.14.28 release (rebuilt with PCI support enabled) connected via an mPCIe to PCIe cable to my custom board.

I also ran into issues with the BAR mask registers - I use BAR0/1 in 64-bit mode with no problems.  I can disable BAR2 but it reverts to having it's prefetchable bit set when disabled so doesn't appear as all 0's as it should.  The mask register for BAR3 seems to be unwritable so it is stuck enabled with a size of 256 bytes.

Another issue that bit me was an error in the bare metal SDK header file sdk/drivers/pcie/pcie_common.h.  It has the values reversed for PCIE_IATU_VP_DIR_INBOUND and _OUTBOUND in the enum pcie_iatu_vp_dir_e.

Finally, I needed to ensure my memory structure backing the BAR was aligned to a 64KB boundary in spite of my BAR being set to a smaller region - the lower 16 bits of the iATU target address register are hardwired to 0x0000.

For what it's worth:

- I'm still using the internal reference clock.  I'll switch to the slot clock later.  The signal levels on the SDB reference clock at the mPCIe connector are wrong (the signal swing has no DC bias - it should be biased to the middle of the HCSL 0.0V to 0.7V swing).

- I set up my config space in "pice_init()" prior to enabling the link (writing app_ltssm_enable to 1).

- The iATU mapping for BAR0 to my "mem_array" is:

    HW_PCIE_PL_IATUVR_WR(BF_PCIE_PL_IATUVR_REGION_INDEX(PCIE_IATU_VIEWPORT_0) |

            BF_PCIE_PL_IATUVR_REGION_DIRECTION(PCIE_IATU_VP_DIR_INBOUND));

    HW_PCIE_PL_IATURLTA_WR((reg32_t)mem_array);

    HW_PCIE_PL_IATURUTA_WR((reg32_t)0x00000000);

    HW_PCIE_PL_IATURC1_WR(BF_PCIE_PL_IATURC1_TYPE(TLP_TYPE_MemRdWr));

    HW_PCIE_PL_IATURC2_WR(BF_PCIE_PL_IATURC2_BAR_NUMBER(0) |

            BF_PCIE_PL_IATURC2_RESPONSE_CODE(0x0) |

            BF_PCIE_PL_IATURC2_MATCH_MODE(1) |

            BF_PCIE_PL_IATURC2_REGION_ENABLE(1));

Good luck!

-Carl

0 Kudos

5,008 Views
elijahbrown
Contributor III

Hi Carl,

No I still have not been able to get inbound transactions working.  I just implemented our protocol using only outbound transactions right now as a workaround since I had other things to work.  My test setup has an x86 motherboard as the RC talking to multiple PCIe EPs, one of which is the IMX6.  The team working the x86 side has discovered some unknowns on how it does its PCIe setup.  It's not as straightforward as originally thought - at the moment it's running a closed source BIOS which is setting up the PCIe hardware and enumerating devices.  But I can't have the IMX EP running while the x86 BIOS enumerates or it writes something into the IMX6 crashing the application.  A PCIe analyzer is on the way and should help sort some of this out.

Anyway, I really appreciate your input.  It's frustrating dealing with the horrible documentation and anomalies like you mentioned.  Some BARs can be enabled but can't be disabled, others can be disabled but not enabled.  WTF.  I was beginning to wonder if anyone has ever used this thing as an EP with inbound mapping. 

Do you have PCIE_IATU_VP_DIR_INBOUND defined as 1?  I'm just writing 1 directly into the direction bit.  I set viewport 0 up as inbound mapping and viewport 1 as outbound mapping.  I see you are doing BAR matching.  Does the RC need to have set the address in BAR0 before you setup the inbound mapping or will it update on the fly?  I've been using address matching just to try and remove that variable.  I do have the memory region 1M aligned and set to non-cachable in the MMU.  I did 1M alignment since the MMU can only work on 1M boundaries...  I too switched the order of bringing up the LTSSM.  The datasheet says to do all DBI configuration before enabling that.  Not sure if it makes a difference though.

I have not tried configuring the BARs as 64 bit, I've been assuming independent 32 bit BARs.  Did you try configuring them as independent 32 bit BARs?

0 Kudos

5,008 Views
carlpii
Contributor I

Hi Elijah,

Sorry for the delay - I'm still working on my code and haven't had time to clean up something I could post here.

Yes, I have PCIE_IATU_VP_DIR_INBOUND defined as 1.  I have not tried using BAR0 & BAR1 as 32-bit BARs.  And no, the host does not have to program the BAR before I set up inbound address translation. I set up my config space and the BAR matched inbound region before I enable the ltssm to bring the link up.  I don't set any of the memory space, I/O space, or bus master enable bits in config space as that is the responsibility of the host.  As I noted, I use an i.MX6 SDB running Linux (3.14.28) as my test host.  I've attached the modified endpoint driver I used to test the inbound address translation on my board.  Hopefully that will help.

Booting the system is a whole 'nuther issue, though.  I haven't found a way for the i.MX6 to be ready in time for enumeration by the host from a cold power-up.  For my test setup (i.MX6 SDB host running Linux), I added a delay before the host decides the link isn't going to come up.  I don't yet have a way to solve this for the x86 BIOS that will be our production environment.  For now, I have stopped the PCIe slot reset (PERST#) from resetting the i.MX6 on my board.  This way, the x86 host finds my board on a warm reboot but this is not a viable production solution.

Regards,

-Carl

0 Kudos

5,008 Views
aurelian_v
NXP Employee
NXP Employee

Hi Elijah!

I think you need to make sure that on RC side, in your code, your are using CPU addresses and not PCI addresses.

In your case 0x90500000 and 0x310000 are PCI addresses.

On RC side you need to setup some kind of address translation mechanism from CPU addresses to PCI addresses.

So, for example, when you are writing from RC you use a CPU address which is then translated to a PCI address and that transaction makes it all the way to EP memory through the inbound bar-matching/addr. translation mechanism(on EP side).

0 Kudos

5,014 Views
aurelian_v
NXP Employee
NXP Employee

Hi Elijah!

Are you making sure that on EP's side  you have the bit 1 set in the PCI_COMMAND register ? This bit "Controls a device's response to Memory Space accesses".

From my experience, you should read 0xFFs from the RC if this bit is 0.

I would also check again the settings for inbound bar matching on EP side, just to make sure the bar number corresponds to what has been requested in the enumeration process.

0 Kudos

5,014 Views
elijahbrown
Contributor III

Thanks, I have double checked that, indeed the bit is being set.  I am setting bit 1 and bit 2 in the EP's PCI_COMMAND register. I have configured the inbound mapping to do address matching rather than BAR matching just to rule that a BAR number mismatch but that doesn't work either. 

The RC sets BAR0 in the EP to 0x90500000.

I configure the IATU to map 0x90500000 to a buffer in the EP's memory space.  The buffer is 1M aligned and the MMU is set to not cache that page.

After enumeration the RC writes into 0x90500000 but it never shows up in the EP's memory.  If the RC tries to read 0x90500000, it just gets 0xFFFFFFFF back.

0 Kudos

5,014 Views
richard_zhu
NXP Employee
NXP Employee

Do you ever configure the BAR0 address of the PCIe EP side(imx6 pcie) in your system?

Here one example that imx6 PCIe RC access the memory space of imx6 EP in imx6 PCIe EP/RC validation system.

    - setup one new outbound memory region at rc side, used
    to let imx6 pcie rc can access the memory of imx6 pcie ep
    in imx6 pcie rc ep validation system.
    - set the default address of the ddr memory to be 0x4000_0000

    NOTE:
    - default address 0x4000_0000 of ep side would be
    accessed in this demo.
    Test howto:
    step1:
    EP side:
    1.1:
    echo 0x40000000 > /sys/devices/soc0/soc.1/1ffc000.pcie/ep_bar0_addr

    1.2:
    memtool -32 0x40000000 4
    E
    Reading 0x4 count starting at address 0x40000000

    0x40000000:  6FE9E9F6 7583FBB9 39EAEFEA FBDCFD78

    step2:
    RC side:
    memtool -32 0x01000000=58D454DA
    memtool -32 0x01000004=7332095B

    step3:
    EP side:
    memtool -32 0x40000000 4
    E
    Reading 0x4 count starting at address 0x40000000

    0x40000000:  58D454DA 7332095B 39EAEFEA FBDCFD78

0 Kudos

5,014 Views
elijahbrown
Contributor III

As I said in the original post, "The RC sets BAR0 to 0x90500000 and BAR2 to 0x310000.  For whatever reason I can't make the BAR1 mask anything non-zero so I'm using BAR2 instead."  Let's start by talking about why setting the BAR1 mask doesn't seem to have any effect? 

The rest is not really useful unless you have the patched linux running on both ends, which we don't.  This is a bare metal project, not linux.  I have referenced the linux patches to come up with my configuration settings which you can see in the code I attached above.  I understand the big picture of how to test an RC->EP transaction which is what you have described.  What is apparently missing is some configuration detail on the EP side.  The freescale linux patches I have seen only setup a single *outbound* mapping on the EP, and rely on the RC to send TLPs in the actual address range of the EP.  This is not possible in our case, we need to setup an inbound mapping to remap inbound TLPs in the address range of BAR0 to the IMX6's RAM. 

I have studied the datasheet and implemented this as shown in the code I attached earlier, but it just doesn't work.  Please review the code I attached and see if you see anything obviously wrong.  If you are willing or it would help, I will make a stripped down project and send it to you so you can duplicate my setup.  I suspect the issue has to do with the inbound mapping at the EP but I have been over the datasheet many times and it appears to be correct.  However the datasheet PCIe section is known to have errors, so who knows....  Also any suggestions as to troubleshooting *why* it doesn't work would be useful.  None of the error bits in the Command/Status reg are set - I don't know how else to get any error information from the core.

0 Kudos

5,014 Views
richard_zhu
NXP Employee
NXP Employee

Hi Elijah:

First of all, can you make a double check that the value of the BAR0 is set to "0x90500000" or something else?

Secondly, did you ever capture the PCIe protocol trace log by the protocol analyzer on your platform ever?

If yes, we can figure out that the PCIe RC of your platform issue the outbound TLP or not.

Best Regards

Richard

0 Kudos

5,014 Views
elijahbrown
Contributor III

Yes, I read back BAR0 from both the RC and the EP software, both sides verify it's 0x90500000 so I'm confident that is correct.  Right now I don't have access to a PCIe protocol analyzer.  All we have to go on right now is that the RC works with other EPs so we think it's correct for that reason. 

0 Kudos

5,014 Views
richard_zhu
NXP Employee
NXP Employee

HI Elijah:
Based on the FSL Linux BSP release, did you ever validate the imx6 PCIe EP inbound operations on the imx6 PCIe EP/RC validation system at your side(one imx6 used as PCIe EP, the othr one used as RC)?

The reason why there is no explict inbound iATU configuration, is that the original and target address would be 1:1, if there are no

explict inbound regions iATU configurations.

BTW, can you send me the whole picture and the details of the PCIe connectors of your system to me?

Since I have one PCIe protocol analyzer at my hand, let me check whether I can capture the protocol data or not on your system firstly.

Best Regards

Richard

0 Kudos

5,014 Views
elijahbrown
Contributor III

No, we do not have the cabling to connect two boards like that, and we are using the Boundary devices nitrogen 6X board.  I realize there is no inbound iATU setup since the TLP addresses are 1:1 with the EP memory addresses.  Unfortunately that is not possible in our system, the RC is an Intel atom board whose memory map is totally different from the IMX6's.  So inbound address translation is necessary...

The RC is running Deos, a safety critical RTOS so I doubt you will be able to replicate our setup exactly.  It's got a normal PCIe edge connector, and we've made a custom cable that goes from that edge connector to the PCIe header on the nitrogen 6X board.  But I think if you just set it up such that the RC sends TLPs with different addresses than the ARM and get the inbound iATU working, that is probably the point we are having trouble with.  It is not demonstrated anywhere in the freescale examples, maybe it has not been tested? 

0 Kudos

5,013 Views
richard_zhu
NXP Employee
NXP Employee

HI Elijah:

First of all, we should figure out whether the RC send out the outbount TLP(address 0x9050_0000) or not when you want the EP(imx6 PCIe) receive the inbount TLP(address 0x9050_0000).

Refer to the imx6 PCIe EP/RC validation system, one outbound region iATU is mandatory required at RC side, if the imx6 PCIe RC

want to access the memrory region of imx6 PCIe EP.

Secondly, the BARs of the imx6 PCIe EP should be configured too, if the PCIe EP want to be enumurated and allocated the responding

memory spaces by PCIe RC. The referrence codes can be found in FSL Linux BSP codes.

A part of them are pasted below:

                /* CMD reg:I/O space, MEM space, and Bus Master Enable */
                writel(readl(pp->dbi_base + PCI_COMMAND)
                                | PCI_COMMAND_IO
                                | PCI_COMMAND_MEMORY
                                | PCI_COMMAND_MASTER,
                                pp->dbi_base + PCI_COMMAND);

                /*
                 * configure the class_rev(emaluate one memory ram ep device),
                 * bar0 and bar1 of ep
                 */
                writel(0xdeadbeaf, pp->dbi_base + PCI_VENDOR_ID);
                writel(readl(pp->dbi_base + PCI_CLASS_REVISION)
                                | (PCI_CLASS_MEMORY_RAM << 16),
                                pp->dbi_base + PCI_CLASS_REVISION);
                writel(0xdeadbeaf, pp->dbi_base
                                + PCI_SUBSYSTEM_VENDOR_ID);

                /* 32bit none-prefetchable 8M bytes memory on bar0 */
                writel(0x0, pp->dbi_base + PCI_BASE_ADDRESS_0);
                writel(SZ_8M - 1, pp->dbi_base + (1 << 12)
                                + PCI_BASE_ADDRESS_0);

                /* None used bar1 */
                writel(0x0, pp->dbi_base + PCI_BASE_ADDRESS_1);
                writel(0, pp->dbi_base + (1 << 12) + PCI_BASE_ADDRESS_1);

...

Best Regards

Richard

0 Kudos

5,013 Views
elijahbrown
Contributor III

We don't have access to a PCIe bus analyzer so there's not much we can do right now to verify the RC is actually sending the TLP.  It is in the works though.  The RC is not another IMX6, it's an Intel atom so the PCIe address mapping is totally different from the IMX's.  It does however work with an FPGA configured as an endpoint. 

As I mentioned in my first post,

The RC sets BAR0 to 0x90500000 and BAR2 to 0x310000

My software in the EP sets up the BAR masks and config registers as your code above shows - I referenced that patch to figure that out.  You will see if you open the code I attached above.  I have read the EP BARs from both ends to verify they are indeed set as we expect.  The RC enumerates it and everything seems to work except for inbound requests.

0 Kudos

5,014 Views
richard_zhu
NXP Employee
NXP Employee

Regarding to Linux PCIe bus enumeration, there are some spaces allocated by RC, when EP device is enumerated by PCIe RC.

Can you try the following method, then check the PCIe EP inbound operations can be excuted or not?
- Step1: find the memory space allocated by RC when iMX6 PCIe EP is enumerated during the initialization.

For example, the 0x0100_0000 ~ 0x017f_ffff, the 8MBytes memory space allocated by iMX6 RC in iMX6 PCIe EP/RC validation system.

...

[    0.394066] pci 0000:01:00.0: BAR 0: assigned [mem 0x01000000-0x017fffff]

[ 0.394104] pci 0000:01:00.0: BAR 6: assigned [mem 0x01900000-0x0190ffff pref]

[    0.394114] pci 0000:01:00.0: BAR 2: assigned [io  0x1000-0x1fff]

[    0.394141] pci 0000:01:00.0: BAR 3: assigned [mem 0x01910000-0x019100ff pref]

...

- Step2: Configure the "memory space address" into the iATU of iMX6 PCIe EP inbound region setup.

For example, use the start/end addresses of the memory space allocted by PCIe RC, to setup the iATU region of iMX6 PCIe EP inbound region setup

...

use 0x0100_0000 as the base address, 8MBytes as the limitation.

The address of iMX6 PCIe EP, that acctually want to be accessed by PCIe RC, used as the target address.

...

- Step3: access the memory space allocated by PCIe RC, issue the outbound region access operations, figure out it works or not.

BTW, can you ship one whole development kit of your platform to me?

Then I can try to figure out that I can capture the PCIe protocol raw data by the lecroy PCIe protocol analyer or not.

0 Kudos

5,014 Views
jamesbone
NXP TechSupport
NXP TechSupport

To my understanding, the EP only need set it's BAR mask, and the RC will determine the EP's pcie bus address, and configure the EP's bar and inbound offset registers accordingly, however I can't make sure the exact mechanism on the RC side so try just configure the BAR and inbound offset by the other device code.

0 Kudos

5,014 Views
elijahbrown
Contributor III

What do you mean the "other device code"?  The EPs BARs, BAR masks, and address translation setup are configured by the application running on the EP.  The RC cannot write the iATU registers as it's above the PCIe config space and that would not make sense anyway as the RC shouldn't need to know the internal memory map of the EP.  The RC writes the base addresses into the EP BARs correctly, but after that is not able to do any memory writes into the memory space reserved by the BARs.  There are no error bits set in the PCIE_EP_COMMAND register.  This is the problem.... it just doesn't work but where to start troublshooting?  The RC is known to work with other FPGA EPs so we are pretty confident it's doing the right thing. 

0 Kudos

5,014 Views
gfine
NXP Employee
NXP Employee

Hi Elijah,

From the initial flush, it looks like it should be working.  I have escalated internally.

What is the OS version?

BR,

Glen

0 Kudos

5,014 Views
elijahbrown
Contributor III

Thanks.  This is a bare metal project, no OS.  I started with the freescale bare metal SDK and worked from there...

0 Kudos

5,014 Views
gfine
NXP Employee
NXP Employee

Hi Elijah,

OK. So the code you are using is taken from the v1.1 (February 11th, 2013) Platform SDK?

BR,


Glen

0 Kudos