fsl,qman-frame-queues-rx

bogdantanasa · ‎11-23-2016

Hi,

I am using the t4240qds system with a device tree that enables usdpaa. Could someone please explain to me what this syntax means:

fsl,qman-frame-queues-rx = <0x6c 0x1 0x6d 1>;

I assume that 0x6c 0x1 means that there is 1 rx queue. However, I do not understand what 0x6c means? Is this an address? How the 0x6c relates to the DPAA user manual for t4240 processors.

PS: these values are taking from the default device tree file that comes with SDK 1.6.

Thanks,

Bogdan.

radu_bulie · ‎11-24-2016

Hi,

This device tree property can be read as follows:

0x6c = Rx error frame queue id

1 - number of Rx error frame queues. There is exactly one error frame queue hence the value of 1. Optionally one can specify the 0 value to instruct the driver to dynamically allocate the frame queue IDs.

0x6d = Rx default frame queue id

1 -number of Rx default queues

Note:

if you have a node like this:

fsl,qman-frame-queues-rx = <0x6c 0x1 0x6d 1 0x180 128>;

that means that 128 queues starting from fqid 0x180 will be allocated dynamically for PCD -use (FMAN engine) assigned to the core-affined portals in a round-robin fashin.

Radu

在原帖中查看解决方案

radu_bulie · ‎11-24-2016

You can search for DPAARM in info center (nxp.com/infocenter) and look for- in the document - Hash Value Generation.

bogdantanasa · ‎11-25-2016

Given an ethernet interface that will be used by usdpaa (i.e. declared with compatible dpa-ethernet-init) what rules should be followed when allocating buffering pools in the device tree. I see that all examples in SDK 1.6 allocate 3 such pools starting from ID 7. My concrete questions are:

1. Are more pools going to improve the performance?

2. FMAN uses a direct portal to talk to BMAN. If on allocates n buffer pools to that Ethernet interface does this means that the remaining ones can be used by the software portals?

2. According to the manual a T4240 rev 2.0 has 64 buffer pools. Can all be used by usdpaa? Or there is starting index from which one can book (for example the magical 7 which is in all examples).

radu_bulie · ‎11-25-2016

On an interface, one can configure max 8 buffer pools. (ebmpi register). Theoretically there shouldn't be a restriction in configuring them in dts. However usdpaa code willl only use 3 buffer pools. So changes should be performed there - at ppac level.

1 - The buffer size is calculated based on frame size. The BMI optimizes the selection of the buffer, to minimize the memory footprint. The BMI tries to obtain
the BMan pool with smallest available buffers to fit the whole frame. Then the BMI tries to allocate one
external buffer within the selected pool from BMan. If requested pool is depleted, the BMI requests a
buffer from the next larger size pool. If that pool is also depleted, it tries another pool with bigger buffers
and so on. This process continues until either a buffer is found or no more pools with larger buffers are
available. Depending of the traffic packets' size more than one pool should be an improvement. Note than BMAN is only a manager of tokens. No allocations actually occur in BMAN.

2. There is no implicit logic that separates the buffers in a bpool. One can define storage profiles per port and each storage profile may contain buffer pools for specific flows. For example one can have separate pools for kernel space traffic and userspace traffic. One should now very well the design of his scheme - when different flows acquire or release buffers.

For example if you send a kernel frame to userspace, that comes from a kernel pool, a copy should be performed to the userspace pool, otherwise a crash would occur.

3. If you start the board with no private interfaces -- that is only usdpaa interfaces - you can use 64 pools (changes should be performed at usdpaa level ). The starting index for bpool is 0.

bogdantanasa · ‎11-28-2016

Regarding the configuration of the bpools with different sizes. Is this achieved by "fsl,bpool-Ethernet-cfg"? There are 6 arguments there (which I believe should be interpreted in groups of 2 thus 3 arguments). What is there meaning?

Similar the syntax "fsl,bpool-thresholds" ... I assume this applies when a bpool is close to become empty. What the arguments mean?

PS: iti dau o bere cand trec prin Bucuresti! :smileygrin:

radu_bulie · ‎11-28-2016

fsl,bpool-ethernet-cfg = <count size base_address>;
count = number of buffers in the pool
size = buffer size
base_address - phys address of the bpool.  
"Because two Linux partition have different memory spaces,
this physical address will be mapped in both partitions. 
In scenarios where a single partition is used, this address
will be invalid, particularly 0, and a dynamic mapping 
from user-space to kernel-space will be done.
The reason there are two numbers per each of count, size, 
and base_address is that we support 36-bit addresses on the P4080
(and 64-bit on P5020).
It should be noted that the size of those parameters should be set 
by the root node's #address-cells and #size-cells properties."

Regarding the thresholds here is an example
fsl,bpool-thresholds = <0x8 0x20 0x0 0x0>;
Each pool maintains two independent depletion states - a sw depletion state 
(for software use) and another for hw blocks(FMAN, CAAM,PME)-the one with bold.
There is a depletion entry and a depletion exit threshold. 
In the case above the pool will enter depletion if it is below 
8 buffers and exit depletion if is above 32 buffers. In this case the
software will take care of "replenishment" - a callback will be called.
The purpose of HW threshold is to implement a push back method
 -e.g issuing pause frames when a depletion
threshold is reached.


PS: :smileyhappy:

bogdantanasa · ‎11-28-2016

1. Nice ... but how a Linux knows that there is only one partitions? I assume by partition you mean that there is a hypervisor (which may not be Freescales) that runs different Linux instances ... right?

2. I ran different tests with different configurations for the bportals and qportals boot arguments. It seems that there is an impact on performance. Currently the driver distributes these portals evenly to the cores if they are missing. If one wants only usdpaa applications, is there any way to allocate ALL the software portals to these applications?

3. Sending frames is equivalent with enqueuing to QMAN. This means that one needs to build the entire Ethernet frame in the memory by hand so to say ... right? The advantage of using FMAN PCD is only for RX ... it seems so.

radu_bulie · ‎11-29-2016

1 -Linux does not know. One would configure the hv-dts with the desired memory windows.

2. You can try to boot in usdpaa context without any qportal, bportal args. On t4240 there are 50 sw portals.

You will have a total of 24 cores each core with its consumer thread; thus you will use 24 sw affine portals. Make sure you have the argument isol_cpus=0..23.

3. For sending a frame one have to build the frame descriptor structure. FMAN is used for ingress traffic - classification, policing, distribution, header manipulations. One can use offline ports - on which the same mentioned operations as above can be applied before sending the frame to a Tx port. For example when flows exit the SEC engine they can be sent to an offline port for forwarding decision before sending them to the Tx port.

bogdantanasa · ‎11-29-2016

I want to check with you few things to make sure I got it right ... :smileyhappy:

Let us say I want to transmit trough a FMAN port the string "0123456789". I my mind the following sequence of steps must be followed.

1. global and per thread initializations

2. allocate the "dma memory"

3. allocate space within the "dma memory" to store the desired string

4. allocate a bman buffer which "points" to the memory from point 3. I must say that this step is not very clear to me. What are the best functions to use? bman_acquire or there are others?

after point 4 one should have a frame descriptor

5. enque the frame descriptor into the tx frame queue (by invoking qman_enqueue)

there are 2 default tx queues in the device tree

6. get the result in default confirm tx if everything is OK; otherwise in the default tx error.

Is this sequence of steps the correct one?

radu_bulie · ‎11-24-2016

Hi,

This device tree property can be read as follows:

0x6c = Rx error frame queue id

1 - number of Rx error frame queues. There is exactly one error frame queue hence the value of 1. Optionally one can specify the 0 value to instruct the driver to dynamically allocate the frame queue IDs.

0x6d = Rx default frame queue id

1 -number of Rx default queues

Note:

if you have a node like this:

fsl,qman-frame-queues-rx = <0x6c 0x1 0x6d 1 0x180 128>;

that means that 128 queues starting from fqid 0x180 will be allocated dynamically for PCD -use (FMAN engine) assigned to the core-affined portals in a round-robin fashin.

Radu

bogdantanasa · ‎11-24-2016

Hi,

Thanks for the answer.

Is there any practical reason to have a syntax like this: fsl,qman-frame-queues-rx = <0x6c 0x1 0x6d 1 0x180 128>; I imagine it is the job of the FMAN configurator (i.e. fmc) say which fqid shall be used.

/Bog.

radu_bulie · ‎11-24-2016

The syntax for the PCD is optional. You can have it or not. The syntax is interpreted by the dpaa eth driver and allows the creation of that range of queues by the dpaa eth driver. Further on, an user will pick up queues from that range and will build a pcd which will contain queues from there and will apply the pcd - with the fmc tool or will apply the model using the fmlib API. FMAN has "no idea" of the queues created by the dpaa-eth.

bogdantanasa · ‎11-24-2016

Because FMAN has "no idea" about those queues ... what would be best design? To let the driver create only the default queues and later use fmc for the PCD? It feels like this since otherwise one has to make sure that fmc uses exactly the same fqid(s) declared in the device tree. What happens if the driver allocates queues for PCD which are never used while the fmc uses totally other queues?

radu_bulie · ‎11-24-2016

If one wants to have a particular configuration and apply it with fmc, he will create the queues according to the pcd model - queues can be affine to cores or can be used by other engines. (the user will assign the consumer at creation time)

There is also the configuration when one would like to classify the traffic before going in the linux stack and in this case he will use the queues created by the dpaa-eth driver.

I think that these queues are automatically created (the pcd queues)- you can check in the sysfs :

cat /sys/devices/fsl,dpaa.19/ethernet.20/net/fm1-mac1/fqids

Rx error: 262
Rx default: 263
Rx PCD: 14336 - 14463
Tx confirmation (mq): 264 - 287
Tx(recycling): 288 - 311
Tx error: 312
Tx default confirmation: 313
Tx: 314 - 337

If you do not have this path just do :

find / -name "*fqid*" -type f

So it depends what you want to do with the model. You will have the PCD queues anyway. There is no issue if you decide not to use them.

bogdantanasa · ‎11-24-2016

Maybe a bit offtopic ... I assume that the hash function used by the FMAN to generate/compute a FQID is proprietary information ... or there exists documentation which I missed?

fsl,qman-frame-queues-rx

fsl,qman-frame-queues-rx

QorIQ T4 Devices