I'm using LS2088 with some AIOP program code to handle heavyish (10G on to and from several interfaces) traffic and occasionally get a packet from the QBMan to be processed in the AIOP processing callback which has exactly the same buffer as another packet which is already being processed by another processing core by another task (same kind of packet processing callback).
It seems that QBMAN has written the latter packet over the buffer that the first task is processing (both FDs seem to be just fine otherwise so they are only using the same buffer for the packet). When sending the packet dpni_drv_send it returns -EBUSY even then the interface is not actually congested. The later arriving packet could be processed without any issues, however it releases the buffer in the dpni_drv_send and if the first packet processes the -EBUSY discarding the FD using the ARCH_FDMA_DISCARD_FD() the same buffer gets released into the pool twice which naturally leads into a lot's of problems.
I checked the pool by not discarding the FD in the -EBUSY and then checking all the addresses in the pool, one of free addresses (not the double one) has disappeared and leaking the poll so it seems that QBMan has reserved a unique buffer for the latter packet, but somehow that get's replaced by one already in use and the packet overwriting one packed already in the processing and leading into same buffer floating in the pool twice and the original unique one leaking out of the pool permanently.
QBMAN_FBPR_AR,0x00000018 indicates 128kb FBPR external memory area.
QBMAN_FBPR_HDPTR,0x0007FBCF is this really within that area?
How is the BMAN internal memory stockpile managed in detail in the HW?
What software are you using?
e.g. LSDK version, MC firmware version, and is the AIOP app from LSDK or from you?
Kind a mix of LSDK 20.04 and 20.12. MC is currently 10.29.0, but I've been able to reproduce this with several other MC versions. AIOP is custom packet forwarder which is forwarding packets from port 1 to port 3, from port 2 to port 3, from port 3 to port 1 and from port 4 to port 2 which makes a congestion point in the egress of the port 3 where taildrop is set to drop packets when port 1 + port 2 combined traffic exceeds the capacity of the port 3.
When this issue happens send in the port 1 or port 2 returns -EBUSY just like the port 3 is doing constantly when the egress traffic is exceeding the capacity and the taildrop is dropping some packets, but in the error case there is only low amount of traffic to the ports 1 and 2 and they are not congested.
I am not aware of any existing issue regarding a double buffer usage.
In a normal flow any frame that's received on a WRIOP port will be put in a separate buffer acquired from the port's buffer pool and then will be presented
in the processing task's workspace.
If a buffer is acquired from a pool then the buffer will no longer be there. Therefore it cannot be acquired twice unless there is
something incorrectly configured in the application such that the buffer is released twice, meaning that there are two identical buffers in the pool.
So the problem can be looked this way: who releases a buffer twice?
Can this problem be easily reproduced using a reference usecase? What is the packet flow? You receive frames continuously on an interface then at a specific moment you notice this?
In my opinion you should try to monitor what's happening in the pool because I do not see a way of acquiring a buffer twice.
However as I already said a buffer can be freed twice in the pool in a faulty setup.
There is no double release before this happens (then when the both tasks are finished they both release the same buffer and there there are 2 instances of the same buffer in the pool and the issue quickly escalates as those are released into the circulation again and again). This mechanism of introducing double buffers is somehow connected to the dpni_drv_send returning -EBUSY even when it is not busy. If I deliberately free the same buffer twice, there will be no -EBUSY, but the symptoms are otherwise similar as the situation escalates when there are 2 instances of the same buffer in the pool.
Pretty likely the QBMAN reserves proper unique buffer for the latter one, but the address is somehow replaced with the one already in use and packet copied from the fifo over it as when I deliberately leak the buffer in the event of -EBUSY (just terminate the task without discard_fd) one buffer (not the one the fd is having!) is leaked, but this buffer remains in circulation (from the latter packet is it is processed normally and the buffer freed) and the issue won't escalate as there is now only one instance of this buffer in curculation, but now the pool is leaked.
The packet flow is 2.5G IMIX from port 1 to port 3, 8.6G jumbo frames from port 2 to port 3 8.6G jumbo frame from port 3 to port 1 and 2.7G IMIX from port 4 to port 1. There is a congestion point in the egress of the port 3 and some packets are constantly dropped as they will not fit into the 10G interface. The issue happens randomly, but with this traffic usually in less than 1 hour.
It is possible that my understanding to be incorrect in case the issue happens for the same interface but on parallel tasks.
In this situation my suggestion with different pools per each interface will probably not change the behavior.
Anyway as I said earlier try to create a simple setup that can reproduce what you observe then send me the steps to have something similar at my side.
What is the implementation of this MACRO in your case? I mean ARCH_FDMA_DISCARD_FD
#define ARCH_FDMA_DISCARD_FD() \
fdma_discard_fd((struct ldpaa_fd *)HWC_FD_ADDRESS, 0, FDMA_DIS_AS_BIT);
When you see -EBUSY return, is the frame not discarded all the time? I mean the discard has to be explicitly executed?
I ask this because dpni_drv_send has a flags argument:
enum fdma_enqueue_tc_options {
/** Return after enqueue */
FDMA_EN_TC_RET_BITS = 0x0000,
/** Terminate: this command will trigger the Terminate task
* command right after the enqueue. If the enqueue failed, the
* frame will be discarded. If a frame structural error is found
* with the frame to be discarded, the frame is not discarded
* and the command returns with an error code. */
FDMA_EN_TC_TERM_BITS = 0x0400,
/** Conditional Terminate : trigger the Terminate task command
* only if the enqueue succeeded. If the enqueue failed, the
* frame handle is not released and the command returns with an
* error code. */
FDMA_EN_TC_CONDTERM_BITS = 0x0800
};
Probably in your case the first option is used. Can you use the second option that automatically discards the frame? If the frame has a structural error then
discard will not happen.(this should not happen in your setup)
If you use the second option (0x400) then you do not need to do explicit discard.
Already tried that some time ago. Didn't change the behavior in any way.
And what happened back then? Did the discard occur? Did you have to do it manually due to a frame structural error?
Discard happened by the send as specified in the flags and on those "double buffer" situations it escalated in a same way as with the external discard without the terminate flag to the send.
I will remind you my earlier request:
" try to create a simple setup that can reproduce what you observe (based on one of the available use cases) then send me the steps to have something similar at my side."
An idea of a WR could be something like follows:
-supposing that the problem does not appear at high rates and due to congestion, you can create a global backlog with a number of N entries.
This backlog will be accessible by all the tasks.
i, each task when EBUSY is returned by send function, will not automatically release the buffers. It will take the buffer addresses from the FD and save them in the backlog. No discard will be executed inside the task.
ii. when the EBUSY happens, create a TASK with TMAN. That task will loop and check the backlog. If the backlog reaches a limit N-j then the task will release all the buffers from the backlog then will terminate itself. (or it can loop forever - depending on the use case architecture and possible impact on performance)
iii. when another EBUSY happens goto i.
Maybe this way of purging the un-dicarded frames will avoid the potential conflict you observe between discarding during EBUSY and another frame reception.
" try to create a simple setup that can reproduce what you observe (based on one of the available use cases) then send me the steps to have something similar at my side."
We are currently working on this.
And we have another finding, this issue might be somehow connected to taildrop and early drop functionality. Now that we have a modified framework to support individual buffer pools for each physical port I tried a totally different congestion handling strategy - allowing the buffer pool to deplete in the congestion and not using taildrop nor early drop. With this kind of setup we have (at least not yet) been able to reproduce this issue.
Make sure that any taildrop config is applied before any traffic is received on a port.
It must be done at init time.
Can you try to reproduce the issue on one of the existing use cases from AIOP? In order to help with this I will need a way to reproduce the problem.
As a general comment if each port has its own buffer pool then it's unlikely to see your problem.
My suggestion as a WR is to try and see if it is possible in your scenario to add a change in your service layer and hardcode the buffer pools such as
interface 1 has bpid 1 , 2 has bpid 2 and so on.
Grep for configure_bpids_for_dpni and see if you can adjust it.
Tried with each port having own buffer pools, but unfortunately it didn't change anything and I was still able to reproduce the issue within relatively short time.
It seems that the dpni_drv_send returning -EBUSY when there is no real congestion on that port is somehow related to this special case of buffer double use, when I deliberately make a buffer double free, there is no -EBUSY returned, but naturally the symptoms after that are similar as there is then 2 instances of the same buffer in the pool and then in the circulation.