Does WEC7 OAL support SMP?
Background: Customer is using an FPGA to generate interrupts but fails when more then one core is enabled. Although the OS is WEC2013, I would assume this isn't an issue with Linux. so there is something the customer is not doing. Any advice is welcome. I did confirm that they are not caching the address range for the FPGA.
Below is the customers information.
We’re having some issues with our i.MX6 and WEC2013 project. We have a scenario that works when we only enable a single core. But when we enable multiple cores, with the same OS, only changing the number of enabled cores from 1 to 2, 3, or 4, why does our scenario fail?
Here is the background. We have an FPGA gathering data that is connected to an i.MX6 Quad core processor. We are using the memory bus to access and read data from the FPGA. We have an interrupt line that also goes from the FPGA to the processor. And an FIFO Almost Full line that we monitor on the FPGA for errors.
Here is the flow:
When our FIFO fills up to a specified level on the FPGA, it generates an interrupt. That interrupt then triggers our Interrupt Service Routine (ISR) which sends an event to our Interrupt Service Thread (IST) which reads in all of the data that is ready. Our IST then signals our Application which processes the data. We also have a FIFO Almost Full (FAF) signal from the FPGA that lets us know when our FIFO is almost full. If the FIFO fills up and we haven’t read the data from the FPGA then the data is lost and we will have errors.
If we only enable one core of the processor, then we can see an orderly progression through this sequence from ISR to IST to App. The FAF Signal never gets asserted, data is never lost, and it meets all of our timing requirements.
If we enable multiple cores, we see FAF asserted and data gets lost. We see multiple situations. Sometimes our ISR is held off. Sometimes our IST is held off. Sometimes we are in either the IST or ISR and get task switched, and don’t finish reading all of the data in time. Sometimes we are being held off for milliseconds.
Things that we have tried, unsuccessfully, for a multicore scenario:
- Adjusting our application data collection thread priority.
- Adjusting our driver’s IST priority.
- Adjusting our application data collection thread processor affinity.
- Adjusting our driver’s IST processor affinity.
- Adjusting our IRQ’s processor affinity in the Global Interrupt Controller.
- Adjusting our IRQ’s priority in the Global Interrupt Controller.
Settings for our successful single core scenario:
- Adjusting our application data collection thread priority. The Time Critical (248) value hasn’t worked. We have to put the priority much higher, e.g. 50 or 0.
- Adjusting our driver’s IST priority. Same values as above.
- Setting our application data collection thread processor affinity to none, which is let the kernel decide.
- Setting our driver’s IST processor affinity to none, again which lets the kernel decide.
- Leaving our IRQ’s processor affinity in the Global Interrupt Controller to the default(Core 1).
- Leaving our IRQ’s priority same priority as all of the other IRQ’s in the Global Interrupt Controller.
With this scenario working, all we do is enable multiple cores in eboot, i.e. set it to 2, 3, or 4 and things stop working. This is the same OS image and code. Nothing changes. The only thing that changes is the number of cores that are enabled. So, again, we return to where we started. What differences are you aware of between single and multiple cores that we are missing or should be aware of? Do you know of anyone else having similar issues? And do you know about their resolutions?