iMXRT1052 and emWin: problems with D cache

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

iMXRT1052 and emWin: problems with D cache

3,419 Views
krzysiekch
Contributor I

Hi,

I was trying to build a GUI using the emWin library on the iMXRT1052 microcontroller and I encountered a performance problem. As far as I know, the LCD and GUI buffers should be placed in a non-cacheable region of SDRAM. This actually works fine, however I experience significant performance drop during moving actions of the GUI elements. I tried to enable the D cache for the whole SDRAM, however in this case the image on the display is broken (the first line on the display is shifted because of some missing pixels and whole image has an offset).

And here comes my question. What causes this problem and why the D cache needs to be disabled for the LCD buffers? Is there any way to clean and flush the cache just before displaying the image frame? Maybe this would help with the performance. Is there anything else that could help to achieve fast GUI without lagging elements?

I have already enabled two display buffers and also tried with the emwin MEMDEV flags. My working memory region configuration looks like this:

MPU->RBAR = ARM_MPU_RBAR(8, 0x81000000U);

MPU->RASR = ARM_MPU_RASR(0, ARM_MPU_AP_FULL, 2, 0, 0, 0, 0, ARM_MPU_REGION_SIZE_16MB);

chainging it to:

MPU->RASR = ARM_MPU_RASR(0, ARM_MPU_AP_FULL, 1, 0, 0, 0, 0, ARM_MPU_REGION_SIZE_16MB);

also doesn't work, but I guess this setting also should disable the cache.

EDIT: I also tried other cache configurations (including shareable with write through), but I am still having the same problem. It only works with the memory region configured as the Device type.

Best regards,

Krzysiek

Labels (1)
0 Kudos
Reply
13 Replies

2,713 Views
victorjimenez
NXP TechSupport
NXP TechSupport

Hello,

Hello Krzysiek,

Regarding your questions please see my comments below.

What causes this problem and why the D cache needs to be disabled for the LCD buffers?

The basic problem is that the LCD controller has no visibility into the cache. So if the data is updated in the cache, but not written to memory, the LCD will read stale data from the memory. 

Is there any way to clean and flush the cache just before displaying the image frame? Maybe this would help with the performance.

Because the LCD will only ever read data from memory, using the cache in write-through mode should work. In write-through mode, new data will be pushed out to external memory, but if the core needs to read data that is cached it doesn't have to go out to memory again. The core performance will be slower than what you would get for copyback mode, but you shouldn't need to do manual cache operations to flush data.

Is there anything else that could help to achieve fast GUI without lagging elements?

If your application fits on the internal RAM, this is your best option to increase the performance in a considerable way. The processor can write the internal RAM faster than the SDRAM.


Have a great day,
TIC

-------------------------------------------------------------------------------
Note:
- If this post answers your question, please click the "Mark Correct" button. Thank you!

- We are following threads for 7 weeks after the last post, later replies are ignored
Please open a new thread and refer to the closed one, if you have a related question at a later point in time.
-------------------------------------------------------------------------------

0 Kudos
Reply

2,713 Views
krzysiekch
Contributor I

Hi,

thank you for your suggestions, but unfortunately I couldn't solve the problem. I tried more cache configurations for the SDRAM region:

MPU->RASR = ARM_MPU_RASR(0, ARM_MPU_AP_FULL, 2, 0, 0, 0, 0, ARM_MPU_REGION_SIZE_1GB);
MPU->RASR = ARM_MPU_RASR(0, ARM_MPU_AP_FULL, 0, 0, 1, 0, 0, ARM_MPU_REGION_SIZE_1GB);
MPU->RASR = ARM_MPU_RASR(0, ARM_MPU_AP_FULL, 0, 1, 1, 0, 0, ARM_MPU_REGION_SIZE_1GB);
MPU->RASR = ARM_MPU_RASR(0, ARM_MPU_AP_FULL, 1, 1, 0, 0, 0, ARM_MPU_REGION_SIZE_1GB);

but only the first one (Device mode) works correctly The remaining configs cause the same problem. I also added cache cleaning and invalidation before switching the image buffer, bu also without any success:

L1CACHE_CleanDCache();
L1CACHE_InvalidateDCache();

Cleaning and invalidating parts of the memory does not change anything. 

I will try to check the SDRAM and LCD controller configuration because I have no more ideas regarding the cache.

Krzysiek

UPDATE:

I did some more debugging and found out that disabling the D-cache completely does not help with the display issue. It of course affect the overall performance, however the image is not displayed correctly. The only thing that seems to work is the MPU region configuration as Device or Strictly Ordered.

0 Kudos
Reply

2,713 Views
victorjimenez
NXP TechSupport
NXP TechSupport

Hello Krzysiek,

With the emwin_slide_show example that we provide within the SDK, I'm not able to reproduce the behavior you mentioned. I left the default cache configurations for the SDRAM region and everything went fine. 

If you are building a GUI that has a lot of moving actions, I encourage you to take a look to Embedded Wizard. Embedded Wizard developed all their drivers thinking on the Cortex-M7, this makes that their solution has a great performance in which you won't face any of the behaviors you are facing.

In the following link, you will find a getting started guide from Embedded Wizard with the i.MX RT1050.I highly recommend you to follow this guide so you can evaluate the performance.

Best regards,

Victor

0 Kudos
Reply

2,713 Views
krzysiekch
Contributor I

Hi Victor,

Finally I think that I have found the possible source of the problem. I run the emwin_slide_show demo from the SDK on the MIMXRT105-EVK, but I increased the pixel clock from the 9.3MHz to 23.25MHz by changing the divider:

CLOCK_SetDiv(kCLOCK_LcdifPreDiv, 1);

The effect is the same that I mentioned in my previous messages - I think you will be able to reproduce this behaviour.  Moreover, I decreased the pixel clock on my board to about 10MHz and it started to work correctly. The performance with write-through cache configuration is very good and the image is displayed correctly.

Now the problem is that I need to set the pixel clock to about 33MHz which is the typical clock frequency for my display. From the thread I know that the maximum pixel clock is 75MHz:

i.MX RT1050 parallel RGB LCD H-res / dot clock rating 

How could I enable the faster clock? Is it related to some other frequency in the system that needs to be changed according to the pixel clock? For the clock configuration I use the Clock Configuration Tool from MCUXpresso. 33.6MHz seems to be a correct value, because I can't see any warnings.

Best regards,

Krzysiek

0 Kudos
Reply

2,713 Views
victorjimenez
NXP TechSupport
NXP TechSupport

Hello Krzysiek,

Thanks for providing more information! Could you please tell me which version of the SDK are you using? In the newer version of the SDK (2.6.0) the ELCDIF driver was updated.

I used the example emwin_slide_show from this version to make some tests. I saw that the LCDIF_clock is set to 67.5 MHz and the example runs perfectly.

pastedImage_7.png

Could you please update the SDK and try your emwin GUI along with the newest drivers at your desired speed (33.6 MHz) to see if the problem persists?

Best regards,

Victor

0 Kudos
Reply

2,713 Views
krzysiekch
Contributor I

Hi Victor!

I downloaded the newest SDK version (2.6.0) on Wednesday, so I guess I have the newest possible version.

I checked once again the emwin_slide_show example code and I can see that the configuration you mentioned is located in function BOARD_BootClockRUN in the clock_config.c file. It is generated by the clock configuration tool and you are right that the pixel clock has the frequency equal to 67.5 MHz.

However in the emwin_slide_show.c file there is another function (BOARD_InitLcdifPixelClock) that overwrites the mentioned configuration. According to the comments the pixel clock is set to 9.3 MHz. Increasing this clock, by changing this line (so changing the PreDiv from 4 to 1):

CLOCK_SetDiv(kCLOCK_LcdifPreDiv, 1);

causes the problems with the image.

It looks that the second function is responsible for setting the final pixel clock.

I will be able to verify it once again on Monday, when I will be back home. I could also send you the MCUXpresso project.

Krzysiek

0 Kudos
Reply

2,713 Views
victorjimenez
NXP TechSupport
NXP TechSupport

Hello Krzysiek,

Thanks for sharing more information! I completely skipped the modifications made inside the function BOARD_InitLcdifPixelClock. I made some tests on my side and I was able to reproduce the behavior you mentioned. However, the behavior showed is perfectly normal, let me explain why.

The function BOARD_InitLcdifPixelClock, as you stated, it overwrites the clock configuration. However, it changes more things, not just the dividers. With the following lines of code, you are initializing the VideoPLL/PLL5 at a frequency of 93MHz.

    /*
     * Initialize the Video PLL.
     * Video PLL output clock is OSC24M * (loopDivider + (denominator / numerator)) / postDivider = 93MHz.
     */
    clock_video_pll_config_t config = {
        .loopDivider = 31,
        .postDivider = 8,
        .numerator   = 0,
        .denominator = 0,
    };

    CLOCK_InitVideoPll(&config);
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Now, with the following lines, you are overwriting the clock configurations.

    /*
     * 000 derive clock from PLL2
     * 001 derive clock from PLL3 PFD3
     * 010 derive clock from PLL5
     * 011 derive clock from PLL2 PFD0
     * 100 derive clock from PLL2 PFD1
     * 101 derive clock from PLL3 PFD1
     */
    CLOCK_SetMux(kCLOCK_LcdifPreMux, 2);

    CLOCK_SetDiv(kCLOCK_LcdifPreDiv, 4);

    CLOCK_SetDiv(kCLOCK_LcdifDiv, 1);‍‍‍‍‍‍‍‍‍‍‍‍‍

With the first line, you are selecting the source clock for the Lcdif, in this case, it will be the PLL5, which you configure before to run at 93MHz. If you change the PreDiv to one, the LCDIF clock is going to be 93MHz, this is out of spec since the maximum allowed frequency is 75MHz. 

I made some changes on the clock configurations on MCUXpresso Config Tools, to show what is the final configuration that you have when you exit the function BOARD_InitLcdifPixelClock and you set the PreDiv to one. I think this helps to clarify what I mentioned before. To be inside the spec, the value of PreDiv must be at least 2.

pastedImage_1.png

Best regards,

Victor

0 Kudos
Reply

2,712 Views
krzysiekch
Contributor I

Hi Victor!

I am not 100% sure but i don't think you are right. Changing the divider value to one in the line:

  CLOCK_SetDiv(kCLOCK_LcdifPreDiv, 1);

Sets the division factor to 2. The configuration you described (LCDIF clock set to 93 MHz) requires changing the mentioned values to 0. I checked that using the configuration tool. Setting both values to 1:

CLOCK_SetDiv(kCLOCK_LcdifPreDiv, 1);
CLOCK_SetDiv(kCLOCK_LcdifDiv, 1);

divides the clock by 4, so the output value should be 93MHz/4 = 23,25MHz.

Also looking at the original comment: the clock source is 93MHz (Video PLL) and it is divided by 10 to obtain the value of 9.3Mhz. It is done by setting those two division values to 4 and 1. The total division factor is then (4+1)(1+1) = 10.

Krzysiek

0 Kudos
Reply

2,712 Views
victorjimenez
NXP TechSupport
NXP TechSupport

Hi Krzysiek!

You are right! Sorry for the misunderstood. I'm currently checking what might be causing this behavior with the applications team. I will give you an update as soon as possible. 

Best regards,

Victor

0 Kudos
Reply

2,712 Views
victorjimenez
NXP TechSupport
NXP TechSupport

Hi Krzysiek!

 

In the datasheet of the LCD (link), table 7.3.1 indicates the maximum of DCLK frequency is 12MHz and the typical is 9MHz, this is why the example is configured with 9.3MHz. If you exceed the maximum allowed frequency we cannot assure the correct functionality of the LCD.

https___community.nxp.com_servlet_JiveServlet_showImage_2-1173345-271360_pastedImage_1.png

 

Best regards,

Victor

0 Kudos
Reply

2,712 Views
krzysiekch
Contributor I

Hi Victor.

I suspected that this may be the case and it looks that it is not related to my problem. However in the meanwhile I found potential solution for my problem too. I changed the source clock for the LCD interface to PLL3_PFD1 and set the dividers to 2 and 6.

pastedImage_1.png

This configuration generates 30MHz pixel clock that corresponds to my display requirements. The interesting thing is that other divider configurations I checked do not work and cause the same problems. Unfortunately I don't understand why this configuration works. Maybe the core clock (600MHz) needs to be the multiple of the pixel clock? I will try to do some more research regarding this issue. I will be grateful for any suggestions.

Best regards

Krzysiek

0 Kudos
Reply

2,712 Views
krzysiekch
Contributor I

Hi Victor,

Thank you for the link - I will go through it.

After some more debugging I found out that the image buffers are correct, however on the display I can see some black pixels at the beginning of the frame. All addresses are set correctly in the LCD controller, but it looks like there is some offset in the pixel data. It is quite interesting because this image is stable, but shifted by those extra black pixels that are not visible in the memory accessed by the LCD controller. Moreover this offset changes after redrawing bigger part of the screen, but again, after that it remains stable.

I still don't know how to explain this, but it may be the issue with the LCD timing configuration. I will check again the SDK demo and the link you sent - maybe I will find the problem in my code.

Best regards,

Krzysiek

0 Kudos
Reply

2,712 Views
victorjimenez
NXP TechSupport
NXP TechSupport

Hello Krzysiek,

Sorry for the late response. I was making some tests on my side to reproduce the behavior you mentioned. I'm now checking this with an applications engineer. I will give you an update as soon as possible.

Best regards,

Victor.

0 Kudos
Reply