Hi all,
I am developing a newly video scaling software based on "vpudec" and "videoscale" of gstreamer plugin on i.MX6Q Sabre board.
I have a problem about memory access.
I would like to process the VPU decoded frame for scaling.
But, it takes a long time just to copy from VPU used decoded frame memory to other memory for scaling.
(Approximately 50ms per VGA 1 frame)
I also tried the original gstreamer.
> gst-launch-1.0 filesrc location=test_640x480.avi typefind=true ! video/x-msvideo ! aiurdemux ! vpudec ! videoscale method=1 ! video/x-raw,width=800,height=480 ! overlaysink
However, it takes a long time at loading of VPU decoded frame in "videoscale".
Do you know how to resolve?
As additional information, if I use software decoder (e.g. avdec_h264 instead of vpudec), the same problem no occurs.
I think that this problem occurs when ARM software loads the results of the VPU.
My environment is as follows.
- i.MX6Q Sabre board using Freescale Yocto Project BSP code (imx-3.14.52-1.1.0_ga).
- vpudec version: 4.0.8
- wrapper: 1.0.62
- vpulib: 5.4.32
- firmware: 3.1.1.46070
- frame memory: 9 frames (vpu wrapper log is as follows)
input register frame 0: (phy) Y:0x3C180000, U:0x3C1CB000, V:0x3C1CB001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
input register frame 0: (virt) Y:0x75755000, U:0x757A0000, V:0x757A0001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
vpu register frame 0: (phy) Y:0x3C180000, U:0x3C1CB000, V:0x3C1CB001
register mv 0: (phy) 0x3C2A0000, (virt) 0x75742000
input register frame 1: (phy) Y:0x3C300000, U:0x3C34B000, V:0x3C34B001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
input register frame 1: (virt) Y:0x74D8F000, U:0x74DDA000, V:0x74DDA001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
vpu register frame 1: (phy) Y:0x3C300000, U:0x3C34B000, V:0x3C34B001
register mv 1: (phy) 0x3C2C0000, (virt) 0x7572F000
input register frame 2: (phy) Y:0x3C380000, U:0x3C3CB000, V:0x3C3CB001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
input register frame 2: (virt) Y:0x74D1E000, U:0x74D69000, V:0x74D69001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
vpu register frame 2: (phy) Y:0x3C380000, U:0x3C3CB000, V:0x3C3CB001
register mv 2: (phy) 0x3C2E0000, (virt) 0x7571C000
input register frame 3: (phy) Y:0x45F00000, U:0x45F4B000, V:0x45F4B001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
input register frame 3: (virt) Y:0x74B8F000, U:0x74BDA000, V:0x74BDA001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
vpu register frame 3: (phy) Y:0x45F00000, U:0x45F4B000, V:0x45F4B001
register mv 3: (phy) 0x3C6C0000, (virt) 0x75709000
input register frame 4: (phy) Y:0x45F80000, U:0x45FCB000, V:0x45FCB001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
input register frame 4: (virt) Y:0x74B1E000, U:0x74B69000, V:0x74B69001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
vpu register frame 4: (phy) Y:0x45F80000, U:0x45FCB000, V:0x45FCB001
register mv 4: (phy) 0x3C6E0000, (virt) 0x74D0B000
input register frame 5: (phy) Y:0x46000000, U:0x4604B000, V:0x4604B001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
input register frame 5: (virt) Y:0x74AAD000, U:0x74AF8000, V:0x74AF8001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
vpu register frame 5: (phy) Y:0x46000000, U:0x4604B000, V:0x4604B001
register mv 5: (phy) 0x46200000, (virt) 0x74947000
input register frame 6: (phy) Y:0x46080000, U:0x460CB000, V:0x460CB001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
input register frame 6: (virt) Y:0x74A3C000, U:0x74A87000, V:0x74A87001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
vpu register frame 6: (phy) Y:0x46080000, U:0x460CB000, V:0x460CB001
register mv 6: (phy) 0x46220000, (virt) 0x74934000
input register frame 7: (phy) Y:0x46100000, U:0x4614B000, V:0x4614B001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
input register frame 7: (virt) Y:0x749CB000, U:0x74A16000, V:0x74A16001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
vpu register frame 7: (phy) Y:0x46100000, U:0x4614B000, V:0x4614B001
register mv 7: (phy) 0x46240000, (virt) 0x74921000
input register frame 8: (phy) Y:0x46180000, U:0x461CB000, V:0x461CB001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
input register frame 8: (virt) Y:0x7495A000, U:0x749A5000, V:0x749A5001 , Y_TileBot: 0x0, Cb_TileBot: 0x0
vpu register frame 8: (phy) Y:0x46180000, U:0x461CB000, V:0x461CB001
register mv 8: (phy) 0x46260000, (virt) 0x7490E000
Best Regards,
Kenji
Dear igor,
Thank you so much for your advice.
I expect that this problem is caused by the use of uncached area for the frame memory.
So, ARM software loading is very slow.
However, I don't have conclusive evidence.
Do you know how to know that the above frame memory address (e.g. 0x3C180000) is uncached area?
Is there a configuration file for setting cached/uncached area?
I use i.MX6Q Sabre board with Freescale Yocto Project BSP code (imx-3.14.52-1.1.0_ga).
Best Regards,
Kenji
Hi Kenji
please try Demo Images found on
i.MX 6 Series Software and Development Tool|NXP
they should work well with i.MX6Q Sabre board
Best regards
igor
Hi Kenji
one can run mmd tool out of the unit_tests package
to know what is reason for memory bottleneck
www.nxp.com/lgfiles/NMG/MAD/YOCTO/imx-test-5.4.tar.gz
../test/mmdc
mmdc tool is a simple test program which does MMDC profile.
One can reference sect. "MMDC Profiling" of reference manual.
Best regards
igor
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------