H.264 decoding using ffmpeg + using GPU for display acceleration

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

H.264 decoding using ffmpeg + using GPU for display acceleration

Jump to solution
16,125 Views
rebelalliance
Contributor III

Hello,

I have a situation where I need to decode low-latency H.264 streams (also known as intra-refresh streams).  I have confirmed that the iMX6 cannot decode such streams in hardware because the hardware decoder requires a full I frame which these streams do not contain.  So what I would like to do is to use ffmpeg to decode the stream in software and then push the decoded output to the GPU for display.

Has anyone done this or something similar?  What would be a recommended architecture for such a sequence?  I have tried using MPlayer + SDL + directFB but the video is still choppy.  I am open to using gstreamer if at all possible (perhaps use ffmpeg plugin?).  I am just trying to brainstorm the best solution for this.

Any input appreciated.

PS: I am using the SabreSD development board with L3.0.35_4.0.0_130424_source.tar.gz to build my rootfs, bootloader, and kernel.

Labels (4)
1 Solution
6,014 Views
rebelalliance
Contributor III

I got intra-refresh streams to decode correctly on my i.MX6 using the attached patch.

It also appears aiurdemux cannot demux my streams, neither normal nor low latency, so I used mpegtsdemux from the gst bad plugins collection and it works fine.

Problem solved.

View solution in original post

26 Replies
6,015 Views
rebelalliance
Contributor III

I got intra-refresh streams to decode correctly on my i.MX6 using the attached patch.

It also appears aiurdemux cannot demux my streams, neither normal nor low latency, so I used mpegtsdemux from the gst bad plugins collection and it works fine.

Problem solved.

5,861 Views
Tarek
Senior Contributor I

Did you run the stream for a long period? "A day or two"

I'm playing multiple network streams and the pipeline freeze after running for many hours.

Thanks

0 Kudos
5,861 Views
rebelalliance
Contributor III

No, I haven't done extended run testing on the pipeline to see if it works ok.  I will try that out and let you know.

0 Kudos
5,857 Views
rebelalliance
Contributor III

It doesn't look like the i.MX6Q has enough juice to decode full frame rate 1080p streams in software only.

0 Kudos
5,857 Views
ChucoChe
NXP Employee
NXP Employee

What do you mean by full frame rate and software only?

From the reference manual on the VPU section:

68.4 Functional Description

VPU is a high-performance multi-standard video processing block that supports up to full

HD 1920 x 1088 at 30 fps plus D1 at 30 fps decoding and 1920 x 1088 at 30 fps

encoding.

Michel

0 Kudos
5,857 Views
rebelalliance
Contributor III

By software only I mean through ffmpeg or any software only decoder (no VPU assist).  The reason why I was forced to go this route is because the gstreamer software provided cannot decode low-latency streams.  Please see this discussion:

https://community.freescale.com/thread/306365

It looks like gstreamer is unable to work with low latency TS files.  If I extract the raw video h.264 data from a low latency stream TS using mplayer and pass the file to mxc_vpu_test.out then the VPU is able to decode the file just fine:

mplayer -dumpvideo -dumpfile myclip.264 input.ts

mxc_vpu_test.out -D "-i myclip.264 -f 2 -w 1920 -h 1080 -s 1 -t 1

So it seems to me that the hardware decoder is ok with low latency streams but gstreamer plugin cannot handle it.  That is too bad because i.MX6 satisfies all our requirements except this one and now we are forced to look for other chips that can do this (OMAP4/OMAP5 is able to).

There are example low latency stream files that I have uploaded in the message referenced above.

Is there anyway freescale can implement decoding of intra-refresh/low latency H.264 streams by modifying gstreamer?  This will make this chip fully able to handle broadcast quality stream.

0 Kudos
5,857 Views
ChucoChe
NXP Employee
NXP Employee

DaianeAngolini, Is it posible the customer codes their own decoder that takes no iframes?

karinavalencia Could we consider a feature request for the customer?

Michel

0 Kudos
5,857 Views
rebelalliance
Contributor III

Please let me know the outcome.

0 Kudos
5,857 Views
LeonardoSandova
Specialist I

It would be interesting to investigate either if there is an gstreamer element which does the parsing (what you are doing manually with mplayer) or implement one. Just an idea.

Leo

0 Kudos
5,857 Views
rebelalliance
Contributor III

Can the freescale experts at least provide hints on which gstreamer plugin to look at?  Is this an issue with the vpudec plugin?  Would I need to change the vpudec plugin code (gst-fsl-plugins/gst-fsl-plugins-3.0.7/src/video/vpu_dec.full/src/mfw_gst_vpu_decoder.c) to prevent it from waiting for an i-frame?

0 Kudos
5,857 Views
LeonardoSandova
Specialist I

I am not an expert. jackmao any idea?

Leo

0 Kudos
5,857 Views
jack_mao
NXP Employee
NXP Employee

all fsl plugin is based on fsl VPU, codec and parser, to this case, vpu doesn't support the low-latency H.264 decoder, so the related plugins can't fix this issue also, have to use open source plugin to decode this kind of stream.

Jack

0 Kudos
5,857 Views
rebelalliance
Contributor III

Jack,

Let me reiterate: if I use 'open source' plugins including ffmpeg for gstreamer or mplayer for full decoding then the performance is horrible.  So that right there is a no-go.

So that leaves us with using gstreamer or the mxc_vpu_test.out as the only options.  As stated above, I can take a low-latency TS, demux it using mplayer, and pass it to mxc_vpu_test.out and it decodes fine.  So we know the hardware can support low latency H.264.

Here are my questions regarding gstreamer:

1) In the gst-fsl-plugins-3.0.7 package, what is the difference between vpu and vpu_dec.full directories?  Pardon my ignorance, but is vpu_dec.full even being compiled?

2) In vpu_dec.full/src/mfw_gst_vpu_decoder.c, I see references in the code for skipping i-frames.  Is this where the decoder is waiting for the i-frame that it never gets from the demuxer?

0 Kudos
5,856 Views
Tarek
Senior Contributor I

Hi Rebelalliance

You should look at vpudec.c this is what's being compiled. I think mfw_vpu_.... is the old imx5 plugin

5,861 Views
rebelalliance
Contributor III

That's a good hint Tarek.  I was going crazy looking for the debug messages from that codebase and not spotting any.  Thanks.

0 Kudos
5,861 Views
Tarek
Senior Contributor I

Can you please post the pipeline that is failing and the error messages that your get?

I expected to be something like:

gst-launch rtspsrc location=rtsp://<ip> ! rtpmp2tdepay ! aiurdemux ! 'video/x-h264' ! vpudec ! mfw_v4lsink

Is that right?

0 Kudos
5,860 Views
rebelalliance
Contributor III

Sure, my pipeline that is failing is (note that if I switch to non-low latency stream then this works and I see accelerated video output):

gst-launch rtspsrc location=rtsp://192.168.1.55 ! rtpmp2tdepay ! mpegtsdemux ! vpudec low-latency=true framedrop=false ! mfw_v4lsink --gst-debug=vpudec:5

I enabled full output of the vpudec plugin since I suspect that is where the error is.  The full debug output of vpudec is attached but here is a snippet:

MFW_GST_V4LSINK_PLUGIN 3.0.7 build on Jun 24 2013 16:33:46.

Setting pipeline to PAUSED ...

[INFO]  Product Info: i.MX6Q/D/S

vpudec versions :smileyhappy:

        plugin: 3.0.7

        wrapper: 1.0.35(VPUWRAPPER_ARM_LINUX Build on Jun 24 2013 16:32:32)

        vpulib: 5.4.12

        firmware: 2.1.9.36350

0:00:00.155390208  3539    0x17050 LOG                   vpudec vpudec.c:395:vpudec_core_mem_alloc_dma_buffer: Call VPU_DecGetMem return 0x0

Pipeline is live and does not need PREROLL ...

Setting pipeline to PLAYING ...

New clock: GstSystemClock

0:00:01.219358469  3539   0x12f158 INFO                  vpudec vpudec.c:1844:gst_vpudec_sink_event: Get newsegment event from 0:02:36.644933333to 99:99:99.999999999 pos 0:00:00.000000000

0:00:01.219797802  3539   0x12f158 INFO                  vpudec vpudec.c:1185:gst_vpudec_setcaps: Get upstream caps video/x-h264

0:00:01.220155802  3539   0x12f158 INFO                  vpudec vpudec.c:1194:gst_vpudec_setcaps: Get codec std 6

0:00:01.220728469  3539   0x12f158 INFO                  vpudec vpudec.c:1234:gst_vpudec_setcaps: got downstream allow caps video/x-raw-yuv, format=(fourcc)NV12, width=(int)[ 1, 2147483647 ], height=(int)[ 1, 2147483647 ], framerate=(fraction)[ 0/1, 2147483647/1 ]; video/x-raw-yuv, format=(fourcc)I420, width=(int)[ 1, 2147483647 ], height=(int)[ 1, 2147483647 ], framerate=(fraction)[ 0/1, 2147483647/1 ]; video/x-raw-yuv, format=(fourcc)TNVP, width=(int)[ 1, 2147483647 ], height=(int)[ 1, 2147483647 ], framerate=(fraction)[ 0/1, 2147483647/1 ]; video/x-raw-yuv, format=(fourcc)YV12, width=(int)[ 1, 2147483647 ], height=(int)[ 1, 2147483647 ], framerate=(fraction)[ 0/1, 2147483647/1 ]; video/x-raw-yuv, format=(fourcc)TNVF, width=(int)[ 1, 2147483647 ], height=(int)[ 1, 2147483647 ], framerate=(fraction)[ 0/1, 2147483647/1 ]; video/x-raw-yuv, format=(fourcc)YV12, width=(int)[ 1, 2147483647 ], height=(int)[ 1, 2147483647 ], framerate=(fraction)[ 0/1, 2147483647/1 ]

[INFO]  bitstreamMode 1, chromaInterleave 1, mapType 0, tiled2LinearEnable 0

0:00:01.226047803  3539   0x12f158 INFO                  vpudec vpudec.c:1287:gst_vpudec_setcaps: Use new tsm scheme

0:00:01.226239470  3539   0x12f158 INFO                  vpudec vpudec.c:1169:gst_vpudec_setconfig: Set drop policy 0

0:00:01.226509137  3539   0x12f158 LOG                   vpudec vpudec.c:1447:gst_vpudec_chain: Chain in with size = 165

0:00:01.227359137  3539   0x12f158 LOG                   vpudec vpudec.c:1500:gst_vpudec_chain: buf status 0x101 data 165

0:00:01.227543470  3539   0x12f158 INFO                  vpudec vpudec.c:1588:gst_vpudec_chain: Got not enough input message!!

0:00:01.241830139  3539   0x12f158 LOG                   vpudec vpudec.c:1447:gst_vpudec_chain: Chain in with size = 690

0:00:01.242433472  3539   0x12f158 LOG                   vpudec vpudec.c:1500:gst_vpudec_chain: buf status 0x101 data 690

0:00:01.242612139  3539   0x12f158 INFO                  vpudec vpudec.c:1588:gst_vpudec_chain: Got not enough input message!!

0:00:01.242854472  3539   0x12f158 LOG                   vpudec vpudec.c:1447:gst_vpudec_chain: Chain in with size = 165

0:00:01.243435139  3539   0x12f158 LOG                   vpudec vpudec.c:1500:gst_vpudec_chain: buf status 0x101 data 165

0:00:01.243608806  3539   0x12f158 INFO                  vpudec vpudec.c:1588:gst_vpudec_chain: Got not enough input message!!

0:00:01.276577136  3539   0x12f158 LOG                   vpudec vpudec.c:1447:gst_vpudec_chain: Chain in with size = 1081

0:00:01.277181469  3539   0x12f158 LOG                   vpudec vpudec.c:1500:gst_vpudec_chain: buf status 0x101 data 1081

0:00:01.277356135  3539   0x12f158 INFO                  vpudec vpudec.c:1588:gst_vpudec_chain: Got not enough input message!!

0:00:01.277594135  3539   0x12f158 LOG                   vpudec vpudec.c:1447:gst_vpudec_chain: Chain in with size = 165

0:00:01.278179469  3539   0x12f158 LOG                   vpudec vpudec.c:1500:gst_vpudec_chain: buf status 0x101 data 165

0:00:01.278355802  3539   0x12f158 INFO                  vpudec vpudec.c:1588:gst_vpudec_chain: Got not enough input message!!

0:00:01.318594804  3539   0x12f158 LOG                   vpudec vpudec.c:1447:gst_vpudec_chain: Chain in with size = 927

0:00:01.319040804  3539   0x12f158 LOG                   vpudec vpudec.c:1500:gst_vpudec_chain: buf status 0x101 data 927

0:00:01.319116138  3539   0x12f158 INFO                  vpudec vpudec.c:1588:gst_vpudec_chain: Got not enough input message!!

0:00:01.319211804  3539   0x12f158 LOG                   vpudec vpudec.c:1447:gst_vpudec_chain: Chain in with size = 165

0:00:01.319634471  3539   0x12f158 LOG                   vpudec vpudec.c:1500:gst_vpudec_chain: buf status 0x101 data 165

0:00:01.319705471  3539   0x12f158 INFO                  vpudec vpudec.c:1588:gst_vpudec_chain: Got not enough input message!!

0:00:01.381333136  3539   0x12f158 LOG                   vpudec vpudec.c:1447:gst_vpudec_chain: Chain in with size = 723

0:00:01.381761469  3539   0x12f158 LOG                   vpudec vpudec.c:1500:gst_vpudec_chain: buf status 0x101 data 723

0:00:01.381832469  3539   0x12f158 INFO                  vpudec vpudec.c:1588:gst_vpudec_chain: Got not enough input message!!

0:00:01.381926803  3539   0x12f158 LOG                   vpudec vpudec.c:1447:gst_vpudec_chain: Chain in with size = 165

0:00:01.382375469  3539   0x12f158 LOG                   vpudec vpudec.c:1500:gst_vpudec_chain: buf status 0x101 data 165

0:00:01.382453136  3539   0x12f158 INFO                  vpudec vpudec.c:1588:gst_vpudec_chain: Got not enough input message!!

0:00:01.419956137  3539   0x12f158 LOG                   vpudec vpudec.c:1447:gst_vpudec_chain: Chain in with size = 674

0:00:01.420394804  3539   0x12f158 LOG                   vpudec vpudec.c:1500:gst_vpudec_chain: buf status 0x101 data 674

0:00:01.420466804  3539   0x12f158 INFO                  vpudec vpudec.c:1588:gst_vpudec_chain: Got not enough input message!!

0:00:01.420561470  3539   0x12f158 LOG                   vpudec vpudec.c:1447:gst_vpudec_chain: Chain in with size = 165

0:00:01.420995137  3539   0x12f158 LOG                   vpudec vpudec.c:1500:gst_vpudec_chain: buf status 0x101 data 165

0:00:01.421068804  3539   0x12f158 INFO                  vpudec vpudec.c:1588:gst_vpudec_chain: Got not enough input message!!

0:00:01.421164470  3539   0x12f158 LOG                   vpudec vpudec.c:1447:gst_vpudec_chain: Chain in with size = 715

0:00:01.421597470  3539   0x12f158 LOG                   vpudec vpudec.c:1500:gst_vpudec_chain: buf status 0x101 data 715

0:00:01.421671804  3539   0x12f158 INFO                  vpudec vpudec.c:1588:gst_vpudec_chain: Got not enough input message!!

[snip]

0 Kudos
5,856 Views
Tarek
Senior Contributor I

Hi Jack,

I just want to understand this further.

If mxc_vpu_test app can decode the stream then VPU must be capable of decoding it, right? The only difference between a gstreamer pipeline and mxc_vpu_test is the demultiplexer. If Freescale demux doesn't sport this type of streams then it's easy enough to replace it with opensource library and still use the hardware acceleration for the decoding and display!

please advise.

Thanks

5,855 Views
rebelalliance
Contributor III

Tarek,

I am not sure if it's just the demuxer. If you look at vpu_dec.full/src/mfw_gst_vpu_decoder.c, it looks as if the decoder is actively waiting for iframes.  I am not an expert on that code but that's what it looks to me and I know that that piece of code is not the demuxer.

0 Kudos
5,857 Views
LeonardoSandova
Specialist I

Hi,

you should post a question to the gstreamer-devel list, asking if it exists this particular demuxer.

Leo

0 Kudos