AnsweredAssumed Answered

iMX6Q VPU performance

Question asked by Kristoffer Glembo on Mar 15, 2016
Latest reply on Jul 13, 2017 by Don Freiling
Branched to a new discussion

Hi there,

 

We're having performance issues when encoding 1920x1080 @ 30 fps on a custom board using a iMX6Q with LPDDR2 POP memory. The LPDDR2 memory is clocked at 396 MHz. The VPU is running at 270 MHz.

 

The VPU is not able to keep up at this rate. Using a GStreamer pipeline to create a MP4 from the encoded data the limit seems to be below 24 fps at the moment.

 

Using the mxc_vpu_test program from the imx-test package I get the following result on our board (encoding 50 1920x1080 YUV frames):

 

./mxc_vpu_test.out -C config_enc

[INFO]  VPU test program built on Mar 15 2016 13:55:16

[INFO]  Product Info: i.MX6Q/D/S

[INFO]  VPU firmware version: 3.1.1_r46070

[INFO]  VPU library version: 5.4.32

[INFO]  Format: STD_AVC

[INFO]  AVC

[INFO]  Input file "/tmp/foo.bin" opened.

[INFO]  Output file "test.264" opened.

[INFO]  Capture/Encode fps will be 30

[INFO]  ringBufferEnable 0, chromaInterleave 0, mapType 0, linear2TiledEnable 0

[INFO]  Finished encoding: 50 frames

[INFO]  enc fps = 30.06

[INFO]  total fps= 21.32

 

Trying exactly the same on a Nitrogen board the following result is obtained:

 

./mxc_vpu_test.out -C config_enc

[INFO]  VPU test program built on Mar  2 2016 08:53:02

[INFO]  Product Info: i.MX6Q/D/S

[INFO]  VPU firmware version: 3.1.1_r46070

[INFO]  VPU library version: 5.4.32

[INFO]  Format: STD_AVC

[INFO]  AVC

[INFO]  Input file "/tmp/foo.bin" opened.

[INFO]  Output file "test.264" opened.

[INFO]  Capture/Encode fps will be 30

[INFO]  ringBufferEnable 0, chromaInterleave 0, mapType 0, linear2TiledEnable 0

[INFO]  Finished encoding: 50 frames

[INFO]  enc fps = 39.27

[INFO]  total fps= 26.89

 

The config_enc config file looks like this:

 

# Write your options here!

# Type of operation encode or decode; encode = 1, decode = 2

operation=1

# read input from file. Mandatory for decode. If not specified for encode

# then default is camera

input=/tmp/foo.bin

# write output to file. For decode, if not specified, then default is LCD

output=test.264

# format; 0 - MPEG4, 1 - H.263, 2 - H.264, 7 - MJPG

format=2

# chromaInterleave, 1 - CbCr is interleaved

chromaInterleave=

# rotation angle (0, 90, 180, 270). Do not specify anything if not needed.

rotation=

# count, number of frames to encode or decode

count=50

# deblocking . 1 - Enable deblock

deblock=

# dering . 1 - Enable dering

dering=

# mirroring (0, 1, 2 , 3)

mirror=

# width, display width for decoding or capture/yuv image width for encoding

width=1920

# height, display height for decoding or capture/yuv image width for encoding

height=1080

# bitrate. default is auto

bitrate=0

# gop size. default is 0

gop=15

# This option specifies the end of option list for one instance

# Each option list must be end with this option. This is mandatory.

end

 

Both boards are running a Yocto Jethro build with the same VPU firmware and library versions as can be seen above.

 

The relevant differences between our board and the Nitrogen board are:

 

1. We run the VPU (and AXI) at 270 MHz vs 264 MHz on Nitrogen

2. We run the memory at 396 MHz vs 528 MHz on Nitrogen

 

I have attached the clock tree dump from both our board and the nitrogen board for reference.

 

Looking at various documentation, I've only been able to find that we need to run the VPU at at least 264 MHz to encode 1920x1080@30 fps. I can't find any references to memory frequency in VPU performance discussions.

 

What are the relevant limitations on VPU performance? What can we do to achieve 30 fps?

 

Best regards,

Kristoffer Glembo

Original Attachment has been moved to: nitrogen_clocks.txt.zip

Original Attachment has been moved to: custom_clocks.txt.zip

Outcomes