Color accuracy of G2D_blit pixel format converion

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Color accuracy of G2D_blit pixel format converion

Jump to solution
3,087 Views
alik
Contributor II

Hello,

I am testing VPU JPEG image decoding using this sample:

https://community.freescale.com/thread/318147

The decoder outputs YUV pixel format which is then converted to RGB using g2d_blit(). I then encode the buffer back into JPEG using SW jpeg library. For comparison, I decode and then encode the same image using only SW jpeg library. I see subtle color differences between the HW and SW decode. I am attaching two images for comparison. It appears the VPU decoder or g2d_blit conversion enhances contrast (see how much darker the top of the white box on the right is for HW decode compared to SW decode) and bumps up red color (see the cardboard box on left is more red for HW decode). The SW decode produce the output images that matches the input image.

Any ideas why this is happening and if it can be fixed? Is there another way to to HW YUV to RGB conversion?

Thanks,

Alex

This is HW decode:

hw.jpg

This is SW decode, input and output images are identical:

sw.jpg

Labels (3)
1 Solution
1,906 Views
rogerio_silva
NXP Employee
NXP Employee

Another possibility is to use IPU (or if available on your device the PxP block).

You can check this example about how to use IPU device (/dev/mxc_ipu) to make csc:

https://github.com/rogeriorps/ipu-examples/tree/master/mx6/csc/example1

You can change the csc conversion coefficients on ipu driver. See function _init_csc on file drivers/mxc/ipu3/ipu_ic.c:

http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/tree/drivers/mxc/ipu3/ipu_ic.c?h=imx_3.1...

Best Regards,

Rogerio

View solution in original post

14 Replies
1,907 Views
rogerio_silva
NXP Employee
NXP Employee

Another possibility is to use IPU (or if available on your device the PxP block).

You can check this example about how to use IPU device (/dev/mxc_ipu) to make csc:

https://github.com/rogeriorps/ipu-examples/tree/master/mx6/csc/example1

You can change the csc conversion coefficients on ipu driver. See function _init_csc on file drivers/mxc/ipu3/ipu_ic.c:

http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/tree/drivers/mxc/ipu3/ipu_ic.c?h=imx_3.1...

Best Regards,

Rogerio

1,906 Views
alik
Contributor II

Hello Rogerio,

As per your advice, I tried changing csc coeffs in drivers/mxc/ipu3/ipu_ic.c. I use L3.10.17. I am getting strange results in my tests. There also seems to be a discrepancy with documentation. Currently I am looking at and testing the ycbcr2rgb_coeff[] matrix:

/*     R = (1.164 * (Y - 16)) + (1.596 * (Cr - 128));

       G = (1.164 * (Y - 16)) - (0.392 * (Cb - 128)) - (0.813 * (Cr - 128));

       B = (1.164 * (Y - 16)) + (2.017 * (Cb - 128); */

  static const uint32_t ycbcr2rgb_coeff[4][3] = {

  {149, 0, 204},

  {149, 462, 408},

  {149, 255, 0},

  {8192 - 446, 266, 8192 - 554}, /* A0, A1, A2 */

}

I tried interpreting the numbers according to the Ref. Manual, section 37.4.5.6, IC Task Parameter Memory. The A0, A1, A2 values just do not make sense and disagree with the formula given above by quite a bit. They are are supposed to be 4987,543,5191 or close to that in order to give actual values of -222.3,135.3,-273.3. In the second row the values 462 and 408 (-206 and -152 with sign bit decoded) disagree with formula as well, but not as much. Other values are OK and agree with the formula.

I ran some tests. Here is the behaviour I get with the default values:

Y       U       V               B       G       R       A

16      128     128             0       0       0       0

100     100     0               39      210     0       0

100     0       100             0       168     53      0

0       0       0               0       133     0       0

I tried setting A0, A1, A2 to zero. I got the behaviour that I mostly expected with the only exception: 462 and 408 coeffs produced the results that I would have expected if they were set to 306 and 360, i.e. very close to values from the formula, but they were not!

A0, A1, A1 are set to 0:

Y       U       V               B       G       R       A

16      128     128             255     0       223     0

100     100     0               255     77      116     0

100     0       100             116     35      255     0

0       0       0               0       0       0       0

As can be seen, the values of 77 and 35 for G in 2nd and 3rd lines are unexpected for 462 and 408 coeffs. Those numbers can be obtained only if 306 and 360 were set instead of 462 and 408. Incidentally, 306 and 360 give -50 and -104 with sign bit decoded, very close to values expected from the formula.

A0, A1, A1 are set to 4987,543,5191 and 306 and 360 are in the second row instead of 462 and 408, unexpected results:

Y       U       V               B       G       R       A

16      128     128             255     0       255     0

100     100     0               255     227     255     0

100     0       100             255     255     255     0

0       0       0               255     255     255     0

Am I missing something and can you explain this? Is there any undocumented modification happening to the 2nd and 4th row coeffs?

Thanks,

Alex

0 Kudos
1,906 Views
rogerio_silva
NXP Employee
NXP Employee

Hi Alex, I'm still checking.

0 Kudos
1,906 Views
rogerio_silva
NXP Employee
NXP Employee

Hi Alex,

Thanks for sharing your analysis and patch. It seems that ref manual is not much clear about negative numbers representation. I'll check it.

Regards,

Rogerio

0 Kudos
1,906 Views
rogerio_silva
NXP Employee
NXP Employee

Hi Alex,

I made some tests and seems that negative numbers must be encoded as two's complement. This way I think it makes sense.

Rgds

Rogerio

0 Kudos
1,906 Views
alik
Contributor II

That's right and I am surprised I did not realize this. Thanks.

0 Kudos
1,906 Views
alik
Contributor II

Hi Rogerio,

I figured out what numbers to put there to get me where I want to be. I did that by treating it as a black box and derived relationships between inputs and outputs. It is my understanding that the documentation improperly describes encoding of negative coefficients in both cases: for Cxx terms and for Ax terms. Positive coefficients seem to be OK. I was waiting for you to post a more comprehensive reply on the community than me just providing a fix that works for me. I need to do a conversion both ways between RGB and YUV. I found that G2D YUV->RGB conversion runs quite a bit faster than on IPU even when using both IPU's on iMX6. It would be great if G2D could provide both ways conversion and also using libjpeg coefficients. Here is my patch that applies the same CSC coefficients as used in libjpeg if it can be useful for somebody else.

diff --git a/drivers/mxc/ipu3/ipu_ic.c b/drivers/mxc/ipu3/ipu_ic.c

index 09cd04e..7d24f22 100644

--- a/drivers/mxc/ipu3/ipu_ic.c

+++ b/drivers/mxc/ipu3/ipu_ic.c

@@ -732,15 +732,15 @@ static void _init_csc(struct ipu_soc *ipu, uint8_t ic_task, ipu_color_space_t in

       ipu_color_space_t out_format, int csc_index)

{

  /*

- * Y =  0.257 * R + 0.504 * G + 0.098 * B +  16;

- * U = -0.148 * R - 0.291 * G + 0.439 * B + 128;

- * V =  0.439 * R - 0.368 * G - 0.071 * B + 128;

+  *      Y  =  0.29900 * R + 0.58700 * G + 0.11400 * B

+  *      Cb = -0.16874 * R - 0.33126 * G + 0.50000 * B  + 128

+  *      Cr =  0.50000 * R - 0.41869 * G - 0.08131 * B  + 128

  */

  static const uint32_t rgb2ycbcr_coeff[4][3] = {

- {0x0042, 0x0081, 0x0019},

- {0x01DA, 0x01B6, 0x0070},

- {0x0070, 0x01A2, 0x01EE},

- {0x0040, 0x0200, 0x0200}, /* A0, A1, A2 */

+ {77, 150, 29},

+ {469, 427, 128}, /* 469 and 427 had to be fudged to produce required results, do not agree with documented calculation in imx6 ref. manual */

+ {128, 405, 491}, /* 405 and 491 had to be fudged */

+ {0, 512, 512}, /* A0, A1, A2 */

  };

  /* transparent RGB->RGB matrix for combining

@@ -752,14 +752,14 @@ static void _init_csc(struct ipu_soc *ipu, uint8_t ic_task, ipu_color_space_t in

  {0x0000, 0x0000, 0x0000}, /* A0, A1, A2 */

  };

-/*     R = (1.164 * (Y - 16)) + (1.596 * (Cr - 128));

-       G = (1.164 * (Y - 16)) - (0.392 * (Cb - 128)) - (0.813 * (Cr - 128));

-       B = (1.164 * (Y - 16)) + (2.017 * (Cb - 128); */

+ /*      R = Y                + 1.40200 * Cr

+  *      G = Y - 0.34414 * Cb - 0.71414 * Cr

+  *      B = Y + 1.77200 * Cb         */

  static const uint32_t ycbcr2rgb_coeff[4][3] = {

- {149, 0, 204},

- {149, 462, 408},

- {149, 255, 0},

- {8192 - 446, 266, 8192 - 554}, /* A0, A1, A2 */

+ {128, 0, 179},

+ {128, 468, 421}, /* 468 and 421 had to be fudged */

+ {128, 227, 0},

+ {8192-359, 270, 8192-454}, /* A0, A1, A2; A0 and A2 had to be fudged */

  };

  uint32_t param;

Regards,
Alex

0 Kudos
1,906 Views
alik
Contributor II

Hi Rogerio,

This indeed seems to be a way to get it working. However, I have only NV16 or YV16 formats that can be output by the IMX6 VPU JPEG decoder because of the 4:2:2 format stored in the JPEG file, but those are not among input formats of the IPU in L3.10.17 that I am using. The closest is YV12 in L3.10, though I have seen that in L3.14 there is YV16 already supported. In addition, IPU hardware seems to have output size limitation of 1024x1024 pixels whereas my JPEGs are 18megapixels. Having this working in G2D would have been ideal. Also, the reverse conversion of RGB to YCbCr would be greatly desirable. At the moment, it seems that looking at OpenGL or OpenCL is the next option.

Thanks,

Alex

0 Kudos
1,906 Views
rogerio_silva
NXP Employee
NXP Employee

Hi Alex,

Are you converting still images or video frames?

Depending on the required processing speed, you can try to decode your frame in two or more IPU process due to the 1024x1024 restriction. If your i.MX has 2 IPUs (e.g. i.MX6Q), you could even run both IPUs at the same time.

Have you checked the PxP block? Do you think it could help?

For sure OpenGL/OpenCL can also do it.

Rgds

Rogerio

0 Kudos
1,906 Views
alik
Contributor II

Hi Rogerio,

I am converting still images from a DSLR, so maintaining frame rate is not an issue, but I obviously want to maximize speed.

I am leaning toward using IPU and doing it in chunks of 1024x1024px. How do I employ both IPUs - call g2d_open twice and get two handles? I could not find good documentation on API for IPU, if you can point to a link I would appreciate it.

I use i.MX6Q, as I understand it does not have PxP block.

Thanks,

Alex

0 Kudos
1,906 Views
rogerio_silva
NXP Employee
NXP Employee

Hi Alex,

Correct, i.MX6Q doesn't have PxP.

You can use both IPUs ate the same time by creating tasks. Check the code attached.

In this code, I read a 4k image and resize in one 1920x1080 image using both IPUs at the same time.

Best Regards,

Rogerio

0 Kudos
1,906 Views
alik
Contributor II

Thank you for the sample Rogerio. I was thinking about G2D API that I am also using when I asked about g2d_open.

1,906 Views
Bio_TICFSL
NXP TechSupport
NXP TechSupport

Hi alik,

The reason of this contrast difference could be due to that your are using some of the alternative operation of G2D blit such as blend, dither, etc.

Also one way to change contrast could be with CSC:

Re: The best way to change the contrast of an image

Regards

1,906 Views
alik
Contributor II

Hi Bio,

I did make sure that I have blend, dither, etc disabled. I think the root cause is that there are multiple formulas for YCbCr > RGB conversion. The one used in GPU uses Y clipping according to the IMX6 reference manual, whereas the one normally used in JPEGs uses full range, eg see https://en.wikipedia.org/wiki/YCbCr

I do not know which conversion is used in MJPGs, I would expect it to be the JPEG one. In that case Freescale has a bug in the implementation. Making selection of a conversion algorithm available would be a solution too.

0 Kudos