Using PBO is very slow (imx8mqevk)

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Using PBO is very slow (imx8mqevk)

1,786 Views
wooyeoljun
Contributor I

Hi community

I tried to test PBO in imx8mqevk(linux 4.9.51, wayland).

Test code is very simple.

* Test1: Load Image -> glTexImage2D -> Rendering  : 52fps

* Test2: Load Image -> PBO bind (2PBO) -> glTexSubImage2D -> Rendering :6~7fps

The image size is 5120x960

I know that in opnegles2.0 I can use gl-extensions for direct mapping. But I have to use GLES3.0 can't use gl-extesnions.

I heard that By using PBO, OpenGL can perform asynchronous DMA transfer between a PBO and a texture object. (OpenGL Pixel Buffer Object (PBO) )

But In my test, using PBO is slower.

Below is my code about PBO.  Thank you.


glGenTextures(1, &textureId);
glBindTexture(GL_TEXTURE_2D, textureId);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB8, IMAGE_SIZE_WIDTH, IMAGE_SIZE_HEIGHT, 0, PIX_FORMAT, GL_UNSIGNED_BYTE, NULL);
glBindTexture(GL_TEXTURE_2D, 0);


glGenBuffers(2, pboIds);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, pboIds[0]);
glBufferData(GL_PIXEL_UNPACK_BUFFER, DATA_SIZE, 0, GL_STREAM_DRAW);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, pboIds[1]);
glBufferData(GL_PIXEL_UNPACK_BUFFER, DATA_SIZE, 0, GL_STREAM_DRAW);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, 0);


Timer fpsTimer;
fpsTimer.reset();
cout <<"loop start"<<endl;
int index =0;
int nextIndex = 0;
/// Infinite loop
for (unsigned int i = 0;; i++) {
/// Clear screen
float fFPS = fpsTimer.getFPS();
if(fpsTimer.isTimePassed(1.0f))
{
printf("FPS:\t%.1f\n", fFPS);
}
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

index= (index + 1) % 2;
nextIndex = (index + 1) % 2;

#if 1 // do not use PBO
//glBindTexture(GL_TEXTURE_2D, textureId);
//glTexImage2D(GL_TEXTURE_2D, 0, 0, 0, IMAGE_SIZE_WIDTH, IMAGE_SIZE_HEIGHT, GL_RGB, GL_UNSIGNED_BYTE, test[index].data);
glBindTexture(GL_TEXTURE_2D, textureId);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, IMAGE_SIZE_WIDTH, IMAGE_SIZE_HEIGHT, 0, PIX_FORMAT, GL_UNSIGNED_BYTE, test[index].data);
#else // use PBO


glBindTexture(GL_TEXTURE_2D, textureId);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, pboIds[index]);

// copy pixels from PBO to texture object
// Use offset instead of ponter.
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, IMAGE_SIZE_WIDTH, IMAGE_SIZE_HEIGHT, PIX_FORMAT, GL_UNSIGNED_BYTE, 0);
// bind PBO to update texture source
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, pboIds[nextIndex]);

// Note that glMapBuffer() causes sync issue.
// If GPU is working with this buffer, glMapBuffer() will wait(stall)
// until GPU to finish its job. To avoid waiting (idle), you can call
// first glBufferData() with NULL pointer before glMapBuffer().
// If you do that, the previous data in PBO will be discarded and
// glMapBuffer() returns a new allocated pointer immediately
// even if GPU is still working with the previous data.
glBufferData(GL_PIXEL_UNPACK_BUFFER, DATA_SIZE, 0, GL_STREAM_DRAW);
// map the buffer object into client's memory
GLubyte* ptr = (GLubyte*)glMapBufferRange(GL_PIXEL_UNPACK_BUFFER, 0,DATA_SIZE, GL_MAP_WRITE_BIT);
if(ptr)
{
// update data directly on the mapped buffer
// updatePixels(ptr, DATA_SIZE);
// cout <<"here"<<endl;
// camImage.data = ptr;
// cap >> camImage; // capture displayImage from camera
//cout <<"memcpy"<<endl;
memcpy (ptr, test[index].data, DATA_SIZE);
glUnmapBuffer(GL_PIXEL_UNPACK_BUFFER); // release the mapped buffer
}

// it is good idea to release PBOs with ID 0 after use.
// Once bound with 0, all pixel operations are back to normal ways.
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, 0);

#endif

//glBindTexture(GL_TEXTURE_2D, textureId);
//glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, IMAGE_SIZE_WIDTH, IMAGE_SIZE_HEIGHT, 0, GL_RGB, GL_UNSIGNED_BYTE, test.data);

render(1280,720);
/// Swap EGL buffer
eglBufferSwap();

glBindTexture(GL_TEXTURE_2D, 0);
}

Tags (3)
0 Kudos
2 Replies

1,549 Views
N_Coesel
Contributor III

I'm seeing something similar when trying to use PBO to copy memory from the VPU decoder on the iMX8. PBO is slower compared to copying data directly. In theory the GPU should use DMA to read from the DMA memory area so this should be lightning fast.

0 Kudos

1,549 Views
Bio_TICFSL
NXP TechSupport
NXP TechSupport

Hi wooyeol,

Performance depends on many factors. One thing would recommend is use glImage Extension.

When you use conventional image and textures, it involves copy operation which will reduce the performance. I suggest you to Try to use glImage extension, Note that the recipes in meta-browser now contain packageconfigs to enable EGL support. You don't need to pass this parameter then.

Which BSP are you using? you test2 have better performance.

0 Kudos