i.MX6Q H.264 VPU encoding

simonpg88 · ‎09-20-2016

Hi,
I use the i.MX6Q VPU to decode jpeg image and then encode to h264 file.

The input is a jpeg image with YUV422 color format, whereas i want to encode to YUV420 h264 stream file. For this reasons I use the IPU CSC to convert from YUV422 color format to YUV420.

Therefore the conversion flow is:
1) Decode jpeg image with VPU
2) Convert raw data from YUV422 to YUV420 color format whit IPU CSC
3) Encode raw data converted to h264 stream

I use this libraries / examples to manage easily VPU and CSC:
VPU: https://github.com/Freescale/libimxvpuapi
CSC: https://github.com/jafp/imx6_ipu_csc

I think that jpeg image decoding and color format conversion works well. So much so that I tested the raw data images (before and after color format conversion) by this link http://rawpixels.net/ and all seems ok.

That said, I have problems with h264 encoding.
Reading the h264 encoding flow with VLC I see an overlay green image and I don't know the reasons. Why?!

I attach the raw data images (before and after color format conversion), orignal image and output h264 file.

Thank you to all.

Simone.

Code:

void init(FILE *input_file, FILE *output_file) {

    ImxVpuEncOpenParams open_params;

    ImxVpuRawFrame input_frame;

    ImxVpuEncodedFrame output_frame;

    ImxVpuEncParams enc_params;

    unsigned int output_code;

    long size;

    void *buf;

    ImxVpuJPEGDecInfo info;

    uint8_t *mapped_virtual_address;

    size_t num_out_byte;

    encContex = calloc(1, sizeof(EncContex));

    jpegContex = calloc(1, sizeof(JpegContex));

    jpegContex->fin = input_file;

    encContex->fout = output_file;

    /* Open the JPEG decoder */

    imx_vpu_jpeg_dec_open(&(jpegContex->jpeg_decoder), NULL, 0);

    /* Decode Jpeg and retrive image size */

    /* Determine size of the input file to be able to read all of its bytes in one go */

    fseek(jpegContex->fin, 0, SEEK_END);

    size = ftell(jpegContex->fin);

    fseek(jpegContex->fin, 0, SEEK_SET);

    /* Allocate buffer for the input data, and read the data into it */

    buf = malloc(size);

    fread(buf, 1, size, jpegContex->fin);

    /* Perform the actual JPEG decoding */

    ImxVpuDecReturnCodes dec_ret = imx_vpu_jpeg_dec_decode(jpegContex->jpeg_decoder, buf, size);

    if (dec_ret != IMX_VPU_DEC_RETURN_CODE_OK)

        fprintf(stderr, "could not decode this JPEG image : %s\n", imx_vpu_dec_error_string(dec_ret));

        return;

    /* Input data is not needed anymore, so free the input buffer */

    free(buf);

    /* Get some information about the the frame

     * Note that the info is only available *after* calling imx_vpu_jpeg_dec_decode() */

    imx_vpu_jpeg_dec_get_info(jpegContex->jpeg_decoder, &info);

    fprintf(

        stderr,

        "aligned frame size: %u x %u pixel  actual frame size: %u x %u pixel  Y/Cb/Cr stride: %u/%u/%u  Y/Cb/Cr size: %u/%u/%u  Y/Cb/Cr offset: %u/%u/%u  color format: %s\n",

        info.aligned_frame_width, info.aligned_frame_height,

        info.actual_frame_width, info.actual_frame_height,

        info.y_stride, info.cbcr_stride, info.cbcr_stride,

        info.y_size, info.cbcr_size, info.cbcr_size,

        info.y_offset, info.cb_offset, info.cr_offset,

        imx_vpu_color_format_string(info.color_format)

);

    if (info.framebuffer == NULL)

        fprintf(stderr, "could not decode this JPEG image : no framebuffer returned\n");

        return;

    FILE* shm_fd = fopen("/dev/shm/tmp.raw", "wb");

    /* Map the DMA buffer of the decoded picture, write out the decoded pixels, and unmap the buffer again */

    num_out_byte = info.y_size + info.cbcr_size * 2;

    fprintf(stderr, "decoded output picture:  writing %u byte\n", num_out_byte);

    mapped_virtual_address = imx_vpu_dma_buffer_map(info.framebuffer->dma_buffer, IMX_VPU_MAPPING_FLAG_READ);

    fwrite(mapped_virtual_address, 1, num_out_byte, shm_fd);

    imx_vpu_dma_buffer_unmap(info.framebuffer->dma_buffer);

    /* Decoded frame is no longer needed, so inform the decoder that it can reclaim it */

    imx_vpu_jpeg_dec_frame_finished(jpegContex->jpeg_decoder, info.framebuffer);

    fclose(shm_fd);

    { // color space conversion

        int file_fd;

        FILE* file_out;

        file_fd = open("/dev/shm/tmp.raw", O_RDWR, 0);

        printf("color space conversion init\n");

        int file_size = lseek(file_fd, 0, SEEK_END);

        printf("post lseek, file_size:%i\n", file_size);

        void * raw_image = mmap(0, file_size, PROT_READ, MAP_SHARED, file_fd, 0);

        printf("post mmap\n");

        ipu_csc_t csc;

        ipu_csc_format_t input_format = { info.aligned_frame_width, info.aligned_frame_height, 16, V4L2_PIX_FMT_YUV422P };

        // ipu_csc_format_t output_format = { info.aligned_frame_width, info.aligned_frame_height, 16, V4L2_PIX_FMT_YUV420 };

        ipu_csc_format_t output_format = { info.aligned_frame_width, info.aligned_frame_height, 12, V4L2_PIX_FMT_YUV420 };

        if (ipu_csc_init(&csc, &input_format, &output_format) < 0) {

            perror("ipu csc init failed");

            return;

        printf("post ipu_csc_init\n");

        // Output buffer

        printf("output_image size: %i\n", info.aligned_frame_width * info.aligned_frame_height * output_format.bpp / 8);

        unsigned char output_image[ info.aligned_frame_width * info.aligned_frame_height * output_format.bpp / 8 ];

        memset(output_image, 0, sizeof(output_image));

        printf("post memset\n");

        if (ipu_csc_convert(&csc, raw_image, output_image) < 0) {

            perror("ipu_csc_convert");

        } else {

            printf("Conversion done.\n");

        close(file_fd);

        file_out = fopen("/dev/shm/tmp_conv.raw", "wb");

        fwrite(output_image, sizeof(output_image), 1, file_out);

        fclose(file_out);

        ipu_csc_close(&csc);

        printf("color conversion end\n");

    } // color space conversion end

    /* Init h264 encoder */

    memset(&open_params, 0, sizeof(open_params));

    imx_vpu_enc_set_default_open_params(IMX_VPU_CODEC_FORMAT_H264, &open_params);

    open_params.bitrate = 0;

    open_params.frame_width = info.aligned_frame_width;

    open_params.frame_height = info.aligned_frame_height;

    open_params.frame_rate_numerator = 25;

    open_params.frame_rate_denominator = 1;

    open_params.color_format = IMX_VPU_COLOR_FORMAT_YUV420;

    /* Load the VPU firmware */

    imx_vpu_enc_load();

    /* Retrieve information about the required bitstream buffer and allocate one based on this */

    imx_vpu_enc_get_bitstream_buffer_info(&(encContex->bitstream_buffer_size), &(encContex->bitstream_buffer_alignment));

    encContex->bitstream_buffer = imx_vpu_dma_buffer_allocate(

        imx_vpu_enc_get_default_allocator(),

        encContex->bitstream_buffer_size,

        encContex->bitstream_buffer_alignment,

);

    /* Open an encoder instance, using the previously allocated bitstream buffer */

    imx_vpu_enc_open(&(encContex->vpuenc), &open_params, encContex->bitstream_buffer);

    /* Retrieve the initial information to allocate framebuffers for the

     * encoding process (unlike with decoding, these framebuffers are used

     * only internally by the encoder as temporary storage; encoded data

     * doesn't go in there, nor do raw input frames) */

    imx_vpu_enc_get_initial_info(encContex->vpuenc, &(encContex->initial_info));

    encContex->num_framebuffers = encContex->initial_info.min_num_required_framebuffers;

    fprintf(stderr, "num framebuffers: %u\n", encContex->num_framebuffers);

    /* Using the initial information, calculate appropriate framebuffer sizes */

    imx_vpu_calc_framebuffer_sizes(info.color_format, info.actual_frame_width, info.actual_frame_height, encContex->initial_info.framebuffer_alignment, 0, 0, &(encContex->calculated_sizes));

    fprintf(

        stderr,

        "calculated sizes:  frame width&height: %dx%d  Y stride: %u  CbCr stride: %u  Y size: %u  CbCr size: %u  MvCol size: %u  total size: %u\n",

        encContex->calculated_sizes.aligned_frame_width, encContex->calculated_sizes.aligned_frame_height,

        encContex->calculated_sizes.y_stride, encContex->calculated_sizes.cbcr_stride,

        encContex->calculated_sizes.y_size, encContex->calculated_sizes.cbcr_size, encContex->calculated_sizes.mvcol_size,

        encContex->calculated_sizes.total_size

);

    /* Allocate memory blocks for the framebuffer and DMA buffer structures,

     * and allocate the DMA buffers themselves */

    encContex->framebuffers = malloc(sizeof(ImxVpuFramebuffer) * encContex->num_framebuffers);

    encContex->fb_dmabuffers = malloc(sizeof(ImxVpuDMABuffer*) * encContex->num_framebuffers);

    for (unsigned int i = 0; i < encContex->num_framebuffers; ++i)

        /* Allocate a DMA buffer for each framebuffer. It is possible to specify alternate allocators;

         * all that is required is that the allocator provides physically contiguous memory

         * (necessary for DMA transfers) and respecs the alignment value. */

        encContex->fb_dmabuffers[i] = imx_vpu_dma_buffer_allocate(imx_vpu_dec_get_default_allocator(), encContex->calculated_sizes.total_size, encContex->initial_info.framebuffer_alignment, 0);

        imx_vpu_fill_framebuffer_params(&(encContex->framebuffers[i]), &(encContex->calculated_sizes), encContex->fb_dmabuffers[i], 0);

    /* allocate DMA buffers for the raw input frames. Since the encoder can only read

     * raw input pixels from a DMA memory region, it is necessary to allocate one,

     * and later copy the pixels into it. In production, it is generally a better

     * idea to make sure that the raw input frames are already placed in DMA memory

     * (either allocated by imx_vpu_dma_buffer_allocate() or by some other means of

     * getting DMA / physically contiguous memory with known physical addresses). */

    encContex->input_fb_dmabuffer = imx_vpu_dma_buffer_allocate(imx_vpu_dec_get_default_allocator(), encContex->calculated_sizes.total_size, encContex->initial_info.framebuffer_alignment, 0);

    imx_vpu_fill_framebuffer_params(&(encContex->input_framebuffer), &(encContex->calculated_sizes), encContex->input_fb_dmabuffer, 0);

    /* Actual registration is done here. From this moment on, the VPU knows which buffers to use for

     * storing temporary frames into. This call must not be done again until encoding is shut down. */

    imx_vpu_enc_register_framebuffers(encContex->vpuenc, encContex->framebuffers, encContex->num_framebuffers);

    /* Set up the input frame. The only field that needs to be

     * set is the input framebuffer. The encoder will read from it.

     * The rest can remain zero/NULL. */

    memset(&input_frame, 0, sizeof(input_frame));

    // input_frame.framebuffer = info.framebuffer;

    input_frame.framebuffer = &(encContex->input_framebuffer);

    /* Set the encoding parameters for this frame. quant_param 0 is

     * the highest quality in h.264 constant quality encoding mode.

     * (The range in h.264 is 0-51, where 0 is best quality and worst

     * compression, and 51 vice versa.) */

    memset(&enc_params, 0, sizeof(enc_params));

    enc_params.quant_param = 0;

    enc_params.acquire_output_buffer = acquire_output_buffer;

    enc_params.finish_output_buffer = finish_output_buffer;

    enc_params.output_buffer_context = NULL;

    /* Set up the output frame. Simply setting all fields to zero/NULL

     * is enough. The encoder will fill in data. */

    memset(&output_frame, 0, sizeof(output_frame));

    for (int nframe = 0; nframe < 100; nframe++)

        uint8_t *mapped_virtual_address;

        void *output_block;

        FILE* raw_file;

        raw_file = fopen("/dev/shm/tmp_conv.raw", "rb");

        /* Read uncompressed pixels into the input DMA buffer */

        mapped_virtual_address = imx_vpu_dma_buffer_map(encContex->input_fb_dmabuffer, IMX_VPU_MAPPING_FLAG_WRITE);

        fread(mapped_virtual_address, 1, num_out_byte, raw_file);

        imx_vpu_dma_buffer_unmap(encContex->input_fb_dmabuffer);

        /* The actual encoding */

        imx_vpu_enc_encode(encContex->vpuenc, &input_frame, &output_frame, &enc_params, &output_code);

        /* Write out the encoded frame to the output file. The encoder

         * will have called acquire_output_buffer(), which acquires a

         * buffer by malloc'ing it. The "handle" in this example is

         * just the pointer to the allocated memory. This means that

         * here, acquired_handle is the pointer to the encoded frame

         * data. Write it to file, and then free the previously

         * allocated block. In production, the acquire function could

         * retrieve an output memory block from a buffer pool for

         * example. */

        output_block = output_frame.acquired_handle;

        fwrite(output_block, 1, output_frame.data_size, encContex->fout);

        free(output_block);

        fclose(raw_file);

    /* Close the previously opened encoder instance */

    imx_vpu_enc_close(encContex->vpuenc);

    /* Free all allocated memory (both regular and DMA memory) */

    imx_vpu_dma_buffer_deallocate(encContex->input_fb_dmabuffer);

    free(encContex->framebuffers);

    for (unsigned int i = 0; i < encContex->num_framebuffers; ++i)

        imx_vpu_dma_buffer_deallocate(encContex->fb_dmabuffers[i]);

    free(encContex->fb_dmabuffers);

    imx_vpu_dma_buffer_deallocate(encContex->bitstream_buffer);

    /* Unload the VPU firmware */

    imx_vpu_enc_unload();

    free(encContex);

    /* Shut down the JPEG decoder */

    imx_vpu_jpeg_dec_close(jpegContex->jpeg_decoder);

    free(jpegContex);

Original Attachment has been moved to: prova.h264.zip

Original Attachment has been moved to: tmp.raw.zip

Original Attachment has been moved to: tmp_conv.raw.zip

igorpadykov · ‎09-21-2016

Hi Simone

for vpu encoding examples one can look at imx-test package examples (../mxc_vpu_test)
www.nxp.com/lgfiles/NMG/MAD/YOCTO/imx-test-5.4.tar.gz
or check attached Linux Guide sect.7.3.3 Video encoding.
Use official nxp bsps on link
http://www.nxp.com/products/microcontrollers-and-processors/arm-processors/i.mx-applications-process...

Best regards
igor
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------