imxrt1064 USB video bit rate

cancel
Showing results for 
Search instead for 
Did you mean: 

imxrt1064 USB video bit rate

1,157 Views
Contributor IV

Hello,

I have been experimenting with USB video lately and I would like to know if a imxrt 1064 can stream video over USB at 30 FPS for an uncompressed image with the resolution 640 x 480 (16 bit per pixel). If not - can it be achieved using another data format (say, MJPEG)?

To demonstrate: Using YUYV, an image as specified above yields 0.8 FPS. However, if I set the resolution to 320 x 240 instead the FPS jumps to 3.24.

Can this device do better?

Labels (2)
42 Replies

101 Views
NXP TechSupport
NXP TechSupport

Hi Tamir, 

I apologize for the delayed response, with the migration of the community I lost tracked of this case. 

Could you please confirm if you are still facing the same issue? 

 

Regards, 

Victor 

0 Kudos

194 Views
Contributor I

Hi,

I would like to add that I believe we are having the exact same problem - on an RT1062 in our case. I have a UVC streaming application derived from the virtual camera SDK example but set to stream uncompressed 640x480 UYVY. I have patched USB_DeviceSetSpeed to set the endpoint descriptor correctly for buffer sizes > 1024, and the device enumerates correctly. If I set the maximum packet size to 1024 (for a single transfer per microframe) then the stream works perfectly at about 10 fps, but if I enable multiple transfers then the stream stops altogether and the host reports USB errors as seen in Linux Wireshark (usbmon). The endpoint is shown by lsusb with the expected 3x1024 size.


I notice that the MULT field appears to be dealt with correctly in usb_device_ehci.c, as the OP already pointed out, but it is not clear what happens regarding the buffer DMA. Is it okay to send the whole 3x1024 in a single buffer, or does it need to be sent as three separate transfers of 1024 bytes by the application? If the latter, how could we patch the USB stack to improve on this and reduce interrupt load?

There is a comment in usb_device_dci.c:

* Currently, only one transfer request can be supported for one specific endpoint.
* If there is a specific requirement to support multiple transfer requests for one specific endpoint, the application
* should implement a queue in the application level.
* The subsequent transfer could begin only when the previous transfer is done (get notification through the endpoint
* callback).

What counts as a "transfer" here? In the isochronous case the additional transfers occur within the same microframe (so no extra SOF) - does the driver count this as the same transfer or three separate ones?

Mike

0 Kudos

144 Views
Contributor I

I have this working now and perhaps my solution will be of benefit to @tamir_michael as well.

To answer my own question, yes the driver does appear to handle splitting the three transfers. It is therefore okay to send 3072 bytes to USB_DeviceSendRequest. This means that increasing the max transfer size to 3072 and ensuring that the descriptor contains the correct 0x1400 value for wMaxPacketSize should be all that is required to modify the SDK example.

The transfer errors we were seeing appeared to occur at the end of a frame, when the packet is shorter than the maximum size. In particular, when the amount of data to send requires less than the full 3 transfers. I believe this is because the underlying EHCI driver (usb_device_ehci.c) does not set the value in the dTD multiplierOverride field. My solution was to add the following code in USB_DeviceEhciTransfer at the point where the dTD is populated:

        if (ehciState->qh[index].capabilttiesCharacteristicsUnion.capabilttiesCharacteristicsBitmap.mult) {
        	// For isochronous endpoints with additional transfers enabled, override the number of transfers
        	// if we don't need them all for this frame
        	uint16_t mult = (sendLength + ehciState->qh[index].capabilttiesCharacteristicsUnion.capabilttiesCharacteristicsBitmap.maxPacketSize - 1) /
        		ehciState->qh[index].capabilttiesCharacteristicsUnion.capabilttiesCharacteristicsBitmap.maxPacketSize;
        	if (mult == 0) {
        		// RM states "when Total Bytes = 0; then MultO should be 1"
        		mult = 1;
        	}
        	dtd->dtdTokenUnion.dtdTokenBitmap.multiplierOverride = mult;
        } else {
        	dtd->dtdTokenUnion.dtdTokenBitmap.multiplierOverride = 0;
        }

This calculates the required number of transfers for the given block size and sets MultO to a non-zero value if it is less than the full amount. It is conditional on "mult" being non-zero in the queue head, which should hopefully prevent it from being set on non isochronous IN endpoints.

It would be great if NXP could comment on the validity of this fix. The reference manual kind of implies that setting MultO is merely a performance thing, but perhaps the UVC spec requires it? I can confirm that with this added I can now achieve 30fps at 640x480 uncompressed YCbCr.

Mike

0 Kudos

234 Views
NXP TechSupport
NXP TechSupport

Hello Tamir, 

Regarding your question, the RT1064 is capable of sending a 640 x 480 image through USB at 30 fps.

Best Regards, 

Victor 

0 Kudos

234 Views
Contributor IV

Hi and thanks for your reply,

Could you elaborate on how this can be done?

Obviously I'm doing something wrong, but I have no idea how to boost data transfer. Can you help me with this? Let's take your video over USB demo program as an example - it sends out a MJPEG with the resolution of 176x144 at 30 FPS. Can you change this program to send out an MJPEG at 640x480 at 30 FPS? I tried, but then the FPS falls dramatically (in addition to visual distortions which are obvious because I did not change the image size, but that is not important for this experiment).

I have attached a very similar mcuXpresso program that sends out video over USB at 320x240 using YUYV, but it manages only 3.2 FPS. It is very similar to the demonstration program NXP provides.

Can you point out how to boost the FPS of this program or any other?

0 Kudos

234 Views
NXP TechSupport
NXP TechSupport

Hi Tamir, 

I checked with the applications team and a while ago they developed a UVC demo working at 30fps. This demo is for the RT1050-EVKB and it's was developed using IAR IDE. Although you are using a different RT and probably a different IDE, you can use this project to see all the configurations that they need to made to work at 30fps. I already sent the project to your distributor so he can send it to you directly. Let me know if you don't receive it and I will send it to you. 

Best Regards, 

Victor 

0 Kudos

119 Views
Contributor I

I have exactly the same problem to ask. Would you please send a copy of the example also? Or tell me what it takes for it to be available. Thanks.

0 Kudos

234 Views
Contributor IV

Hi Victor,

The program you provided indeed delivers 30FPS. 

However, I do have some comments and questions remaining:

1. I have noted that the USB interrupt is blocked until a full frame has been received from CSI. This behavior makes the code as it is unsuitable for production purposes when the main loop / OS task must do timely work. Is it possible to postpone the work of the USB ISR so that it can be executed in the main loop / task (i.e. if the main loop is expected to send out the data and not the ISR, this will happen _after_ the ISR processing by the CPU has finish - which may be unacceptable).

2. I tried to replicate this blocking behavior but somehow the CSI peripheral stopped generating interrupts. The buffer queue to CSI filled up after only a few insertions into it, because somehow the CSI stopped reporting that an frame was available. However, the same blocking code, if executed outside the ISR, does not stop the CSI and it keeps on generating interrupts. CSI interrupt priority is 0 which is higher than USB's 3.

Any comments will be very much appreciated!

0 Kudos

234 Views
NXP TechSupport
NXP TechSupport

Hi Tamir, 

I'm glad to hear that the example project was useful for you. I will reach to the applications team to see if they have any additional information regarding this example project that might help answer your questions. I will give you an update as soon as possible. Thanks a lot for your patience. 

Regards, 

Victor 

0 Kudos

234 Views
Contributor IV

Hi Victor,

I managed to get my controller to deliver video data (640x480x2) at 13.28 FPS. Trying to increase this by enabling 1 or 2 extra 1024 byte transactions (as your demonstration program does) always yields errors on the USB bus (so, I can deliver up to 1024 bytes per 125[us]). In order to get this to work, I programmed "wMaxPacksize" with the right bit pattern, and delivered 3072 bytes via the USB ISR. The device enumerates and video delivery is commenced, but now and then there is a piece of the image that is corrupt which breaks everything. Did I miss a step in adjusting my program to deliver more data?

0 Kudos

234 Views
NXP TechSupport
NXP TechSupport

Hi Tamir, 

In your previous reply, you mentioned the following: The program you provided indeed delivers 30FPS. But now you are saying that the example project doesn't work properly, could you please clarify this? 

Also, regarding your requirement to postpone the work of the USB ISR so that it can be executed in the main loop. I discussed this with the applications team and they modify the example that your distributor sent you before to match this new requirement. I already send this example to your distributor so he can send it to you. 

Regards, 

Victor 

0 Kudos

234 Views
Contributor IV

Hi Victor,

Thanks for the effort - I very much appreciate this.

Regarding your request for clarifications - my apologies, I should have explained the situation better.

Your demonstration program works fine on a MIMXRT1050 and I understand how it works. It takes advantage of 2 extra 1024 byte transfers available to USB high speed connections. It does so by setting the right values in the wMaxPacketSize field, as the specification requires. That allows this program to deliver 3072 bytes per 125us which translates to 30 FPS.

I'm working with the MIMXRT1064, so I simply replicated what I hoped are equivalent settings in my program with the hope to achieve the same bit rate. Alas, I cannot deliver more than 1024 bytes per 125us without bus errors, using either our hardware or NXP's evaluation board. I wonder if there is a detail I forgot - I though setting the maximum packet size of 1024 bytes, enabling the proper bits in wMaxPacketSize and actually delivering the data would be sufficient to reach the data rates I need but alas... I guess the question can be formulated simply as follows: If I would depart from the video demo provided with the SDK for the 1064, what do I need to do to make it capable of high speed transfers (read: delivering 3072 bytes per 125us / USB interrupt)?

0 Kudos

234 Views
NXP TechSupport
NXP TechSupport

Hi Tamir, 

Thank you so much for the detailed information! I will check this with the engineer who created this demo project. I will get back to you as soon as possible. Thank you for your patience. 

Regards, 

Victor 

0 Kudos

234 Views
NXP TechSupport
NXP TechSupport

Hi Tamir, 

I received the following response from the engineer who created this demo project: "The example code I shared with you can run on RT1064-EVK without any modification. I have validated that."

Regards, 

Victor 

0 Kudos

234 Views
Contributor IV

Hi Victor,

Thanks for your response.

I understand that, but what I'm really interested in is why, if I replicate the elements from the sample _I think_ should guarantee up to 3072 bytes worth of video data per 125 us, I get errors on the host side if I exceed 1024 bytes.

To be pin-point precise:

Using your own video sample (SDK v2.7.0), in the file "usb_device_descriptor.h", why does the host generate errors and display no video if I set the value 

#define HS_STREAM_IN_PACKET_ADDITIONAL_TRANSACTION (0U)  /* MAX Value is 2U*/

to 1 or 2,

and in addition

#define HS_STREAM_IN_PACKET_SIZE (1024U) // instead of 512

and change nothing else at all in the sample?

Can you please ask your engineer to do those changes on the standard SDK (v2.7.0) video example, and tell you what additional changes he had to do to get them to work with a host (i.e. the host displaying the video)?

The example we got from you was built using IAR so there may be something I'm missing here (we only use mcuXpresso). Can you please ask the engineer you mentioned _how_ he would go about transforming one of the standard SDK examples to deliver more data per unit time (3072 bytes per 125us) ?

Thank you.

0 Kudos

234 Views
Specialist V

Hi Tamir

To use more that one data packets per micro-frame the endpoint descriptor must advertise this (I think that you identified that with the HS_STREAM_IN_PACKET_ADDITIONAL_TRANSACTION() setting of 2 (meaning send 2 additional data packets).

In the HS USB header setting you also need the MULT field set to match:

pastedImage_1.png

Overall the data rate and the ISO data settings must match otherwise the host may refuse to do anything (the same can be see with audio class settings that define a certain bandwidth and endpoint settings that then advertise values that would request much more than needed for the advertised payload - such details are not visible but hosts probably do some rough sanity checks and if the payload settings advertise 1MB and the endpoints request 24MB of bandwidth it presumably refuses to work (with or without error messages).

Regards

Mark

[uTasker project developer for Kinetis and i.MX RT]

0 Kudos

234 Views
Contributor IV

Hi Mark and thanks for your reply,

I believe the value for the "mult" field is taken care of in "usb_device_ehci.c", here:

USB_DeviceEhciEndpointInit()

......

if (USB_ENDPOINT_ISOCHRONOUS == transferType)
{
if (maxPacketSize > USB_DEVICE_MAX_HS_ISO_MAX_PACKET_SIZE)
{
maxPacketSize = USB_DEVICE_MAX_HS_ISO_MAX_PACKET_SIZE;
}
ehciState->qh[index].capabilttiesCharacteristicsUnion.capabilttiesCharacteristicsBitmap.mult =
1U + ((epInit->maxPacketSize & USB_DESCRIPTOR_ENDPOINT_MAXPACKETSIZE_MULT_TRANSACTIONS_MASK) >>
USB_DESCRIPTOR_ENDPOINT_MAXPACKETSIZE_MULT_TRANSACTIONS_SHFIT);
}

where 

USB_DESCRIPTOR_ENDPOINT_MAXPACKETSIZE_MULT_TRANSACTIONS_MASK = 0x1800

and USB_DESCRIPTOR_ENDPOINT_MAXPACKETSIZE_MULT_TRANSACTIONS_SHFIT = 11

Here is my fundamental problem:

I have a program the streams over USB at 13.28 FPS with a packet size of 1024 bytes. I can slow it down by sending less data per 125 us without issues. However, once I dare to set the value of "HS_STREAM_IN_PACKET_ADDITIONAL_TRANSACTION" to 1 or 2 the video signal is lost, even if I send out up to 3072 bytes per interrupt. Is this a pure timing issue? What I see happening in practice is no video frames are detected _at all_ by the host.

It is allowed to "throttle" the FPS by adjusting the amount of data delivered per microframe? For example, assuming an image dimension of 640x480x2, delivering up to 2340 bytes per 125us should yield 30FPS. However, an IAR example program created for the 1050 delivers up to the maximum amount of 3072 bytes per 125us only to block in the ISR waiting for the next image to come in from the camera. But one might as well simply "spread" the video data over the entire 33ms needed to reach 30FPS, rather then the 25ms the example employs, followed by a wait period (the block I mentioned above). Is such a policy of "speading" the data principally acceptable?

If you have access to a MIMXRT1064-EVK you can very easily reproduce this problem as I pointed out in a post above using NXP SDK video over USB examples (only a very limited number of changes are needed, also demonstrated above. You could even use example code our contact person should have delivered to NXP some days ago - a program that does not even need a camera to demonstrate the problem). Could you or somebody else please help me understand why this is happening? It is perfectly possible that there is no firmware problem at all and that the SDK's video-over-usb sample blocking in USB interrupt waiting for an image is the "solution", but then I'd expect that similar artifical delays I tried (just to get an image) would yield a similar result but they don't.

I would like to point out that even if my controller sends out a frame per 33ms I see no image on the host side. I have verified this with a scope, using the "data throtteling" method I mentioned above.

The fact of the matter is that the product will have to be redesigned unless I get this to work soon...

Thanks in advance.

0 Kudos

234 Views
NXP TechSupport
NXP TechSupport

Hi Tamir, 

I have a couple of questions just to be sure that I understood correctly the changes that you are making to the SDK example. 

  • The only two changes that you are making is setting HS_STREAM_IN_PACKET_ADDITIONAL_TRANSACTION to 2 and HS_STREAM_IN_PACKET_SIZE to 1024, correct? 
  • When you make these two changes the SDK example project stops working at all, correct? 
  • Are there any other changes that you have made to the example to achieve the 30fps?  
  • Just to confirm, are you using the dev_video_virtual_camera_bm example? 

Regards, 

Victor 

0 Kudos

234 Views
Contributor IV

Hi Victor,

- "The only two changes that you are making is setting HS_STREAM_IN_PACKET_ADDITIONAL_TRANSACTION to 2 and HS_STREAM_IN_PACKET_SIZE to 1024, correct? "

Yes.

- "When you make these two changes the SDK example project stops working at all, correct? "

Yes.

- "Are there any other changes that you have made to the example to achieve the 30fps?"

1. Replaced the blockage of the USB ISR (as it waits for the camera ISR) with a balanced data payload delivery per interrupt (without blocking; see (*) below) to allow me to reach 30 FPS (in my case that is 2340 bytes:

2340 bytes/125us = 18720 bytes/ms, or 18720000 bytes/second ~ 640*480*2*30 = 18432000 bytes/second = payload 30 FPS).

2. I've set HS_STREAM_IN_INTERVAL to 1.

- "Just to confirm, are you using the dev_video_virtual_camera_bm example?"

Yes.

Thanks again for your assistance.

Tamir

(*) By only adjusting the value of HS_STREAM_IN_INTERVAL to 1 (i.e. leaving HS_STREAM_IN_PACKET_ADDITIONAL_TRANSACTION set to 0)of I can deliver 1024 bytes/interrupt without blocking the USB ISR yielding 13.28 FPS:

1024 bytes/125us = 8192 bytes/ms = 8192000 bytes/sec

8192000/(640*480*2) = 13.33 FPS

0 Kudos

234 Views
NXP TechSupport
NXP TechSupport

Hi Tamir, 

Thank you for answering my questions! I'm currently checking this with the applications team and I will give you an update as soon as possible. 

Regards, 

Victor 

0 Kudos