VPU is the "block" which encode and decode video, like taking YUV and giving H264.
By default, imx53 BSP has one "special" camera input (csi). The camera input is done by some hardware specific (please, take a look on IPU documentation for details).
As I know, there is only CSI0 and CSI1, in other words, there is only 2 "hardware camera inputs".
In case you´re using 4 USB, I don´t know if USB bus will be fast enough for all video instances.
But wait. There are cameras that gives H264, is it your case?