In this article we are going to discuss the following topics:
- how to use pre-trained CNN in MATLAB
- how to build a simple program to classify objects using CNN
- how to compare 3 types of CNN based on the accuracy & speed
- how to use NXP's SBC S32V234 Evaluation Board ISP camera to feed data into MATLAB simulations in real-time
The NXP Vision Toolbox offers support for integrating:
- MATLAB designed built-from-scratch CNN;
- MATLAB provided pre-trained CNN;
- MATLAB imported CNN from other deep learning frameworks via ONNX format;
...into your final application.
The NXP Vision Toolbox has an intuitive m-script API and allows you to simply use a custom built-in wrapper that enables both simulation and deployment of the CNN supported by MATALB on the NXP S32V embedded processors for rapid prototyping and evaluation purposes.
As mentioned in the beginning the main focus will be on MATLAB simulation of CNN algorithms. NXP Vision Toolbox API allows users to create a Neural Network that can be then executed in:
- MATLAB simulation;
- Real-time on NXP S32V hardware;
...using an easy syntax as shown below: nxpvt.CNN(...)
For more information about the nxpvt.CNN object, type help nxpvt.CNN in the MATLAB command line.
Therefore, when you want to create a CNN object from a pre-saved .mat file that contains a MATLAB-formatted and MATLAB-supported Neural Network you just need to specify:
- the .mat file;
- the input size;
You then need to load the class names (still a .mat file with the class names).
Currently, NXP Vision Toolbox supports only 3-channel neural networks, but we are planning to add support for a variable number of channels in the future.
You can find out the input size of a MATLAB CNN very easily by using the command:
Additional information about the CNN you wish to know such as class names, can be found by looking at the last (classification) layer of the network:
To save the network to a .mat file and to save the class names to a .mat file you should run the following commands:
The result will be the creation of two .mat files in the current folder named alexnet.mat, which will contain the actual network with all the layers, and alexnet_classes.mat which will contain the class names. As mentioned below, the toolbox also supports helper functions for achieving this.
You can find a number of CNN examples in the vision_toolbox\examples\cnn folder. The NXP Vision Toolbox has three CNN examples which can detect objects on an image, using the webcam on your laptop or by gathering the images from the MIPI-CSI attached camera on the S32V board.
Below is a simple m-script that implement object classification based AlexNet CNN. In this example the input image is obtained from the webcam using the built in NXP Vision Toolbox wrapper nxpvt.webcam()
Then, a new alxNet object is created based on a pre-trained alexnet CNN provided by MATLAB.
The frames captured from the webcam are then processed one-by-one in a infinite loop, by feeding the frames to the predict() method associated with the object we have created in the beginning of the the program.
The first five predictions are then displayed on top of the frame acquired from the webcam using another builtin NXP Vision Toolbox wrapper nxpvt.imshow. Since this is simulation only, the program could be written without any nxpvt. prefixes but in preparation for next module code generation we think is better to get used with this notation as soon as possible.
These 20-something lines of code let you create an AlexNet object and represent all the code one should write to be able to run the algorithm in simulation and deploy on the board. Before using the code, one should get a hold of the pre-trained network alexnet and save it in a file using the nxpvt.save_cnn_to_file command:
There are also 3 scripts that can do this automatically : save_alexnet_to_file, save_googlenet_to_file, save_squeezenet_to_file.
Running the AlexNet CNNexample is done by executing the cnn_alexnet.m.
After running the m-script you should be able to see the following information on your MATLAB Figure. As can be seen the algorithm is able to identify objects on background with high confidence.
As you can see if you look in the cnn_alexnet.m, cnn_googlenet.m and cnn_squeezenet.m files the same code is being used. Apart from the fact that you should use the appropriate .mat file for the CNN object, all other aspects are pretty much the same.
You can also use the cnn_alexnet_image.m, cnn_googlenet_image.m and cnn_squeezenet_image.m. We have tested the CNN predictions on our colleague Marius-lucian Andrei's dog and got different results. All of them were able to figure out that it's a dog, but the most accurate one was SqueezeNet which predicted it's a Pekinese.
DISCLAIMER: The dog is mixed-breed so we can't really blame any of them :-)
In terms of speed and accuracy on frames from a video using the webcam in MATLAB we have encountered the following performances:
These performance differences aren't necessarily relevant since it depends a lot on the lighting and the shadows in the images. SqueezeNet was optimized for running with a small footprint in terms of size and its performance is comparable with AlexNet.
There is also the possibility to get the images from the S32V234 board and run the algorithm in MATLAB. To do so, you have to use the S32V234 connectivity object. An example using SqueezeNet is provided in the vision_toolbox\examples\cnn\s32v234\ folder. To do this, you should get your board's IP Address:
The example uses port 50000 as default, so be sure you have this port free. This example runs SqueezeNet by using the already discussed nxpvt.CNN API to wrap the network and creates an object for communication through a TCP/IP socket with the board. Getting frames from the camera follows the same approach as the Raspberry Pi MATLAB object, so if you're familiar with that, it should be really easy to set this up. To do this, you simply need to create an nxpvt.s32v234 object that takes an IP address and a port. On top of that, in the same manner as you would do for the camera board on the Raspberry Pi you need another object specifying the camera index ( 1 - for MIPI-A connected camera, 2 - for MIPI-B connected camera ) and the resolution the following way:
As you can see the s32obj.cameraInUse becomes true after the cameraboard object is connected to the S32V board and the whole rigmarole associated with initializing and configuring the low-level drivers and the ISP-specific details is done automatically. Furthermore, getting a frame from the camera becomes as easy as just calling the snapshot method on the cameraboard object:
This communication object is used in the SqueezeNet demo below, which is running the object detection algorithm in MATLAB with frames from the attached camera:
If we run the above m-script in MATLAB with data from the S32V Sony camera we should obtain:
Using these pre-trained models has the limitation of only being able to classify images based on the predefined classes the models were trained on. However, you can use Transfer Learning, which we will discuss in a future article, to train the network to recognize a set of user-defined custom classes.
In conclusion, Simulation in the MATLAB environment with the NXP Vision Toolbox is pretty much straight-forward and the flexibility that MATLAB provides with the Deep Learning Toolbox and the pre-trained models has been integrated and provides MATLAB developers a friendly and familiar environment.