In this article we are going to discuss the following topics:
1. INTRODUCTION
Deploying MATLAB scripts on the NXP proprietary boards can be done using the NXP Vision Toolbox. The utilities offered by the toolbox are bypassing all the hardware configuration and initialization that someone would have had to do, and can provide MATLAB users a way to easily start to run their own code on the boards, without having to know specific hardware and low-level software details. It also includes a module that supports Convolutional Neural Network deployment, so users can:
We will get back on Trasfer Learning and cover more of this topic in the next Course #5.
The NXP Vision Toolbox uses MATLAB's capabilities to generate code for CNNs using ARM Neon technology that can accelerate, to some extent, these computation-intensive algorithms. We are actively working on getting things to work with NXP's next generation machine learning software called AIRunner, which leverages the power of the integrated Vision APEX accelerator, to offer support for deploying stable and real-time object detection, semantic segmentation and other useful algorithms that can improve driver's experience behind the wheel.
2. GETTING THINGS UP AND RUNNING
This course assumes you have already set up the MATLAB environment inC1: HW and SW Environment Setup. The steps mentioned there will be listed below just for the completeness of this document:
Debian version can also be used, but there are some extra things that need to be installed on the board and our current version of the toolbox does not integrate with it when it comes to deploying scripts on the board. This will most likely change in our next release, which will get out-of-the-box support for Debian as well
The ARM Compute Library can be downloaded from https://github.com/ARM-software/ComputeLibrary
The NXP Vision Toolbox CNN examples were built using version 18.03, so download this one to avoid any backward or forward compatibility issues - scroll down until you find Binaries section like in the image below:
After downloading and unpacking the ARM Compute image the ARM_COMPUTELIB system variable should point at the top of the installation folder:
The ARM Compute library should contain the linux-arm-v8a-neon folder with the correct libraries:
To be able to run the CNN examples in the toolbox, the following MATLAB Add-Ons should be installed :
3. RUNNING THE EXAMPLES
At this point, the NXP Vision Toolbox should be ready for deploying scripts on the board.
If you are not familiar with the concepts of configuring the Linux OS what runs on the NXP SBC S32V234 evaluation or Network Configuration in Windows OS, please refer to this thread: https://community.nxp.com/docs/DOC-335345
In order to download any application generated and compiled in MATLAB to the NXP microprocessor, you simply need to set the configuration structure with the IP address of the board and then to call the nxpvt_codegen script provided by the NXP Vision Toolbox in the following way:
This is just a mechanism for automatizing the deployment on the board. A script can also be compiled without using the Deploy option in the configuration structure. In this case, the resulting .elf and the binaries generated by MATLAB for the Neural Network should be copied to the board, manually.
The executable (.elf file) can be found in the ../codegen/exe/SCRIPT_NAME/build-v234ce-gnu-linux-o if the compilation was done with optimization ( config.Optimize = true ), or in the ../codegen/exe/SCRIPT_NAME/build-v234ce-gnu-linux-d if the compilation was done with debug ( config.Optimize = false
After copying the executable, you can find the network binary files in the folder directly in the codegen/ folder. These files represent the Makefile for compilation, the labels that the class supports, the network implementation together with network's layers, weights and biases. The files you need to be copying to the board are the weights, biases and average binary files:
If you use this manual deployment method, you also have to be sure that the libarm_compute.so and libarm_compute_core.so are known by the loader using the LD_LIBRARY_PATH environmental variable. You could either set that to point to a custom folder in which you copied the .so's or you can simply copy them to the /lib/ folder.
If this step is omitted we will get an error stating that the arm_compute so's are not found:
Deploying the with the nxpvt_codegen script takes care of all the copying for you and no extra steps are required
The same 3 ready-to-run examples from the previous course https://community.nxp.com/docs/DOC-343430 that are available in the NXP Vision Toolbox can be deployed with no extra changes to the scripts. You can click on the pictures below to zoom and get an idea of how well the networks are doing in terms of accuracy and performance.
AlexNet Object Detector deployed on the S32V board
GoogLeNet Object Detector deployed on the S32V board
SqueezeNet Object Detector deployed on the S32V board
As expected, SqueezeNet performs best in terms of performance and it delivers a pretty decent result in terms of accuracy. For a short presentation of these 3 pre-trained networks you can look into the second course of this tutorialC2: Introduction to Deep Learning.
4. CODE INSIGHTS
As shown in the previous articles of this course, the idea that writing 20-something lines of code (including displaying and adding annotations) to actually deploy the neural network to the board is making sense in the context of bringing the embedded world into MATLAB. We will use the cnn_squeezenet.m to prove how easy it is to use the toolbox to run things on the S32V:
We fist need to save the SqueezeNet network object from MATLAB to a .mat file to be able to generate code from it:
Then we should go and take a look at the cnn_squeezenet.m script.
We start by creating an input object that reads from the MIPI-A attached camera using the 1 parameter to nxpvt.webcam at line 3. We create the CNN using the nxpvt.CNN object by passing the saved squeezenet.mat that represents the actual network and the size that the network accepts. We load the squeezenet_classes.mat using the loadClassNames method of the object. We then loop to get a continuous stream from the camera, get the images with the snapshot() method of the nxpvt.webcam object and run predict on that image. The predict method will return the classes together with the percentages, in descending order of the percentages.
We display the top 5 classes and we call nxpvt.toc() to determine how much time we needed to predict and display the image ( this is computed with regard to the nxpvt.tic call) in order to compute the frames per second. And we're done !
4. CONCLUSIONS
The NXP Vision Toolbox eliminates all the hassle and the extra steps that would be necessary for deploying Convolutional Neural Networks on the target directly from MATLAB. It also allows using custom networks that are supported by MATLAB to be ran with ease, providing smooth integration and headache-free execution. In the next course, we will focus extensively on how to retrain networks with Transfer Learning.