Machine Vision Course #2: CNN Overview

Yes

Single Layer Perceptron

Before getting into the hardware specific details, this course will cover a bit of info about how CNNs work and what are they used for. CNNs are widely used in applications in image and video recognition, thus they certainly stirred up interest in the automotive world. CNNs, like neural networks, are made up of neurons with learnable weights and biases. Each neuron receives several inputs, takes a weighted sum over them, passes it through an activation function and responds with an output.

The basic structures of neural networks are perceptrons. The perceptron is depicted in the below figure:

The perceptron consists of weights (including a special weight called bias) , summation processor and an activation function

There also is an extra input node called a bias, which is a bit of

All the inputs are individually weighted, added together and passed into the activation function.

Every activation function (or non-linearity) takes a single number and performs a certain fixed mathematical operation on it. There are several activation functions that are encountered in practice:

Sigmoid: takes a real-valued input and squashes it to range between 0 and 1

σ(x) = 1 / (1 + exp(−x))

tanh: takes a real-valued input and squashes it to the range [-1, 1]

tanh(x) = 2σ(2x) − 1

ReLU: ReLU stands for Rectified Linear Unit. It takes a real-valued input and thresholds it at zero (replaces negative values with zero)

f(x) = max(0, x)

The main function of Bias is to provide every node with a trainable constant value (in addition to the normal inputs that the node receives).

Multi Layer Perceptron

A Multi Layer Perceptron (MLP) contains one or more hidden layers (apart from one input and one output layer). While a single layer perceptron can only learn linear functions, a multi layer perceptron can also learn non – linear functions.

All connections have weights associated with them, each layer having its own bias. The process by which a Multi Layer Perceptron learns is called the Backpropagation algorithm. Initially all the edge weights are randomly assigned. For every input in the training dataset, the Neural Network is activated and its output is observed. This output is compared with the desired output that we already know, and the error is “propagated” back to the previous layer. This error is noted and the weights are “adjusted” accordingly. This process is repeated until the output error is below a predetermined threshold. The process of "adjusting" the weights makes use of the Gradient Descent algorithm for minimizing the error function, which we will not go into details about.

Machine Vision Course #2: CNN Overview

Machine Vision Course #2: CNN Overview