LPC55S69 + PowerQuad Part 4 : Matrix and Vector Processing

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

LPC55S69 + PowerQuad Part 4 : Matrix and Vector Processing

Eli_H
NXP Pro Support
NXP Pro Support
0 0 2,211

In some of my past articles on the PowerQuad,  we examined some common signal processing operations (IIR BiQuad and the Fast Fourier Transform) and then showed  how to use the PowerQuad  DSP engine to accelerate the computations.        The matrix engine in the PowerQuad can be used to perform common matrix and vector operations to free the M33 CPU(s) to perform other tasks in parallel.     In general, the matrix engine is limited to a maximum size of 16x16 size (or 256 operations).

 

pastedImage_1.png

Figure 1:     PowerQuad Matrix Operation Maximum Sizes

 

Hadamard Product

 

A simple, but useful, operation that is common in a processing pipeline is the Hadamard (elementwise) product.  Think of it as multiplying two signals together.        Let us say we have two input vectors/signals that are 1x32 in size:

 

pastedImage_2.png

Figure 2.   Hadamard Product

 

A quick note:   because the Hadamard product only needs a signal element from each of the inputs to produce each element in the output, the actual shape of the matrix/vector is inconsequential.      For example, a 16x16 matrix and 1x256 vector would yield the same result if the input data is organized the same in memory.

 

The cartoon in Figure 2 illustrates a common application of the Hadamard product:   windowing of a time domain signal.  In my last article, we looked at how Discrete Fourier Transforms are constructed from basic mathematical operations.  There was one assumption I made about the nature of the signal we comparing to our cosine/sine references. Consider the cartoon in Figure 3.

 

pastedImage_6.png

Figure 3:    The Rectangular Window as a “Default”.

 

Let us say we captured 32 samples of a signal via an Analog to Digital Converter (ADC).    In the “real” world, that signal existed before and after the 32 point “window” of time.    Here is philosophical question to consider:

 

Is there any difference between our 32 samples and an infinite long input multiplied by a “rectangular” window of 1’s around our region of interest?  

 

In fact there is not!  The simple act of “grabbing” 32 samples yields us a mathematical entity that is the “product” of some infinite long input signal and a 32-point rectangle of 1’s (with zero’s elsewhere) around our signal.      When we consider operations such as the Discrete Fourier Transform, what we are transforming is the input signal multiplied by a window function.       Using mathematical properties of the Fourier Transform, it can be shown that this multiplication in the time domain is a “shift” of the window’s Fourier Transform in the frequency domain.  There is a lot of theory available to explain this effect, but the takeaway is the rectangular window exists by the simple act of grabbing a finite number of samples.    One of my pet peeves is seeing literature that refers to the “windowed vs non-windowed transforms”.        The Rush song “Free Will”  has a memorable lyric:

 

If you choose not to decideyou still have made a choice”

 

By doing nothing, we have selected a rectangular window (which shows up as a sin(x)/x artifact around the frequency bins).    While we cannot eliminate the effects of the window, we do have some choice in how the window artifacts are shaped.  By multiplying the input signal by a known shape, we can control artifacts in the frequency domain caused by the window.    Figure 2 shows a common window called a “Hanning” window.  

 

In the context of the LPC55S69 and the PowerQuad, the matrix engine can be used to apply a different  “window” an input signal.    Since applying a window before computing a Fast Fourier Transform is a common operation, consider using the Hadamard product in the PowerQuad to do the work.

 

Vector Dot Product

 

In my last article,  I showed that the Discrete Fourier Transform is the dot product between a signal and a Cosine/Sine reference applied to many different frequency bins.     I wanted point this out here as the PowerQuad matrix engine can compute a dot product.       While the FFT is certainly the “workhorse” of Frequency domain processing, it is always not the best choice for some applications.  There are use cases where you may only to need to perform frequency domain analysis at a single or (just a few) frequency bins.   In this case, directly computing the transform via the dot product may be a better choice.

 

One constraint of the using an FFT is that the bins of the resultant spectrum are spaced as the sample rate over the number of samples.   This means the bins may not align to frequencies important to your application.   The only knobs you have is to adjust are the sample rate and the number of samples (which must be a power of two).       There are cases where you may need to align your analysis to an exact number which may not have a convenient relationship to your sample rate.        In this case, you could use the dot product operation using the exact frequencies of interest.      I have worked on applications that required frequency bins that were logarithmically spaced.   In these cases,  directly computing the DFT was the best approach to achieve the results we needed.    

 

The FFT certainly has computational advantages for many applications but it is NOT the only method for frequency domain analysis.  Speed is not always the primary requirement for some application so don't automatically think you need an FFT to solve a problem.   I wanted to point this out in the context of matrix processing  as the PowerQuad could still be used in these scenarios to do the work and keeping the main CPU free for general purpose operations.

 

Also,  I do want to mention that in these special cases there are alternate approaches besides the direct computation of the DFT with the dot product such as Goerztel’s method.      Even in these cases, you can use features in the PowerQuad to compute the result.  In the case of Goerztel’s method, the IIR BiQuad engine would be a great fit.

 

Matrix Multiplication

 

There are literally hundreds of applications where you need efficiently perform matrix multiplication, scaling, inversion, etc.         Just keep in mind the PowerQuad can do this work efficiently if the matrix dimensions are of size 16x16 or smaller (9x9 in the case of inversion).     One possible application that came to mind was Field Oriented Control (FOC).   FOC applications use special matrix transformations to simplify analysis and transform motor currents into a direct/quadrature reference frame:

 

pastedImage_9.png

Figure 4:   DQZ Transform

 

Alpha–beta transformation - Wikipedia 

Direct-quadrature-zero transformation - Wikipedia 

 

Another neat application would be to accelerate an embedded graphics application.   I was thinking that the PowerQuad Matrix Engine could handle 2D and 3D coordinate transformations that could form the basis for a “mini” vector graphics and polygon rendering capability.        When I got started with computing, video games drove my interest in “how thing work”.      I remember the awe i felt when I 1st saw games that could rotated shapes on a screen.    It "connected" when I found computer graphics text that showed this matrix equation:

 

pastedImage_7.png

Figure 5:   2D Vector Rotation Matrix.

 

This opened my mind to many other applications as the magic was now accessible to me.  Maybe I am just dreaming a bit but having a hardware co-processor such as the PowerQuad can yield some interesting work!

 

Getting Started with PowerQuad Matrix Math

 

Built into the SDK for the LPC55S69 are plenty of PowerQuad examples.   “powerquad_matrix” has plenty of examples that exercise the PowerQuad matrix engine.

 

pastedImage_15.png

Figure 6:  PowerQuad Matrix Examples in the SDK.

 

Let us take a quick peek at the vector dot product example:

 

pastedImage_17.png

Figure 7:  PowerQuad Vector Dot Product.

 

As you can see, there is actually very little required setup PowerQuad  for a matrix/vector computation.      There are handful of registers over the AHB bus that need configured and then the PowerQuad will do the work.  I hope this article got you thinking of some neat applications with the LPC55S69 and the PowerQuad.       Next time we are going to wrap up the PowerQuad articles with a neat application demonstration.   After that we are going to look at some interesting graphics and IOT applications with the LPC55S69.  Stay tuned!

 

In the meantime, here are all my previous LPC55 articles just in case you missed them.

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/01/22/lpc55-mcu-series-there-...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/02/05/lpc5500-series-theres-a...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/02/20/lpc5500-series-theres-a...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/03/13/mini-monkey-part-1-how-...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/03/29/mini-monkey-part-2-usin...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/04/19/lpc55s69-mini-monkey-bu...

 

https://community.nxp.com/videos/9003

 

https://community.nxp.com/videos/8998

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/08/02/lpc55s69-embedded-graph...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/06/15/lpc55s69-powerquad-part...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/07/05/lpc55s69-powerquad-part...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/07/21/lpc55s69-powerquad-part...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/07/28/x-ray-the-monkey-mini-m...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/08/02/lpc55s69-embedded-graph...