MCUs Community Articles

cancel
Showing results for 
Search instead for 
Did you mean: 

MCUs Community Articles

Eli_H
NXP Employee
NXP Employee

In some of my past articles on the PowerQuad,  we examined some common signal processing operations (IIR BiQuad and the Fast Fourier Transform) and then showed  how to use the PowerQuad  DSP engine to accelerate the computations.        The matrix engine in the PowerQuad can be used to perform common matrix and vector operations to free the M33 CPU(s) to perform other tasks in parallel.     In general, the matrix engine is limited to a maximum size of 16x16 size (or 256 operations).

pastedImage_1.png

Figure 1:     PowerQuad Matrix Operation Maximum Sizes

Hadamard Product

A simple, but useful, operation that is common in a processing pipeline is the Hadamard (elementwise) product.  Think of it as multiplying two signals together.        Let us say we have two input vectors/signals that are 1x32 in size:

pastedImage_2.png

Figure 2.   Hadamard Product

A quick note:   because the Hadamard product only needs a signal element from each of the inputs to produce each element in the output, the actual shape of the matrix/vector is inconsequential.      For example, a 16x16 matrix and 1x256 vector would yield the same result if the input data is organized the same in memory.

The cartoon in Figure 2 illustrates a common application of the Hadamard product:   windowing of a time domain signal.  In my last article, we looked at how Discrete Fourier Transforms are constructed from basic mathematical operations.  There was one assumption I made about the nature of the signal we comparing to our cosine/sine references. Consider the cartoon in Figure 3.

pastedImage_6.png

Figure 3:    The Rectangular Window as a “Default”.

Let us say we captured 32 samples of a signal via an Analog to Digital Converter (ADC).    In the “real” world, that signal existed before and after the 32 point “window” of time.    Here is philosophical question to consider:

Is there any difference between our 32 samples and an infinite long input multiplied by a “rectangular” window of 1’s around our region of interest?  

In fact there is not!  The simple act of “grabbing” 32 samples yields us a mathematical entity that is the “product” of some infinite long input signal and a 32-point rectangle of 1’s (with zero’s elsewhere) around our signal.      When we consider operations such as the Discrete Fourier Transform, what we are transforming is the input signal multiplied by a window function.       Using mathematical properties of the Fourier Transform, it can be shown that this multiplication in the time domain is a “shift” of the window’s Fourier Transform in the frequency domain.  There is a lot of theory available to explain this effect, but the takeaway is the rectangular window exists by the simple act of grabbing a finite number of samples.    One of my pet peeves is seeing literature that refers to the “windowed vs non-windowed transforms”.        The Rush song “Free Will”  has a memorable lyric:

If you choose not to decideyou still have made a choice”

By doing nothing, we have selected a rectangular window (which shows up as a sin(x)/x artifact around the frequency bins).    While we cannot eliminate the effects of the window, we do have some choice in how the window artifacts are shaped.  By multiplying the input signal by a known shape, we can control artifacts in the frequency domain caused by the window.    Figure 2 shows a common window called a “Hanning” window.  

In the context of the LPC55S69 and the PowerQuad, the matrix engine can be used to apply a different  “window” an input signal.    Since applying a window before computing a Fast Fourier Transform is a common operation, consider using the Hadamard product in the PowerQuad to do the work.

Vector Dot Product

In my last article,  I showed that the Discrete Fourier Transform is the dot product between a signal and a Cosine/Sine reference applied to many different frequency bins.     I wanted point this out here as the PowerQuad matrix engine can compute a dot product.       While the FFT is certainly the “workhorse” of Frequency domain processing, it is always not the best choice for some applications.  There are use cases where you may only to need to perform frequency domain analysis at a single or (just a few) frequency bins.   In this case, directly computing the transform via the dot product may be a better choice.

One constraint of the using an FFT is that the bins of the resultant spectrum are spaced as the sample rate over the number of samples.   This means the bins may not align to frequencies important to your application.   The only knobs you have is to adjust are the sample rate and the number of samples (which must be a power of two).       There are cases where you may need to align your analysis to an exact number which may not have a convenient relationship to your sample rate.        In this case, you could use the dot product operation using the exact frequencies of interest.      I have worked on applications that required frequency bins that were logarithmically spaced.   In these cases,  directly computing the DFT was the best approach to achieve the results we needed.    

The FFT certainly has computational advantages for many applications but it is NOT the only method for frequency domain analysis.  Speed is not always the primary requirement for some application so don't automatically think you need an FFT to solve a problem.   I wanted to point this out in the context of matrix processing  as the PowerQuad could still be used in these scenarios to do the work and keeping the main CPU free for general purpose operations.

Also,  I do want to mention that in these special cases there are alternate approaches besides the direct computation of the DFT with the dot product such as Goerztel’s method.      Even in these cases, you can use features in the PowerQuad to compute the result.  In the case of Goerztel’s method, the IIR BiQuad engine would be a great fit.

Matrix Multiplication

There are literally hundreds of applications where you need efficiently perform matrix multiplication, scaling, inversion, etc.         Just keep in mind the PowerQuad can do this work efficiently if the matrix dimensions are of size 16x16 or smaller (9x9 in the case of inversion).     One possible application that came to mind was Field Oriented Control (FOC).   FOC applications use special matrix transformations to simplify analysis and transform motor currents into a direct/quadrature reference frame:

pastedImage_9.png

Figure 4:   DQZ Transform

Alpha–beta transformation - Wikipedia 

Direct-quadrature-zero transformation - Wikipedia 

Another neat application would be to accelerate an embedded graphics application.   I was thinking that the PowerQuad Matrix Engine could handle 2D and 3D coordinate transformations that could form the basis for a “mini” vector graphics and polygon rendering capability.        When I got started with computing, video games drove my interest in “how thing work”.      I remember the awe i felt when I 1st saw games that could rotated shapes on a screen.    It "connected" when I found computer graphics text that showed this matrix equation:

pastedImage_7.png

Figure 5:   2D Vector Rotation Matrix.

This opened my mind to many other applications as the magic was now accessible to me.  Maybe I am just dreaming a bit but having a hardware co-processor such as the PowerQuad can yield some interesting work!

Getting Started with PowerQuad Matrix Math

Built into the SDK for the LPC55S69 are plenty of PowerQuad examples.   “powerquad_matrix” has plenty of examples that exercise the PowerQuad matrix engine.

pastedImage_15.png

Figure 6:  PowerQuad Matrix Examples in the SDK.

Let us take a quick peek at the vector dot product example:

pastedImage_17.png

Figure 7:  PowerQuad Vector Dot Product.

As you can see, there is actually very little required setup PowerQuad  for a matrix/vector computation.      There are handful of registers over the AHB bus that need configured and then the PowerQuad will do the work.  I hope this article got you thinking of some neat applications with the LPC55S69 and the PowerQuad.       Next time we are going to wrap up the PowerQuad articles with a neat application demonstration.   After that we are going to look at some interesting graphics and IOT applications with the LPC55S69.  Stay tuned!

In the meantime, here are all my previous LPC55 articles just in case you missed them.

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/01/22/lpc55-mcu-series-there-...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/02/05/lpc5500-series-theres-a...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/02/20/lpc5500-series-theres-a...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/03/13/mini-monkey-part-1-how-...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/03/29/mini-monkey-part-2-usin...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/04/19/lpc55s69-mini-monkey-bu...

https://community.nxp.com/videos/9003

https://community.nxp.com/videos/8998

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/08/02/lpc55s69-embedded-graph...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/06/15/lpc55s69-powerquad-part...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/07/05/lpc55s69-powerquad-part...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/07/21/lpc55s69-powerquad-part...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/07/28/x-ray-the-monkey-mini-m...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/08/02/lpc55s69-embedded-graph...

More
0 0 408
Eli_H
NXP Employee
NXP Employee

I had some design updates for “Rev B” of my Mini-Monkey design that I wanted to get in the "queue" for testing.  For the next revision, I wanted to try PCB:NG for the board fabrication and assembly.  PCB:NG is an “on-demand” PCB assembly service focused on turnkey prototypes via simple a web interface.    The pricing looked attractive and it appeared that the Mini-Monkey fit within their standard design rules.    The Mini-Monkey design uses an NXP LPC55S69 microcontroller that is in a 0.5mm pitch VFBGA98 package.   NXP offers guidance on how to use this device with low-cost design rules and I thought this would be a great test for PCB:NG.   I had success with Rev A at Macrofab and thought I would give PCB:NG a shot.

Getting your design uploaded is straightforward with PCB:NG.   You can upload your Gerber files and get a preview of the PCB.  As you move through the process, the web interface will give you an updated price:

pastedImage_1.png

Figure 1: PCB:NG Gerber Upload

The online PCB:NG interface includes a Design For Manufacture (DFM) check.  The check is exhaustive and includes all the common DFM rules such as trace width, clearance, drill hits etc.  In my case, I had some features that violated minimum solder mask slivers and copper to board outline clearances.  The online tool allows you to “ignore” DFM violations that may not be an issue.   I was able to look through all the violations and mark which ones were of no concern.

Once the Gerber files are uploaded, you can add your parts and as well as the pick/place data.  The PCB:NG interface will show you part pricing and availability as soon your Bill of Materials (BOM) is uploaded.   You have the option to mark parts as Do Not Place (DNP) if you do not want them populated.   In my case, I had 2 components on the Mini-Monkey BOM (a battery and a display) that I did not include as they required some manual assembly steps that I was going to perform once I had the units in hand.

pastedImage_2.png

Figure 2 : PCB:NG BOM Upload

Along with the BOM, you must  upload XYRS placement data.    The XYRS data can be combined in the spreadsheet file used for the BOM.  The PCB:NG viewer will also show you where it thinks all the placements are and can make manual adjustments if necessary.

pastedImage_3.png

Figure 3 : PCB:NG Part Placement Interface

 

Results!

I had placed my order on 2020-06-10.   Throughout the process, PCB:NG sent email updates when materials were in house,  when production started, etc.    I did have to send in a note that one of the parts (a MEMs microphone) was sensitive to cleaning  processes.    I received a response the same day noting the exception (PCB:NG uses a no-clean process) and they would add the part to their internal database of exceptions.

I had placed the order when they were in the middle of some equipment upgrades.   When I checked the price a few day ago I found that it was lower ($ 380 vs $496) after the new process upgrades.    I consider the service a huge value given that they handle some potentially difficult parts.   Getting the BGA packages microcontroller and the LGA packaged MEMs soldered professionally was well worth the price.   The boards shipped out 2020-06-29.  It was a bit longer than the published lead time but communication during the process was good.  I think I caught the team in the middle of some equipment upgrades which may have delayed things a bit.   PCB:NG took some extra time to get me photos from the X-tay inspection of the BGA and LGA parts.     Getting these photos was well worth the wait!

pastedImage_4.png

Figure 4: LPC55S69 VFBGA98 Post Assembly X-ray - View 1

pastedImage_5.png

Figure 5: LPC55S69 VFBGA98 Post Assembly X-ray – View 2

pastedImage_6.png

Figure 6: MEMS Microphone (LGA) Post Assembly X-ray

As you can see of the X-ray images, the solder joints were good.   It was also cool seeing the via structures in the PCB and bond wires in the IC packages.   You can even see tiny little via structures in the VFBGA98 package itself.   How did the build turn out?   Here is a video of the Mini-Monkey Rev B:

Video Link : 10227 

PCB:NG was also kind enough to show the Mini-Monkeys getting setup for placement:

Video Link : 10228 

Final Thoughts

The experience with PCB:NG was excellent.   The boards turned out a great and I was able to test all my changes quickly.    Having someone else handle part procurement and assembly is a huge value to me as it allows me to focus on other aspects of the design such as firmware develop for the board bring-up.    One possible improvement with the online PCB:NG interface would be to be able to submit ODB++ or IPC-2581 data.  These formats bake in more information and could really streamline design upload.       I will certainly be using PCB:NG in the future for my prototypes.    The on-demand model is helpful, especially when you are busy and need to get some help accelerating your development efforts.   

Onward to Revision C!    I think I may add eMMC storage and improve the battery circuit.   If you want to see the current raw design files,   they are available on BitBucket in Altium Designer format.

Hardware:

https://bitbucket.org/ehughes_/minimonkey-hw/src/master/ 

Test Software:

https://bitbucket.org/ehughes_/minimonkey-sw/src/master/ 

I'll be posting more updates on the Mini-Monkey as new revisions are complete.   In the mean time, here are other articles and videos related to the LPC55S69. Cheers!

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/01/22/lpc55-mcu-series-there-...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/02/05/lpc5500-series-theres-a...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/02/20/lpc5500-series-theres-a...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/03/13/mini-monkey-part-1-how-...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/03/29/mini-monkey-part-2-usin...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/04/19/lpc55s69-mini-monkey-bu...

https://community.nxp.com/videos/9003

https://community.nxp.com/videos/8998

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/06/15/lpc55s69-powerquad-part...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/07/05/lpc55s69-powerquad-part...

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/07/21/lpc55s69-powerquad-part...

More
0 0 663
Eli_H
NXP Employee
NXP Employee

In my last article, we examined a common time domain filter called the Biquad and how it could be computed using the LPC55S69 PowerQuad engine.        We will now turn our attention to another powerful component of the PowerQuad, the “Transform Engine”.      The PowerQuad transform engine can compute a Fast Fourier Transform (FFT) in both a power and time efficient manner leaving your main CPU cores to handle other tasks.  

Before we look at the implementation on the LPC55S69, I want to illustrate what exactly an FFT does to a signal.   The meaning of the data is often glossed over or even worse yet, explained in purely mathematical terms without a description of *context*.    I often hear descriptions like “transform a signal from the time domain to the frequency domain”.   While these types of descriptions are accurate, I think that many do not get an intuitive feel for what the numbers are *mean*.    I remember my 1st course in ordinary differential equations.  The professor was explaining the Laplace transform (which is more generalize case of the Fourier Transform) and I asked the question “What does s actually mean in a practical use case?”.

pastedImage_1.png

Figure 1. Laplace Transform. What is s?

My professor was a brilliant mathematician and a specialist in complex analysis.  He could explain the transform from 3 different perspectives with complete mathematical rigor.       Eventually we both got frustrated and he said “s will be a large number complex number in engineering applications”.   It turned out the answer was simple in terms of the electrical engineering problems we were solving.  After many years and using Laplace again in Acoustic Grad School it made sense but at the time it was magical.       I hope to approach the FFT a bit differently and you will see it is simpler than you may think.     While I cannot address all the aspect of using FFT’s in this article, I hope it gives you a different perspective from a “getting started” perspective.

Rulers, Protractors, and Gauges

One of my favorite activities is wood working.   I am not particularly skilled in the art, but I enjoy using the tools, building useful things, and admiring the beauty of the natural product.  I often tell people that “getting good” at wood working is all about learning how to measure, gauge and build the fixtures to carry out an operation.      When you have a chunk of wood, one of the most fundamental operations is to measure its length against some fixed standard.      Let us begin with a beautiful chunk of Eastern Hemlock:

pastedImage_14.png

Figure 2.  A 12” x 3” x 26” Piece of rough sawn eastern hemlock. 

One of the first things we might want to do to with this specimen is use some sort of standard gauge to compare it to:

pastedImage_17.png

Figure 3.  Comparing our wood against a reference.

We can pick a standard unit and compare our specimen to a scale based upon that unit.   In my case the unit was “inches” of length, but it could be anything that helps up solve our problem at hand.  Often you want to pick a unit and coordinate system that scales well to the problem at hand.     If we want to measure circular “things”, we might use a protractor as it makes understanding the measurement easier.    The idea is to work in a system that is in the “coordinate system” of your problem.     It makes sense to use a ruler to measure a “rectangular” piece of wood.

What does this have to do with DSP and Fourier transforms?   I hope to show you that a Fourier Transform (and its efficient discrete implementation, the FFT) is simplly just a set of gauges that can be used to understand a time domain signal.   We can then use the PowerQuad hardware to carryout out the “gauging”.  For the sake of this discussion, let us consider a time domain signal such as this:

pastedImage_20.png

Figure 4.   An example time domain signal

This particular signal is a bit more complex than the simple sine wave used in previous articles.     How exactly would we “gauge” this signal?   Amplitude?  Frequency?  Compute statistics such as variance?    Most real-world signals can have quite a bit of complexity, especially when they are tied to some physical process.    For example, if we are a looking at the result of some sort of vibration measurement, the signal could look very complicated as there are many physical processes contributing to the shape.   In vibration analysis, the physical “things” we are examining with move and vibrate according to well understood physics.     The physics show that the systems can be modeled with even order differential equations.       This means always be we can write the behavior of the system over time as the sum of sinusoidal oscillations.        So, what would be a good gauge to use to examine our signal?    Well, we could start with a cosine wave at some frequency of interest:

pastedImage_23.png

Figure 5.   Gauging our signal against a cosine wave.

Choosing a cosine signal as a reference gauge can simplify the problem as we can easily identify the properties of our unit of measure,  i.e. its frequency and amplitude.      We can fix the amplitude and frequency of our reference and then compare it to our signal.     If we do our math correctly, we can get a number that indicates how well correlated our input signal is to a cosine wave of a particular frequency and a unit amplitude.     So, how exactly do we perform this correlation?     It turns out to be a simple operation.    If we think about the input signal and our reference gauge as discrete arrays of numbers (i.e. vectors),  we compute the dot-product between them:

pastedImage_26.png

Figure 6.   Computing the correlation between a test signal and our feference gauge.

The operation is straightforward.    Both your signal input and the “gauge” has the same number of samples.   Multiply the elements of each array  together and add up the results.    Using an array like notation, the input is represented by “x[n]” and the gauge is represented by “re[n]” where n is an index in the array:

Output = x[0]*re[0]  +  x[1]*re[1]  + x [2]*re[2] + .  .  .

What we end up with is a single number (scalar).   Its magnitude is proportional to how well correlated out signal is to the particular gauge we are using.      As a test, you could write some code and use a cosine wave as your input signal.  The test code could adjust the frequency of input and as the frequency of the input gets closer to the frequency of the gauge, the magnitude of the output would go up.

As you can see the math here is just a bunch of multiplies and adds, just like the IIR filter from our last article.     There is one flaw however with this approach.      There is special case of the input where the output will be zero.   If the signal input is a cosine wave of the *exact* frequency as the gauge and  is 90 degrees phase shifted with respect to the reference gauge, we would get a zero output.    

pastedImage_28.png

Figure 7.  A special case of our reference gauge that would render zero output.

This is not desirable as we can see that input is correlated our reference gauge, it just is shifted a in time.  There is a simple fix and we can even use our piece of hemlock lumber to illustrate.

pastedImage_32.png

Figure 8.  Gauging along a different side of the lumber.

In Figure 3, I showed a ruler along the longest length of the wood.     We can also rotate the ruler and measure along the shorter side. It is the same gauge, just used a different way. Imagine that board was only 1” wide but 24” long.  I could ask an assistant to use a ruler and measure the board. Which of those two numbers is “correct”?    The assistant could report to me either of those numbers and be technically correct.   We humans generally assume length to be the longer side of a rectangular object but there is nothing special about that convention.  In figure 6, we were only measuring along 1 “side” of the signal.   It is possible to get a measurement that is zero (or very small) while have a signal that looks very similar to the gauge (like in figure 7).   We can fix this by “rotating” our ruler similar to figure 8 and measure along the both ”sides” of the signal.

pastedImage_36.png

Figure 9.  Using two reference gauges.  One is “rotated” 90 degrees.

In figure 9, I added another “gauge” labeled “A” in purple.  The original gauge is labeled “B”.    The only difference between the two gauges is that B is phase shifted by 90 degrees.   This is equivalent to rotating my ruler in figure 8 and measuring the “width” of my board.     In figure 9, I am showing 3 of the necessary multiply/add operations but you would carry out the multiple/add for all points in the signal. Writing it out:

B = x[0]*Re[0]  +  x[1]*Re[1]  + x [2]*Re[2] +  .  .  .

A = x[0]*Im[0]  +  x[1]*Im[1]  + x [2]*Im[2] +  .  .  .

In this new formulation we get a pair of numbers A,B for our output.   Keep in mind that we are gauging our input against a *single* frequency of reference signals at a unit amplitude.     This is analogous to measuring the length and width of our block of wood.      Another way of thinking about it is that we now have a measuring tool that evaluates along 2 axes which are “orthogonal”.     It is almost like a triangle square.

pastedImage_39.png

Figure 10.  A two-axis gauge.

Once we have our values A & B, it is typical to consider them as a single complex number

Output = B + iA

The complex output gives us a relative measure of how we are correlated to our reference gauges. To get a relative amplitude, simply compute the magnitude:

||Output|| = sqrt(A^2 + B^2)

You could even extract the phase:

Phase = arctan(B/A)

It common to think about the output in “polar” form (magnitude/phase).  In vibration applications you typical want understand the magnitude of the energy at different frequency components of a signal. There are applications in communications, such as orthogonal frequency domain multiplexing (OFDM), where you work directly the with real and imaginary components. 

I previously stated that the correlation we were performing is essentially a vector dot product operation.    The dot product shows up in many applications. One of which is dealing with vectors of length 2 where we use the following relationship:

pastedImage_42.png

The interesting point here is that the dot product is a simple way of getting a relationship of the angle and magnitude between two vectors and b.    It is easy to think about a and b as vectors on a 2d plane, but the relationship extends to vectors of any length.  For digital data, we work with discrete samples, so we define everything in terms of the dot product.   We are effectively using this operation to compute magnitudes and find angles between “signals”.   In the continuous time world, there is the concept of the inner-product space.   It is the “analog” equivalent of the dot product and underpins the mathematical models for many physical systems.  

At this point we could stop and have a brute force technique of comparing a signal against a single frequency reference.    If we want to determine if a signal had a large component of a particular frequency, we could tailor our reference gauges to the *exact* frequency we are looking for.  The next logical step is to compare our signal against a *range* of reference gauges of different frequencies:

pastedImage_44.png

Figure 11: Using a range of reference gauges at different frequencies.

In Figure 11, I show four different reference gauges at frequencies that have an integer multiple relationship. There is no limit to the number of frequencies you could use.     With this technique, we can now generate a “spectrum” of outputs at all the frequencies of interest for a problem.    This operation has a name:  the Discrete Fourier Transform (DFT).    One way of writing the operation is:

pastedImage_2.png

Figure 12. The Discrete Fourier Transform (DFT)

N is the number of samples in the input signal.

k is the frequency of the cosine/sine reference gauges.     We can generate a “frequency” spectrum by computing DFT over a range of “k” values.  It is common to use a linear spacing in when selecting the frequencies.   For example, if your sample rate is 48KHz and you are using N=64 samples, it is common (we will see why later) to use 64 reference gauges spaced at (48000/64)Hz apart. 


The “Fast Fourier Transform”

The Fast Fourier Transform is a numerically efficient method of computing the DFT.  It was developed by J. W. Cooley and John Tukey in 1965 as a method of performing the computation with a fewer adds and multiplies as compared to the direct implementation shown in Figure 11.    The development of the FFT was significant as we can do our number crunching much more efficiently by imposing a few restrictions on the input.  There are a few practical constraints that need to be considered when using an FFT implementation

  1. The length of your input must be a power of 2.   i.e. 32, 64, 128, 256.
  2. The “bins” of the output are spaced in frequency by the sample rate of your signal divided by the number of samples in the input.     As an example, if you have a 256-point signal sampled at 48Khz, the array of outputs corresponds to frequencies spaced at 187Hz.      In this case the “bins” would correlate to 0Hz, 187.5Hz, 375 Hz, etc.  You cannot have arbitrary input lengths or arbitrary frequency spacing in the output.
  3. When the input the FFT/DFT are “real numbers” (i.e. samples from an ADC), the array of results exhibits a special symmetry.   Consider an input array of 256 samples.    The FFT result will be 256 complex numbers.   The 2nd half of the output are a “mirror” (complex conjugates) of the 1st half.      This means that for a 256-sample input, you get 128 usable “bins” of information. Each bin has a real and imaginary component.  Using our example in #2, the bins would be aligned to 0Hz, 187.5Hz, 375Hz, all the way up to one half of our sample rate (24KHz).

You can read more details about how the FFT works as well as find plenty of instructional videos on the web. Fundamentally, the algorithm expresses the DFT of signal length N recursively in terms of two DFTs of size N/2.  This process is repeated until you cannot divide the intermediate results any further.   This means you must start with a power of 2 length.  This particular formulation is called the Radix-2 Decimation in Time (DIT) Fast Fourier Transform.  The algorithm gains its speed by re-using the results of intermediate computations to compute multiple DFT outputs.  The PowerQuad uses a formulation called “Radix-8” but the same principles apply.

Using the PowerQuad FFT Engine

The underlying math to a DFT/FFT boils down to multiplies and adds along with some buffer management.   The implementation can be pure software, but this algorithm is a perfect use case for a dedicated coprocessor.    The good news is that once you understand the inputs and outputs of a DFT/FFT, using the PowerQuad is quite simple and you can really accelerate your particular processing task. The best way to get started with using the PowerQuad FFT is to look at the examples in the SDK.     There is an example project called “powerquad_transform” which has examples that test the PowerQuad hardware.

pastedImage_1.png

Figure 13.  PowerQuad Transform examples in the MCUXpresso SDK for the LPC55S69

In the file powerquad_transform.c, there are several functions that will test the PowerQuad engine in its different modes.     For now, we are going to focus on the function PQ_RFFTFixed16Example(void).

pastedImage_4.png

This example will set up the PowerQuad to accept data in a 16-bit fixed point format.      To test the PowerQuad, a known sequence of input and output data is used to verify results.     The first thing I would like to point out is that the PowerQuad transform engine is used fixed point/integer processing only.   If you need floating point, you will need to convert beforehand.  This is possible with the matrix engine in the PowerQuad.      I personally only every use FFTs with fixed point data most of my source data comes right from analog to digital converter data.       Because of the processing gain of the FFT, I have never seen any benefit of using a floating-point format for FFTs other than some ease of use for the programmer.   Let us look at the buffers used in the example:

pastedImage_7.png

Notice that the input data length FILTER_INPUT_LEN (which is 32 samples).   The arrays used to store the outputs are twice the length.     Remember that an FFT will produce the same number of *complex* samples in the output as there are samples for the input.     Since our input sample are real values (scalars) and the outputs have real/imaginary components, it follows that we 2x the length to storage the result.     I stated before that one of implications of the FFT with real valued inputs is that we have a mirror spectrum with complex conjugate pairs.   Focusing on the reference for testing the FFT output in the code:

pastedImage_10.png

The 1st pair 100,0 corresponds to the 1st bin which is a “DC” or 0Hz component.  It should always have a “zero” for the imaginary component.    The next bins can be paired up with bins from the opposite end of the data:

76,-50  <->   77,49

29,-62   <->   29, 61

-1, -34   <->  -1,33

These are the complex conjugate pairs exhibiting mirror symmetry.   You can see that they are not quite equal.  We will see why in a moment.  After all the test data in initialized, there is a data structure used to initialize the PowerQuad:

pastedImage_1.png

One of the side effects of computing an FFT is that you get gain at every stage of the process.   When using integers, it is possible to get clipping/saturation and the input needs to be downscaled to ensure the signal down not numerically overflow during the FFT process.     The macro FILTER_INPUTA_PRESCALER is set to “5”.    This comes from the length of the input being 32 samples or 2^5.     The core function of the Radix-2 FFT is to keep splitting the input signal in half until you get to a 2-point DFT.  It follows that we need to downscale by 2^5 as we can possible double the intermediate results at each stage in the FFT.       The PowerQuad uses a Radix-8 algorithm, but the need for downscaling is effectively the same.      I believe that some of the inaccuracy we saw in the complex conjugates pairs the test data was from the combination of an input array values that are numerical small and the pre-scale setting.   Note that the pre-scaling is a built in hardware function of the PowerQuad.

The PowerQuad needs an intermediate area to work from.  There is a special 16KB region starting at address 0xe0000000 dedicated to the PowerQuad.   The PowerQuad has a 128-bit interface to this region so it is optimal to use this region for the FFT temporary working area.  You can find more details about this private RAM in AN12292 and   AN12383

Once you configure the PowerQuad, the next step is to tell the PowerQuad the input and result data is stored with the function PQ_transformRFFT().

pastedImage_3.png

Notice in the implementation of the function, all that is happening is setting some more configuration registers over the AHB bus and kicking off the PowerQuad with a write to the CONTROL register. In the example code, the CPU blocks until the PowerQuad is finished and then checks the results.    It is important to point out that in your own application,  you do not have to block until the PowerQuad is finished.      You could setup an interrupt handler to flag completion and do other work with the general purpose M33 core.  Like I stated in my article on IIR filtering with the PowerQuad,   the example code is a good place to start but there are many opportunities to optimize your particular algorithm.   Example code tends to include additional logic to check function arguments to make the initial experience better.    Always take the time look through the code to see where you can remove boilerplate that might not be useful.

Parting Thoughts

  • The PowerQuad includes a special engine for computing Fast Fourier Transforms.
  • The FFT is an efficient implementations of the Discrete Fourier Transform.   This process just compares a signal against a known set of reference gauges (Sines and Cosines)
  • The PowerQuad has a private region to do its intermediate work.  Use it for best throughput.
  • Also consider the memory layout and AHB connections of where your input and output data lives.  There may be additional performance gains by making sure you input DSP data is in a RAM block that is on a different port than RAM used in your application for general purpose task.  This can help with contention when different processes are accessing data.   For example, SRAM0–3 are all on different AHB ports.   You might consider locating you input/output data in SRAM3 and having your general-purpose data in SRAM0-2.   Note:   You still need to use 0xE0000000 for the PowerQuad TEMP configuration for its intermediate working area.

pastedImage_5.png

At this point you can begin looking through the example transform code.  Also make sure to read through  AN12292 and   AN12383 for more details.       While there are more nuances and details to FFT and “frequency domain” processing, I will save those for future articles.    Next time I hope to show some demos of the PowerQuad FFT performance on the Mini-Monkey and illustrate some other aspect of the PowerQuad.      Until then, check out some of the additional resources below on the LPC55S69.

Additional LPC55S69 resources:

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/01/22/lpc55-mcu-series-there-...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/02/05/lpc5500-series-theres-a...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/02/20/lpc5500-series-theres-a...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/03/13/mini-monkey-part-1-how-...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/03/29/mini-monkey-part-2-usin...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/04/19/lpc55s69-mini-monkey-bu...

 

https://community.nxp.com/videos/9003

 

https://community.nxp.com/videos/8998

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/06/15/lpc55s69-powerquad-part...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/07/05/lpc55s69-powerquad-part...

More
0 0 327
Eli_H
NXP Employee
NXP Employee

In my last article,   we starting discussing the PowerQuad engine in the LPC55S69 as well as the concept of data in the “time domain”.    Using the Mini-Monkey board, we showed the function of collecting a bucket of data over time.     I chose to use a microphone as a data source as it is easy to visualize and understand.      You can now easily imagine replacing the microphone with *anything* that changes over time.     In this article we are going to look at some common algorithms for processing data in the time domain.  In particular, we will look at the “Dual Biquad IIR” engine in the LPC55S69 PowerQuad. An IIR biquad is a commonly used building block as it is possible to configure the filter for many common filtering use cases.   This article is not intended to review all of the DSP theory behind IIR filter implementations but I do want to highlight some key points and the PowerQuad implementation.

Digital Filtering with Embedded Microcontrollers

When sampling data “live”, one can imagine data being continuously recorded at a known rate.      A time domain filter will accept this input data and output a new signal that is modified in some way.

pastedImage_2.png

Figure 1.   Filtering In the Time Domain

The concept here is that the output of the filter is just another time domain signal.   You may choose to do further processing on this new signal or output to a Digital Analog Converter (DAC).      If we are thinking in terms of “sine waves”, a digital filter adjusts the amplitude and phase of the input signal.     As we apply different frequency inputs (or a sum of different frequencies), the filter attenuates or gains to the sinusoidal components.   So, how does one compute a digital filter?   It is quite simple.   Let us start with a simple case.  :

pastedImage_4.png

Figure 2.   Sample by Sample Filter Processing using a History of the Input

One operation we perform is to *mix* the most recent input sample with samples we have previously recorded.    The result of this operation is our next *output* sample. The name of this filter configuration is an FIR or Finite Impulse Response filter.      One way to write this algorithm is to use a “c array style” notation and difference equations.

x[n]       The current input

x[n-1]     Our previous input

y[n-2]     An input from 2 sample ago 

y[n]       Our next output

 

Figure 2 could be written as

y[n] = b0*x[n] + b1*x[n-1]  + b2*x[n-2]

All we are doing is multiplying our input sample and its history by constant coefficients and then adding them up.    We are multiplying then accumulating! The constants b0, b1 and b2 control the frequency response of the filter.    By choosing these numbers correctly, we can attenuate “high” frequencies (low pass filter), attenuate low frequencies (high pass filter), or perform some combination of the two (band pass filter).     We can also use more samples from the input history.    For example, instead of just using the previous 3 samples, one could use 128 samples.      A filter of this type (FIR) can require quite a bit of time history to get precise control over its frequency response.         The code to implement this structure is simple but can be very CPU intensive as you need to do the multiply and adds for *every* sample at your signal sample rate.  

There is an adjustment we can make to figure 2 that can allow for tighter control over our frequency response without having to use a long time history.

pastedImage_1.png

Figure 3.   Sample by Sample Filter Processing using a History of the Input and Output

The key difference between figure 2 and figure 3 is that we can also mix in previous filter *outputs* to generate the output signal.     Adding this “feedback” can yield some interesting properties and is the root of another class of digital filters called IIR (Infinite Impulse Response filters).    

y[n] = b0*x[n] + b1*x[n-1]  + b2*x[n-2]  + a1*y[n-1] + a2*y[n-2]

One of the primary advantages of this approach that you need fewer coefficients than an FIR filter structure to get a desired frequency response.   There are always trade-offs when using IIR filters vs. FIR filters so be sure to read up on the differences.    The example I showed in figure 3 is called a “biquad”.     A biquad filter is a common filter building block that can be easily cascaded to construct larger filters.       There are several reasons to use a biquad structure, one of which being that there are many design tools that can generate the coefficients for all of the common use cases.      Several years ago, I built a tool around a set of design equations that were useful for audio filtering.

https://community.nxp.com/docs/DOC-100240

http://shepazu.github.io/Audio-EQ-Cookbook/audio-eq-cookbook.html

pastedImage_4.png

Figure 4.  An IIR Biquad Filter Design Tool.

At the time I made the tool shown in figure 4, I was using biquad filter structures for tone controls on a guitar effects processor.     The frequency and phase response plots where designed to show frequencies of interest of an electric guitar pickup.  There are lots of options for coming up with coefficients and numerous libraries to help.  For example, you could use Python:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.iirfilter.html

In my guitar effects project, I embedded the filter design equations in my C code so I could recompute coefficients dynamically!

Using the PowerQuad IIR Biquad Engines

The PowerQuad in the LPC55S69 has dedicated hardware to compute IIR biquad filters.      Like an FIR filter, the actual code to implement a biquad filter is straightforward.  An IIR filter may be simple to code but can use quite a bit of CPU time to crunch through all the multiply and accumulate operations.     The PowerQuad is available to free up the CPU from performing the core computational component of the biquad computation.       A good starting point for using the PowerQuad IIR biquad engine is to use the MCUXpresso SDK.     It is important to note that the SDK will be a starting point.  The SDK code is written to cover as many use cases as possible and to demonstrate the different functions of the PowerQuad.  It can be helpful to read through the source code and decide which pieces you need to extract for your own application.   DSP code often requires some hand tuning and optimization for a particular use case.    The PowerQuad is connected via the AHB bus and the Cortex-M33 co-processor interface.  Let’s take a look at the SDK source code to see how you the IIR engine works.

Using the “Import SDK Examples” wizard in MCUXpresso, you will find PowerQuad examples under driver_examples PowerQuad

pastedImage_1.png

Figure 5.  Selecting the PowerQuad Digital Filter Example

The powerquad_filter project has quite a few examples of the different filter configurations.  We are going to focus on a floating point biquad example as a starting point.  In the file powerquad_filter.c, there are several test functions that will demonstrate a basic filter setup.   I am using LPC55S69 SDK 2.7.1 and there is function around line 455 (Note the spelling mistake PQ_VectorBiqaudFloatExample).

pastedImage_3.png

Figure 6.  Vectorized Floating Point IIR Filter Function

The 1st important point to note is that PowerQuad computes IIR filters using “Direct Form II”.        In the previous figures I showed the filter using “Direct Form I”.      When one is 1st introduced to IIR filters, “Direct Form I” is the natural starting point as it is the clearest and most straightforward implementation.    It is possible however to re-arrange the flow of multiplies and adds and get the same arithmetic result.

pastedImage_5.png

Figure 7.  IIR Direct Form II

https://ccrma.stanford.edu/~jos/filters/Direct_Form_II.html

  

When using "Direct Form II", we do not need to store history of both inputs and outputs.  Instead, we store an intermediate computation which is labeled v[n].         During the computation of the filter, the intermediate history v[n] must be saved.    We will refer these intermediate values as the filter “state”.  To setup the PowerQuad for IIR filter operation, there are handful of registers on the AHB bus where the state and coefficients are stored. In the SDK examples, the state of the filter is initialized with PQ_BiquadRestoreInternalState().       

pastedImage_9.png

Figure 8.   Restoring/Initializing Filter State

Once the PowerQuad IIR engine is initialized,  data samples can be processed through the filter.   Let us take a look at the function PQ_VectorBiqaudDf2F32() in fsl_powerquad_filter.c

pastedImage_13.png

Figure 9.   Vectorized IIR Filter Implementation.

This function is designed to process longer blocks of input samples, ideally in multiples 8.       Note that many of the SDK examples are designed make it simple to get started but could be easily tuned to remove operations that may be not applicable in your application code.  For example, the modulo operation to determine if the input block is a multiple of 8 is something that could be easily removed to save CPU time.         In your application, you have complete control over buffer sizes and can easily optimize and remove unnecessary operations.  The actual computation of the filter can be observed in the code block that processes the 1st block of samples.

pastedImage_17.png

Figure 10.  Transfering Data to the IIR Engine with the ARM MCR Coprocessor Instruction

Data is transferred to the PowerQuad with the MCR instruction.   This instruction transfers data from an CPU register to an attached co-processor (the PowerQuad in this case).  The PowerQuad does the work of crunching through the Direct Form II IIR structure.    While it take some CPU intervention to move data into the PowerQuad,   the PowerQuad is much more efficient at the multiply and adds for the filter implementation.

To get the result, the MRC instruction is used.   MRC moves data from a co-processor to a CPU register.

pastedImage_20.png

Figure 11.  Retrieving the IIR Filter result with the MRC instruction.

Further down in PQ_VectorBiquadDf2F32(), there is assembly code tuned to inject data in blocks of 8 samples.    Looking at PQ_Vector8BiquadDf2F32():

pastedImage_23.png

Figure 12.  Vectorized Data Insertion into the PowerQuad.

Notice all the MCR/MRC functions to transfer data in and out of the biquad engine.    All the other instructions are “standard” ARM instructions to get data into the registers that feed coprocessor.  Take some time to run the examples in the SDK.  They are structured to inject a known sequence to verify correct filter operation.    Now that you have seen some the of the internals,  you can use the pieces you need from the SDK to implement your signal processing chain.

Some take-aways

  • The PowerQuad can help accelerate biquad filters. There are 2 separate biquad engines built into the PowerQuad.
  • The PowerQuad IIR functions are configured through registers on the AHB bus and the actual input/output samples transferred through the Cortex M33 coprocessor interface.
  • The SDK samples are a good starting point to see how configure and transfer data to the PowerQuad.  There are optimization opportunities for your particular application so be sure to inspect all of the code.
  • If you need more than two biquad filters, you will need to preserve the “state” of the filter.  This can be a potentially expensive operation if you are constantly saving/restoring state.  In this case you will want to consider processing longer blocks of data.
  • You may not need to save the entire “state” of the filter.   For example, if the filter coefficients are the same for all of the your filters,  all you need to save and restore is v[n].
  • While the PowerQuad can speed up (6x) the core IIR filter processing,  you still need the CPU to setup the PowerQuad and feed in samples.   Consider using one the extra Cortex M33 cores in the LPC55S69 to do your data shuffling.

You now have a head start on performing time domain filtering with the LPC55S69 PowerQuad.    We examined IIR filters, which have lots of applications in audio and sensor signal processing, but the PowerQuad can also accelerate FIR filters.  Next time we are going to dive a litter deeper with some frequency domain processing with the PowerQuad transform engine.      The embedded transform engine can accelerate processing of Fast Fourier Transforms *significantly*. Stay tuned for more embedded signal processing goodness!

Additional LPC55S69 resources:

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/01/22/lpc55-mcu-series-there-...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/02/05/lpc5500-series-theres-a...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/02/20/lpc5500-series-theres-a...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/03/13/mini-monkey-part-1-how-...

   

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/03/29/mini-monkey-part-2-usin...

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/04/19/lpc55s69-mini-monkey-bu...

 

https://community.nxp.com/videos/9003

 

https://community.nxp.com/videos/8998

 

https://community.nxp.com/community/general-purpose-mcus/lpc/blog/2020/06/15/lpc55s69-powerquad-part...

More
1 0 661
Eli_H
NXP Employee
NXP Employee

The Mini-Monkey is now officially “out the door”.   I just sent the files to Macrofab and can’t wait to see the result.   Before I talk a bit about Macrofab, we will look at what going to get built. A few weeks ago, I introduced a design based upon the LPC55S69 in the 7mm VFBGA98.   The goal was to show that this compact package can be used with low cost PCB/Assembly service without having to use the more expensive build specifications. The Mini-Monkey board will also be used to show off some of the neat capabilities of the PowerQuad DSP engine in future design blogs.    Here is what we ended with for the first version:

pastedImage_2.png

pastedImage_3.png

Figure 1.  Mini-Monkey Revision A

Highlights

  • Lithium-Polymer battery power with micro-USB Charging
  • High-speed USB 2.0 Interface
  • SWD debug via standard ARM .050” and tag-connect interface
  • Digital MEMs microphone with I2S Interface
  • 240x240 1.54” IPS Display with HS-SPI interface
  • Op-amp buffer for one of the 1MSPS ADC channels
  • 3 push buttons.  One can be used to start the USB ROM bootloader
  • External Power Input
  • 16MHz Crystal
  • 11 dedicated IO pins connected to the LPC55S69.   Functions available:
    • GPIO
    • Dedicated Frequency Measurement Block
    • I2C
    • UART
    • State Configurable Timers (Both input and output)
    • Additional ADC Channels
    • CTIMERs
  • The HS-SPI used for the IPS display is also brought to IO pins

I am a firm believer in not trying to get anything perfect on the 1st try.    It is incredibly inexpensive to prototype ideas quickly so I decided to try to get 90% of what I wanted in the first version.   As we will see, it is inspesive to iterate on this design to work in improvements.    Without too much trouble,    I was able to get everything I wanted on 2 signal layers with filling in a power reference on the top and bottom sides.  If this was a production design, I would probably elect to spend a bit more to get two solid inner reference planes by using a 4-layer design.     Once a design hits QTY 100 or more, the cost of using a 4-layer stack-up can be negligible. A 4-layer stack-up makes the design much easier to execute and compliant with EMI, RFI requirements.      For most of my “industrial” designs where I know that it won’t be high quantity, I always start at 4-layer unless it is a simple connector board.    

For this 1st run, I wasn’t trying to push the envelope with how much I could get done with low cost design rules and a 2-layer stack-up. The VFBGA leaves quite a bit of space for fanning out IO.  Quite a bit can be done on the top layer without vias.      I had a few IO that ended up in more difficult locations, but routing was completely quickly.

pastedImage_8.png

Figure 2.  Mini-Monkey VFBGA Fanout

As you can see, I did not make use of all the IO.       If I had used a 4-layer board I would be simpler to get quite a bit more of the IO fanned out.       Moving to smaller vias, traces and a 4-layer stack-up would probably allow one to get all IO’s connected.   For this design,  I was trying to move quickly as well as use the standard “prototype” class specs from Macrofab.    This means 5 mil traces, 10 mil drills with a 4-mil annular ring.  If you can push to 3.5mil trace/space,  NXP AN12581 has some suggestions.

I did want to take a minute to talk about Macrofab.     I normally employ the services of a local contract manufacturer but this time I elected to this online service a try.     After going through the order process, I must say I was thoroughly impressed!       The 1st step is to upload your PCB design files.  I use Altium Designer PCB package and Macrofab recommends uploading in OBD++ format.   Since this format has quite a bit more meta-data baked than standard Gerbers, the online software can infer quite a bit about your design.

pastedImage_12.png

Figure 3.  Macrofab PCB Upload

The Macrofab software gives you a cool preview of your PCB with a paste mask out of the gate.  Note that this design is using red solder mask as that is what is included in the prototype class service.  Once you have all the PCB imported, you can now upload a Bill of Materials (BOM).

pastedImage_16.png

Figure 3.  Macrofab BOM Upload

Macrofab provides clear guidance on how to get your BOM formatted for maximum success.      Once the BOM is uploaded, the online tool searches distributors and you can select what parts you want to use.   The tool also allow one to  leave items as Do No Place (DNP).       I was impressed that it found almost everything I wanted out of the box.   Pricing and lead time are transparent.

 

Next up is part placement:

pastedImage_21.png

Figure 4.  Macrofab Part Placement

Using the ODB++ data, the Macrofab software was able to figure out my placements.   I was thoroughly impressed with this step as it was completely automatic.      The tool allows you to nudge components if needed.    Once placements are approved, the tool will give you a snapshot of the costs.

pastedImage_23.png

 Figure 5.  Cost Analysis and Ordering

What I liked here was how transparent the process was.    Using the prototype class service, a single board was $152.  This is an absolute steal when you consider that all the of the setup costs, parts and PCBs are baked in. If you consider the value of your time, this is an absolute no brainer.    I also like that it gives you a cost curve for low volume production.      In the future, I am going to have a hard time using another service that can’t give me much data with so little work.        

I ended up ordering 3 prototype units.  Total cost plus 2-day UPS shipping was $465.67.      Note, I did end up leaving one part off the board for now:  the 1.54” IPS display.     This part requires some extra “monkeying” around as it is hot bar soldered and needs some 2-sided tape.    I decided to solder the 1st three prototypes on my bench to get a better feel for the process of using this display.  However, I am more than happy to push the BGA and SMT assembly off to someone else.

It looks like board are going to ship on the 1st of May.  I’ll post a video and update when they come in.  So far, the experience with Macrofab has been quite positive and I am eager to see the results.  Once I get the design up and running, I’ll post documentation to bitbucket.

More
1 0 1,356
Eli_H
NXP Employee
NXP Employee

In part two in this series on designing with the LPC55S69 VFBGA98 package,  I am going to show you how to use the NXP MCUXpresso SDK tools to help with physical design process.    Combining some features in MCUXpresso with my PCB tool of choice, Altium Designer, I can significantly reduce the time in the CAD process.

The first step in designing a PCB with a new MCU is to add the part into your component libraries.      Component library management can a source of passionate disagreements between design engineers.      My own view on library management is rooted in many years of making mistakes!  These simple mistakes ultimately caused delays and made projects more difficult than they needed to be.   Often time these mistakes were also driven by a desire to "save time".   Given my experience, there are a few overarching principles I adhere to.

  1. The individual making the component should also be the one who has to stay the weekend and cut traces if a mistake is made. This obviously conflicts of the “librarian/drafter” model but I literally have seen projects where the librarian made a mistake on a 1000+ pin BGA that cost >$5k.  This model was put in a library and marked as “verified”.         The person making the parts needs some skin in the game!     In this case, the drafting teams claimed they had a processing that included a double check but *no one in that process knew they context on how the part was going to be used*.     
  2. Pulling models from the internet or external libraries is OK as a starting point but it is just that,  A starting point. You must treat every pin as if it was wrong and verify. Since many organizations have specific rules on how a part should look,  you will need to massage the model to meet your own needs.   Software engineers shake their head at this rule.  "Why not build on somebody else's libraries?   It is what we do!".     Well,    A mistake in a hardware library can take weeks if not months to really solve....  The cost, time and frustration impact can be huge.   We hardware engineers can't simply "re-compile".   
  3.  I don’t trust any footprint unless I know it has been used in a successful design.  The context of how a part is used is very important (which leads to #4).
  4. I believe the design re-used is best done at a schematic snippet level, not an individual part.   After all,   once I get this Mini-Monkey board complete,  I will never again start with just the LPC55S69.  I want all the “stuff” surrounding the chip that makes it work!

To the casual observer,  these principles seems onerous and time consuming but I have found that the *save me time over the course of the project*.  Making your own parts may seem time consuming but it *does not have to be*.     There are tools that can make your life simpler and the task less arduous.        Also making your own CAD part is  useful for a few other reasons:

  1. You have to go through a mental exercise when looking at each of the pins. It forces you brain to think about functionality in a slightly different way.      When starting with a new part/family, repeated exposure is a very good way to learn.
  2. Looking at the footprint early on gets your brain in a planning mode for when you do get started.

One could argue that this is “lost” time as compared to getting someone else to do the CAD library management it but I really feel strongly that it saves time in the long run.     I have witnessed too many projects sink time into unnecessary debugging due to the bad CAD part creation.   I feel the architect of the design needs to be intimately involved and take ownership of the process.

The LPC55S69 in the VFBGA package has only 98 pins.    With no automation or tools, it would not take all that long build a part right from the datasheet.   However, it is on the edge of being a time consuming endeavor.     Also,   when I build schematic symbols, I tend to label the pins with all possible IO capabilities allowed by the MCU pin mux.  This can make the part quite large but it also helps see what also is available on a pin if I am in in a debug pinch.       Creating pins with all this detail can be quite time consuming.     I use Altium Designer for all of my PCB design and it has some useful automation to make parts more quickly.   NXP’s MCUXpresso tool also has a unique feature that can really help board designers get work done quickly.

Creating the Pin List

Built into MCUXpresso is a pins tool that is *very* useful in large projects with setting up the pin mux’s and doing some advanced planning.    While it is primarily a tool for bootstrapping pin setup for the firmware, It can also use useful to drive the CAD part creation process.       Simply create a new project and start the pins tool:

OpenPins.gif

The pins tools gives you a tabular and physical view of pin assignments.   Very useful when planning your PCB routing.    We will use the export feature to get a list of all the pins, numbers and labels.

OpenPins2.gif

The pins tool generates a CSV file that you can bring into your favorite editor. Not only do I get the pin/ball numbers,   I get all of the IO options available via the MCU pin mux. 

pastedImage_10.png

Using the Pin List To Generate Component Pins

 With just a few modifications, I can get the spreadsheet into a format useful for the Altium Smart Grid Paste Tool.

pastedImage_1.png

Altium Designer requires a few extra columns of meta-data to be able import the data into a grouping of pins in the schematic library editor.   At this point you could group the pins to your personal preference.  I personally like to see all pin function of the schematic but does create rather large symbols.         The good news here is that by using MCUXpresso and Altium you can make this a 10-minute job, not a 3 hour one.  Imagine going through the reference manual line by line!

pastedImage_11.png

pastedImage_12.png

pastedImage_14.png

pastedImage_15.png

Viola!  A complete symbol.     It just took a few minutes of massaging to get what I wanted.     Like I stated previously, a 98 pin package is not that bad to do manually but you can imagine a 200 or 300 pin part (such as the i.MX RT!) 

 

The VFBGA package is 7mmx7mm with a 0.5mm pitch.    There are balls removed from the grid for easier route escaping when use this part with lower cost fabrication processes.

pastedImage_20.png

 

Once again,   with a quick look at NXP documentation and using the Altium IPC footprint generator,   we can make quick work of getting an accurate footprint.

pastedImage_22.png

pastedImage_3.png

The IPC footprint generator steps you through the entire process.  All you need is the reference drawing.   

A quick note about the IPC footprint tool in this use case.   The NXP VFBGA has quite a few balls removed to allow of easier escaping.     The IPC footprint generator can automatically remove certain regions, I found that this particular arrangement needed a few minutes of hand work to delete the unneeded pads given the unique pattern.

 

By using Altium and NXP’s MCUXpresso tool together, I was about to get my CAD library work done very quickly.   And because I spent some time with the design tools,   I became more familiar with the IO’s and physical package.   This really helps get the brain primed for the real design work.

Preview.gif

At this point in the proces I have a head start on the schematic entry and PCB layout.     Next time we are going to dive in a bit to see what connections we need to bootstrap the LPC55S69 to get it up and running.    We will take a look at some of the core components to get the MCU to boot and some peripheral functions that will help the Mini-Monkey come alive!    

More
0 1 1,524
Eli_H
NXP Employee
NXP Employee

Now that we have discussed the LPC5500 series at a high level and investigated some of the cool features,  it is time to roll up our sleeves work on some real hardware.    In this next series of articles, I want to step through a simple hardware design using the LPC55S69.   We are going to step a bit beyond the application notes and going through a simple design using Altium Designer to implement a simple project.  

Many new projects start with development boards (such as the LPC55S69-EVK) to evaluate a platform and to take a 1st cut at some of the software development work.      Getting to a form-factor compliant state quickly can just as important as the firmware efforts.      Getting a design into a manufacturable form is a very important step in the development process.  With new hardware, I like to address all of my “known unknowns” early in the process so I almost always make my own test PCBs right away.  The LPC5500 series devices are offered in some easy to use QFP100 and QFP64 packages.      Designers also have the option of a very small VFBGA98 package option.     Many engineers flinch when you mention BGA, let alone a “fine pitch” BGA.     I hope to show you that it is not be bad as you may think and one can even route this chip on 2 layers.

pastedImage_1.pngpastedImage_2.png

Figure 1.  The LPC55S69 VFBGA98 Package. QFP100 comparison on the bottom.

The LPC55S69 is offered at an attractive price but packs a ton of functionality and processing power into a very small form-factor that uses little energy in both the active and sleep cases.     Having all of this processing horsepower in a small form-factor can open new opportunities.  Let’s see what we can get done with this new MCU.

The “Mini-Monkey” Board

In this series of “how to” articles, I want to step through a design with the LPC55S69 in the VFBGA and *actually build something*.   The scope of this design will be limited to some basic design elements of bringing up a LPC55S69 while offering some interesting IO for visualizing signal processing with the PowerQuad hardware.      Several years ago, I posted some projects on the NXP community using the Kinetis FRDM platform.   One of the projects showcased some simple DSP processing on an incoming audio signal.

pastedImage_4.png

https://www.youtube.com/watch?v=Nn7DweR--Po&list=PLWM8NW5LEukhCAvE7voge_-L8waDyQSgo&index=3&t=1s

The “Monkey Listen” project used an NXP K20D50 FRDM board with a custom “shield” that included a microphone and a simple OLED display.       For this effort I wanted to do something similar except using the LPC55S69 in the VFBGA98 package with some beefed-up visualization capabilities.       There is so much more horsepower in the LPC55S69 and we now have the potential to do neat applications such as real time feature detection in an audio signal, etc.        Also given the copious amounts of RAM in the in the LPC55S69, also wanted to step up the game a bit in the display.     The small VFPGA98 package presents with an opportunity to package quite a bit in a small space.  So much has happened since the K20D50 hit the street!

I recently found some absolutely gorgeous IPS displays with a 240x240 pixel resolution from buydisplay.com.   They are only a few dollars and have a simple SPI interface.  I wired a display to the an LPC55S69-EVK for a quick demonstration:

pastedImage_8.png

   Figure 2:  The LPC55S69EVK driving the 240x240 Pixel 1.54” IPS display.

It was difficult for me to capture how beautiful this little 1.54” display is with my camera.  You must see it to believe it!    Given the price I figured I would get a boxful to experiment with for this design project!

pastedImage_11.png

Figure 3:   240x240 Pixel 1.54” IPS display from buydisplay.com

The overarching design concept with the “mini-monkey” is to fit a circuit under the 1.54” display that uses LPC55S69 with some interesting IO:

  • USB interface
  • LIPO Battery and Charger circuitry
  • Digital MEMs microphone
  • SWD debugging
  • Buttons
  • Access to the on-chip ADC

I want to pack some neat features beneath the screen that can do everything the “Monkey Listen” project can, just better.    With access to the PowerQuad, the sky is the limit on what kinds of audio processing that can be implemented.  The plan is to see how much we can fill up underneath the display to make an interesting development platform.    I started a project in Altium designer and put together a concept view of the new “Mini-Monkey” board to communicate some of the design intent:

pastedImage_13.png

Figure 4:  The “Mini-Monkey” Concept PCB based upon the LPC55S69 in the VFBGA98 package

While this is not the final product, I wanted to give you an idea of where I was going.      The “Mini-Monkey” will be a compact form fact board that can be used for some future articles on how to make use of the LPC5500 series PowerQuad feature.   There will be some extra IO made available to enable some cool new projects to showcase the awesome capabilities of the LPC55S69.    Got some ideas for the "Mini-Monkey"?    Leave a comment below!

In the next article we will be looking at the schematic capture phase and how we can use NXP’s MCUXpresso SDK to help automate some of the work required in Altium Designer.     I will be showing some of the basic elements to getting an LPC55S69 design up and running from scratch.      We will then look at designing with the VFBGA98 package and get some boards built.   I hope I now have you interested so stay tuned.   In the meantime, checkout this application note on using the VFBGA package on a 2-layer board:

https://www.nxp.com/docs/en/application-note/AN12581.pdf

More
1 0 1,576
Eli_H
NXP Employee
NXP Employee

I recently wrote about the ample processing capabilities built into the LPC55S69 MCU  in addition to the Dual USB capabilities and large banks of RAM.  Now it is time to explore some peripherals and features that are often overlooked in the LPC family but are very beneficial to many embedded system designs.

The State Configurable Timer

An absolute gem in the LPC family is the “State Configurable Timer” (SCT).      It has been implemented in many LPC products and I feel is one of the most under-rated and often misunderstood peripherals.    When I first encountered the SCT, I wrote it off as a “fancy PWM” unit.   This was a mistake on my part as the SCT is an extremely powerful peripheral that can solve many logic and timing challenges.     I have personally been involved in several design efforts where I could remove the need for an additional programmable logic device on a PCB by taking advantage of the SCT in an LPC part.  At its core, the SCT is a up/down counter that can be sequenced with up to 16 events.   The events can be triggered by IO or by one of 16 possible counter matches.   An event can then update a state variable, generate IO activity (set, clear, toggle), or start/stop/reverse the counter.

Consider an example which is similar to a design problem I previously used the SCT for.

Given a 1 cycle wide Start input signal


i.) Assert a PowerCtrl signal on the 3rd Clk cycle after the start.
ii.) After 2 Clk cycles the assertion of PowerCtrl, output exactly 2 pulses on the Tx output pin at a programmable period.
iii.) 5 Clk cycles after ii.), de-assert PowerCtrl
iv.) After 2 Clk cycles of the de-assertion of PowerCtrl, output a 1 cycle pulse to the Complete pin.

pastedImage_1.png

This task could be done in pure software if the incoming CLK was slow enough.    Most timer/counter units in competing MCUs would not be able to implement this particular set of requirements       In my use case (an acoustic transmitter), I was able to implement this completely in the SCT with minimal CPU intervention and no external circuitry.     This is a scenario where I might consider an external CPLD or FPGA but the SCT would be more than capable of implementing the behavior.    I highly recommend grabbing the manual for the LPC55 family and read chapter 24.   If you have never used a peripheral like the SCT, I highly recommend learning out about it. 

  

Programmable Logic Unit

In addition to the SCT, there is a small amount of programmable logic in the LPC55 family.       The PLU is an array of twenty 5-input Look up tables (LUTs) and four flip-flops.    From the external pins of the LPC55xx, there are 6 inputs to the PLU fabric and 8 outputs.     While this is not a large amount of logic, it is certainly enough to replace some external glue logic you might have in your design.  There is even a free tool to draw your logic schematically or describe using the Verilog HDL.

pastedImage_2.png

I often find I need a just handful of gates in a design to glue a few things together and the PLU is the perfect peripheral for this need.

LPC Boot ROM

Another indispensable feature that has been in the LPC series since the beginning is a bootloader in ROM.   For me, it is a must have as it means I can program/recover code via one of many interfaces without a JTAG/SWD connection.     For factory/production programming and test, it saves quite a bit of hassle.    The boot rom allows device programming over SPI, UART, I2C or UART.   I typically use the UART or USB interface with FlashMagic.     This feature has certainly benefited me on *every* embedded project, especially when it comes to production programming and test.   There have even been some handy times to recover a firmware image in field.     Many designs included some sort of bootloader and having an option that is hard coded in ROM is a great benefit that you get for free in the LPC family.

It is difficult to capture all the benefits of the new LPC55 family, but we hope you are interested.    The LPC55 family is offered many convenient IC packages, is low power (both active and sleep) and is packed with useful peripherals.       The LPC55S69 development board is available at low cost.   Combining the low cost hardware tools with the MCUXpresso SDK, you can start LPC55 development today.   From here we are going to start looking at some interesting how-to’s and application examples with the LPC55 family.   Stay tuned and visit www.nxp.com/LPC55S6x to learn more.

More
0 2 1,234
Eli_H
NXP Employee
NXP Employee

I recently wrote about the ample processing capabilities built into the LPC55S69 MCU. In this article I am going to highlight some very useful IO interfaces and memory.

Dual USB

One killer feature in some of the other LPC parts (for example the LPC4300 series and the LPC54000 series) is the *dual* USB interface. Dual USB enables some very interesting use cases and It is something that sets the LPC portfolio apart from its competitors. For the LPC5500 MCU series, High-Speed USB and Full-Speed USB with on-chip PHY features are fully supported, providing up to 480Mbit/s of speed. Let’s examine a scenario I comonly encounter.

 

In my projects, I like to have both USB device and USB host capabilities on separate connectors.   Instead of using USB On-the-Go (OTG) with a single connector, it has been my experience the many deeply embedded and industrial projects benefit from separate connectors.  Consider the arrangement in figure 1.

 pastedImage_1.png

Figure 1:   Dual USB with FAT File System, SDIO and CDC.

On the device side, I almost always implement a mass storage class device along with a communications class device.   The mass storage interface is connected to the SDIO port through the FATFs IO Layer so a PC can access sectors on the  SD card.   FatFS  is my go library for embedded FAT file systems.  It is open source and battle tested.    While I choose to always pull the files from author’s siteMCUXpresso SDK has FatFS  built in.   With this file it can be easily copied between a PC and the LPC5500 system.   Data logging and configuration storage is now built into your application.   The CDC interface can provide a virtual COM port interface to implement a basic shell.     

I use the USB host port for mass storage as well.   Like the SDIO interface, I connect the host drivers (examples in the MCUXpresso SDK) to through FatFS  IO layer so my system can read write files on a thumb drive.       One very useful application in my projects is a secondary bootloader.  There have been several products I have worked on that required field updatability, but the users do not necessarily have access to a PC.   

  

To update the system, data files and new firmware can be placed on a thumb drive and inserted into the LPC5500 system.   A bootloader can then perform necessary programming to update the internal flash.         In additional firmware updates, the host port could also be used to copy device configuration information.   A technician would just carry a USB “key” to update units.     Having both USB device and host using the two LPC55S69 USB interfaces can unlock many benefits.  

With the SDIO interface and USB host, one is not limited to the more common SD cards and thumb drives.  There are other options for more robust physical interfaces.    Instead of a removable SD card,   a soldered down eMMC can be used.      For the USB host interface, there are rugged “DataKey” options available.    Also note that that the DataKeys come with an SDIO interface as well.

 pastedImage_3.png

 

Figure 2:   Rugged Memory Options.   DataKey (Left) and eMMC (Right)

 

One last tidbit is that the SDIO interface can also be used to connect to many high speed WIFI chipsets.   It is an option that is easy to forget about.

Copious amounts of RAM

While I certainly came up in a time where RAM was sparse, I love having access to a large amount lot of it.    At 360KB of RAM, there is no shortage of RAM in the LPC55S69!      Relating to the USB and file storage application, large RAM buffers can be important for optimizing for transfer speeds.     It is common to write SD cards and thumb drives in 512-byte blocks.       This transfer size however is not always the most optimum case for overall speed.    The controller in the memory cards has to erase internal NAND flash in much larger sector sizes resulting in slow write performs   It has been my experience that queueing up data until I have at least 16KB can improve overall transfer speeds but up to an order of magnitude. In most of my use cases, I implement a software cache of at least 16KB to speed transfer of large files.     Larger caches can yield better results.     These file system caches can consume quite a bit of memory, so it is very helpful that the LPC5500 series has quite a bit of RAM available.

Given the security features of the LPC55S69, the extra RAM can make integration of SSL stacks for IOT applications much simpler.     One example is the use of WolfSSL for implementation of SSL/TLS.  While it targets the embedded space, SSL processing can be complicated and require a significant amount of stack and heap.      In one particular use case I had with an embedded IOT product, I needed 35k of Stack and about 40kB of heap to handle of the edge cases when dealing with connections to the internet over TLS.        The large reserve of RAM in the LPC55S69 easily allows for these larger security and encryption stacks.

 

Another use for the large memory capability is a graphics back-buffer.     It would be simple to hook a high-resolution IPS to the LPC55S59 and be able to store a complete image back buffer in memory.  For example a 240x240 IPS display with 16-bit color depth would require 112.5KiBytes of RAM!    There is plenty of RAM left in the LPC55S69 for your other tasks.  In fact, you could dedicate one of the CPUs in the LPC55S69 to handling all the graphics rendering.   The copious amount of RAM enables neat applications such as wearables, industrial displays and compact user interfaces.

pastedImage_5.png

Figure 3.   A 240x240 IPS Display with SPI Interface from BuyDisplay.com

 

One other important aspect to the RAM in the LPC55S69 is its organization. It is intelligently segmented (with 272Kb continuous in the memory amp) via a bus matrix to allow the Arm® Cortex®-M33 cores, PowerQuad, CASPER and DMA engine access to memory with minimal contention between bus masters.

 

pastedImage_8.png

 Figure 4.   LPC55S69 Memory Architecture.

 

The LPC5500 Series offers a lot in a small, low power package. The large amount of internal SRAM and dual USB interface enables many applications and makes development simpler. Stayed tuned for part 3 of the LPC5500 series overview. I will be further examining some interesting peripherals in the LPC5500 series that set it apart from its competition.

For more information, visit: www.nxp.com/LPC55S6x.

More
3 2 1,532
Eli_H
NXP Employee
NXP Employee

Most of my life, programming and embedded microcontrollers has been a passion of mine.  Over the course of my career I have gained experienced on many different architectures including some that are very specialized for specific applications. Even with current diverse market of specialized devices,  I continue to find the general-purpose microcontroller market the most interesting. I believe this stems from how I first fell in love with computing. It can be traced back to the 7th grade when we were learning “Computer Literacy” with the Apple IIe computer. During the course, students learned how to code programs in the BASIC language. Projects spanned everything from simple graphics, printing and games. Simultaneous to that experience, I learned that my other 7th grade passion, playing the Nintendo, was connected to the activities in computer literacy. Through a popular gaming magazine, I discovered that the chip that powered the Nintendo was the device that powered the computers at school, the venerable “6502”. That was the real moment of epiphany. If a CPU could be both a gaming system and a word processor,  it could really *do anything* I wanted. It wasn’t long before I was digging into the intricate details of the 6502 to power my creations. The 6502 was my 1st general purpose CPU.

Fast forward 30 years … The exact same principal applies today. We have an incredible amount of power in small packages. There is a lot you can accomplish with seemly little. I am always on the lookout for new parts that may appear to be “vanilla” on the surface but have some hidden gems that really help me accomplish cool projects. The NXP LPC5500 series really appealed to my sensibilities as I immediately saw features that make it relevant to today’s design challenges. In the coming weeks I want to highlight some features of the LPC5500 series. This is not intended to be an all-encompassing review of the LPC5500 series, but I hope to hit on some highlights that could be beneficial to your design challenges. In this article we are going to focus a bit on the LPC55S69 device and its core platform. There is a lot under the hood!

First – It is actually 4 processors in 1!

From the block diagram in figure 1, one can see that there are two Arm® Cortex®-M33 cores. This by itself is an extremely useful feature given the low cost and low active power aspects of this device. I have made good use of the other LPC families with asymmetric cores (such as the LPC43xx device with a Cortex®-M4 and -M0).  Having a 2nd core is very useful in offloading common tasks. In my experience with the LPC43xx, I used the Cortex®-M0 as a dedicated graphics co-processor to offload UI tasks from the Cortex®-M4 while was doing other time critical DSP operations.

In the case of the LPC55S69, both cores are Cortex®-M33.  The Cortex®-M33 is a new offering from ARM based upon the ArmV8-M Instruction set architecture.  Like the Cortex-M4, it has hardware floating point and DSP instructions but also includes TrustZone.  TrustZone enables new security states to ensure your critical code can be protected.    Another notable new feature is a co-processor interface for streamlining integration with dedicated co-processors.   This feature is germane to the LPC5500 series as there are 2 coprocessors that we are about to talk about.   You can learn more about the Cortex®-M33 here.  

 

I can’t count the number of design scenarios where I wished I had an extra programmable CPU that could handle a task that might be extremely time critical but not actually need a lot of code space. For example, I have used OLED displays that have a non-standard I/O interface that needs bit-banged.  It became a great opportunity to have the 2nd core do the work. You could even turn that 2nd core into a small graphics co-processor.

pastedImage_2.png

Figure 1.  The LPC55S6x MCU Family Block Diagram

I mentioned four processors. So, where are the 3rd and 4th processors? Number three is hidden in the “DSP accelerator” block. The Cortex®-M4 core of which many other LPC microcontrollers are built upon have DSP specific instructions that can accelerate certain math functions. I have given seminars at the Embedded Systems Conference on using the DSP instructions in a general-purpose CPU scenario. The LPC55S69 DSP accelerator (A.K.A . PowerQuad) is a separate core whose sole purpose is to accelerate DSP specific tasks. While PowerQuad is not a pure general purpose CPU, it can perform tasks that would significantly burden one of the Cortex-M33 cores. In many cases you can get a 10x improvement over convention software implements of certain algorithms. PowerQuad covers all the common use cases such as Fast Fourier Transforms (FFTs), IIR filters, convolution, trigonometric functions and matrix math. It has enough “brains” to do almost all the work so your main general purpose CPU(s) are free for other tasks. The PowerQuad is enabled by a very specific new feature in the Cortex-M33 (ARM®v8‑M specifically) that allow for coprocessors to be connected to the CPU through a simple interface. Data transfer to the coprocessor is low latency and can sustain a bandwidth of up to twice the memory interface to the processor.

Lastly,   the 4th processor is another specialized core called “CASPER”. CASPER is high performance accelerator that is optimized for cryptographic computations. At its core, CASPER is a dual multiply-accumulate-shift engine that can operate of large blocks of data. CASPER has special access to 2 blocks of RAM so data can be accessed parallel. Applications of CASPER include accelerating cryptographic functions such as public key verification (i.e. TLS/SSL), hash computations or even blockchain. As CASPER is a general math engine, it is also possible to perform DSP operations in parallel with the PowerQuad. With a little bit of imagination, one could achieve quite a bit with minimal intervention from the general-purpose Cortex®-M33 cores.

pastedImage_1.png

Figure 2.  PowerQuad (Left) and CASPER (right) Accelerators

While the PowerQuad and CASPER processing engines are not technically a 3rd and 4th general purposes cores, they can easily do the work that you might normally require of an entire CPU. We will be talking much more about these features in the future but the key take-away:

The PowerQuad DSP and CASPER accelerators are a powerful math engines that can allow you to number crunch a rate similar to dedicated DSPs. All this while still reserving your generally purpose processors to handle other system tasks.    

All of this functionality is delivered on a low power 40nm process technology packaged in approachable footprints at a low price point. Interested yet?  I know I am!

For more information, visit: www.nxp.com/LPC55S6x.

More
5 4 27.9K
brendonslade
NXP Employee
NXP Employee

Embedded Artists are having a Winter Sale, offering the LPC54018 IoT module for only 5 Euros:

LPC54018 IoT powered by Amazon Web Services (AWS) - Embedded Artists 

The baseboard to accompany the module is also reduced to only 20 Euros!

More
0 0 262
omar_cruz
NXP Employee
NXP Employee

If you are searching for a high-performance, power efficient, yet cost sensitive MCU for your designs, then here is the exciting news: NXP recently introduced the LPC51U68 MCU.

LPC51U68 Chip Image.png

Power efficient solution for future IoT designs

Based on the Arm® Cortex®-M0+ core, the LPC51U68 MCU pushes the performance of the core to 100MHz, which is more than two times faster than current Cortex-M0+-based products. The LPC51U68 MCU also provides expanded memory resources of up to 96KB on-chip SRAM and 256 KB on-chip flash programming memory with flash accelerator. It also features unparalleled design flexibility and integration including a USB 2.0 full-speed device controller supporting crystal-less operations, eight flexible serial communication peripherals, each of which can be enabled as UART, SPIs or I2C and up to two I2S interfaces. The LPC51U68 MCU integrates a variety of timers including three general purpose timers, one versatile timer with PWM (SCTimer/PWM), one RTC/alarm timer, a multi-rate timer and watchdog timers. On the analog side, an on-chip 12-channel ADC with a 12-bit resolution and conversion rates at up to 5Msps and temperature sensors are provided. With all the features integrated, the LPC51U68 MCU brings unparalleled design flexibility, computing performance and integration into today’s demanding IoT and industrial applications.

LPC51U68 Block Diagram.jpg

Extraordinary compatibility

While considering the LPC51U68 MCU as an upgrade of the LPC11U68 MCU family, it provides pin-function compatibility with Arm® Cortex®-M4 based LPC5410x and LPC5411x MCU families in the same packages and pinout versions enabling a smooth transition to the power-efficient MCUs based on Arm® Cortex®-M4 core.

 

Low power design for energy efficiency

While providing excellent computing power with the Arm Cortex-M0+ core, the LPC51U68 MCU displays ultra-low-power consumption and a unique low-power design. The microcontroller supports four low-power modes and API-driven power profiles, providing developers with easy-to-use dynamic current management at runtime and fast wake-up times from the microcontroller’s reduced power modes.

LPC51U68 Run Currents.png

 

Make your design easier with tools supported

LPC51U68 MCUs are fully supported by NXP’s MCUXpresso software and tools which brings together the best of NXP’s software enablement into one enablement platform for a shared software experience across a broader set of Arm® Cortex®-M MCUs. In addition, this new MCU is supported by the LPCXpresso51U68 development board, designed to enable evaluation and prototyping with the LPC51U68 MCU. The board features an on-board, CMSIS-DAP / SEGGER J-Link compatible debug probe, expansion options based on Arduino UNO and PMod, plus additional expansion port pins and more

 

LPCXpresso51U68 Development Board

 LPC51U68 Development Board Image.png

To learn more about LPC51U68 MCU, visit http://www.nxp.com/LPC51U68http://www.nxp.com/LPC51U68

 

Live at Computex 2018 this week is: The High Performance Gaming Mouse Controlling Hundreds of Full Color LEDs Powered by LPC51U68

  • A 100MHz Arm® Cortex®-M0+ delivering real-time response for game player
  • 96K SRAM for LED pattern allows for a smooth transition 
  • Built-in USB drivers in ROM and supports 1K report rate
  • 8 Flexcomm serial channels to drive up to 800 LEDs with full color control at the same time

More
2 5 1,753
brendonslade
NXP Employee
NXP Employee

The LPC800-DIP board is now being sold by Coridium for just $10:

http://www.coridium.us/coridium/shop/boards/bd07-special

 snipcart-thumb-image

They have their own version of gcc and BASIC running on it too, available for just a few $$

http://www.coridium.us/coridium/shop/software/s07-basic-lpc824

Enjoy! 

More
1 0 552
justinbmortimer
NXP Employee
NXP Employee

Nice project from Kevin Townsend showing off capabilities of LPC824 with it's state configurable timer to drive Neopixel and IR distance application.

LPC824 NeoPixel IR Distance Sensor

More
1 0 567
justinbmortimer
NXP Employee
NXP Employee

I love seeing how our LPC community uses our microcontrollers ... keep sharing what you create and let's continue to support and invest in each other.  

Pokitto LPC11U68 Game Gadget on Kickstarter - EB sold out soon!

More
1 0 406
peter_furtner
NXP Employee
NXP Employee

LPCXpert V3.4 is the latest release of a freeware expert tool for the CORTEX-M based LPC families of microcontrollers. This tool simplifies the selection of a MCU device, speeds up the creation of application code and initialization code and supports generation of an application specific schematic Symbol. This version supports more than 410 different CORTEX-M based micro controllers from NXP.

 

LPCXpert supports all phases of a development. During the MCU selection phase LPCxperts supports selection of a target MCU by providing selection features in the "MCU Select" tab. During the software implementation phase LPCXpert provides a graphical user-interface to configure the pinout (Pin-MUX) and the peripheral interfaces of the target device. LPCXpert then also generates projects providing a framework of reference applications. These applications configure the Clock Generation Unit (CGU) and the on-chip peripheral interfaces of the device to test and demonstrate the setup.

 

New and enhanced features include support for LPCopen software package from NXP. Features also include generation of a Schematic Symbol for the ALTIUM Designer and the CADSOFT EAGLE V6.2 and generation of projects for the NXP LPCXpresso and MCUxpresso IDE, IAR Embedded Workbench (EWARM), Keil µVision and GNU C-Compilers, as well as links to Internet Sites for additional information.

Using LPCXpert it is possible to set the pins of each peripheral (i.e. for SPI, CAN., I2C, EMC, ETH, ...) and to configure the features of each pin (Pull-Up, Pull-Down, ...). In addition LPCxpert V3.4 also supports configuration of pre-built demo code for the LPC8xx and LPC54xxx Families of MCUs.

 

Based on the configuration LPCXpert may generate a C-Code Project or a Schematic Symbol. In addition LPCxpert saves up to 8 different pin-mux configurations and restore from up to 10 different configurations. Additional Information and the download is available from the following Web-Site:

--> http://www.lpcxpert.com

More
1 1 1,196
justinbmortimer
NXP Employee
NXP Employee

Back in Austin TX after a fun trip to Nürnberg, Germany.

I am humbled and energized after spending a week with 30,000+ engineers at this year’s Embedded World!

IMG_1192.JPG

Amazing.  That’s the only way to describe the passion and enormity of our LPC FANS across Europe.  LPC is deep-rooted in the hearts of many and I am lucky to be a part of this inspired, tightknit community.    Embedded World has a strong place in the heart of LPC. 

 

My personal highlights from the event .... LPC FANS, What's Popular, MCUXpresso & Geoff.

  • LPC FANS.

My favorite experience of the event was standing proudly at the LPC pedestal shaking hands with 1000s of LPC FANS.  I enjoyed connecting with each of you, hearing about your success, ideas and future needs.  Everything begins with great people … and we will continue to learn and find inspiration from you.  Thank you for your guidance as we build our next generation of differentiated microcontrollers!

IMG_1221a.JPG

  • What's Popular?  LPC800 & LPC54000 demos & give-aways!

The 8-bit MCU market is moving to the 32-bit world and we are excited to show off the cool features of the LPC800 series, but EW was really more focused on the LPC54600 family.  High performance and integration for power-sensitive applications.  We showed off a variety of demos and partner solutions at the pedestal.  Stunning, low-power, cost effective GUIs made easy with Embedded Wizard and TouchGFX.

Longtime NXP partner, Embedded Systems Academy showcased the dramatic improvements the LPC5461x family of CAN-FD controllers can make in various industrial applications.

IMG_1208.JPG

With our partners, we gave away tons of LPC boards, from our power-optimized, full speed USB enabled LPC54114 board, to our newest LPCXpresso54608 platform.  And over 200 engineers & students left on the final day with an LPC800 DIP boarda very fun platform to experience what everyone is talking about ... LPC800!  Much more to come this year, stay tuned!

  • MCUXpresso.

NXP spent a year working on a unified development experience and Embedded World was the near final step in our MCUXpresso roll-out.  Erich Styger & Andy Beeson did an unbelievable job showing our new tools, read more at Erich's blog.

  • Geoff Lees

pastedImage_10.png

#1, LPCFANS and I both loved seeing Geoff engaged and interacting with numerous customers, fans and partners.  Many of you commented how "cool" it was to see Geoff approachable and engaged throughout the event. Year after year, his commitment to the industry and his visible presence at Embedded World is inspiring.  I am not sure where he finds his energy!

#2, Geoff stole the show (like only Geoff can do). Check out Junko Yoshida's article in EETimes, where Geoff   "unveiled a sweeping plan to broadly migrate design and production of general-purpose processors and microcontrollers from CMOS nodes to the FD-SOI" ... all-in on 28 FD-SOI.  Get ready for our next generation of breakthrough processors and microcontrollers.

Thank you to the NXP team and valued partners for your hard work making this event a huge success!  

IMG_1195.JPG

IMG_1204.JPG

And to our LPC FANS, thank you for continuing to believe ... until next time! 

  Justin

More
1 2 10.7K
justinbmortimer
NXP Employee
NXP Employee

We want people all over the world to get their hands on an LPC54608 development board (OM13092) - not just our large customers, so here's your chance ... 

pastedImage_2.png

By December 31, 2016, start your own discussion (put LPCXpresso54608 in your title) on our community in which you share how you might use a new or existing LPC microcontroller in an application, maybe it's a new feature or product you want to try, could be something you've already created or a new idea you want to create in 2017.  Be creative and have fun sharing ideas.  Up to 100 submissions (non-NXP employees) will be selected to receive an LPC54608 board (OM13092).  

thanks for sharing!

Justin

More
0 6 3,757
justinbmortimer
NXP Employee
NXP Employee

The year when two titans came together forming what is now the #1 supplier to the broad based MCU market.  To the industry, it was the year of the merger, clouded in uncertainty.

Not so for LPC. 

    While the world wondered, what might happen?  ... LPC simply got to work.

For the LPC team, 2016 marks a year of celebration – surpassing our goals and closing one of the most successful years in history.  But it was much more than just a banner year shipping microcontrollers, so many milestones were achieved,

  • we reinforced our core team, a collection of new and familiar faces
  • defined a new product roadmap & go to market strategy
  • rebuilt relationships with our global distributors and partners
  • listened to feedback and took immediate actions -
  • restructured & strengthened our global support infrastructure - an ongoing process
  • launched the LPC83x & LPC5411x families to market, establishing our team’s clear direction
    • LPC800 an entry-level 8-bit alternative at the right price
    • LPC54000 the mainstream MCU series for everyone
  • strengthened our ecosystem, traveling the world, rebuilding trust and visiting customers

For me personally, getting to know all of our LPC fans, customers and partners has been such a gratifying experience. Honestly, I have never met such a committed and hard working network of people supporting and using LPC around the world; you're the ones that make a world of difference, thank you!  

The energy level and encouragement is amazing!  I've loved our conversations together at events, comments on the community, and emails directly from you!

With 2016 coming to a close, it’s clear that LPC is NXP’s general purpose microcontroller business for the broad market.  We focus on our customers through products that are easy to use, well documented and supported online in this community.

Now what?

As we enter 2017, our campaign to inspire creativity across the globe moves to its second phase.  Focus on introducing what is quickly becoming our flagship microcontroller family, LPC546xx, which has the humble honor of following in the huge footsteps of the one of the most recognized and successful microcontrollers in history, LPC1768.  Later, we move to completing the highly anticipated expansion of the LPC800 MCU series.

Final Reflections

So, how will 2016 be remembered to us?  It was the year when a small group of hard working individuals came together as one team to build something great again.  Build something that has a lasting impression and changes the industry forever.  

What comes next from LPC is coming soon.  And we cannot wait to tell you about it.

thank-you-languages-post.png

Thank you for continuing to trust and believe in LPC.  

  Happy holidays and see you in 2017!

     - Justin Mortimer, Marketing manager

More
0 1 5,547
justinbmortimer
NXP Employee
NXP Employee

Our global ecosystem - from NXP field engineers to distributors and partners are training up on LPC and now more specifically, the newest LPC546xx family.  

But we also want to make sure the engineers actually using our MCUs in their application receive the same information directly from us ... we will continue to keep you updated & informed.

Links to a few of the recent presentations prepared by our LPC AE team,

State Configurable Timer

LPC key feature_SCT 

SPI Flash Interface

LPC key feature_SPIFI 

Graphics LCD Controller

LPC key feature_LCD 

Dual Core Architecture (found in LPC541xx)

LPC key feature_DualCore 

More
2 1 589
jessegarcia-b45
NXP Employee
NXP Employee

Links to material being referenced:

 

LPCXpresso IDE Download

 

Setup Guide

 

This guide will be the first of many entries where I will show you how to get started with LPC. Today this entry will focus on setting up the IDE and highlighting which products are supported by LPCXpresso IDE.

  1. Visit the link at the top of the post that will direct you to the LPCXpresso IDE page. As of this time, the current version is v8.8.2
  2. Click the gray download link1.pngNote: You will need an account in order to download the IDE. Login or create an account
  3. Once signed in you will be presented with the following window2.pngThis guide assumes you will install on Windows. The steps will be more or less the same regardless of which operating system is utilized for the installation. We will register the software in a later step.
  4. Once you've selected your operating system you will be presented with the following options

    3.pngIt is always recommended that you download the most recent version of the IDE but links are provided for previous editions, if necessary.  Clicking on the link automatically starts the installer. Each installer serves as a standalone package. If you are upgrading to the newest version, keep in mind that the old version remains on the computer. You may opt to manually uninstall old versions.
  5. Once you launch the installer and agree to the licensing terms, you will be prompted for an installation directory. Use the default directory.4.png

    Note: C:\nxp contains all LPCXpresso installations. You can open previous versions here if needed.

  6. Once the software installation finishes you will be prompted to install various drivers. You can select "Always trust software from 'NXP Semiconductors USA. Inc.'" to not have to individually approve each driver's installation.

    5.jpg

  7. You will be presented with the following window once the installation process has completed. You are free to review the version documentation and the IDE User Guide if you wish.

    6.jpg
  8. Once you launch LPCXpresso for the first time, you will be presented with the following window letting you know that you do not have an active license for the IDE. This limits you to debugging code up to 8k in size.7.jpg
  9. In order to increase this limit we need to request a free license by clicking on help in the task bar. Then scrolling down to "Activate" and selecting "Create serial number and register (Free Edition)"8.jpg
  10. A new window will come up with your serial number as shown below. Select "Open in external browser" to open up a browser window to generate the activation key.10.png
  11. Once the new browser window loads you will be presented with your activation key listed below the serial number. Highlight and copy this key.11.png
  12. We are going to follow a similar process to what we used to request the activation key but instead we will select "Activate (Free Edition)"12.jpg
  13. Paste the activation key into the new window that pops up12.png
  14. Once you press OK you will receive confirmation that your copy of LPCXpresso has been licensed. This allows you you to use all of the features of the IDE as well as raising your debug limit to 256k.
  15. You will be prompted to restart LPCXpresso and when it relaunches the welcome page will show that your copy is fully activated.

    13.jpgNote: Once you have an activated key, this key will also be utilized by MCUXpresso in the future. 

This tutorial demonstrated how to set up the free edition of LPCXpresso, however, activating the Pro edition is very similar.

As of this writing LPCXpresso IDE v8.8.2 can be used to develop on the following platforms:

  • LPC81x/LPC82x/LPC83x
  • LPC11xx
  • LPC11Uxx
  • LPC11Exx
  • LPC12xx
  • LPC13xx
  • LPC15xx
  • LPC17xx
  • LPC18xx
  • LPC2xxx
  • LPC3xxx
  • LPC40xx
  • LPC43xx
  • LPC5410x/LPC5411x

Next week, I will demonstrate how to install and setup IAR and KEIL for LPC. In the coming weeks, once I have shown you how to configure the software environments I will post getting started guides with different LPCXpresso development boards. Stay Tuned!

More
1 0 1,355
jessegarcia-b45
NXP Employee
NXP Employee

In case you missed it, we extended our low cost LPC800 series, back in September with the addition of the LPC83x family. The LPC83x family introduces new functionality to our streamlined LPC800 series, which includes LPC81x and now LPC83x.  If more functionality is needed, our extremely popular superset LPC82x family is likely the one you need.

 

The LPC800 series is great 8- and 16-bit alternative for use in various systems, such as end node connectivity, gesture sensing for HMI, basic motor control, power line communication, battery power management … applications are endless from IoT smart home to building control, industrial automation, children’s toys, and more.

LPC83xBlockDiagram.png

The LPC83x family includes option for 32 kB flash, with the addition of 18 ch DMA and up to 12 channel, 12-bit ADC.  Rich capability bundled with a low price has allowed LPC800 series to become the most actively quoted and fastest growing LPC family to date with millions of units shipping in 2016.

 

The LPC832, available in TSSOP20 with 16kB of flash, and LPC834 available in HVQFN33 with 32 kB of flash, is just the start of the LPC800 relaunch.  Just wait until 2017 when many new product families are launched to market!

OM13071_PC824v2_LR_Main.jpg

If the LPC83x fits your requirements for your next design, the recommended board to purchase is the LPC824-MAX (OM13071) and using the free LPCXpresso IDE, you can use code bundles for the LPC800 series to speed up your design.

For more information on the LPC83x family please visit the links below

LPC800 Series Summary

LPC832 Product Page

LPC834 Product Page

LPC824-MAX Development Board

LPCXpresso IDE Summary

More
0 0 645
justinbmortimer
NXP Employee
NXP Employee

So I am getting ready for Electronica and brought a bunch of boards home.  I was testing and packing everything when my daughter got her hands on one of the demos ... this is what happened.

Ok, so the title is somewhat of a stretch ... my daughter is 3, but my son is 2!  And although they didn't development the demo & flash the board ... they sure did love playing with the coloring demo!!

The demo my kids are fighting over is taken from our SDK using Segger emWin's graphics library.   It was really nice and easy to get up and running right out of the box.

This was the first time my kids were actually interested in my work :smileyhappy:

More
0 0 397
justinbmortimer
NXP Employee
NXP Employee

Our first software development kit (SDK) based leveraging the nice touch display :smileyhappy:  ... it may look simple, but the board is alive and working well! 

And if you didn't see our LPC54608 family introduction ... check it out here: Introducing the First LPC546xx Family

More
2 2 809
justinbmortimer
NXP Employee
NXP Employee

If you haven't heard, our LPC54000 power-efficient platform is expanding to address what our 10,000+ customers have been waiting for patiently ... an extension to some of our most popular devices in the market, including LPC1700.

The time has come ... lead customers are developing on the LPCXpresso54608 board now!

By the way, our LPC54000 platform may have a different part number nomenclature compared to what you have grown to love in our existing LPC families, but in the end, it's just a part number for ordering and marking.  So let me introduce the first of a few new LPC546xx families coming very soon ....

LPC5460x

This is our Baseline HMI & Connectivity MCU family for the industrial, IoT and general embedded markets, a great starting point for your development, which is why we will being this to market first.

The family leverages a 180MHz Cortex-M4 and fits a sweet spot with respect to balancing power and performance.  But the it is far more than just another Cortex-M4 and also more than just an LPC1700 series upgrade.  With options for up to 512Kbyte of Flash, 200Kbyte of SRAM, 16Kbyte EEPROM, and additional memory interface options, we thinkyou will like what you see.

e've combined some new features, such as a 12MHz power-optimized free running oscillator, trimmed to 1% accuracy over temperature and voltage.  A 5Msps 12-bit, 12-channel ADC, along with a bunch of peripherals you have grown to love, such as FS & HS USB, CAN, Ethernet, and a Graphics Display Controller.

With over 21 communication interfaces, including 10 Flexible Serial Interfaces, the key for this product family is flexibility, giving you option to customize the device for your application.

Standby as we start going deeper into the product details and software of this new product family!!

#LPCisBACK

More
2 12 1,878
justinbmortimer
NXP Employee
NXP Employee

Hi!  we love staying connected with all of our LPC customers, which is why it would be great if you could follow our LPC MCU community by clicking on the link from the main forum page.  This allows us to stay better connected and ensure you always receive the latest information direct from the product line.   Portfolio updates, announcements and more will be shared, so stay tuned.  By the way, we have a few design challenges planned and definitely want to make sure you hear the latest details as they are shared.  You can stop following at any time.

In case you missed any of our recent portfolio and roadmap blogs, you can find the direct links below,

  1. An Update on our Microcontroller Portfolio
  2. #LPCisBACK ... are you Ready for the Relaunch?
  3. LPC Outlines MCU Roadmap for the Broad Market

follow along.png

I added a direct link for easy reference to the recent press release where NXP reinforced their strong commitment to the LPC portfolio and our broad market customers. 

NXP’s Microcontroller Business Reinforces Market Commitment with Extended Product Longevity of Key L...

The LPC team is really excited with what is in store, so stayed tuned for the latest updates.

lpcisback‌

More
0 0 608
justinbmortimer
NXP Employee
NXP Employee

Whooohooo  ... LPCXpresso54608 boards have arrived! ... development teams are hard at work finishing our release and preparing demos, lead customers have them in hand and we're looking good for our full market launch!  I can't wait for all of you to get a board in the coming months, so you can start developing and sharing what you have using the latest LPC MCU

#LPCisBACK

LPC54608_1.jpeg

LPC54608_2.jpeg

More
2 1 876
justinbmortimer
NXP Employee
NXP Employee

The newly assembled LPC team has been working hard since day one and now, we are ready to unveil our official plans!

As we approach our relaunch, which starts this November, we will be sharing all of the details on this forum with our 10,000+ customers.  First up, the 2017 product roadmap ... followed by regular updates on LPC5460x, our upcoming launch, which kicks-off our busy year of product announcements and launches.

Our business is built on trust and innovation, which is why we want you, our followers, to hear our exciting plans first and direct from the product line.  So stay tuned and share with your friends and coworkers.

LPC is really proud and excited for what we have planned.  And we think you will feel the same.

I hope you are ready ... #LPCisBACK

More
2 0 420
justinbmortimer
NXP Employee
NXP Employee

Dear LPC and Kinetis customers, partners and enthusiasts,

It has been roughly 9 months since LPC and Kinetis merged under the new NXP. Both LPC and Kinetis teams are now under the same business line. Everyone has been working diligently to make sure the basic infrastructures are not broken and customers using either LPC or Kinetis are well supported.

We hope word has gotten out that both product families are doing well.  We have you to thank.

 

Which Family will Survive the Merger?

The answer is simple.  Both.

Over the past 9 months, both families have been introducing new products to the market and more are under way. Both LPC and Kinetis are the top 5 ARM-based MCU vendors and both have continued to grow. Our competitors have been hoping we will sacrifice one family after the merger, but we haven’t. Investments have continued on both product lines.  

You will see the fruits of our labor in the coming months with exciting new products coming your way.

In the long run, we intend to bring the best of both families together for an even stronger ARM-based portfolio, which we plan to share a future glimpse during the next post.  Apologizing in advance for not showing more, but we don’t want our competition to know everything we're doing.

Will NXP continue the Longevity Program?

Yes. Please continue to use your favorite LPC and Kinetis devices for your projects. We will continue to honor our longevity program and look for ways to strengthen it. We are in it for the long haul.

And if you didn't see our recent Longevity press statement, please check it out here, 

NXP Semiconductors :: Press Release 

 

Will the development tools merge?

Similar to our hardware strategy, we will continue to support our existing software platforms as well as enable our customers with strong tool partners, who like NXP, are committed to support both families.

As with our hardware strategy, we have brought together the software teams and using their combined strength to help weave together our products and shape our future. Plans are in motion, with some big announcements coming soon.  I think you will like what you see.

Thank you.

As with any merger, not all things go smoothly, but we are committed to make things better as we march forward with a stronger MCU portfolio. We appreciate your patience and loyalty to these great products.

More
1 0 793