Michael
If flash is secured it can be unlocked with KDS (it will tell you that it is secured and ask where you want it to un-secure it).
I have a couple of boards that stopped communicating (always the OpenSDA - as if it were dead).
In one case, while doing low power tests I found that I always corrupt the OpenSDA loader if I leave the current measurement jumper unconnected and connect to a Win 10 PC. This can be explained by the fact that (at least on that board) not having the target processor powered causes the OpenSDA chip (K20) to start in its own bootloader mode, which is not compatible with Win 10. Win 10 tries to write hidden files to its USB-MSD device and corrupts the loader. I can recover from this be reloading the OpenSDA loader on a Win 7 PC - I am very careful to not let this happen when doing current measurements but sometimes it still does....
To the issues:
- Basically if the system (however performed and whether in interrupts, main loop, tasks or in the "virtual Interrupt Handler") need 249us of every 250us cycle to do its work the maximum saving is 1/250% in terms of power reduction, even if there were zero overhead. The hard limit at the end of the day is therefore what the handler is doing - any additional overhead to control the handler (wake-up/sleep times, interrupt latency, task switching, other code in the system) is on top of this and so is best kept to a minimum.
- If I use the "LOW_POWER_CYCLING_ENABLED" state I have absolute maximum performance since there is neither interrupt latency, nor scheduling overhead. If I don't need or want the system to be responsive to other events I don't need to ever leave this state and so I can stay therefore.
- However I have the advantage that I can power up the system with the full operation (eg. allowing a user to configure the system in a comfortable manner if that is what the product also requires to be able to do). At some point I can switch to the optimal cycling mode with close to zero overhead.
- The further advantage is that I can temporarily switch to the full-operation (eg. at a defined time).
- If I want, as in the reference, I can still allow events - such as the command line console - to briefly go back to the full operation (with its higher power consumption) to handle the input. Of course if I don't want this I simply don't allow it.
Therefore there is no compromise if I want "pure" low power cycle operation. But in many real world designs one will still like to have other capabilities, even if only used once during the very first setup; after that it still has zero effect on the final operation and its efficiency.
To the currents:
There are unfortunately still some surprises which I can't (yet) fully explain. In the reference case the only activity outside the low power cycle loop is a RTC interrupt every 1s (negligable) plus the command line handling. The command line handling "only" takes place when used so if I don't touch the terminal it adds no overhead - if I do touch it the system requires more power for maybe one or two cycles - depending on what it needs to do. It doesn't affect the cycle operations though since this is still taking place in an interrupt and so no cycles are lost - it is only the temporary power consumption that is affected.
Therefore uTasker itself is not involved during the tests; no further activity whatsoever.
These are some things that I have noted concerning actual current consumption:
1. If I turn on my board (in WAIT mode, which means that the processor is set to the WAIT state with no activity (i.e. the 50% or so with the 25us cycle)) I measure 2.5mA. The current slowly increases up to about 3.5mA - possibly as the chip heats up (?). I didn't notice this effect in such tests before....and I have been doing them regularly on many different Kinetis parts for some years, whereby my typical TICK is 50ms (and not 250us).
2. If I test with 250us cycle time I really don't see the linear saving. It is as though there is an invisible period where the processor still consumes power - it doesn't follow the rule that you and I both expect that if the duration in each cycle is halved the current should halve. In fact there is very little difference in the current consumption between 50% and 100% duty cycle. It improves only when the duty cycle gets below about 40% or so.
I was in two minds as to whether it would be best to show a 250us or 500us cycle period. The results are NOT the same - it is as if this invisible "dead-time" is much less of a factor and the relationship is closer to that expected when the cycle period is increased.
Here is a complete set of measurements using 500us and 250us cycle so that you can see (4MHz core and 800kHz Flash so that VLPR is possible). There is some limit that can't be explained by the pure relation between being in RUN and being in VLPS (or other) since the waveforms measured show that the processor is RUNning for the same time in each case.
500us cycle period
RUN 4.2mA
WAIT 3.5mA (lpc has no used)
STOP 2.2mA (with lpc = 1.1mA)
VLPR 2.45mA (lpc has no effect)
VLPW 1.47mA (lpc has no effect)
VLPS 2.2mA (with lpc = 0.9mA)
LLS2 2.7mA (with lpc = 0.8mA)
This all makes pretty good sense - note that the low power loop mode is even more effective in LLS2 because we know that it save much higher interrupt latency (for previous investigations) so the theory and practice are doing quite well...
250us cycle period
RUN 4.2mA
WAIT 3.8mA (lpc has no used)
STOP 3.8mA (with lpc = 1.8mA)
VLPR 2.45mA (lpc has no effect)
VLPW 2.3mA (lpc has no effect)
VLPS 3.75mA (with lpc = 1.6mA)
LLS2 3.9mA (with lpc = 1.7mA)
I monitor the waveform in each case and the waveforms are as expected but one can see that the current is not. There are three interesting cases:
1. STOP mode is not effective at saving current in the 250us case, although the STOP mode is really entered for 50% of the time...!
2. VLPS is the same (makes sense because both are based on the STOP mode)
3. LLS mode is only effective when using the "low power cycle" mode due to the fact that its wake-up time is rather long and it can't quite achieve the rate - effectively it results in it always being in RUN mode since it has to wake up as soon as it gets there....With the LPC optimisation it can benefit from some decent sleep time....
My first suspicion is that there is some "energy" requirement to move to and from the STOP based modes that causes a "knee" in the current measurements and so a limit. Current saving only becomes effective when the STOP/RUN ratio is lower than around 35%. If this is not achieved it is basically useless...(at least in this configuration)
The overall conclusion is that one needs to TEST, TEST and TEST - also in the REAL conditions that the final product will be required to run under.
But I went a little further and checked what would happen if we allow the processor to run faster when it is not in a sleep state - setting the 48MHz IRC as clock to give 48MHz core and 24MHz Flash.
250us cycle period
RUN 17.0mA
WAIT 15.3mA (lpc has no used)
STOP 2.22mA (with lpc = 1.2mA)
VLPR - not possible
VLPW - not possible
VLPS 1.95mA (with lpc = 0.86mA)
LLS2 2.44mA (with lpc = 0.77mA)
As I have noted on several occasions - sometime sit is better to RUN fast and Sleep deep for as long as possible. These results show that the practice again follow the theory - this suggests more strongly that switching in and out of STOP mode with a slow clock is energy intensive because now we are running at a good speed when active and still achieving less current overall.
Now I also have no problems running as your preferred 122us cycle since even occasional scheduling overhead is peanuts overall in terms of actual overhead.
122us cycle period
RUN 16.8mA
WAIT 16.1mA (lpc has no used)
STOP 2.92mA (with lpc = 1.81mA)
VLPR - not possible
VLPW - not possible
VLPS 3.70mA (with lpc = 1.58mA)
LLS2 4.66mA (with lpc = 1.54mA)
I would probable choose VLPS with lpc at this faster working rate because I would also allow use USB optionally. These are the recordings at 122us for VLPS (without lpc) and (with lpc)

RUN/SLEEP ratio 12.58us/122us

RUN/SLEEP ratio 3.42us/122us
LPT saving 9.16us per 122us period, which equates to (real) 57% saving in power consumption.
Don't forget that I still have a complete system (the serial command shell reacts identically and if I want to use USB for a short period where the current consumption doesn't have priority I can so with without any effort). Therefore there is no advantage (only disadvantages in fiddly development/tests and potentially much higher costs and maintenance later when features need to be added) of not using the scheduler.
Regards
Mark
P.S. Would you like me to post binaries for you to test and verify your HW? Unfortunately I had to stop the 32kHz clock output because it shared the UART Tx line, so I need to disable the serial interface if the clock reference needs to be measured. Since the clock/wake-up delays (at least at 4MHz) are known I decided to not use it any more and just rely on the SLEEP/RUN ratios because the command line is extremely useful for doing efficient tests.
Alternatively, the uTasker development (including all this on the development branch) is available as GIT and SVN repository for uTasker users. If you want a free commercial license to avoid needing to rely on the NXP code (which is not designed for dynamic use) just tell me. I still support free users here - especially ones who are obviously clued up since it helps drive the development (which is already leaps and bounds beyond the NXP packages, but gets more ahead each day ;-)