This week I’m going to focus on switch statements. While easy to implement and a good way to improve code readability, frequently executed switch statements can burn a considerable number of clock cycles. It can be easy to fall into the trap of thinking in a switch statement the micro examines a variable and auto-magically goes to the correct block of code for that value of the variable. Since they are really a sequence of if-then-else statements, frequently executed switch statements can be very wasteful of power. Here are several ways to make them more efficient, some of them can be combined for even greater efficiency improvements:
- Arrange the order of the case statements so that the most frequently executed cases are listed first. You should check the compiled code to make sure the assembly code checks for the cases in the order they are listed or abandon the switch statement and implement your own sequence of if-then-else statements to ensure the order you want.
- You may be able to sacrifice some simplicity in the code and do a binary decode on the switch variable. The example below for a simple 4 case switch statement doesn’t appear to offer much improvement but is just intended to illustrate the binary decode. As the number of cases increases, the savings in time and power can be considerable, an 8 case switch statement goes from potentially seven tests to only three tests, a 16 case switch statement goes from potentially 15 tests to four and so on. If a few values are encountered significantly more often or are more time critical than the others, you can specifically test for those values before starting the binary decode. If you do this, remove the code for those values in the decoder code for clarity and to reduce code size.
- Using several if-then-else sequences testing for ranges of values of the switch variable with each test having its own if-then-else sequence can considerably improve the efficiency. The example below shows splitting what would be a 16 value switch statement into two “if” statements each with an eight value switch statement, reducing the worst case from 15 tests to 8 tests. Breaking it down further to four “if” statements each with a four value switch statement reduces the worst case to six tests.
- Nesting switch statements can produce similar improvements in the worst case number of tests required. To do this effectively you really need two variables or a variable that can be cleanly split into two fields like a “mode” for the first level switch and “command code” for the second level switch statements. The example below shows how an instruction opcode parser could be done with the opcode split into an instruction type field and command code field.
By now you should be seeing that writing low power firmware requires assuming a level of control in how you structure your code to minimize execution time. Next week I’ll continue on low power firmware design with arrays/structures and a discussion about complex algorithms and floating point math.
The last two posts were about general concepts around low power firmware design. This week I’ll start getting into details including some code examples.
- Use the largest clock pre-scaler that provides the resolution your firmware needs. The pre-scaler is typically a 4 to 8 bit counter while the counter/timer may be 8, 16 or 32 bits. Letting the pre-scaler run faster so the timer runs slower can reduce power considerably, particularly for free-running timers.
- Software based timers running on a tick interrupt are easy to implement and may be necessary if your application requires more than a few timers to be running simultaneously. Dedicated timers will be more power efficient IF the time-out period can be achieved with a single terminal count interrupt from the timer.
- When using a periodic tick interrupt for software timers, use the longest tick interval your code can tolerate. If sections of your code require tight timing it will usually be more efficient to use one timer with a longer tick interval for general use and another timer for with a short tick interval for the tight timing.
- For a timeout protection timer that doesn’t have tight timing requirements, consider using an RTC alarm interrupt for a 1 second (or longer) timeout. The RTC running at 32Khz should be much lower power than an 8 or 16 bit timer using a clock divided down from the micro’s much faster clock. This also provides a means for long timeout periods without taking periodic tick interrupts. On most modern micros, the RTC is functional even without a battery voltage present but may still require a dedicated crystal so be sure to check the datasheet if you aren’t using the RTC for its intended purpose.
- Turn off timers when they aren’t being used. If software based timers are appropriate for your application, turn off the free-running timer when it is not being used. This sounds like a no-brainer but it’s common to leave the free-running timer running all the time. For dedicated timers, this is usually just a matter of selecting the right mode for the timer so it stops automatically when it reaches the terminal count.
- For ultra-low power, when possible use a timer that counts down to 0 or counts up to all ones and generates an interrupt. Using a counter and match register containing the terminal count value requires more circuitry to be active and will consume more power.
If you are writing firmware in any high level language you need to become intimately familiar with the compiler and learn what it does well and what it doesn’t it. The only way to do that is to write some code and then examine the assembly code it generates.
- Efficiency – Every instruction your micro executes that isn’t required is wasted power. Compiler efficiency is usually a case of you get what you pay for. Free and low-cost compilers based on the GNU compiler typically produce code 2X to 5X larger, slower and less power efficient than a compiler written for a specific architecture.
- In-line assembly code – If you decide to use assembly language for time critical sections of code or for other reasons, you need to research how your compiler handles in-line assembly instructions. In some compilers, the registers you think you are using are actually memory based variables. If your assembly code uses many variables or contains a loop that is executed more than a few times, you are generally better off calling a function written in actual assembly code since the function calling overhead uses less power than the pseudo assembly code. If you do use in-line assembly language, review the compiler generate assembly language before and after your assembly language code to see how much overhead the compiler imposes for saving/restoring the micro state information.
- Compiler options – Compilers usually have many options, some of which can greatly increase or decrease the power efficiency of your code. A few things to look for:
- Position independent code – Disable this option unless you absolutely need it. The relative addressing required for position independent code will use more power than fixed location code on every jump/call instruction executed.
- Optimization options – Compilers typically provide options to optimize the generated code for speed or for size. The optimizations made for speed should also improve power efficiency since fewer clocks uses less power but will result in larger programs. If you are tight on code space, check to see if your compiler supports optimization on a per file basis so you can optimize the most frequently executed sections of code.
Structures are great for organizing variables but you need to consider whether this convenience is worth the cost in power compared to individual variables. A few things to consider:
- Every time a structure element is used the micro has to add the element offset to the structure base address. On most 32-bit micros this is USUALLY achieved with an indexed addressing mode so no additional clock cycles are required (but check the assembly code to make sure). On a low-end 8-bit micro this requires code to calculate the address, taking 8 to 10 instructions or more.
- Arrays of structures further complicate the math involved in calculating addresses. To calculate the offset into the array, the array index must be multiplied by the structure size and that is done in software on most micros used in embedded applications. If the array only contains a few instances of the structure it can be considerably more power efficient to have individual structures with another variable containing a pointer to the structure to use. Another technique for use with larger arrays will be discussed in the “Arrays and structures” section.
- Don’t assume your compiler is smart enough to calculate the base address for a particular structure in an array once and use it for several consecutive lines of C code that access that structure. There is a good chance it will calculate that base address for each line of C code. To use less power, use a pointer to the structure in this situation, the compiler should only calculate the address based on the offset for structure members.
- One place where structures may work to save power is in parameter passing. Particularly with low-end 8 bit micros, passing multiple parameters can be considerably more expensive in terms of power and time than passing a pointer to a structure containing those parameters. The example below illustrates this, even using the structure for the return value.
Next week I’ll continue on low power firmware design with switch statements and arrays/structures.