**Dynamic Range and Precision**

You may have seen dB specs thrown around for various products available on the market today. Table 1 lists a few fairly established products along with their assigned signal quality, measured in dB.

*Table 1: Dynamic range comparison of various audio systems.* So what exactly do those numbers represent? Let's start by getting some definitions down. Use Figure 1 as a reference signal for the following "cheat sheet" of the essentials.

*Figure 1: Relationship between some important terms in audio systems.* The dynamic range of the human ear (the ratio of the loudest to the quietest signal level) is about 120 dB. In systems where noise is present, dynamic range is described as the ratio of the maximum signal level to the noise floor. In other words,

Dynamic Range (dB) = Peak Level (dB) - Noise Floor (dB)

The noise floor in a purely analog system comes from the electrical properties of the system itself. In digital systems, audio signals also acquire noise from the ADCs and DACs, as well as from the quantization errors due to sampling.

Another important measure is the signal-to-noise ratio (SNR). In analog systems, this means the ratio of the nominal signal to the noise floor, where "line level" is the nominal operating level. On professional equipment, the nominal level is usually 1.228 Vrms, which translates to +4 dBu. The headroom is the difference between nominal line level and the peak level where signal distortion starts to occur. The definition of SNR is a bit different in digital systems, where it is defined as the dynamic range.

Now, armed with an understanding of dynamic range, we can start to discuss how this is useful in practice. Without going into a long derivation, let's simply state what is known as the "6 dB rule". This rule is key to the relationship between dynamic range and computational word width. The complete formulation is described in the equation below, but in shorthand the 6 dB rule means that the addition of one bit of precision will lead to a dynamic range increase of 6 dB. Note that the 6 dB rule does not take into account the analog subsystem of an audio design, so the imperfections of the transducers on both the input and the output must be considered separately.

Dynamic Range (dB) = 6.02n + 1.76 ≈ 6n dB

where

n = the number of precision bits

The "6 dB rule" dictates that the more bits we use, the higher the audio quality we can attain. In practice, however, there are only a few realistic choices of word width. Most devices suitable for embedded media processing come in three word width flavors: 16-bit, 24-bit, and 32-bit. Table 2 summarizes the dynamic ranges for these three types of processors.

*Table 2: Dynamic range of various fixed-point architectures.* Since we're talking about the 6 dB rule, it is worth mentioning something about the nonlinear quantization methods that are typically used for speech signals. A telephone-quality linear PCM encoding requires 12 bits of precision. However, our ears are more sensitive to audio changes at small amplitudes than at high amplitudes. Therefore, the linear PCM sampling is overkill for telephone communications. The logarithmic quantization used by the A-law and μ–law companding standards achieves a 12-bit PCM level of quality using only 8 bits of precision. To make our lives easier, some processor vendors have implemented A-law and μ–law companding into the serial ports of their devices. This relieves the processor core from doing logarithmic calculations.

After reviewing Table 2, recall once again that the dynamic range of the human ear is around 120 dB. Because of this, 16-bit data representation doesn't quite cut it for high quality audio. This is why vendors introduced 24-bit processors. However, these 24-bit systems are a bit non-standard from a C compiler standpoint, so many audio designs these days use 32-bit processing.

Choosing the right processor is not the end of the story, because the total quality of an audio system is dictated by the quality level of the "lowest-achieving" component. Besides the processor, a complete system includes analog components like microphones and speakers, as well the converters to translate signals between the analog and digital domains. The analog domain is outside of the scope of this discussion, but the audio converters do cross into the digital realm.

Let's say that you want to use the AD1871 for sampling audio. The datasheet for this converter explains that it is a 24-bit converter, but its dynamic range is not the theoretical 144 dB – it is 105 dB. The reason for this is that a converter is not a perfect system, and vendors publish only the useful dynamic range.

If you were to hook up a 24-bit processor to the AD1871, then the SNR of your complete system would be 105 dB. The noise floor would amount to 144 dB – 105 dB = 39 dB. Figure 2 is a graphical representation of this situation. However, there is still another component of a digital audio system that we have not discussed yet: computation on the processor's core.

* Figure 2: An audio system's SNR consists of the weakest component's SNR.* Passing data through a processor's computational units can potentially introduce a variety of errors. One is quantization error. This can be introduced when a series of computations causes a data value to be either truncated or rounded (up or down). For example, a 16-bit processor may be able to add a vector of 16-bit data and store this in an extended-length accumulator. However, when the value in the accumulator is eventually written to a 16-bit data register, some of the bits are truncated.

Take a look at Figure 3 to see how computation errors can affect a real system. For an ideal 16-bit A/D converter (Figure 3a), the signal-to-noise ratio would be 16 x 6 = 96 dB. If quantization errors did not exist, then 16-bit computations would suffice to keep the SNR at 96 dB. Both 24-bit and 32-bit systems would dedicate 8 and 16 bits, respectively, to the dynamic range below the noise floor. In essence, those extra bits would be wasted.

However, all digital audio systems do introduce some round-off and truncation errors. If we can quantify this error to take, for example, 18 dB (or 3 bits), then it becomes clear that 16-bit computations will not suffice in keeping the system's SNR at 96 dB (Figure 3b). Another way to interpret this is to say that the effective noise floor is raised by 18 dB, and the total SNR is decreased to 96 dB – 18 dB = 78 dB. This leads to the conclusion that having extra bits below the noise floor helps to deal with the nuisance of quantization.

Figure 3 (a) Allocation of extra bits with various word width computations for an ideal 16-bit, 96 dB SNR system, when quantization error is neglected (b) Allocation of extra bits with various word width computations for an ideal 16-bit, 96 dB SNR system, when quantization noise is present.

**Numeric Formats for Audio**

There are many ways to represent data inside a processor. The two main processor architectures used for audio processing are fixed-point and floating-point. Fixed-point processors are designed for integer and fractional arithmetic, and they usually natively support 16-bit, 24-bit, or 32-bit data. Floating-point processors provide very good performance with native support for 32-bit or 64-bit floating-point data types. However, floating-point processors are typically more costly and consume more power than their fixed-point counterparts, and most real systems must strike a balance between quality and engineering cost.

**Fixed-point Arithmetic**

Processors that can perform fixed-point operations typically use two's complement binary notation for representing signals. A fixed-point format can represent both signed and unsigned integers and fractions. The signed fractional format is most common for digital signal processing on fixed-point processors. The difference between integer and fractional formats lies in the location of the binary point. For integers, the binary point is to the right of the least significant digit, whereas fractions usually have their binary point to the left of the sign bit. Figure 4a shows integer and fractional formats.

While the fixed-point convention simplifies numeric operations and conserves memory, it presents a tradeoff between dynamic range and precision. In situations that require a large range of numbers while maintaining high resolution, a radix point that can shift based on magnitude and exponent, (i.e., floating-point) is desirable.

* Figure 4. (a) Fractional and integer formats*

**Floating-point Arithmetic**Using floating-point format, very large and very small numbers can be represented in the same system. Floating-point numbers are quite similar to scientific notation representation of rational numbers. They are described with a mantissa and an exponent. The mantissa dictates precision, and the exponent controls dynamic range.

There is a standard that governs floating-point computations of digital machines. It is called IEEE-754 (Figure 4a) and can be summarized as follows for 32-bit floating-point numbers. Bit 31 (MSB) is the *sign bit*, where a 0 represents a positive sign and a 1 represents a negative sign. Bits 30 through 23 represent an exponent field (*exp_field*) as a power of 2, biased with an offset of 127. Finally, bits 22 through 0 represent a fractional mantissa (*mantissa*). The hidden bit is basically an implied value of 1 to the left of the radix point.

The value of a 32-bit IEEE floating-point number can be represented with the following equation:

(-1)^{sign_bit} x (1.mantissa) x 2^{(exp_field " 127)}

With an 8-bit exponent and a 23-bit mantissa, IEEE-754 reaches a balance between dynamic range and precision. In addition, IEEE floating-point libraries include support for additional features such as ±infinity, zero, and NaN (not a number).

* Figure 4. (a) Fractional and integer formats (b) IEEE 754 32-bit single-precision floating-point format.*

Table 3 shows the smallest and largest values attainable from the common floating-point and fixed-point types.

* Table 3. Comparison of dynamic range for various data formats.*

**Emulation on 16-bit Architectures**

As explained earlier, 16-bit processing does not provide a high enough SNR for high quality audio, but this does not mean that you shouldn't choose a 16-bit processor. For example, while a 32-bit floating-point machine makes it easier to code an algorithm that preserves 32-bit data natively, a 16-bit processor can also maintain 32-bit integrity through emulation at a much lower cost. Figure 5 illustrates some of the possibilities for choosing a data type for an embedded algorithm.

In the remainder of this section, we'll describe how to achieve floating-point and 32-bit extended precision fixed-point functionality on a 16-bit fixed-point machine.

* Figure 5: Depending the goals of an application, there are many data types that can satisfy system requirements.*

**Floating-point emulation on fixed-point processors**

On most 16-bit fixed-point processors, IEEE-754 floating-point functions are available as library calls from either C/C++ or assembly language. These libraries emulate the required floating-point processing using fixed-point multiply and ALU logic. This emulation requires additional cycles to complete. However, as fixed-point processor core clock speeds venture into the 500 MHz - 1 GHz range, the extra cycles required to emulate IEEE-754-compliant floating-point math become less significant.

It is sometimes advantageous to use a "relaxed" version of IEEE-754 in order to reduce computational complexity. This means that the floating-point arithmetic doesn't implement the standard features such ±infinity, zero, and NaN.

A further optimization is to use a more native type for the mantissa and exponent. Take, for example, Analog Devices' fixed-point Blackfin processor architecture, which has a register file set that consists of sixteen 16-bit registers that can be used instead as eight 32-bit registers. In this configuration, on every core clock cycle, two 32-bit registers can source operands for computation on all four register halves. To make optimized use of the Blackfin register file, a two-word format can be used. In this way, one word (16 bits) is reserved for the exponent and the other word (16 bits) is reserved for the fraction.

**Double-Precision Fixed-Point Emulation**

There are many applications where 16-bit fixed-point data is not sufficient, but where emulating floating-point arithmetic may be too computationally intensive. For these applications, extended-precision fixed-point emulation may be enough to satisfy system requirements. Using a high-speed fixed-point processor will insure a significant reduction in the amount of required processing. Two popular extended-precision formats for audio are 32-bit and 31-bit fixed-point representations.

**32-Bit-Accurate Emulation**

32-bit arithmetic is a natural software extension for 16-bit fixed-point processors. For processors whose 32-bit register files can be accessed as two 16-bit halves, the halves can be used together to represent a single 32-bit fixed-point number. The Blackfin processor's hardware implementation allows for single-cycle 32-bit addition and subtraction.

For instances where a 32-bit multiply will be iterated with accumulation (as is the case in some algorithms we'll talk about soon), we can achieve 32-bit accuracy with 16-bit multiplications in just 3 cycles. Each of the two 32-bit operands (R0 and R1) can be broken up into two 16-bit halves (R0.H / R0.L and R1.H / R1.L).

*igure 6. 32-bit multiplication with 16-bit operations.*

From Figure 6, it is easy to see that the following operations are required to emulate the 32-bit multiplication R0 x R1 with a combination of instructions using 16-bit multipliers:

Four 16-bit multiplications to yield four 32-bit results:

- R1.L
**x** R0.L - R1.L
**x** R0.H - R1.H
**x** R0.L - R1.H
**x** R0.H

Three operations preserve bit place in the final answer (the **>>** symbol denotes a right shift). Since we are performing fractional arithmetic, the result is 1.63 (1.31 x 1.31 = 2.62 with a redundant sign bit). Most of the time, the result can be truncated to 1.31 in order to fit in a 32-bit data register. Therefore, the result of the multiplication should be in reference to the sign bit, or the most significant bit. This way the rightmost least significant bits can be safely discarded in a truncation:

- (R1.L x R0.L)
**>>** 32 - (R1.L x R0.H)
**>>** 16 - (R1.H x R0.L)
**>>** 16

The final expression for a 32-bit multiplication is:

((R1.L x R0.L) >> 32 + (R1.L x R0.H) >> 16) + ((R1.H x R0.L) >> 16 + R1.H x R0.H)

On the Blackfin architecture, these instructions can be issued in parallel to yield an effective rate of a 32-bit multiplication in three cycles.

**31-Bit-Accurate Emulation**

We can reduce a fixed-point multiplication requiring at most 31-bit accuracy to just 2 cycles. This technique is especially appealing for audio systems, which usually require at least 24-bit representation, but where 32-bit accuracy may be a bit excessive. Using the "6 dB rule," 31-bit-accurate emulation still maintains a dynamic range of around 186 dB, which is plenty of headroom even with all the quantization effects.

From the multiplication diagram shown in Figure 6, it is apparent that the multiplication of the least significant half-word R1.L **x** R0.L does not contribute much to the final result. In fact, if the result is truncated to 1.31, then this multiplication can only have an effect on the least significant bit of the 1.31 result. For many applications, the loss of accuracy due to this bit is balanced by the speeding up of the 32-bit multiplication through eliminating one 16-bit multiplication, one shift, and one addition.

The expression for 31-bit accurate multiplication is:

((R1.L x R0.H) + (R1.H x R0.L) ) >> 16 + (R1.H x R0.H)

On the Blackfin architecture, these instructions can be issued in parallel to yield an effective rate of a 2 cycles for each 32-bit multiplication.

So that's the scoop on numeric formats for audio. In the final article of this series, we'll talk about some strategies for developing embedded audio applications, focusing primarily on data movement and building blocks for common algorithms.

*This series is adapted from the book "Embedded Media Processing" (Newnes 2005) by David Katz and Rick Gentile. See the book's web site for more information.*