## DSP Processors, Embodiments and Alternatives: Part I

Uptil now, we described digital signal processing in general terms, focusing on DSP fundamentals, systems and application areas. Now, we narrow our focus in DSP processors. We begin with a high level description of the features common to virtually all DSP processors. We then describe typical embodiments of DSP processors, and briefly discuss alternatives to DSP processors such as general purpose microprocessors, microcontrollers(for comparison purposes) (TMS320C20) and FPGA’s. The next several blogs provide a detailed treatment of DSP processor architectures and features.

So, what are the “special” things that a DSP processor can perform? Well, like the name says, DSP processors do signal processing very well. What does “signal processing” mean? Really, it’s a set of algorithms for processing signals in the digital domain. There are analog equivalents to these algorithms, but processing them digitally has been proven to be more efficient. This has been trend for many many years. Signal processing algorithms are the basic building blocks for many applications in the world from cell phones to MP3 players, digital still cameras, and so on. A summary of these algorithms is shown below:

• FIR Filter: $y(n)=\sum_{k=0}^{N}a_{k}x(n-k)$
• IIR Filter: $y(n)=\sum_{k=0}^{M}a_{k}x(n-k)+\sum_{k=1}^{N}b_{k}y(n-k)$
• Convolution: $y(n)=\sum_{k=0}^{N}x(k)h(n-k)$
• Discrete Fourier Transform: $X(k)=\sum_{n=0}^{N-1}x(n)\exp{[-j(2 \pi l/N)nk]}$
• Discrete Cosine Transform: $F(u)=\sum_{x=0}^{N-1}c(u).f(x).\cos{\frac{\pi}{2N}u(2x+1)}$

One or more of these algorithms are used in almost every signal processing application. FIR filters and IIR filters are almost fundamental to any DSP application. They remove unwanted noise from signals being processed, convolution algorithms are used to look for similarities in signals, discrete Fourier transforms are used to represent signals in formats that are easier to process, and discrete cosine transforms are used in image processing applications. We will discuss the details of some of these algorithms later, but there are things to notice about  this entire list of algorithms. First, they all have a summing operation, the function. In the computer world, this is equivalent to an accumulation of a large number of elements which is implemented using a “for” loop. DSP processors are designed to have large accumulators because of this characteristic. They are specialized in this way. DSPs also have special hardware to perform the “for” loop operation so that the programmer does not have to implement this in software, which would be much slower.

The algorithms above also have multiplication of two different operands. Logically, if we were to speed up this operation, we would design a processor to accommodate the multiplication and accumulation of two operands like this very quickly. In fact, this is what has been done with DSPs. They are designed to support the multiplication and accumulation of data sets like this very quickly; for most processors, in just one cycle. Since these algorithms are very common in most DSP applications, tremendous execution savings can be obtained by exploiting these processor optimizations.

There are also inherent structures in DSP algorithms that allow them to be separated and operated on in parallel. Just as in real life, if I can do more things in parallel, I can get more done in the same amount of time. As it turns out, signal processing algorithms have this characteristic as well. So, we can take advantage of this by putting multiple orthogonal (nondependent) execution units in our DSP processors and exploit this parallelism when implementing these algorithms.

DSP processors must also add some reality in the mix of these algorithms shown above. Take the IIR filter described above. You may be able to tell just by looking at this algorithm that there is a feedback component that essentially feeds back previous outputs into the calculation of the current output. Whenever you deal with feedback, there is always an inherent stability issue. IIR filters can become unstable just like other feedback systems. Careless implementation of feedback systems like the IIR filter can cause the output to oscillate instead of asymptotically decaying to zero (the preferred approach). This problem is compounded in the digital world where we must deal with finite word lengths, a key limitation in all digital systems. We can alleviate this using saturation checks in software or use a specialized instruction to do this for us. DSP processors, because of the nature of signal processing algorithms, use specialized saturation underflow/overflow instructions to deal with these conditions efficiently.

There is more I can say about this, but you get the point. Specialization is really all it is about with DSP processors; these processors specifically designed to do signal processing really well. DSP processors may not be as good as other processors when dealing with nonsignal processing centric algorithms (that’s fine; I am not any good at medicine either). So, it’s important to understand your application and choose the right processor. (A previous blog about DAC and ADC did mention this issue).

(We describe below common features of DSP processors.)

Dozens of families of DSP processors are available on the market today. The salient feature of some of the commonly used families of DSP Processors are summarized in Table 1. Throughout these series, we will use these processors as examples to illustrate the architectures and features that can be found in commercial DSP processors.

Most DSP processors share some common features designed to support repetitive, numerically intensive tasks. The most important of these features are introduced briefly here. Each of these features and many others will be examined in greater detail in this blog article series.

Table 1.

$\begin{tabular}{|c|c|c|c|c|} \hline Vendor & Processor Family & Arithmetic Type & Data Width & Speed (MIPS) or MHz\\ \hline Analog Devices & ADSP 21xx & Fixed Point & 16 bit & 25MIPS\\ \hline Texas Instruments & TMS320C55x & Fixed Point & 16 bit & 50MHz ro 300MHz \\ \hline \end{tabular}$

Fast Multiply Accumulate

The most often cited of DSP feature of DSP processors is the ability to perform a multiply-accumulate operation (often called a MAC) in a single instruction cycle. The multiply-accumulate operation is useful in algorithms that involve computing a vector product, or matrix product, such as digital filters, correlation, convolution and Fourier transforms. To achieve this functionality, DSP processors include a multiplier and accumulator integrated into  the main arithmetic processing unit (called the data path) of the processor. In addition, to allow a series of multiply-accumulate operations to proceed without the possibility of arithmetic overflow, DSP processors generally provide extra bits in their accumulator registers to accommodate growth of  the accumulated result. DSP processor data paths will be discussed in detail in some later blog here.

Multiple Access Memory Architecture.

A second feature shared by most DSP processors is the ability to complete several accesses to memory in a single instruction cycle. This allows the processor to fetch an instruction while simultaneously fetching operands for the instruction or storing the result of the previous instruction to memoryHigh bandwidth between the processor and memory is essential for good performance if repetitive data intensive operations are required in an algorithm, as is common in many DSP applications.

In many processors, single cycle multiple memory accesses are subject to restrictions. Typically, all but one of the memory locations must reside on-chip, and multiple memory accesses can take place only with certain instructions. To support simultaneous access of multiple memory locations, DSP processors provide multiple on-chip  buses, multiported on-chip memories, and in some casesm multiple independent memory banks. DSP memory structures are quite distinct from those of general purpose processors and microcontrollers. DSP processor memory architectures will be investigated in detail later.

Specialized Execution Control.

Because many DSP algorithms involve performing repetitive computations, most DSP processors provide special support for efficient looping. Often, a special loop or repeat instruction is provided that allows the programmer to implement a for-next loop without expending any instruction cycles for updating and testing the loop counter or for jumping back to the top of the loop.

Some DSP processors provide other execution control features to improve performance, such as context switching and low-latency, low overhead interrupts for fast input/output.

Hardware looping and interrupts will also be discussed later in this blog series.

Perfipheral and Input/Output Interfaces

To allow low-cost, high performance input and output (I/O), most DSP processors incorporate one or more serial or parallel I/O interfaces, and specialized I/O handling mechanisms such as Direct Memory Access (DMA). DSP processor peripheral interfaces are often designed to interface directly with common peripheral devices like analog-to-digital and digital-to-analog converters.

As integrated circuit manufacturing techniques have improved in terms of density and flexibility, DSP processor vendors have included not just peripheral interfaces, but complete peripheral devices on-chip. Examples of this are chips designed for cellular telephone applications, several of which incorporate a DAC and ADC on chip.

Various features of DSP processor peripherals will be described later in this blog series.

In the next blog, we will discuss DSP Processor embodiments.

More later,

Aufwiedersehen,

Nalin Pithwa

This site uses Akismet to reduce spam. Learn how your comment data is processed.