Instruction/ maintenance manual of the product TMS320 DSP Texas Instruments
Go to page of 88
TMS320 DSP Algorithm Standard Rules and Guidelines User's Guide Literature Number: SPRU352G June 2005 – Revised February 2007.
2 SPRU352G – June 2005 – Revised February 2007 Submit Documentation Feedback.
Contents Preface ............................................................................................................................... 7 1 Overview .............................................................................................
4 Algorithm Performance Characterization ..................................................................... 37 4.1 Data Memory ....................................................................................................... 38 4.1.1 Heap Memory .
6.2 Algorithm and Framework ........................................................................................ 62 6.3 Requirements for the Use of the DMA Resource ............................................................. 63 6.4 Logical Channel .
List of Figures 1-1 TMS320 DSP Algorithm Standard Elements ........................................................................... 10 1-2 DSP Software Architecture ...................................................................................
Preface SPRU352G – June 2005 – Revised February 2007 Read This First This document defines a set of requirements for DSP algorithms that, if followed, allow system integrators to quickly assemble production-quality systems from one or more such algorithms.
www.ti.com Related Documentation • Chapter 6 - Use of the DMA Resource , develops guidelines and rules for creating eXpressDSP-compliant algorithms that utilize the DMA resource. • Appendix A - Rules and Guidelines , contains a complete list of all rules and guidelines in this specification.
Chapter 1 SPRU352G – June 2005 – Revised February 2007 Overview This chapter provides an overview of the TMS320 DSP Algorithm Standard. Topic .................................................................................................. Page 1.
www.ti.com 1.1 Scope of the Standard Rules for TMS320C2x Level 1 Level 2 Level 3 Level 4 T elecom Rules for TMS320C5x Rules for TMS320C6x Imaging Audio Automotive Other Algorithm Component Model General Programming Guidelines S C callable S No hard coded addresses S Reentrant S etc.
www.ti.com 1.1.1 Rules and Guidelines 1.2 Requirements of the Standard Requirements of the Standard Level 3 contains the guidelines for specific families of DSPs. Today, there are no agreed-upon guidelines for algorithms with regard to the use of processor resources.
www.ti.com 1.3 Goals of the Standard 1.4 Intentional Omissions Goals of the Standard This section contains the goals of this standard. While it may not be possible to perfectly attain these goals, they represent the primary concerns that should be addressed after addressing the required elements described in the previous section.
www.ti.com 1.5 System Architecture ALG ALG ALG Framework Status Cmd Status Cmd Core run time support 1.5.1 Frameworks System Architecture To support the ability of a system integrator to rapidly evaluate algorithms from various vendors, a mechanism should be defined that allows a component to be used for evaluation only.
www.ti.com 1.5.2 Algorithms 1.5.3 Core Run-Time Support System Architecture Careful inspection of the various frameworks in use reveals that, at some level, they all have algorithm components. While there are differences in each of the frameworks, the algorithm components share many common attributes.
Chapter 2 SPRU352G – June 2005 – Revised February 2007 General Programming Guidelines In this chapter, we develop programming guidelines that apply to all algorithms on all DSP architectures, regardless of application area. Topic .................
www.ti.com 2.1 Use of C Language 2.2 Threads and Reentrancy 2.2.1 Threads Use of C Language Almost all recently developed software modules follow these common sense guidelines already, so this chapter just formalizes them.
www.ti.com 2.2.2 Preemptive vs. Non-Preemptive Multitasking 2.2.3 Reentrancy Threads and Reentrancy Non-preemptive multitasking relies on each thread to voluntarily relinquish control to the operating system before letting another thread execute.
www.ti.com 2.2.4 Example y n + x n * x n * 1 ) 13 32 x n * 2 Threads and Reentrancy The definition of reentrant code often implies that the code does not retain "state" information. That is, if you invoke the code with the same data at different times, by the same or other thread, it will yield the same results.
www.ti.com 2.3 Data Memory Data Memory void PRE_filter1(int input[], int length, int *z) { int I, tmp; for (I = 0; I < length; I++) { tmp = input[i] - z[0] + (13 * z[1] + 16) / 32; z[1] = z[0]; z[0.
www.ti.com 2.3.1 Memory Spaces 2.3.2 Scratch versus Persistent Data Memory While the amount of on-chip data memory may be adequate for each algorithm in isolation, the increased number of MIPS available on modern DSPs encourages systems to perform multiple algorithms concurrently with a single chip.
www.ti.com Scratch Algorithm A Scratch Algorithm B W rite-Once C Scratch Algorithm C Scratch Physical Memory Persistent B Persistent A Persistent A Persistent B W rite-Once C 0000 FFFF 0 1 Algorithm C.
www.ti.com 2.3.3 Algorithm versus Application Persistent Scratch Shared Private Shadow Data Memory Guideline 1 Algorithms should minimize their persistent data memory requirements in favor of scratch memory. In addition to the types of memory described above, there are often several memory spaces provided by a DSP to algorithms.
www.ti.com 2.4 Program Memory 2.5 ROM-ability Program Memory Like the data memory requirements described in the previous section, it is important that all eXpressDSP-compliant algorithms are fully relocatable; i.e., there should never be any assumption about the specific placement of an algorithm at a particular address.
www.ti.com 2.6 Use of Peripherals Use of Peripherals Rule 5 Algorithms must characterize their ROM-ability; i.e., state whether or not they are ROM-able. Obviously, self-modifying Code is not ROM-able. We do not require that no algorithm employ self-modifying Code; we only require documentation of the ROM-ability of an algorithm.
Chapter 3 SPRU352G – June 2005 – Revised February 2007 Algorithm Component Model In this chapter, we develop additional rules and guidelines that apply to all algorithms on all DSP architectures regardless of application area. Topic ..............
www.ti.com 3.1 Interfaces and Modules client.c #include <fir.h> ... FIR_apply(); } fir .h typedef struct FIR_obj *FIR_Handle; extern void FIR_init(); extern void FIR_exit(); extern FIR_HandleFIR_create(); fir_apply .asm FIR_apply: .globalFIR_apply fir_create.
www.ti.com 3.1.1 External Identifiers Interfaces and Modules Rule 7 All header files must support multiple inclusions within a single source file. The general technique for insuring this behavior for C header files is illustrated in the Code below. /* * ======== fir.
www.ti.com 3.1.2 Naming Conventions 3.1.3 Module Initialization and Finalization 3.1.4 Module Instance Objects Interfaces and Modules To simplify the way eXpressDSP-compliant client Code is written, it is valuable to maintain a single consistent naming convention.
www.ti.com FIR_Config FIR; FIR_init(); FIR_exit(); FIR_create(); FIR firObject Creates firObject FIR FIR_create(); Creates firObject Int length; Int coeff[]; Int delay[]; Read-only coef ficient array Filter input history buf fer 3.1.5 Design-Time Object Creation 3.
www.ti.com 3.1.7 Module Configuration 3.1.8 Example Module Interfaces and Modules Guideline 4 All modules that support object creation should support run-time object creation. Note that the eXpressDSP-compliant algorithms are a special type of module.
www.ti.com 3.1.9 Multiple Interface Support Interfaces and Modules typedef struct FIR_Obj { /* FIR_Obj definition */ int hist[16]; /* previous input value */ int frameLen; /* input frame length */ int.
www.ti.com 3.1.10 Interface Inheritance 3.1.11 Summary Interfaces and Modules module's header file defines a concrete interface; the functions defined in the header uniquely identify a specific (or concrete) implementation within a system.
www.ti.com 3.2 Algorithms IALG IALG_Fxns FIR_Config FIR; FIR_init(); FIR_exit(); FIR_Fxcs FIR_IALG; FIR Implements Algorithms Element Description Required Module's object definition; normally not struct XYZ_Obj yes defined in the module's header.
www.ti.com 3.3 Packaging 3.3.1 Object Code Packaging Rule 13 Each of the IALG methods implemented by an algorithm must be independently relocatable. In practice, this simply means that each method should either be implemented in a separate file or placed in a separate COFF output section.
www.ti.com 3.3.2 Header Files 3.3.3 Debug Verses Release Packaging <module> is the name of the module (containing characters from the set [a-z0-9]), <vers> is an optional version number of.
www.ti.com Packaging If multiple versions of the same component are provided by a single vendor, the different versions must be in different libraries (as described above) and these libraries must be named as follows: <module><vers>_<vendor>_<variant>.
Chapter 4 SPRU352G – June 2005 – Revised February 2007 Algorithm Performance Characterization In this chapter, we examine the performance information that should be provided by algorithm components to enable system integrators to assemble combinations of algorithms into reliable products.
www.ti.com 4.1 Data Memory 4.1.1 Heap Memory Data Memory The only resources consumed by eXpressDSP-compliant algorithms are MIPS and memory. All I/O, peripheral control, device management, and scheduling is managed by the application — not the algorithm.
www.ti.com 4.1.2 Stack Memory 4.1.3 Static Local and Global Data Memory Data Memory In the example above, the algorithm requires 960 16-bit words of single-access on-chip memory, 720 16-bit words of external persistent memory, and there are no special alignment requirements for this memory.
www.ti.com 4.2 Program Memory Program Memory Algorithms must characterize their static data memory requirements by filling out a table such as that illustrated below. Each row represents the requirements for an individual object file that is part of the algorithm's implementation.
www.ti.com 4.3 Interrupt Latency 4.4 Execution Time 4.4.1 MIPS Is Not Enough Interrupt Latency Code Code Sections Size Align a.obj(.text) 768 0 b.obj(.text) 125 32 In most DSP systems, algorithms are started by the arrival of data and the arrival of data is signaled by an interrupt.
www.ti.com 3 ms 3 ms 3 ms 3 ms A B Idle 4.4.2 Execution Time Model Execution Time Figure 4-1. Execution Timeline for Two Periodic Tasks In this case, both task A and B meet their deadlines and we have more than 18% (1 ms every 6 ms) of the CPU idle. Suppose we now increase the amount of processing that task B must perform very slightly, say to 1.
www.ti.com Execution Time Execution time should be expressed in instruction cycles whereas the period expressed in microseconds. Worst-case execution time must be accompanied with a precise description of the run-time assumptions required to reproduce this upper bound.
www.ti.com Algorithm Performance Characterization 44 SPRU352G – June 2005 – Revised February 2007 Submit Documentation Feedback.
Chapter 5 SPRU352G – June 2005 – Revised February 2007 DSP-Specific Guidelines This chapter provides guidelines for creating eXpressDSP-compliant algorithms for various DSP families. Topic ..........................................................
www.ti.com 5.1 CPU Register Types Read−only Scratch Preserve Init Global Local Read−write CPU Register Types DSP algorithms are often written in assembly language and, as a result, they will take full advantage of the instruction set.
www.ti.com 5.2 Use of Floating Point 5.3 TMS320C6xxx Rules and Guidelines 5.3.1 Endian Byte Ordering 5.3.2 Data Models 5.3.3 Program Model Use of Floating Point It is important to note that the use of global registers by algorithms is permitted. However, like self-modifying code, their use must be invisible to clients.
www.ti.com 5.3.4 Register Conventions 5.3.5 Status Register TMS320C6xxx Rules and Guidelines In addition, no algorithm may ever directly manipulate the cache control registers. It is important to realize that eXpressDSP-compliant algorithms may be placed in on-chip program memory by the system developer.
www.ti.com 5.3.6 Interrupt Latency 5.4 TMS320C54xx Rules and Guidelines 5.4.1 Data Models 5.4.2 Program Models TMS320C54xx Rules and Guidelines CSR Field Use Type EN Current CPU endian mode. Read-only (global) PWRD Power-Down modes Not accessible (global) PCC Program Cache Control Not accessible (global) DCC Data Cache Control.
www.ti.com TMS320C54xx Rules and Guidelines There are, of course, cases where it would be desirable that the core run-time support is accessible with near calls.
www.ti.com 5.4.3 Register Conventions 5.4.4 Status Registers TMS320C54xx Rules and Guidelines This section describes the rules and guidelines that apply to the use of the TMS320C54xx on-chip registers. As described above, there are several different register types.
www.ti.com 5.4.5 Interrupt Latency 5.5 TMS320C55x Rules and Guidelines 5.5.1 Stack Architecture 5.5.2 Data Models TMS320C55x Rules and Guidelines ST1 Field Name Use Type INTM Interrupt mask Preserve(g.
www.ti.com 5.5.3 Program Models 5.5.4 Relocatability TMS320C55x Rules and Guidelines Rule 32 All C55x algorithms must access all static and global data as far data; also, the algorithms should be instantiable in a large memory model. Only the large memory model is supported for the program memory.
www.ti.com 5.5.5 Register Conventions TMS320C55x Rules and Guidelines If the algorithm does not use B-bus, then the first column must be zero. If there is more than one block that is accessed by the B-bus, then all the block numbers must be specified in the second column as shown in the above example.
www.ti.com 5.5.6 Status Bits TMS320C55x Rules and Guidelines The C55xx contains four status registers: ST0, ST1, ST2 and ST3. ST0 Field Name Use Type ACOV2 Overflow flag for AC2 Scratch (local) ACOV3 .
www.ti.com TMS320C55x Rules and Guidelines ST3 Field Name Use Type HOMY Host only access mode Read-only (global) HOMX Host only access mode Read-only (global) HOMR Shared access mode Read-only (global.
www.ti.com 5.6 TMS320C24xx Guidelines 5.6.1 General TMS320 DSP Standard Algorithms vs. DCS Modules The C24xx family of DSPs are classified as DSP controllers, and consequently are mainly focused on the “Digital Control Space.
www.ti.com 5.6.5 Status Registers 5.6.6 Interrupt Latency 5.7 TMS320C28x Rules and Guidelines 5.7.1 Data Models TMS320C28x Rules and Guidelines Register Use Type AR6 - AR7 C compiler Register variable.
www.ti.com 5.7.2 Program Models 5.7.3 Register Conventions 5.7.4 Status Registers TMS320C28x Rules and Guidelines Only large memory model is supported for the program memory.
www.ti.com 5.7.5 Interrupt Latency TMS320C28x Rules and Guidelines ST1 Field Name Use Type ARP Auxiliary register pointer Scratch (local) XF XF pin status Read Only (global) M0M1MAP M0 and M1 mapping .
Chapter 6 SPRU352G – June 2005 – Revised February 2007 Use of the DMA Resource The direct memory access (DMA) controller performs asynchronously scheduled data transfers in the background while the CPU continues to execute instructions.
www.ti.com 6.1 Overview 6.2 Algorithm and Framework Overview Rule 6 states that "Algorithms must never directly access any peripheral device. This includes but is not limited to on-chip DMAs, timers, I/O devices, and cache control registers.
www.ti.com 6.3 Requirements for the Use of the DMA Resource 6.4 Logical Channel Requirements for the Use of the DMA Resource through the logical DMA channels it acquires through the IDMA2 protocol. A detailed description of these APIs can be found in the TMS320 DSP Algorithm Standard API Reference (SPRU360).
www.ti.com 6.5 Data Transfer Properties Elem0 Elem1 Elem2 Elem3 Gaps between elements Element index Frame Element size Frame 0 Frame 1 Frame N-1 Number of frames = N Frame index 6.
www.ti.com 6.7 Abstract Interface Abstract Interface DMA Guideline 1 The data transfer should complete before the CPU operations executing in parallel. However, we can never guarantee that the data transfers are complete before data are accessed by the CPU, even if the algorithm is designed in such a way (e.
www.ti.com 6.8 Resource Characterization Resource Characterization DMA Rule 3 Each of the IDMA2 or IDMA3 methods implemented by an algorithm must be independently relocateable. The pragma directive must be used to place each method in appropriate subsections to enable independent relocatability of the methods by the system integrator.
www.ti.com 6.9 Runtime APIs 6.10 Strong Ordering of DMA Transfer Requests Runtime APIs For example, in the table above, the "process" operation is using two logical channels. On logical channel 0, it performs on average 5 data transfers and a maximum of 7 data transfers.
www.ti.com 6.11 Submitting DMA Transfer Requests 6.12 Device Independent DMA Optimization Guideline Submitting DMA Transfer Requests The specification of the ACPY2 interface strives to perform a delicate trade-off between allowing high performance and requiring error checking at run time.
www.ti.com 6.13 C6xxx Specific DMA Rules and Guidelines 6.13.1 Cache Coherency Issues for Algorithm Producers Y X = 0ld Y = new X L2 cache External memory DMA X Y X = 0ld Y = new X L2 cache External m.
www.ti.com V X = 0ld Y = new VX L2 cache External memory Y DMA Cache line 6.14 C55x Specific DMA Rules and Guidelines 6.14.1 Supporting Packed/Burst Mode DMA Transfers C55x Specific DMA Rules and Guidelines DMA Rule 7 is a rule for the client application writer.
www.ti.com 6.14.2 Minimizing Logical Channel Reconfiguration Overhead 6.14.3 Addressing Automatic Endianism Conversion Issues 6.15 Inter-Algorithm Synchronization 6.15.1 Non-Preemptive System Inter-Algorithm Synchronization Some common C55x DMA devices impose additional restrictions that affect when a channel needs to be reconfigured.
www.ti.com DMA/CPU idle CPU context switch CPU/DMA active Algorithm A active Algorithm B active CPU context (timeline) DMA context (timeline) 1 2 3 4 5 6.
www.ti.com DMA/CPU idle CPU context switch CPU/DMA active Algorithm A active Algorithm B active active Algorithm A CPU context (timeline) DMA context (timeline) 1 2 3 4 5 6 Inter-Algorithm Synchronization algorithm. Events 1. Algorithm A requests a data transfer by calling ACPY2_start() .
www.ti.com Inter-Algorithm Synchronization It is important to notice that preemptive systems might have groups of algorithms that execute with the same priority. A well-designed DMA manager would assign the same physical channels to algorithms at the same priority level to avoid the scenarios described in Section 6.
Appendix A SPRU352G – June 2005 – Revised February 2007 Rules and Guidelines This appendix gathers together all rules and guidelines into one compact reference. Topic ................................................................................
www.ti.com A.1 General Rules General Rules Recall that rules must be followed in order for software to be eXpressDSP-compliant. Guidelines, on the other hand, are strongly suggested guidelines that should be obeyed but may be violated by eXpressDSP-compliant software.
www.ti.com A.2 Performance Characterization Rules A.3 DMA Rules Performance Characterization Rules Rule 25 — All C6x algorithms must be supplied in little-endian format. (See Section 5.3.1 ) Rule 26 — All C6x algorithms must access all static and global data as far data.
www.ti.com A.4 General Guidelines General Guidelines DMA Rule 3 — Each of the IDMA2 methods implemented by an algorithm must be independently relocateable. (See Section 6.7 ) DMA Rule 4 — All algorithms must state the maximum number of concurrent DMA transfers for each logical channel.
www.ti.com A.5 DMA Guidelines DMA Guidelines Guideline 12 — All C6x algorithms should be supplied in both little- and big-endian formats. (See Section 5.
www.ti.com Rules and Guidelines 80 SPRU352G – June 2005 – Revised February 2007 Submit Documentation Feedback.
Appendix B SPRU352G – June 2005 – Revised February 2007 Core Run-Time APIs This appendix enumerates all acceptable core run-time APIs that may be referenced by an eXpressDSP-compliant algorithm. Topic ..............................................
www.ti.com B.1 TI C-Language Run-Time Support Library B.2 DSP/BIOS Run-time Support Library TI C-Language Run-Time Support Library Recall that only a subset of the DSP/BIOS and the TI C run-time support library functions are allowed to be referenced from an eXpressDSP-compliant algorithm.
C.1 Books C.2 URLS Appendix C SPRU352G – June 2005 – Revised February 2007 Bibliography This appendix lists sources for additional information. Dialogic, Media Stream Processing Unit; Developer's Guide , 05-1221-001-01 September 1998. ISO/IEC JTC1/SC22 N 2388 dated January 1997, Request for SC22 Working Groups to Review DIS 2382-07 .
www.ti.com Bibliography 84 SPRU352G – June 2005 – Revised February 2007 Submit Documentation Feedback.
D.1 Glossary of Terms Appendix D SPRU352G – June 2005 – Revised February 2007 Glossary Abstract Interface — An interface defined by a C header whose functions are specified by a structure of function pointers. By convention these interface headers begin with the letter 'i' and the interface name begins with 'I'.
www.ti.com Glossary of Terms Endian — Refers to which bytes are most significant in multi-byte data types. In big-endian architectures, the leftmost bytes (those with a lower address) are most significant. In little-endian architectures, the rightmost bytes are most significant.
www.ti.com Glossary of Terms Scheduling — The process of deciding what thread should execute next on a particular CPU. It is usually also taken as involving the context switch to that thread. Scheduling Latency — The maximum time that a "ready" thread can be delayed by a lower priority thread.
IMPORTANT NOTICE Texas Instruments Incorporated an d its subsidiaries (TI) reserve the right to make corrections, modifications, enhancements, improveme nts, and other changes to its products a nd services at any time and to discontinue any product or s ervice without notice.
An important point after buying a device Texas Instruments TMS320 DSP (or even before the purchase) is to read its user manual. We should do this for several simple reasons:
If you have not bought Texas Instruments TMS320 DSP yet, this is a good time to familiarize yourself with the basic data on the product. First of all view first pages of the manual, you can find above. You should find there the most important technical data Texas Instruments TMS320 DSP - thus you can check whether the hardware meets your expectations. When delving into next pages of the user manual, Texas Instruments TMS320 DSP you will learn all the available features of the product, as well as information on its operation. The information that you get Texas Instruments TMS320 DSP will certainly help you make a decision on the purchase.
If you already are a holder of Texas Instruments TMS320 DSP, but have not read the manual yet, you should do it for the reasons described above. You will learn then if you properly used the available features, and whether you have not made any mistakes, which can shorten the lifetime Texas Instruments TMS320 DSP.
However, one of the most important roles played by the user manual is to help in solving problems with Texas Instruments TMS320 DSP. Almost always you will find there Troubleshooting, which are the most frequently occurring failures and malfunctions of the device Texas Instruments TMS320 DSP along with tips on how to solve them. Even if you fail to solve the problem, the manual will show you a further procedure – contact to the customer service center or the nearest service center