Instruction/ maintenance manual of the product Processor Intel
Go to page of 289
Intel ® 80200 Processor based on Intel ® XScale ™ Microarchitecture Developer’s Manual March, 2003 Order Number : 2734 11-003.
ii March, 200 3 Deve loper’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Information in t his document i s provided i n connection with Intel® products . No license, expr ess or impli ed, by estop pel or oth e rwise, to any intellectual prop erty rights is g ranted by this docum ent.
Deve loper’ s Man ual March, 200 3 iii Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Contents 1 Introduction ........... .......... ......... ......... .............. .......... ......... ......... ............. 1 1.1 I ntel ® 80200 Pr ocessor based on Intel ® XScale ™ Microarchite cture Hig h-Level O verview .
iv March, 200 3 Deve loper’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture 3.2.2.1 Page ( P) Attribu te Bit ............. ............ ............. ............. .................... ......... 2 3.2.2.2 Cacheabl e (C), B ufferabl e (B), and eXtens ion (X) B its .
Deve loper’ s Man ual March, 200 3 v Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture 6.2.3.3 Write Miss Policy ...................... ............. ............. ............. ................... ...... 7 6.2.3.4 Write-Ba ck Versus Write- Through .
vi March, 200 3 Deve loper’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture 9.3 Programm er Model ...... ............. .................... ............. ............ ............. .................... ..........
Deve loper’ s Man ual March, 200 3 vii Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture 12.5.3 Instructi on Fetch La tency Mo de ...... ............. ............. ............. ............. ................... ...... 8 12.5.
viii March, 200 3 Deve loper’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture 13.11.6.4 DBG.V .. ............. ............. ................... ............. ............. ............. ............. . 25 13.11.
Deve loper’ s Man ual March, 200 3 ix Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture 14.4.10 Miscellane ous Instruc tion Timi ng .... ............. ............. ............. ................... ............. ...... 9 14.4.
x March, 200 3 Deve loper ’s Ma nual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture B.4.1 Instructio n Cache ................ ............. ............. ................... ............. ............. ............. .....
Deve loper’ s Man ual March, 200 3 xi Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture C.2.2 TAP Pins ...... ............. ............. ............. ............. ................... ............. ............. ........... .
xii March, 200 3 Deve loper ’s Ma nual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Figures 1-1 Int el ® 80200 Processor b ased on Intel ® XScale ™ Microarchitecture Featur es ..................... ........... .....
Deve loper’ s Man ual March, 200 3 xiii Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Tabl es 2-1 Multiply with Internal Accumulate Format...................... ............ ........... ........... ........... ............
xiv March, 200 3 Deve loper ’s Ma nual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture 9-1 Interrupt Contro l Register (C P13 register 0) .................. ........... ........... ............ ................ ...........
Deve loper’ s Man ual March, 200 3 xv Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture 14-14 Semaphore Instructio n Timings ................. ........... ........... ............ ........... ........... ................. ... .
.
Deve loper’ s Man ual March, 2 003 1-1 Introduction 1 1.1 Int el ® 80200 Processor based o n Intel ® XScale ™ Microarchite cture High-Lev el Overv iew The Intel ® 8 0200 proces sor based on Inte l ® XScale ™ microarchitecture, is the next generation in the Intel ® StrongARM* p rocessor family (co mpliant with ARM* Architecture V5TE) .
1-2 Marc h, 200 3 Deve loper’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Introduction 1.1.2 Features Figur e 1- 1 shows the maj or f uncti on al bl ocks o f t he Int el ® 80200 processor . The following sections gi ve a brief, h igh-level overview of these blocks.
Deve loper ’ s M anual March, 2003 1-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Introduction 1.1.2.2 Memory Management The Intel ® 8020 0 processor implements the Memory M anagement Unit (MMU) Ar chitecture specified in the ARM Ar chitectur e Refer ence Manual .
1-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Introduction 1.1.2.6 Power Management The Intel ® 80200 pr ocessor supports two low power modes : idle an d sleep. These modes are discusse d in Section 8.
Deve loper ’ s M anual March, 2003 1-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Introduction 1.2 T erminolog y and Conven tions 1.2.1 Number Represent ation All numb ers i n th is d ocum ent c an be ass umed to be bas e 10 un les s des ig nated othe rwi se .
1-6 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Introduction 1.3 Other Releva nt Document s • Intel ® 80200 Pr ocessor ba sed o.
Deve loper ’ s M anual March, 2003 2-1 Programming Model 2 This chapter describe s the programming model of th e Intel ® 8020 0 processor based on Intel ® XScale ™ microarchitecture, namely th e implementation options and extensions to the ARM* V ersion 5 architecture.
2-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model 2.2.4 A RM* DSP-Enhanced Instruction Set The Intel ® 80200 pr ocessor implements AR M DSP-enhanced i n struction set, which is a set of instructions that boos t the performance of signal processing applications.
Deve loper ’ s M anual March, 2003 2-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Programming Model 2.3 Extensions to ARM* Architec ture The Intel ® 8020 0 processor made a few extens ions to the ARM V ersion 5 architecture to meet the needs of v arious markets and design requirement s.
2-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model 2.3.1.1 Multiply With Internal Accumulate Format A new multiply format has been created to define operations on 4 0-bit accumulators.
Deve loper ’ s M anual March, 2003 2-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Programming Model MIA does not support unsigned multip lication; all values in Rs and Rm are interpreted as signed data values.
2-6 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model The MIAxy instru ction performs one16-bit signed multip ly and accumulates these to a single 40-bit accumulator .
Deve loper ’ s M anual March, 2003 2-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Programming Model 2.3.1.2 Internal A ccumulator A ccess Format The Intel ® 80200 process or defines a new instruction format for accessing internal accumu lat ors in CP0.
2-8 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model The MAR instruction moves the value in regis ter RdLo to bits [31:0] of the 40-b it accumulator (acc0) and moves bits[7:0] of the value in regis t er RdHi into b its[39:32] of acc0.
Deve loper ’ s M anual March, 2003 2-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Programming Model 2.3.2 New Page Attribut es The Intel ® 8020 0 processor extends the page att ributes defined by the C and B b its in the page descriptors with an additiona l X bit.
2-10 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model The P bit controls ECC . The TEX (T ype Extens ion) fi eld is pr esent in s everal of the d escriptor types. In the Intel ® 80200 processor, only the LSB of this field is used; this is called the X bit.
Deve loper ’ s M anual March, 2003 2-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Programming Model 2.3.3 A dditions to CP15 Functionality T o accommodate the functionality in the Intel ® 8020 0 processor , registers in CP15 an d CP14 have been added or augmented.
2-12 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model 2.3.4 E vent Architecture 2.3.4.1 Exception Summary Ta b l e 2 - 1 1 shows all the exceptions that the Intel ® 80200 pr ocessor may generate, and the attributes of each.
Deve loper ’ s M anual March, 2003 2-13 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Programming Model 2.3.4.3 Prefetch Abort s The Intel ® 80200 processor detects three types of prefetch aborts: Instruction MMU abort, external abort on an instru ct ion access, and an ins truction cache parity error .
2-14 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model 2.3.4. 4 Data Abort s T wo types of data aborts exist in the Intel ® 80200 processor: precis e and imprecise.
Deve loper ’ s M anual March, 2003 2-15 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Programming Model Imprecise data aborts • A data cache parity error is imprecise; the extended Status field o f the Fault S tat us Register is set to 0xb1 1000.
2-16 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model Multiple Da ta Aborts Multiple data ab orts may be detected by hardware, but only the h ighest pr iority one is reported.
Deve loper ’ s M anual March, 2003 3-1 Memory Management 3 This chapter describes th e memory management unit implemented in the Intel ® 8 0200 proces sor based on I ntel ® XScale ™ microarchitecture, and is compliant with the ARM* Architecture V5TE .
3-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Memory Management 3.2 Archite cture Model 3.2.1 V ersion 4 vs. V ersion 5 ARM* MMU V ersion 5 Architecture introduces t he support o f tiny pages, which are 1 KByte in size.
Deve loper ’ s M anual March, 2003 3-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Memory Management 3.2.2.4 Dat a Cache and W rite Buffer All of these descriptor bits affect th e behavior of th e Dat a Cache and the W rite B uff er .
3-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Memory Management 3.2.2.5 Deta ils on Dat a Cache and W rite Buffer B ehavior If the MMU is disabled all data accesses are non -cacheable and non-buf ferable.
Deve loper ’ s M anual March, 2003 3-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Memory Management 3.3 Interaction of the MMU, Instructio n Cache, a nd Data Cache The MMU, instruction cache, and data/mini-data cach e may be ena bled/disabled independen tly .
3-6 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Memory Management 3.4 Control 3.4.1 I nvalidate (Fl ush) Operatio n The entire instru ction and data TLB can b e invalidated at t he same time with one command or they can be invali dated sep arately .
Deve loper ’ s M anual March, 2003 3-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Memory Management 3.4.3 Locking Entries Individual entries can be locked into the instruction and data TLBs. See T able 7-14, “Cache Loc kdo wn Fun cti ons” on pa ge 7- 14 for the exact comman d s.
3-8 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Memory Management Note: Care must be exercised here when allowing exceptio ns to occur during this routine whose handlers may have data that lies in a page that is tryin g to be locked into the TLB.
Deve loper ’ s M anual March, 2003 3-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Memory Management 3.4.4 Ro und-Robin Replacement Algor ithm The line replacem ent algorithm for the TLB s is round- robin; there is a round-r obin pointer that keeps track of the n ext entry to replace.
.
Deve loper ’ s M anual March, 2003 4-1 Instruction Cache 4 The Intel ® 80200 p rocessor based on Intel ® XScale ™ m icroarchitecture (compliant with th e ARM* Architecture V5TE) in struction cache enh ances perform ance by reducing the number of instruction fetch es from external memory .
4-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Instructio n Cache 4.2 Operation 4.2.1 O peration When Instruction Cache is Enabled When the cache is enabled, it compares every ins truction request addres s against the addres ses of instructions that it is curr ently holding.
Deve loper ’ s M anual March, 2003 4-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Instru ction Cac he 4.2.3 Fetch Policy An instruction-cach e “miss” occurs when the request ed instruction is not found in the instruction fetch buf fers or instruction cache; a fetch r equest is then made to external m emory .
4-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Instructio n Cache 4.2.5 Parity Protection The instruction cache is protected by pa rity to ensure data integr i ty . Each instruction cache word has 1 parity bit.
Deve loper ’ s M anual March, 2003 4-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Instru ction Cac he 4.2.6 I nstruction Fetch Latency Because the Intel ® 80200 proces.
4-6 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Instructio n Cache 4.3 Instru ction Cac he Contro l 4.3.1 I nstruction Cache S t ate at RESET After reset, the instruction cach e is always disabled, unlocked , and invalidated (f lushed).
Deve loper ’ s M anual March, 2003 4-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Instru ction Cac he 4.3.3 Invalidating the Ins truction Cache The entire instruction cache along with the fetch buffers are inv alidated by writing to coprocessor 15, r egister 7.
4-8 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Instructio n Cache 4.3.4 L ocking Instructions in the Instruction Cache Software has the ability to lock performance cr itical routines into the ins truction cache.
Deve loper ’ s M anual March, 2003 4-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Instru ction Cac he Software can lock down several dif ferent routines located at different memo ry locations. This may cause some sets to have more lock ed lines than other s as shown in Fi gure 4-2 .
.
Deve loper ’ s M anual March, 2003 5-1 Branch T a rget Buf f er 5 Intel ® 802 00 process or base d on Inte l ® XScale ™ microarchitecture (compliant with the ARM* Architecture V5TE) uses dynamic branch prediction to re duce the penalties associated with changing the flo w of program ex ecution.
5-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Br anch T arg et Bu f fer 5.1.1 Reset After Processor Reset, the BTB is disabled and all entries are invalidated.
Deve loper ’ s M anual March, 2003 5-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Branch T a rget Buffer 5.2 BTB C ontrol 5.2.1 Disabli n g/Enabling The BTB is always disabled out of Reset. Software can enable the BTB thro ugh a bit in a coprocessor r egis ter (see Section 7.
.
Deve loper ’ s M anual March, 2003 6-1 Data Cache 6 The Intel ® 80200 p rocessor based on Intel ® XScale ™ m icroarchitecture (compliant with th e ARM* Architecture V5TE) d ata cache enhances perfo rmance by reducing the num ber of data accesses to and fr om external memory .
6-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache Figure 6-1. Dat a Cache Organization way 0 way 1 way 31 32 bytes (cache.
Deve loper ’ s M anual March, 2003 6-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Data Cache 6.1.2 Mini-Dat a Cache Overview The mini-data cache is a 2-Kbyte, 2-way set as sociative cache ; this means there are 32 sets with each set containing 2 ways.
6-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache 6.1.3 Write Buffer and Fill Buffer Overvi ew The Intel ® 80200 pr ocessor employs an eight entry write bu ffer , each entry co ntaining 16 bytes.
Deve loper ’ s M anual March, 2003 6-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Data Cache 6.2 Dat a Cache an d Mini-Data Cache Operation The following dis cussions refer to the data cache and min i -data cache as one cach e (data/mini-data) since th eir behavior is the same when ac cessed.
6-6 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache 6.2.3. 2 Read Mi ss Poli cy The following se quence of even ts occurs when a cacheable (s ee Section 6.2.3.1, “Cacheability” on page 6-5 ) load operation misses the cache: 1.
Deve loper ’ s M anual March, 2003 6-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Data Cache 6.2.3.3 W rite Miss Pol icy A write operation that misses the cache requests a 3 2-byte cache line fro m external memory if the access is cacheable and write allocation is specified in the page.
6-8 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache 6.2.4 Round -Robin Replacement A lgorithm The line replacement algorithm for the data cache is round-rob in.
Deve loper ’ s M anual March, 2003 6-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Data Cache 6.3 Dat a Ca che and Mini-Data Cache Control 6.
6-10 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache 6.3.3.1 Global Clean and Invali date Operation A simple software routine is used to globally clean th e data cache.
Deve loper ’ s M anual March, 2003 6-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Data Cache The line-allocate command will not operate on the mini Data Cache, so system software must clean this cache by reading 2KB yte of co ntiguous unus ed data into it.
6-12 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache 6.4 Re-co nfigurin g the Dat a Ca che as Dat a RAM Software has the ability to lock tags associated with 32-by te lines in the data cache, th us creating the appearance of d ata RAM.
Deve loper ’ s M anual March, 2003 6-13 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Data Cache Examp le 6 -3. Locki ng Dat a into th e Dat a Cac he ; configured with C=1 and B=1 ; R0 is the number of 32-byte lines to lock into the data cache.
6-14 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache Example 6-4. Creati ng Data RAM ; R1 contains the virtual address of a region of memory to configure as data RAM, ; which is aligned on a 32-byte boundary.
Deve loper ’ s M anual March, 2003 6-15 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Data Cache T ags can be locked into the d at a cache by enabling th e data cache lock mode bit located in coprocessor 15, r egister 9. (See T able 7-14, “Cache Lockdown Functi ons” on page 7-14 for the exact command.
6-16 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache 6.5 Write Buffer/Fill Buff er Operation an d Contro l See Section 1.2 .2, “T erminolo gy and Acronyms ” on page 1-5 for a def i nition of coales cing.
Deve loper ’ s M anual March, 2003 7-1 Configuration 7 This chapt er describes the Sy stem Control C oprocesso r (CP15) and copro cessor 14 (CP14). C P15 configures the MMU, caches, buffers and other system at tributes. Where possible, the defin ition of CP15 follows th e definition in t he first generation Intel ® Stro ngARM* products.
7-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration The format of M RC and MCR is sh ow n in Ta b l e 7 - 1 .
Deve loper ’ s M anual March, 2003 7-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration The format of LDC and STC is show n i n Ta b l e 7 - 2 . LD C and ST C follow th e program ming notes in the ARM Architectur e Refer ence Manual .
7-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 7.2 CP15 Reg isters Ta b l e 7 - 3 lis ts the CP15 regis ters implemented in the Intel ® 80200 p rocessor. T abl e 7- 3.
Deve loper ’ s M anual March, 2003 7-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.2.1 Register 0: ID and Cache T ype Reg isters Register 0 ho uses two read-only registers that are used for par t identification: an ID r egister and a cache type register .
7-6 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 5:3 Read / Write Ignored Instruction cache associativity = 0b101 = 32 kB 2 Read-as-Z ero / Wri te Ignored Reser ved 1:0 Read / Write Ignored Instruction cache line length = 0b10 = 8 w ords/line T able 7 -5.
Deve loper ’ s M anual March, 2003 7-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.2.2 Register 1: Control and Auxiliary Control Registers Register 1 is .
7-8 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration The mini-data cache attribute bits, in the Intel ® 80 200 process o.
Deve loper ’ s M anual March, 2003 7-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.2.3 Register 2: T ranslation T able Base Register 7.2.4 Register 3: Domain Access Control Regis ter 7.2.5 Register 4: Reserved Register 4 is reserved.
7-10 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 7.2.6 R egister 5: Fault S t atus Register The Fault Status Register (FSR) indicates wh ich fault has occurred, w hich could be either a prefetch abort or a data abort.
Deve loper ’ s M anual March, 2003 7-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.2.8 Regi ster 7: Cache Functions All the fu nctions d efined in t he first g eneration of Intel ® Stron g ARM* appear her e.
7-12 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration Other items to note abo ut the line-allocate command are: • It forces all pending memory operati ons to complete.
Deve loper ’ s M anual March, 2003 7-13 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.2.9 Register 8: TLB Operations Disabling/enabling the MMU has no e ffect on the contents of either TLB: valid entries stay v alid, locked items remain locked.
7-14 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 7.2.10 Register 9: Cac h e Lock Down Register 9 is used for locking do wn entries into the instruction cache and data cache.
Deve loper ’ s M anual March, 2003 7-15 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.2.1 1 Register 10: TLB Lock Down Register 10 is used for locking down en tries into th e instruction TLB, and data TLB. (The protocol for locking down entries can be found in Chapter 3, “Memory Managemen t ” .
7-16 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 7.2.13 Register 13: Proce ss ID The Intel ® 80200 pr ocessor supports the remapping of vi rtual addresse s through a Process ID (PID) register .
Deve loper ’ s M anual March, 2003 7-17 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.2.14 Register 14: Breakpoint Registers The Intel ® 8 0200 proces sor.
7-18 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 7.2.15 Register 15: Coprocessor Access Register This register is selected w hen opco de_ 2 = 0 and CRm = 1. This register cont rols access rights to all the coproces sors in the system except for CP15 and CP14.
Deve loper ’ s M anual March, 2003 7-19 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration T abl e 7-20. Coprocess or Access Registe r 31 30 29 28 27 26 25 24 23 .
7-20 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 7.3 CP14 Reg isters Ta b l e 7 - 2 1 lists the CP1 4 registers implement ed in the Intel ® 80200 pro cessor.
Deve loper ’ s M anual March, 2003 7-21 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.3.3 Registers 6-7 : Clock and Power Manageme nt These regist ers contai n functions for ma naging the co re clock and power .
7-22 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 7.3.4 R egisters 8-15: Software Debug Software deb ug is supported by address breakp oint regi sters (Cop rocessor 15, re gister 14) , serial communication ov er the JT A G interface an d a trace buff er.
Deve loper ’ s M anual March, 2003 8-1 System Management 8 This chapter describes the clocking and power management featur es of the In tel ® 80200 proces sor based on Intel ® XScale ™ microarchitecture (compliant with the A RM* Architecture V5TE) along with reset details.
8-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture System Management The Intel ® 80200 pr ocessor supports low volt age operati on with a s upply as l ow as 0.95 V . At lower voltages, not all CCLK config urations are available.
Deve loper ’ s M anual March, 2003 8-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Syst em Man agement 8.2 Processor Reset The RESET# pin mu st be asserted when C LK and power are applied to the proces sor . CLK, MCLK, and power must be prese nt and stable befo re RESET# can be d eas serted.
8-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture System Management 8.2.2 R eset Effect on Output s After RESETOUT# is as serted, the processor’ s output pins are driven to a well-defined state.
Deve loper ’ s M anual March, 2003 8-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Syst em Man agement 8.3 Power Managem ent The Intel ® 8 0200 proces sor prov ides low power mod es: idle and sl eep, whi ch are li sted in increasing pow er savi ng order .
8-6 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture System Management The JT AG clock must be stopped during s leep mode.
Deve loper ’ s M anual March, 2003 9-1 Interrupts 9 9.1 Int roduc tion The Intel ® 80200 p rocessor based on Intel ® XScale ™ m icroarchitecture (compliant with th e ARM* Architecture V5TE) sup ports a variety o f external and intern al interrupt sources.
9-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Interrup ts 9.3 Programm er Model Software has access to three registers in the ICU. INTCTL is used t o enable or disable (mask) individual interrupts.
Deve loper ’ s M anual March, 2003 9-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Interrupts 9.3.1 INTCTL INTCTL is used to s pecify what interrupts ar e disabled (masked).
9-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Interrup ts 9.3.2 INTSRC The Interrupt Source regis ter (INTSRC) indicates which interrupts are active. This register may be used by an I S R to determine q uickly the sour ce of an interr upt.
Deve loper ’ s M anual March, 2003 9-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Interrupts 9.3.3 INTSTR Systems may h ave differing priorities for the various interrup t cases; the ICU allows system designers to as sociate each internal interrupt sour ce with one of the two i nternal interrupts: FIQ and IRQ.
.
Deve loper ’ s M anual March, 2003 10-1 External Bus 10 10.1 General Description The Intel ® 80200 p rocessor based on Intel ® XScale ™ m icroarchitecture (compliant with th e ARM* Architecture V5TE) bus is a split bus, with separate request and da ta buses.
10-2 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus An alternate configuratio n with a separate memory bus is als o possible, shown in Fi gure 10-2 . All signals o n this bus, data and request, are sampled on the rising edge of MCL K .
Deve loper ’ s M anual March, 2003 10-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2 Signal Descriptio n T abl e 10-1.
10-4 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2.1 Request Bus The request bus issues read or write requests from t he Intel ® 80200 processor or ot her bus mas ter to the chipset or memory controller .
Deve loper ’ s M anual March, 2003 10-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus In addition to the alignment constrain ts listed above, read transactions never cross a 32-byte boundary , and wri te trans actions n ever cros s a 16-byt e boundary .
10-6 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2.2 Data Bus Some time after a request is m ade on the request bus, data must be transfer red for that request on the data bus.
Deve loper ’ s M anual March, 2003 10-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2.3 Critical W ord First The CW F signal is only used du ring read burst s of eight word s ( Le n = 6). CWF need s to be driven at the same time as DV alid of the first data cycle of the transaction.
10-8 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus There are eight byte enables ( BE# ) associated with the D bus. Each b yte enable corresponds to one byte of the bus. During a write cycle, the byte enables for each byte that is being written is asserted low .
Deve loper ’ s M anual March, 2003 10-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2.5 Multimaster Support Simple multim aster support is su pplied with the Hold pin.
10-1 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus A simpler but lower perfor mance method would be to ass ert Ho ld .
Deve loper ’ s M anual March, 2003 10-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2.6 Abort If for any reason a request mad e by the In tel ® 80200 pr ocessor can not b e complet ed, it mu st be aborted.
10-1 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2.7 ECC Software ru nning on the I ntel ® 80200 proces sor may configur e pages in memory as being ECC protected.
Deve loper ’ s M anual March, 2003 10-13 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2.8 Big Endian System Configuration The Intel ® 8020 0 processor supports ex ecution in a big end i an system. A sys tem is said to be bi g endian if multi-byte v alues are accessed with the M SB at lower addresses.
10-1 4 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3 Example s All examples assu me a 64-bit bus, in a little endian syst em. 10.3.1 Si mple Re ad Word In Fi gure 10-4 , a read request for one word at address 0x240 is issued at time 10 ns.
Deve loper ’ s M anual March, 2003 10-15 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.2 Read Burst, No Critical W o rd First In Figur e 10-5 the request goes out the same as the last ex ample, with the address 0x24 8 this time and the length 0x6, indicating an eight word cach e line fill.
10-1 6 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.3 Read Burs t, Crit ical Word First D ata Return Figur e 10-6 is the same as the last with one dif ference: CWF is asserted high on the first data cycle of the return data.
Deve loper ’ s M anual March, 2003 10-17 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.4 Word W r ite Figure 10-7 s hows a 32-bit write r equest to ad dress 0x 240. W/R# is high when ADS# is asserted low .
10-1 8 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.5 T w o Word Coalesced W rite In Fi gure 10-8 , tw o store byte instruction s from the instruction st ream have been coalesced into a single write comm and in t he write b uffer .
Deve loper ’ s M anual March, 2003 10-19 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.5. 1 W rite Burst Figure 10-9 s hows a four word write caused by the eviction of a half cache l ine. In this case, the Len is 0x5 ind icating fo ur words.
10-2 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.6 Wri te Burst, Coalesced Figur e 10-1 0 shows a four word cache write caused by store requests co al esced in a write buf fer .
Deve loper ’ s M anual March, 2003 10-21 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.7 Pipelined Accesses The example in Figure 10-1 1 demonstrates the four deep pip elined nature of this bus.
10-2 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.8 Locked Access An example of a locke d access is shown in Figure 10-12 . Here the processor is doing an ato m ic read/write to ad dress 0x240, denot ed as A i n the figure.
Deve loper ’ s M anual March, 2003 10-23 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.9 Aborted Access As discuss ed in Section 10 .2.6, “Abort” on pag e 10-11 , any request from the I ntel ® 80200 processor can be abo rted by the chipset or memory .
10-2 4 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.10 Hold Figur e 10-1 4 shows an examp le of hold being as serted to stop n ew transactions being is sued.
Deve loper ’ s M anual March, 2003 11-1 Bus Controller 11 1 1.1 Introduct ion The Intel ® 80200 p rocessor based on Intel ® XScale ™ m icroarchitecture (compliant with th e ARM* Architecture V5TE) B us Controller Unit (B CU) is responsibl e for accessing of f-chip memory .
11-2 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Bus Controll er 1 1.3 Error H andling The BCU is able to detect and respond to t wo classes of e rrors: bus abor ts and ECC error s.
Deve loper ’ s M anual March, 2003 11-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Bus Controller 1 1 .3.2 E CC Errors An ECC error occu rs when the BCU reads data and notices that the as sociated ECC bits do no t match the dat a.
11-4 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Bus Controll er Error report ing may be enab led with the BCUC TL register , described in Section 1 1 .4.1 . If enabled, single bit errors cause th e BCU to assert an interrupt to the Inte rrupt Controller Unit (ICU ).
Deve loper ’ s M anual March, 2003 11-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Bus Controller 1 1.4 Prog rammer Model The BCU registers r eside in Copr ocesso r 13 (CP13). They may be accessed/manipulated with the MCR, MRC, STC, and LDC instru ctions.
11-6 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Bus Controll er BCUCTL.TP allows so ftware to determine if the BCU has any pendin g memory transactions. This may be used to ensure that all memory operatio ns have completed before attempting to modify system state.
Deve loper ’ s M anual March, 2003 11-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Bus Controller When ECC i s enabled, the BCU onl y genera tes an inter rupt on a single- bit error i f BCUCTL.SR is set. When ECC is enable d, the BCU always gene rates an abort on a multi-bit er ror .
11-8 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Bus Controll er BCUMOD.AF affects the beh avior of the BCU when it is read ing a 32-byte b lock (a cache line-fill). If this bit is ‘0 ’, then the BCU always emits the 32-byte aligned address of the cache line when requesting it.
Deve loper ’ s M anual March, 2003 11-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Bus Controller 1 1.4.2 ECC Error Registers The contents of thes e registers sh ould only be consi dered valid i f the corres ponding bit in r egister BCUCTL is set.
11-1 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Bus Controll er The BCU does not write to these ELOGx/ECARx registers unless the corresponding BCUCTL. Ex bit is cleared, either by res et or by software.
Deve loper ’ s M anual March, 2003 12-1 Performance Monitoring 12 This chapter d escribes the perf ormance monitoring facilit y of the Intel ® 80200 processo r based on Intel ® XScale ™ microarchitecture (compliant with the ARM* Architecture V5TE).
12-2 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Monitor ing 12.2 Clock Coun ter (CCNT ; CP14 - Reg ister 1) The format of C C NT is shown i n T able 1 2-1 .
Deve loper ’ s M anual March, 2003 12-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance M onitoring 12.3 Perfo rmance Count Reg isters (PMN0 - PMN1; CP14 - Register 2 and 3, Respectively) There are two 32- bit event counters; their format is shown in Ta b l e 1 2 - 2 .
12-4 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Monitor ing 12.4 Performance Monito r Control Register (PMNC ) The perf.
Deve loper ’ s M anual March, 2003 12-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance M onitoring 12.4.1 Managing PMNC The following are a few notes about cont.
12-6 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Monitor ing 12.5 Performance Monito ring Event s T able 1 2-4 lists events that may be monitored by the PMU. Each of the Performance Monitor Count Registers (PMN0 and PMN1) can count any lis ted event.
Deve loper ’ s M anual March, 2003 12-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance M onitoring Some typical combination o f counted events are l isted in this section and summarized in T able 12-5 . In this section, we call such an event comb ination a mo de .
12-8 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Monitor ing 12.5.2 Dat a Cache Efficiency Mode PMN0 totals the num ber .
Deve loper ’ s M anual March, 2003 12-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance M onitoring 12.5.4 Data/Bus Request Bu ffer Full Mode The Data Cache has buffer s available to service cache misses or uncacheab l e accesses.
12-1 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Monitor ing PMN1 counts the number of writeback operations emitted by the data cache.
Deve loper ’ s M anual March, 2003 12-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance M onitoring 12.6 Multiple Performance Mo nitoring Run St atistics Even t.
12-1 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Monitor ing 12.7 Example s In this example, the events selected with the Instruction Cach e Efficiency mod e are monitored and CCNT is used to measur e total execution time.
Deve loper ’ s M anual March, 2003 13-1 Software Debug 13 This chapt er describes s o ftware debu g and related feature s in the Int el ® 80200 pro cessor bas ed on Intel ® XScale ™ micro archit.
13-2 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.3 Introduc tion The Intel ® 80200 processor debug unit, when used with a debugger applicatio n, allows s oftware running on a the In tel ® 80200 pro cessor target to be debu gged.
Deve loper ’ s M anual March, 2003 13-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.4 De bug Co ntrol an d S t atus Re gist er (D CSR) The DCSR register is the main control r egis ter for the d ebug unit.
13-4 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.4.1 Globa l Enable Bit (GE) The Global Enable bit disables and enables all d ebug functionality (except the rese t vector trap).
Deve loper ’ s M anual March, 2003 13-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.4.3 V ector T rap Bit s (TF ,TI,T D,T A,TS,TU,TR) The V ector Trap bits allow instruction breakpoints to be set o n exception vectors without using up any of the br eakpoint registers.
13-6 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.5 Debug Exceptio ns A debug exception causes the processor to re-direct execution t o a debug event h andling routine.
Deve loper ’ s M anual March, 2003 13-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug During Halt mode, software runnin g on the Intel ® 80200 processor cannot access DCSR, or any of hardware breakpoint reg isters, unless the pro ces sor is in Special Debug State (SDS), described below .
13-8 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.5.2 Monitor Mode In monitor mod e, the processor handles de bug exceptio ns like normal ARM exceptions.
Deve loper ’ s M anual March, 2003 13-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.6 HW Breakpoint Resou rces The Intel ® 80200 pro cesso r deb ug ar chi tect ure d efi nes tw o inst ruc tion a nd two da ta bre akpo in t registers , denoted IBCR0, IBCR1, DBR 0, and DBR1 .
13-1 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.6.2 Dat a Breakpoint s The Intel ® 80 200 proces sor debug architecture defines two dat a breakpo int regist ers (DBR0, DBR1).
Deve loper ’ s M anual March, 2003 13-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug When DBR1 is progr ammed as a data add ress mask, it is used in co njunction with the addr es s in DBR0.
13-1 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.8 T ransmit/Recei ve Control Register (TXRXCTRL) Communicat.
Deve loper ’ s M anual March, 2003 13-13 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.8.1 RX Register Ready Bit (RR) The debugger and debug handler use the R R bit to synchron i ze accesses to RX.
13-1 4 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.8.2 Overflow Fla g (OV) The Overflow flag is a sticky flag that is set when the debugger writes to the RX register while the RR bit is set.
Deve loper ’ s M anual March, 2003 13-15 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.8.4 TX Register Ready Bit (TR) The debugger and debug handler use the TR bit to synchronize accesses to the TX r egis ter .
13-1 6 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.9 T ransmit Reg ister (TX) The TX register is the debug hand ler transmit buf fer . The debug handler sends data to the debugger through this regi ster .
Deve loper ’ s M anual March, 2003 13-17 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.1 1 Deb ug JT AG Access There are four JT AG instruction s used by the debugger dur ing softwar e debug: LDIC, SELDC SR, DBG TX and DBGRX.
13-1 8 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.1 1.2 SELDC SR JT AG Register Placing the “SELDCSR” JT .
Deve loper ’ s M anual March, 2003 13-19 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.1 1.2.1 DBG .HLD_RST The debugger uses DBG .HLD_RST when loading code in to the instruction cach e during a processor reset.
13-2 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.1 1.2 .2 DBG .B RK DBG .BRK allows the debugg er to generate an external debug break and asynchronously re- direct exe cut ion to a deb ug h andl ing r outi ne.
Deve loper ’ s M anual March, 2003 13-21 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.1 1.4 DBG TX JT AG Register The DBG TX JT AG instruction selects the Debug JT AG Data register ( Fi gure 13-3 ).
13-2 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.1 1.6 D BGRX JT AG Register The DBGRX JT AG instruction selects the DBGRX JT AG Data register . The debug ger uses the DBGRX data regi ster to s end data o r commands to the debug handler .
Deve loper ’ s M anual March, 2003 13-23 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.1 1.6.1 RX Write Logic The RX write logic ( Figure 13- 6 ) serves 4 functions: 1) Enable th e debugger write to RX - the logic ensu res only new , valid data from th e debugger is written to RX.
13-2 4 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.1 1.6.2 D BGRX Dat a Registe r The bits in the DBGRX data register ( Figu re 13-6 ) are used by t he debugger t o send d ata to the processor .
Deve loper ’ s M anual March, 2003 13-25 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.1 1.6.4 DBG .V The debugger sets this bit to indicate the d ata scanned i nto DBG_SR [34:3] is vali d data to write to RX.
13-2 6 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.12 T race Buffer The 256 entry trace buff er provides the ability to capture control flow information to be used for debuggin g an applicat ion.
Deve loper ’ s M anual March, 2003 13-27 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug When the trace buffer is enabled, reading and writing to either checkpoint register has unpredictable results. When the trace buffer is disabled, writing to a checkpoint register sets the register to the valu e written.
13-2 8 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.13 T race Buffer Entries T race buffer entries cons ist of either one or five by tes. Most entries are on e byte messages indicating the type of c ontrol flow change.
Deve loper ’ s M anual March, 2003 13-29 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.13. 1.1 Exception Messa ge Byte When any kind of exception occurs, an exception message is placed in the trace buffer .
13-3 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.13.1 .2 Non-exception Mess age Byte Non-exception messag e bytes are use d for direct bran ches, indirect branches, and ro llovers.
Deve loper ’ s M anual March, 2003 13-31 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.13. 1.3 Address Bytes Only indirect branch entries contain address bytes in addition to the mes sage byte. Ind irect branch entries always have fou r address bytes indicating the tar g et of that indirect br anch.
13-3 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.13.2 T race Buffer Usage The Intel ® 80200 processor trace buf fer is 256 bytes in length. The first byte read from the buf fer represents the oldes t trace history information in the b u ffer .
Deve loper ’ s M anual March, 2003 13-33 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug As the trace buf fer is read, the oldest entries are r ead first.
13-3 4 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.14 Downloading Code in the ICache On the Intel ® 8020 0 processor, a 2K mini instruction cache, physically sep arate 1 from the 32K main instruction cache can be used as an on-chip instruction RAM.
Deve loper ’ s M anual March, 2003 13-35 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13. 14. 2 LD IC JT AG D ata Regi st er The LDIC JT AG Data Register is selected when the LDIC JT A G instruction is in th e JT AG IR.
13-3 6 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.14.3 LDIC Cache Function s The Intel ® 80200 processor supports fo ur cache functions that can be executed t hrough JT AG .
Deve loper ’ s M anual March, 2003 13-37 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug All packets are 33 bits in l ength.
13-3 8 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.14.4 L oading I C During Reset Code can be download ed into the instruction cache throu gh JT AG during a processor rese t.
Deve loper ’ s M anual March, 2003 13-39 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.14.4.1 Loading IC During Cold Reset for Debug The Figure 13-12 shows the actions necessary to download code into the instru ction cache during a cold reset for debug.
13-4 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug An external host should take the follo wing steps to load code.
Deve loper ’ s M anual March, 2003 13-41 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.14.4.2 Loading IC During a W arm Reset for Debug Loading the instruction cache during a warm reset may be a sl ightly dif ferent situation than during a cold reset.
13-4 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug If it is necessary to download code into the instruction cache then: 2) Assert TRST#. This clears the Halt Mode bit allowing the instruction cache to be invalidated.
Deve loper ’ s M anual March, 2003 13-43 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.14.5 Dynamically Loading IC After Reset An external host c an load code into th e inst ruction cache “o n the fly” or “dynamically”.
13-4 4 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug The following step s describe the details fo r downloading cod.
Deve loper ’ s M anual March, 2003 13-45 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.14.5.1 Dynamic Code Download Sync hronization The following pieces of cod e are necessary in the debug han dler to implement the synchronization used during dy namic code download .
13-4 6 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.14.6 Mini Instructi on Cache Overview The mini instruction ca che is a smaller version of the main instruction cache (Refer t o Chapter 4 for more details on the main instruction cache).
Deve loper ’ s M anual March, 2003 13-47 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.15 Halt Mode Softwa re Prot ocol This sectio n describes the ov erall debug process in Halt Mode . It describes h o w to st art and en d a debug ses sion and det ails for implementing a debug han dler .
13-4 8 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.15.1 .2 Placing the Handler in Memory The debug handler is not required to be p laced at a specific pre-defined address.
Deve loper ’ s M anual March, 2003 13-49 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.15.2 Implementing a Debug Handler The debugger uses the debug hand l er to ex amine or mo dify pro cessor stat e by sendi ng command s and reading d ata through JT AG .
13-5 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.15.2.3 D ynamic Debug Handler On the Intel ® 8020 0 processo r, the debug handler and over ride vector tab l es reside in the 2 KB mini instruction cache, separate from the main ins truction cache.
Deve loper ’ s M anual March, 2003 13-51 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 2. Using the Main IC The steps for d o wnloading dynam ic functions into the main in s truction cache is similar to downloading into the min i instruction cache.
13-5 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.15 .2.4 H igh- Spee d Download Special debu g hardware has been added to supp ort a high-speed d ownload mode to increas e the performance of downloads to sy stem memory (vs.
Deve loper ’ s M anual March, 2003 13-53 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.15.3 Ending a Debu g Session Prior to ending a debu g session, the.
13-5 4 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.16 Software Debug Notes/Errat a 1. Trace bu f fer message count v alue on data aborts: LDR to non-PC that aborts gets counted in the ex ception message.
Deve loper ’ s M anual March, 2003 14-1 Performance Considerations 14 This chapter describes relevant perfor mance considerations that compiler writers, app l ication programmers and system designer.
14-2 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Considerations 14.2 Branch Predictio n The Intel ® 80200 processor implements dynamic bran ch prediction for the ARM* instructions B and BL and f or the Thu mb* instr uction B .
Deve loper ’ s M anual March, 2003 14-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance Considerations 14.4 Instruction Lat encies The latencies for all the inst.
14-4 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Considerations • Minimum Res ource Latency The minimum cycle distance.
Deve loper ’ s M anual March, 2003 14-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance Considerations 14.4.3 Data Pr ocessing Instruction T imings T able 14-5.
14-6 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Considerations 14.4.4 Mult iply I nstruc tion Timings T able 14-7.
Deve loper ’ s M anual March, 2003 14-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance Considerations UMULL Rs[31:15] = 0x00000 0 1 RdLo = 2; RdHi = 3 2 13 3 3 Rs[31:27] = 0x00 0 1 RdLo = 3; RdHi = 4 3 14 4 4 all others 0 1 RdLo = 4; RdHi = 5 4 15 5 5 1.
14-8 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Considerations 14.4.5 Satu rated Ari thmetic Instru ctions h 14.4.6 St atus Register Access Instructions 14.4.7 Lo ad/Store Inst ruc tions T able 14-10.
Deve loper ’ s M anual March, 2003 14-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance Considerations 14.4.8 Semaphore Instructions 14.4.9 Co processor I nstructions 14.4.10 Miscellaneous Instruc tion T iming 14.
.
Deve loper ’ s M anual March, 2003 A-1 Compatibility: Intel ® 80200 Processor vs. SA -1 10 A This appendix highlights th e differences between the first generation Intel ® S trongARM* technology (SA-11 0) and the Intel ® 802 00 process or based on Inte l ® XScale ™ microarchitecture (compliant with the ARM* Architecture V5TE).
A-2 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Compatibility: Intel ® 80200 Pro cessor vs.
Deve loper ’ s M anual March, 2003 A-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Compatibility: Intel ® 80200 P rocesso r vs.
A-4 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Compatibility: Intel ® 80200 Pro cessor vs.
Deve loper ’ s M anual March, 2003 A-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Compatibility: Intel ® 80200 P rocesso r vs. SA- 1 1 0 A.3.6 Performance Differences There exists si gnificant performa nce differences in p rogram execution b etween SA-1 10 and the Intel ® 802 00 process or.
.
Deve loper ’ s M anual March, 2003 B-1 Optimization Guide B B.1 Int roduc tion This appendix con t ains optimization techniques f or achieving the highes t performance fr om the Intel ® 802 00 process or base d on Inte l ® XScale ™ microarchitecture (compliant with the ARM* Architecture V5TE).
B-2 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.2 Intel ® 80200 Processo r Pipeline One of the biggest dif ferences between the Intel ® 8 0200 proces sor and first-g eneration Int el ® Stron gARM* processors is the p ipeline.
Deve loper ’ s M anual March, 2003 B-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.2.1 .2. Intel ® 80200 Processor Pipeline Organization The Intel ® 80200 pr ocessor sin gle-issue superpi p eline cons ists of a main execut ion pipel ine, MAC pipeline, and a memo ry access pipeline.
B-4 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.2.1.3. Out Of Order Completion Sequential consisten cy of ins.
Deve loper ’ s M anual March, 2003 B-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.2.2 Instruction Flow Th rough the Pipeline The Intel ® 8020 0 processor pipel ine issu es a sing le instr uction per cl ock cycle.
B-6 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.2.3 Main Execution Pi peline B.2.3.1. F 1 / F2 (Instruction Fetch) Pipest ages The job of the in struction fetch stages F1 and F2 is to present the n ext instruction to be executed to the ID stage.
Deve loper ’ s M anual March, 2003 B-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.2.3.3. RF (Register File / Shifte r) Pipest age The main function of the RF pipestag e is to read and write to the register file un i t (RFU).
B-8 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.2.4 M emory Pipeline The memory pipeline consists of two stages, D1 and D2. The data cache unit , or DCU, consists of the data-cache array , mini-data cache, f ill buf fers, and writebuf fers.
Deve loper ’ s M anual March, 2003 B-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.3 Bas ic Optimiza tions This chapter outlines optimizations s p ecific to ARM architecture. These optimizations have been modified to suit the Intel ® 80200 processor architecture where needed .
B-10 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.3.1. 2. Optimizing Branches Branches decrease appl ication performance by indirectly causing p ipeline stalls.
Deve loper ’ s M anual March, 2003 B-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de P2 Percentage of times we are li kely to incur a branch mispredic.
B-12 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.3.1.3. Optimizing Complex Expressions Conditional instru ct ions sh ould also be used to improv e the code generated for complex expressions s uch as the C shortcut ev aluation feature.
Deve loper ’ s M anual March, 2003 B-13 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.3.2 Bit Fi eld Manipu lation The Intel ® 8020 0 processor shift an d logical operations provide a useful w ay of manip ulating bit fields.
B-14 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.3.3 Optimizing the Use of Immediate V alues The Intel ® 80200 pro cessor MOV or MVN in struction should be used wh en loading an immediate (constant) value into a register .
Deve loper ’ s M anual March, 2003 B-15 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.3.4 Optimizing In teger M ultip ly a nd Di vide Multiplication by an intege r constant should be optimized to make use of the sh ift operation whenever possible.
B-16 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.3.5 Effective Use of Addressing Modes The Intel ® 80200 pr ocessor provide s a variety of addres sing modes that mak e indexing an array of objects high ly efficient.
Deve loper ’ s M anual March, 2003 B-17 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4 Cache and Prefetch Optimizations This chapter consid ers how to use the v arious cache memo ries in all their modes and then examines when and how to use prefetch to improve execu tion efficien ci es.
B-18 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.4.1.4. L ocking Code into the Instruction Cache One very important instruction cache feature is the ability to lock code into the instructio n cache.
Deve loper ’ s M anual March, 2003 B-19 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4.2 Data and Mini Cache The Intel ® 80200 process o r allows the user to define memor y regions whose cache policies can be set by the user (s ee Section 6.
B-20 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.4.2. 3. Read Allocate and Read- write Al locate M e mory Regions Most of the regula r data and the stack for your ap plication should be allocated to a read-write allocate region.
Deve loper ’ s M anual March, 2003 B-21 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4.2 .5. Mini-dat a Cache The mini-data cache is best used for data structures , which have s hort temporal lives , and/or cover vast amounts of data space.
B-22 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.4.2. 6. Dat a Alignme nt Cache lines begin on 32-by te address bou ndaries.
Deve loper ’ s M anual March, 2003 B-23 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4.2.7. Literal Pools The Intel ® 80200 process or does not h ave a single ins truction that can m ove all literals (a constant or address) to a register .
B-24 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.4.3 Cache Considerations B.
Deve loper ’ s M anual March, 2003 B-25 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4.4 Prefetch Co nsiderations The Intel ® 8020 0 processor has a true p refetch load instru ction (PLD). The p urpose of this instruction is to prelo ad data into the data and mini-data caches.
B-26 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide The Intel ® 80200 pr ocessor needs sev en bus clocks to p rocess a memory reques t to the S D RAM (N processor ).
Deve loper ’ s M anual March, 2003 B-27 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4.4.2. Prefetch Loop Scheduling When adding prefetch to a loop w hich operates on arrays, it may b e advantages to prefetch ahead one, two, or more iterations.
B-28 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.4.4.6. B andwid th Limitations Overuse of pr efetches can usurp res ources and d egrade performance.
Deve loper ’ s M anual March, 2003 B-29 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4. 4.7. Ca che Me mory Cons idera tions Stride, the way data structures are walked through, can affect the temporal quality of the data and reduce or increase cache conflicts.
B-30 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide on a 32-byt e boundary , modi fications to the Y ear2Date fields i s likely t o use tw o write bu ffers when the data is written out to me mory .
Deve loper ’ s M anual March, 2003 B-31 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4. 4.8. Ca che Bl ocking Cache blocking techniques, s uch as strip-mining, are used to improve temporal locality of the data.
B-32 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.4.4. 10. Pointer P refetc h Not all loopin g constructs contai n inducti on variabl es. However , prefetching techni ques can s till be applied.
Deve loper ’ s M anual March, 2003 B-33 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4.4.1 1. Loop Interchange As mentioned earlier , the sequence in which d ata is accessed af fects cache thrashing.
B-34 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.4.4.1 3. Prefetch to Re duce Regi ster Pre ssure Prefetch can be used to red u ce register pres sure.
Deve loper ’ s M anual March, 2003 B-35 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.5 Instructi on Scheduling This chapter discuss es instruction sched uling optimizations.
B-36 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide ; all other registers are in use sub r1, r6, r7 mul r3,r6, r2 .
Deve loper ’ s M anual March, 2003 B-37 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.5.1.1. Scheduling Load and Store Double (LDRD/STRD) The Intel ® 8 0200 proces sor introduces t wo new d ouble w ord instr uctions: LDRD and STRD .
B-38 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.5.1.2. Scheduling Load and Store Multiple (LDM/STM) LDM an d STM instruction s have an i ssue laten cy of 2-20 cy cles depending on t he number of registers being loaded or stored.
Deve loper ’ s M anual March, 2003 B-39 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.5.2 Scheduling Data Processing Instructions Most Intel ® 80200 pro cessor data processi ng ins tructions h ave a result latenc y of 1 cycl e.
B-40 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.5.3 Schedu ling M ultip ly In struct ions Multiply instr uctions can cause pipeline stal ls due to either resource conflict s or result latencies.
Deve loper ’ s M anual March, 2003 B-41 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.5.4 Scheduling SWP and SWPB Instructions The SWP and SWPB instructions have a 5 cycle issue laten cy . As a result of this laten cy , the instruction fo llowing the SWP / SWPB instruction would stall for 4 cycles.
B-42 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.5.5 Schedu ling the MRA and M AR Instructions (MRRC/MCR R) T.
Deve loper ’ s M anual March, 2003 B-43 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.5.6 S cheduling the MIA and MIAPH Instructions The MI A instruction has an i ssue latency of 1 cycle.
B-44 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.5.7 Scheduling MRS and MSR Instructi ons The MRS instruction has an issue latency of 1 cycle and a res ult latency of 2 cycles.
Deve loper ’ s M anual March, 2003 B-45 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.6 Optimizing C Libraries Many of the st andard C library routin es can benefit greatly by being optim ized for the Intel ® 80200 processor architectur e.
.
Deve loper ’ s M anual March, 2003 C-1 T est Features C The Intel ® 80200 p rocessor based on Intel ® XScale ™ m icroarchitecture (compliant with th e ARM* Architecture V5TE) implements Design Fo r T est (DFT) techniques to ensure quality and reliability .
C-2 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture T est Featur es C.2.1 B oundary Scan Architecture Boundary scan test logic co nsists of a Boundary-Scan reg i ster and supp ort logic.
Deve loper ’ s M anual March, 2003 C-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture T e st Feat ures C.2 .2 T AP Pi ns The Intel ® 80200 pr oce ssor T AP is comp ose d of fo ur in put c onne ctio ns (T MS, T CK , TRST # and TDI) and one o utput connection ( TDO ).
C-4 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture T est Featur es C.2.3 Instruction Register (IR) The instruction re gister holds instructio n codes shifted through the T est Data Input (TDI) pin.
Deve loper ’ s M anual March, 2003 C-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture T e st Feat ures T abl e C-3. IEEE Instructions Instruction / Requisite Opcode Description extest IEEE 1 149.
C-6 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture T est Featur es C.2.4 T AP T est Data Registers The Intel ® 80200 pr ocessor contains a device identification register and two test data registers (Bypass and RUNBIST).
Deve loper ’ s M anual March, 2003 C-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture T e st Feat ures C.2.5 T AP Controller The T AP controller is a 16-state synchr onous finite state machine that con t rols the sequence o f test logic operations.
C-8 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture T est Featur es C.2.5. 1. T est Logic R eset S t ate In this state, test logic is disabled to allow normal operation of the Intel ® 80200 pr oce ssor.
Deve loper ’ s M anual March, 2003 C-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture T e st Feat ures C.2.5 .5. Shift-DR S t ate In this contro ller state, the test data r.
C-10 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture T est Featur es C.2. 5.9. Up date-DR S t ate The Boundary -Scan regi ster is p rovided with a latch ed parall el output .
Deve loper ’ s M anual March, 2003 C-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture T e st Feat ures C.2.5 .13. Exit1-I R S t ate This is a temporary s tate. If TMS is held hi gh on the rising edg e of TCK, the controller enters the Update-IR state, which terminates the scanning p r ocess.
C-12 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture T est Featur es C.2.5.17. Boundary-Scan Example In the example that fo llows, two command actions are described. Th e example starts in the reset state, a new instruction is loaded an d executed.
Deve loper ’ s M anual March, 2003 C-13 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture T e st Feat ures Figure C- 3. JT AG E xample 00 0 1 1 00 0 0 0 0 00 00 0 0 000 000 0 .
C-14 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture T est Featur es Figure C-4. Timing Diagram Illustrati ng the Loading of I nstruct.
Deve loper ’ s M anual March, 2003 C-15 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture T e st Feat ures Figure C-5. Timing Dia gram Illustrating the Loading of Dat a R egis.
An important point after buying a device Intel Processor (or even before the purchase) is to read its user manual. We should do this for several simple reasons:
If you have not bought Intel Processor yet, this is a good time to familiarize yourself with the basic data on the product. First of all view first pages of the manual, you can find above. You should find there the most important technical data Intel Processor - thus you can check whether the hardware meets your expectations. When delving into next pages of the user manual, Intel Processor you will learn all the available features of the product, as well as information on its operation. The information that you get Intel Processor will certainly help you make a decision on the purchase.
If you already are a holder of Intel Processor, but have not read the manual yet, you should do it for the reasons described above. You will learn then if you properly used the available features, and whether you have not made any mistakes, which can shorten the lifetime Intel Processor.
However, one of the most important roles played by the user manual is to help in solving problems with Intel Processor. Almost always you will find there Troubleshooting, which are the most frequently occurring failures and malfunctions of the device Intel Processor along with tips on how to solve them. Even if you fail to solve the problem, the manual will show you a further procedure – contact to the customer service center or the nearest service center