Compaq Network Card 21264 User Manual

21264/EV68A Microprocessor  
Hardware Reference Manual  
Part Number: DS–0038B–TE  
This manual is directly derived from the internal 21264/EV68A Specifications, Revi-  
sion 1.1. You can access this hardware reference manual in PDF format from the  
following site:  
ftp://ftp.compaq.com/pub/products/alphaCPUdocs  
Revision/Update Information:  
Revision 1.1, March 2002  
Compaq Computer Corporation  
Shrewsbury, Massachusetts  
Download from Www.Somanuals.com. All Manuals Search And Download.  
21264/EV68A Hardware Reference Manual  
iii  
Download from Www.Somanuals.com. All Manuals Search And Download.  
iv  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
21264/EV68A Hardware Reference Manual  
v
Download from Www.Somanuals.com. All Manuals Search And Download.  
vi  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Fault Reset Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
7–8  
21264/EV68A Hardware Reference Manual  
vii  
Download from Www.Somanuals.com. All Manuals Search And Download.  
9–5  
Power Supply Sequencing and Avoiding Potential Failure Mechanisms . . . . . . . . . . . . . . .  
viii  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
D–14  
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
D–15  
21264/EV68A Hardware Reference Manual  
ix  
Download from Www.Somanuals.com. All Manuals Search And Download.  
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
D–24  
x
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Figures  
21264/EV68A Hardware Reference Manual  
xi  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Power-Up Timing Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
7–3  
xii  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Tables  
21264/EV68A Hardware Reference Manual  
xiii  
Download from Www.Somanuals.com. All Manuals Search And Download.  
xiv  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
21264/EV68A Hardware Reference Manual  
xv  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Preface  
Audience  
Content  
This manual is for system designers and programmers who use the Alpha 21264/  
EV68A microprocessor (referred to as the 21264/EV68A).  
This manual contains the following chapters and appendixes:  
Chapter 1, Introduction, introduces the 21264/EV68A and provides an overview of the  
Alpha architecture.  
Chapter 2, Internal Architecture, describes the major hardware functions and the inter-  
nal chip architecture. It describes performance measurement facilities, coding rules, and  
design examples.  
Chapter 3, Hardware Interface, lists and describes the internal hardware interface sig-  
nals, and provides mechanical data and packaging information, including signal pin  
lists.  
Chapter 4, Cache and External Interfaces, describes the external bus functions and  
transactions, lists bus commands, and describes the clock functions.  
Chapter 5, Internal Processor Registers, lists and describes the internal processor regis-  
ter set.  
Chapter 6, Privileged Architecture Library Code, describes the privileged architecture  
library code (PALcode).  
Chapter 7, Initialization and Configuration, describes the initialization and configura-  
tion sequence.  
Chapter 8, Error Detection and Error Handling, describes error detection and error han-  
dling.  
Chapter 9, Electrical Data, provides electrical data and describes signal integrity issues.  
Chapter 10, Thermal Management, provides information about thermal management.  
Chapter 11, Testability and Diagnostics, describes chip and system testability features.  
Appendix A, Alpha Instruction Set, summarizes the Alpha instruction set.  
Appendix B, 21264/EV68A Boundary-Scan Register, presents the BSDL description  
of the 21264/EV68A boundary-scan register.  
21264/EV68A Hardware Reference Manual  
xvii  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
Appendix C, Serial Icache Load Predecode Values, provides a pointer to the Alpha  
Motherboards Software Developer’s Kit (SDK), which contains this information.  
Appendix D, PALcode Restrictions and Guidelines, lists restrictions and guidelines  
that must be adhered to when generating PALcode.  
Appendix E, 21264/EV68A-to-Bcache Pin Interface, provides the pin interface  
between the 21264/EV68A and Bcache SSRAMs.  
The Glossary lists and defines terms associated with the 21264/EV68A.  
An Index is provided at the end of the document.  
Documentation Included by Reference  
The companion volume to this manual, the Alpha Architecture Reference Manual,  
Fourth Edition, can be accessed from the following website: ftp.compaq.com/  
pub/products/alphaCPUdocs.  
xviii  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Terminology and Conventions  
This section defines the abbreviations, terminology, and other conventions used  
throughout this document.  
Abbreviations  
Binary Multiples  
The abbreviations K, M, and G (kilo, mega, and giga) represent binary multiples  
and have the following values.  
10  
K
M
G
=
=
=
2
2
2
(1024)  
20  
30  
(1,048,576)  
(1,073,741,824)  
For example:  
10  
2KB  
=
2 kilobytes  
4 megabytes  
8 gigabytes  
2 kilopixels  
4 megapixels  
=
=
=
=
=
2 × 2 bytes  
20  
4MB  
=
=
=
=
4 × 2 bytes  
30  
8GB  
8 × 2 bytes  
10  
2K pixels  
4M pixels  
2 × 2 pixels  
20  
4 × 2 pixels  
Register Access  
The abbreviations used to indicate the type of access to register fields and bits have  
the following definitions:  
Abbreviation Meaning  
IGN  
Ignore  
Bits and fields specified are ignored on writes.  
MBZ  
Must Be Zero  
Software must never place a nonzero value in bits and fields specified as  
MBZ. A nonzero read produces an Illegal Operand exception. Also, MBZ  
fields are reserved for future use.  
RAZ  
RC  
Read As Zero  
Bits and fields return a zero when read.  
Read Clears  
Bits and fields are cleared when read. Unless otherwise specified, such bits  
cannot be written.  
RES  
RO  
Reserved  
Bits and fields are reserved by Compaq and should not be used; however,  
zeros can be written to reserved fields that cannot be masked.  
Read Only  
The value may be read by software. It is written by hardware. Software write  
operations are ignored.  
RO,n  
Read Only, and takes the value n at power-on reset.  
The value may be read by software. It is written by hardware. Software write  
operations are ignored.  
21264/EV68A Hardware Reference Manual  
xix  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Abbreviation Meaning  
RW  
Read/Write  
Bits and fields can be read and written.  
RW,n  
W1C  
Read/Write, and takes the value n at power-on reset.  
Bits and fields can be read and written.  
Write One to Clear  
If read operations are allowed to the register, then the value may be read by  
software. If it is a write-only register, then a read operation by software  
returns an UNPREDICTABLE result. Software write operations of a 1 cause  
the bit to be cleared by hardware. Software write operations of a 0 do not  
modify the state of the bit.  
W1S  
Write One to Set  
If read operations are allowed to the register, then the value may be read by  
software. If it is a write-only register, then a read operation by software  
returns an UNPREDICTABLE result. Software write operations of a 1 cause  
the bit to be set by hardware. Software write operations of a 0 do not modify  
the state of the bit.  
WO  
Write Only  
Bits and fields can be written but not read.  
WO,n  
Write Only, and takes the value n at power-on reset.  
Bits and fields can be written but not read.  
Sign extension  
SEXT(x) means x is sign-extended to the required size.  
Addresses  
Unless otherwise noted, all addresses and offsets are hexadecimal.  
Aligned and Unaligned  
The terms aligned and naturally aligned are interchangeable and refer to data objects  
that are powers of two in size. An aligned datum of size 2n is stored in memory at a  
byte address that is a multiple of 2n; that is, one that has n low-order zeros. For ex-  
ample, an aligned 64-byte stack frame has a memory address that is a multiple of 64.  
A datum of size 2n is unaligned if it is stored in a byte address that is not a multiple of  
2n.  
Bit Notation  
Multiple-bit fields can include contiguous and noncontiguous bits contained in square  
brackets ([]). Multiple contiguous bits are indicated by a pair of numbers separated by a  
colon [:]. For example, [9:7,5,2:0] specifies bits 9,8,7,5,2,1, and 0. Similarly, single bits  
are frequently indicated with square brackets. For example, [27] specifies bit 27. See  
also Field Notation.  
Caution  
Cautions indicate potential damage to equipment or loss of data.  
xx  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Data Units  
The following data unit terminology is used throughout this manual.  
Term  
Words  
Bytes  
Bits  
Other  
Byte  
½
1
1
2
4
8
8
16  
32  
64  
Word  
Longword  
Quadword  
2
Dword  
2 longword  
4
Do Not Care (X)  
A capital X represents any valid value.  
External  
Unless otherwise stated, external means not contained in the chip.  
Field Notation  
The names of single-bit and multiple-bit fields can be used rather than the actual bit  
numbers (see Bit Notation). When the field name is used, it is contained in square  
brackets ([]). For example, RegisterName[LowByte] specifies RegisterName[7:0].  
Note  
Notes emphasize particularly important information.  
Numbering  
All numbers are decimal or hexadecimal unless otherwise indicated. The prefix 0x indi-  
cates a hexadecimal number. For example, 19 is decimal, but 0x19 and 0x19A are hexa-  
decimal (also see Addresses). Otherwise, the base is indicated by a subscript; for  
example, 100 is a binary number.  
2
Ranges and Extents  
Ranges are specified by a pair of numbers separated by two periods (..) and are inclu-  
sive. For example, a range of integers 0..4 includes the integers 0, 1, 2, 3, and 4.  
Extents are specified by a pair of numbers in square brackets ([]) separated by a colon  
(:) and are inclusive. Bit fields are often specified as extents. For example, bits [7:3]  
specifies bits 7, 6, 5, 4, and 3.  
Register Figures  
The gray areas in register figures indicate reserved or unused bits and fields.  
Bit ranges that are coupled with the field name specify the bits of the named field that  
are included in the register. The bit range may, but need not necessarily, correspond to  
the bit Extent in the register. See the explanation above Table 5–1 for more information.  
Signal Names  
The following examples describe signal-name conventions used in this document.  
21264/EV68A Hardware Reference Manual  
xxi  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                 
AlphaSignal[n:n]  
Boldface, mixed-case type denotes signal names that are  
assigned internal and external to the 21264/EV68A (that  
is, the signal traverses a chip interface pin).  
AlphaSignal_x[n:n] When a signal has high and low assertion states, a lower-  
case italic x represents the assertion states. For example,  
SignalName_x[3:0] represents SignalName_H[3:0] and  
SignalName_L[3:0].  
UNDEFINED  
Operations specified as UNDEFINED may vary from moment to moment, implementa-  
tion to implementation, and instruction to instruction within implementations. The  
operation may vary in effect from nothing to stopping system operation.  
UNDEFINED operations may halt the processor or cause it to lose information. How-  
ever, UNDEFINED operations must not cause the processor to hang, that is, reach an  
unhalted state from which there is no transition to a normal state in which the machine  
executes instructions.  
UNPREDICTABLE  
UNPREDICTABLE results or occurrences do not disrupt the basic operation of the pro-  
cessor; it continues to execute instructions in its normal manner. Further:  
Results or occurrences specified as UNPREDICTABLE may vary from moment to  
moment, implementation to implementation, and instruction to instruction within  
implementations. Software can never depend on results specified as UNPREDICT-  
ABLE.  
An UNPREDICTABLE result may acquire an arbitrary value subject to a few con-  
straints. Such a result may be an arbitrary function of the input operands or of any  
state information that is accessible to the process in its current access mode.  
UNPREDICTABLE results may be unchanged from their previous values.  
Operations that produce UNPREDICTABLE results may also produce exceptions.  
An occurrence specified as UNPREDICTABLE may happen or not based on an  
arbitrary choice function. The choice function is subject to the same constraints as  
are UNPREDICTABLE results and, in particular, must not constitute a security  
hole.  
Specifically, UNPREDICTABLE results must not depend upon, or be a function of,  
the contents of memory locations or registers that are inaccessible to the current  
process in the current access mode.  
Also, operations that may produce UNPREDICTABLE results must not:  
Write or modify the contents of memory locations or registers to which the cur-  
rent process in the current access mode does not have access, or  
Halt or hang the system or any of its components.  
For example, a security hole would exist if some UNPREDICTABLE result  
depended on the value of a register in another process, on the contents of processor  
temporary registers left behind by some previously running process, or on a  
sequence of actions of different processes.  
xxii  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
X
Do not care. A capital X represents any valid value.  
21264/EV68A Hardware Reference Manual  
xxiii  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
Download from Www.Somanuals.com. All Manuals Search And Download.  
1
Introduction  
This chapter provides a brief introduction to the Alpha architecture, Compaq’s RISC  
(reduced instruction set computing) architecture designed for high performance. The  
chapter then summarizes the specific features of the Alpha 21264/EV68A microproces-  
sor (hereafter called the 21264/EV68A) that implements the Alpha architecture. Appen-  
dix A provides a list of Alpha instructions.  
The companion volume to this document, the Alpha Architecture Reference Manual,  
Fourth Edition, contains the complete architecture information.  
1.1 The Architecture  
The Alpha architecture is a 64-bit load and store RISC architecture designed with par-  
ticular emphasis on speed, multiple instruction issue, multiple processors, and software  
migration from many operating systems.  
All registers are 64 bits long and all operations are performed between 64-bit registers.  
All instructions are 32 bits long. Memory operations are either load or store operations.  
All data manipulation is done between registers.  
The Alpha architecture supports the following data types:  
8-, 16-, 32-, and 64-bit integers  
IEEE 32-bit and 64-bit floating-point formats  
VAX architecture 32-bit and 64-bit floating-point formats  
In the Alpha architecture, instructions interact with each other only by one instruction  
writing to a register or memory location and another instruction reading from that regis-  
ter or memory location. This use of resources makes it easy to build implementations  
that issue multiple instructions every CPU cycle.  
The 21264/EV68A uses a set of subroutines, called privileged architecture library code  
(PALcode), that is specific to a particular Alpha operating system implementation and  
hardware platform. These subroutines provide operating system primitives for context  
switching, interrupts, exceptions, and memory management. These subroutines can be  
invoked by hardware or CALL_PAL instructions. CALL_PAL instructions use the  
function field of the instruction to vector to a specified subroutine. PALcode is written  
in standard machine code with some implementation-specific extensions to provide  
direct access to low-level hardware functions. PALcode supports optimizations for mul-  
tiple operating systems, flexible memory-management implementations, and multi-  
instruction atomic sequences.  
21264/EV68A Hardware Reference Manual  
Introduction  
1–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
The Architecture  
The Alpha architecture performs byte shifting and masking with normal 64-bit, regis-  
ter-to-register instructions. The 21264/EV68A performs single-byte and single-word  
load and store instructions.  
1.1.1 Addressing  
The basic addressable unit in the Alpha architecture is the 8-bit byte. The 21264/  
EV68A supports a 48-bit or 43-bit virtual address (selectable under IPR control).  
Virtual addresses as seen by the program are translated into physical memory addresses  
by the memory-management mechanism. The 21264/EV68A supports a 44-bit physical  
address.  
1.1.2 Integer Data Types  
Alpha architecture supports the four integer data types listed in Table 1–1.  
Table 1–1 Integer Data Types  
Data Type  
Description  
Byte  
A byte is 8 contiguous bits that start at an addressable byte boundary.  
A byte is an 8-bit value.  
Word  
A word is 2 contiguous bytes that start at an arbitrary byte boundary.  
A word is a 16-bit value.  
Longword  
A longword is 4 contiguous bytes that start at an arbitrary byte boundary. A  
longword is a 32-bit value.  
Quadword  
A quadword is 8 contiguous bytes that start at an arbitrary byte boundary.  
Note:  
Alpha implementations may impose a significant performance penalty  
when accessing operands that are not naturally aligned. Refer to the Alpha  
Architecture Handbook, Version 4 for details.  
1.1.3 Floating-Point Data Types  
The 21264/EV68A supports the following floating-point data types:  
Longword integer format in floating-point unit  
Quadword integer format in floating-point unit  
IEEE floating-point formats  
S_floating  
T_floating  
VAX floating-point formats  
F_floating  
G_floating  
D_floating (limited support)  
1–2  
Introduction  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
21264/EV68A Microprocessor Features  
1.2 21264/EV68A Microprocessor Features  
The 21264/EV68A microprocessor is a superscalar pipelined processor. It is packaged  
in a 587-pin PGA carrier and has removable application-specific heat sinks. A number  
of configuration options allow its use in a range of system designs ranging from  
extremely simple uniprocessor systems with minimum component count to high-per-  
formance multiprocessor systems with very high cache and memory bandwidth.  
The 21264/EV68A can issue four Alpha instructions in a single cycle, thereby minimiz-  
ing the average cycles per instruction (CPI). A number of low-latency and/or high-  
throughput features in the instruction issue unit and the onchip components of the mem-  
ory subsystem further reduce the average CPI.  
The 21264/EV68A and associated PALcode implements IEEE single-precision and  
double-precision, VAX F_floating and G_floating data types, and supports longword  
(32-bit) and quadword (64-bit) integers. Byte (8-bit) and word (16-bit) support is pro-  
vided by byte-manipulation instructions. Limited hardware support is provided for the  
VAX D_floating data type.  
Other 21264/EV68A features include:  
The ability to issue up to four instructions during each CPU clock cycle.  
A peak instruction execution rate of four times the CPU clock frequency.  
An onchip, demand-paged memory-management unit with translation buffer, which,  
when used with PALcode, can implement a variety of page table structures and trans-  
lation algorithms. The unit consists of a 128-entry, fully-associative data translation  
buffer (DTB) and a 128-entry, fully-associative instruction translation buffer (ITB),  
with each entry able to map a single 8KB page or a group of 8, 64, or 512 8KB  
pages. The allocation scheme for the ITB and DTB is round-robin. The size of each  
translation buffer entry’s group is specified by hint bits stored in the entry. The  
DTB and ITB implement 8-bit address space numbers (ASN), MAX_ASN=255.  
Two onchip, high-throughput pipelined floating-point units, capable of executing  
both VAX and IEEE floating-point data types.  
An onchip, 64KB virtually-addressed instruction cache with 8-bit ASNs  
(MAX_ASN=255).  
An onchip, virtually-indexed, physically-tagged dual-read-ported, 64KB data  
cache.  
Supports a 48-bit or 43-bit virtual address (program selectable).  
Supports a 44-bit physical address.  
An onchip I/O write buffer with four 64-byte entries for I/O write transactions.  
An onchip, 8-entry victim data buffer.  
An onchip, 32-entry load queue.  
An onchip, 32-entry store queue.  
An onchip, 8-entry miss address file for cache fill requests and I/O read  
transactions.  
An onchip, 8-entry probe queue, holding pending system port probe commands.  
21264/EV68A Hardware Reference Manual  
Introduction  
1–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
21264/EV68A Microprocessor Features  
An onchip, duplicate tag array used to maintain level 2 cache coherency.  
A 64-bit data bus with onchip parity and error correction code (ECC) support.  
Support for an external second-level (Bcache) cache. The size and some timing  
parameters of the Bcache are programmable.  
An internal clock generator providing a high-speed clock used by the 21264/  
EV68A, and two clocks for use by the CPU module.  
Onchip performance counters to measure and analyze CPU and system perfor-  
mance.  
Chip and module level test support, including an instruction cache test interface to  
support chip and module level testing.  
A 2.0-V external interface.  
Refer to Chapter 9 for 21264/EV68A dc and ac electrical characteristics. Refer to the  
Alpha Architecture Handbook, Version 4, Appendix E, for waivers and any other  
implementation-dependent information.  
1–4  
Introduction  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
2
Internal Architecture  
This chapter provides both an overview of the 21264/EV68A microarchitecture and a sys-  
tem designer’s view of the 21264/EV68A implementation of the Alpha architecture. The  
combination of the 21264/EV68A microarchitecture and privileged architecture library  
code (PALcode) defines the chip’s implementation of the Alpha architecture. If a certain  
piece of hardware seems to be “architecturally incomplete,” the missing functionality is  
implemented in PALcode. Chapter 6 provides more information on PALcode.  
This chapter describes the major functional hardware units and is not intended to be a  
detailed hardware description of the chip. It is organized as follows:  
21264/EV68A microarchitecture  
Pipeline organization  
Instruction issue and retire rules  
Load instructions to R31/F31 (software-directed instruction prefetch)  
Special cases of Alpha instruction execution  
Memory and I/O address space  
Miss address file (MAF) and load-merging rules  
Instruction ordering  
Replay traps  
I/O write buffer and the WMB instruction  
Performance measurement support  
Floating-point control register  
AMASK and IMPLVER instruction values  
Design examples  
2.1 21264/EV68A Microarchitecture  
The 21264/EV68A microprocessor is a high-performance third-generation implementa-  
tion of the Compaq Alpha architecture. The 21264/EV68A consists of the following  
sections, as shown in Figure 2–1:  
Instruction fetch, issue, and retire unit (Ibox)  
Integer execution unit (Ebox)  
21264/EV68A Hardware Reference Manual  
Internal Architecture  
2–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
21264/EV68A Microarchitecture  
Floating-point execution unit (Fbox)  
Onchip caches (Icache and Dcache)  
Memory reference unit (Mbox)  
External cache and system interface unit (Cbox)  
Pipeline operation sequence  
2.1.1 Instruction Fetch, Issue, and Retire Unit  
The instruction fetch, issue, and retire unit (Ibox) consists of the following subsections:  
Virtual program counter logic  
Branch predictor  
Instruction-stream translation buffer (ITB)  
Instruction fetch logic  
Register rename maps  
Integer and floating-point issue queues  
Exception and interrupt logic  
Retire logic  
2.1.1.1 Virtual Program Counter Logic  
The virtual program counter (VPC) logic maintains the virtual addresses for instruc-  
tions that are in flight. There can be up to 80 instructions, in 20 successive fetch slots, in  
flight between the register rename mappers and the end of the pipeline. The VPC logic  
contains a 20-entry table to store these fetched VPC addresses.  
2–2  
Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
21264/EV68A Microarchitecture  
Figure 2–1 21264/EV68A Block Diagram  
Instruction Cache  
Ibox  
Four  
Instructions  
Physical  
Address  
Virtual Address  
Fetch Unit  
ITB  
VPC  
Queue  
Next Address  
Predecode  
Retire  
Unit  
Decode and  
Rename Registers  
128  
Branch  
Predictor  
Integer Issue Queue  
(20 Entries)  
FP Issue Queue  
(15 Entries)  
Cbox  
Cache  
Data  
Probe  
128  
Queue  
Ebox  
Fbox  
Cache  
Index  
Duplicate  
Tag Store  
Address  
ALU 0  
(L0)  
INT  
UNIT  
0
INT  
UNIT  
1
Address  
ALU 1  
(L1)  
FP  
ADD  
DIV  
FP  
MUL  
20  
(U0)  
(U1)  
SQRT  
System  
Bus  
IOWB  
64  
Victim  
Buffer  
Integer Registers 0  
(80 Registers)  
Integer Registers 1  
(80 Registers)  
FP Registers  
(72 Registers)  
System  
Address  
15  
Arbiter  
Data  
Mbox  
DTB  
Load  
Store  
Queue  
Miss Address  
File  
(Dual-ported, 128-entry) Queue  
128  
Physical  
Address  
Data  
Dual-Ported Data Cache  
FM-05642-AI4  
2.1.1.2 Branch Predictor  
The branch predictor is composed of three units: the local, global, and choice predic-  
tors. Figure 2–2 shows how the branch predictor generates the predicted branch  
address.  
21264/EV68A Hardware Reference Manual  
Internal Architecture  
2–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
21264/EV68A Microarchitecture  
Figure 2–2 Branch Predictor  
Local  
Global  
Choice  
Predictor  
Predictor  
Predictor  
Predicted  
Branch  
Address  
FM-05810.AI4  
Local Predictor  
The local predictor uses a 2-level table that holds the history of individual branches.  
The 2-level table design approaches the prediction accuracy of a larger single-level  
table while requiring fewer total bits of storage. Figure 2–3 shows how the local pre-  
dictor generates a prediction. Bits [11:2] of the VPC of the current branch are used as  
the index to a 1K entry table in which each entry is a 10-bit value. This 10-bit value is  
used as the index to a 1K entry table of 3-bit saturating counters. The value of the satu-  
rating counter determines the predication, taken/not-taken, of the current branch.  
Figure 2–3 Local Predictor  
VPC[11:2]  
Local  
History  
Table  
1K x 10  
10  
10  
Local  
Index  
Predictor  
1K x 3  
+/-  
3
3
1
Local Branch Prediction  
FM-05811.AI4  
Global Predictor  
The global predictor is indexed by a global history of all recent branches. The global  
predictor correlates the local history of the current branch with all recent branches. Fig-  
ure 2–4 shows how the global predictor generates a prediction. The global path history  
is comprised of the taken/not-taken state of the 12 most-recent branches. These 12  
states are used to form an index into a 4K entry table of 2-bit saturating counters. The  
value of the saturating counter determines the predication, taken/not-taken, of the cur-  
rent branch.  
2–4  
Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
21264/EV68A Microarchitecture  
Figure 2–4 Global Predictor  
Global  
Path  
History  
12  
Global  
Index  
Predictor  
4K x 2  
+/-  
2
2
1
Global Branch Prediction  
FM-05812.AI4  
Choice Predictor  
The choice predictor monitors the history of the local and global predictors and chooses  
the best of the two predictors for a particular branch. Figure 2–5 shows how the choice  
predictor generates its choice of the result of the local or global prediction. The 12-bit  
global path history (see Figure 2–4) is used to index a 4K entry table of 2-bit saturating  
counters. The value of the saturating counter determines the choice between the outputs  
of the local and global predictors.  
Figure 2–5 Choice Predictor  
Global  
Path  
History  
12  
2
Choice  
Predictor  
4K x 2  
Choice Prediction  
2
1
FM-05813.AI4  
2.1.1.3 Instruction-Stream Translation Buffer  
The Ibox includes a 128-entry, fully-associative instruction-stream translation buffer  
(ITB) that is used to store recently used instruction-stream (Istream) address transla-  
tions and page protection information. Each of the entries in the ITB can map 1, 8, 64,  
or 512 contiguous 8KB pages. The allocation scheme is round-robin.  
The ITB supports an 8-bit ASN and contains an ASM bit. The Icache is virtually  
addressed and contains the access-check information, so the ITB is accessed only for  
Istream references that miss in the Icache.  
Istream transactions to I/O address space are UNDEFINED.  
21264/EV68A Hardware Reference Manual  
Internal Architecture  
2–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
21264/EV68A Microarchitecture  
2.1.1.4 Instruction Fetch Logic  
The instruction prefetcher (predecode) reads an octaword, containing up to four natu-  
rally aligned instructions per cycle, from the Icache. Branch prediction and line predic-  
tion bits accompany the four instructions. The branch prediction scheme operates most  
efficiently when only one branch instruction is contained among the four fetched  
instructions. The line prediction scheme attempts to predict the Icache line that the  
branch predictor will generate, and is described in Section 2.2.  
An entry from the subroutine return prediction stack, together with set prediction bits  
for use by the Icache stream controller, are fetched along with the octaword. The Icache  
stream controller generates fetch requests for additional Icache lines and stores the  
Istream data in the Icache. There is no separate buffer to hold Istream requests.  
2.1.1.5 Register Rename Maps  
The instruction prefetcher forwards instructions to the integer and floating-point regis-  
ter rename maps. The rename maps perform the two functions listed here:  
Eliminate register write-after-read (WAR) and write-after-write (WAW) data  
dependencies while preserving true read-after-write (RAW) data dependencies, in  
order to allow instructions to be dynamically rescheduled.  
Provide a means of speculatively executing instructions before the control flow  
previous to those instructions is resolved. Both exceptions and branch  
mispredictions represent deviations from the control flow predicted by the  
instruction prefetcher.  
The map logic translates each instruction’s operand register specifiers from the virtual  
register numbers in the instruction to the physical register numbers that hold the corre-  
sponding architecturally-correct values. The map logic also renames each instruction’s  
destination register specifier from the virtual number in the instruction to a physical  
register number chosen from a list of free physical registers, and updates the register  
maps.  
The map logic can process four instructions per cycle. It does not return the physical  
register, which holds the old value of an instruction’s virtual destination register, to the  
free list until the instruction has been retired, indicating that the control flow up to that  
instruction has been resolved.  
If a branch mispredict or exception occurs, the map logic backs up the contents of the  
integer and floating-point register rename maps to the state associated with the instruc-  
tion that triggered the condition, and the prefetcher restarts at the appropriate VPC. At  
most, 20 valid fetch slots containing up to 80 instructions can be in flight between the  
register maps and the end of the machine’s pipeline, where the control flow is finally  
resolved. The map logic is capable of backing up the contents of the maps to the state  
associated with any of these 80 instructions in a single cycle.  
The register rename logic places instructions into an integer or floating-point issue  
queue, from which they are later issued to functional units for execution.  
2.1.1.6 Integer Issue Queue  
The 20-entry integer issue queue (IQ), associated with the integer execution units  
(Ebox), issues the following types of instructions at a maximum rate of four per cycle:  
2–6  
Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
21264/EV68A Microarchitecture  
Integer operate  
Integer conditional branch  
Unconditional branch – both displacement and memory format  
Integer and floating-point load and store  
PAL-reserved instructions: HW_MTPR, HW_MFPR, HW_LD, HW_ST,  
HW_RET  
Integer-to-floating-point (ITOFx) and floating-point-to-integer (FTOIx)  
Each queue entry asserts four request signals—one for each of the Ebox subclusters. A  
queue entry asserts a request when it contains an instruction that can be executed by the  
subcluster, if the instruction’s operand register values are available within the subclus-  
ter.  
There are two arbiters—one for the upper subclusters and one for the lower subclusters.  
(Subclusters are described in Section 2.1.2.) Each arbiter picks two of the possible 20  
requesters for service each cycle. A given instruction only requests upper subclusters or  
lower subclusters, but because many instructions can only be executed in one type or  
another this is not too limiting.  
For example, load and store instructions can only go to lower subclusters and shift  
instructions can only go to upper subclusters. Other instructions, such as addition and  
logic operations, can execute in either upper or lower subclusters and are statically  
assigned before being placed in the IQ.  
The IQ arbiters choose between simultaneous requesters of a subcluster based on the  
age of the request—older requests are given priority over newer requests. If a given  
instruction requests both lower subclusters, and no older instruction requests a lower  
subcluster, then the arbiter assigns subcluster L0 to the instruction. If a given instruction  
requests both upper subclusters, and no older instruction requests an upper subcluster,  
then the arbiter assigns subcluster U1 to the instruction. This asymmetry between the  
upper and lower subcluster arbiters is a circuit implementation optimization with negli-  
gible overall performance effect.  
2.1.1.7 Floating-Point Issue Queue  
The 15-entry floating-point issue queue (FQ) associated with the Fbox issues the fol-  
lowing instruction types:  
Floating-point operates  
Floating-point conditional branches  
Floating-point stores  
Floating-point register to integer register transfers (FTOIx)  
Each queue entry has three request lines—one for the add pipeline, one for the multiply  
pipeline, and one for the two store pipelines. There are three arbiters—one for each of  
the add, multiply, and store pipelines. The add and multiply arbiters pick one requester  
per cycle, while the store pipeline arbiter picks two requesters per cycle, one for each  
store pipeline.  
21264/EV68A Hardware Reference Manual  
Internal Architecture  
2–7  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
21264/EV68A Microarchitecture  
The FQ arbiters pick between simultaneous requesters of a pipeline based on the age of  
the request—older requests are given priority over newer requests. Floating-point store  
instructions and FTOIx instructions in even-numbered queue entries arbitrate for one  
store port. Floating-point store instructions and FTOIx instructions in odd-numbered  
queue entries arbitrate for the second store port.  
Floating-point store instructions and FTOIx instructions are queued in both the integer  
and floating-point queues. They wait in the floating-point queue until their operand reg-  
ister values are available. They subsequently request service from the store arbiter.  
Upon being issued from the floating-point queue, they signal the corresponding entry in  
the integer queue to request service. Upon being issued from the integer queue, the  
operation is completed.  
2.1.1.8 Exception and Interrupt Logic  
There are two types of exceptions: faults and synchronous traps. Arithmetic exceptions  
are precise and are reported as synchronous traps.  
The four sources of interrupts are listed as follows:  
Level-sensitive hardware interrupts sourced by the IRQ_H[5:0] pins  
Edge-sensitive hardware interrupts generated by the serial line receive pin,  
performance counter overflows, and hardware corrected read errors  
Software interrupts sourced by the software interrupt request (SIRR) register  
Asynchronous system traps (ASTs)  
Interrupt sources can be individually masked. In addition, AST interrupts are qualified  
by the current processor mode.  
2.1.1.9 Retire Logic  
The Ibox fetches instructions in program order, executes them out of order, and then  
retires them in order. The Ibox retire logic maintains the architectural state of the  
machine by retiring an instruction only if all previous instructions have executed with-  
out generating exceptions or branch mispredictions. Retiring an instruction commits the  
machine to any changes the instruction may have made to the software-visible state.  
The three software-visible states are listed as follows:  
Integer and floating-point registers  
Memory  
Internal processor registers (including control/status registers and translation  
buffers)  
The retire logic can sustain a maximum retire rate of eight instructions per cycle, and  
can retire up to as many as 11 instructions in a single cycle.  
2.1.2 Integer Execution Unit  
The integer execution unit (Ebox) is a 4-path integer execution unit that is implemented  
as two functional-unit “clusters” labeled 0 and 1. Each cluster contains a copy of an 80-  
entry, physical-register file and two “subclusters”, named upper (U) and lower (L). Fig-  
ure 2–6 shows the integer execution unit. In the figure, iop_wr is the cross-cluster bus  
for moving integer result values between clusters.  
2–8  
Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
21264/EV68A Microarchitecture  
Figure 2–6 Integer Execution Unit—Clusters 0 and 1  
iop_wr  
iop_wr  
U0  
Register  
L0  
U1  
Register  
L1  
iop_wr  
iop_wr  
Load/Store Data  
Load/Store Data  
eff_VA  
eff_VA  
FM-05643.AI4  
Most instructions have 1-cycle latency for consumers that execute within the same clus-  
ter. Also, there is another 1-cycle delay associated with producing a value in one cluster  
and consuming the value in the other cluster. The instruction issue queue minimizes the  
performance effect of this cross-cluster delay. The Ebox contains the following  
resources:  
Four 64-bit adders that are used to calculate results for integer add instructions  
(located in U0, U1, L0, and L1)  
The adders in the lower subclusters that are used to generate the effective virtual  
address for load and store instructions (located in L0 and L1)  
Four logic units  
Two barrel shifters and associated byte logic (located in U0 and U1)  
Two sets of conditional branch logic (located in U0 and U1)  
Two copies of an 80-entry register file  
One pipelined multiplier (located in U1) with 7-cycle latency for all integer multiply  
operations  
One fully-pipelined unit (located in U0), with 3-cycle latency, that executes the fol-  
lowing instructions:  
CTLZ, CTPOP, CTTZ  
PERR, MINxxx, MAXxxx, UNPKxx, PKxx  
21264/EV68A Hardware Reference Manual  
Internal Architecture  
2–9  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
21264/EV68A Microarchitecture  
The Ebox has 80 register-file entries that contain storage for the values of the 31 Alpha  
integer registers (the value of R31 is not stored), the values of 8 PALshadow registers,  
and 41 results written by instructions that have not yet been retired.  
Ignoring cross-cluster delay, the two copies of the Ebox register file contain identical  
values. Each copy of the Ebox register file contains four read ports and six write ports.  
The four read ports are used to source operands to each of the two subclusters within a  
cluster. The six write ports are used as follows:  
Two write ports are used to write results generated within the cluster.  
Two write ports are used to write results generated by the other cluster.  
Two write ports are used to write results from load instructions. These two ports  
are also used for FTOIx instructions.  
2.1.3 Floating-Point Execution Unit  
The floating-point execution unit (Fbox) has two paths. The Fbox executes both VAX  
and IEEE floating-point instructions. It supports IEEE S_floating-point and T_floating-  
point data types and all rounding modes. It also supports VAX F_floating-point and  
G_floating-point data types, and provides limited support for D_floating-point format.  
The basic structure of the floating-point execution unit is shown in Figure 2–7.  
Figure 2–7 Floating-Point Execution Units  
Floating-Point  
Execution Units  
FP Mul  
Reg  
FP Add  
FP Div  
SQRT  
LK98-0004A  
The Fbox contains the following resources:  
72-entry physical register file  
Fully-pipelined multiplier with 4-cycle latency  
Fully-pipelined adder with 4-cycle latency  
Nonpipelined divide unit associated with the adder pipeline  
Nonpipelined square root unit associated with the adder pipeline  
The 72 Fbox register file entries contain storage for the values of the 31 Alpha floating-  
point registers (F31 is not stored) and 41 values written by instructions that have not  
been retired.  
2–10 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
21264/EV68A Microarchitecture  
The Fbox register file contains six reads ports and four write ports. Four read ports are  
used to source operands to the add and multiply pipelines, and two read ports are used  
to source data for store instructions. Two write ports are used to write results generated  
by the add and multiply pipelines, and two write ports are used to write results from  
floating-point load instructions.  
2.1.4 External Cache and System Interface Unit  
The interface for the system and external cache (Cbox) controls the Bcache and system  
ports. It contains the following structures:  
Victim address file (VAF)  
Victim data file (VDF)  
I/O write buffer (IOWB)  
Probe queue (PQ)  
Duplicate Dcache tag (DTAG)  
2.1.4.1 Victim Address File and Victim Data File  
The victim address file (VAF) and victim data file (VDF) together form an 8-entry vic-  
tim buffer used for holding:  
Dcache blocks to be written to the Bcache  
Istream cache blocks from memory to be written to the Bcache  
Bcache blocks to be written to memory  
Cache blocks sent to the system in response to probe commands  
2.1.4.2 I/O Write Buffer  
The I/O write buffer (IOWB) consists of four 64-byte entries and associated address  
and control logic used for buffering I/O write data between the store queue and the sys-  
tem port.  
2.1.4.3 Probe Queue  
The probe queue (PQ) is an 8-entry queue that holds pending system port cache probe  
commands and addresses.  
2.1.4.4 Duplicate Dcache Tag Array  
The duplicate Dcache tag (DTAG) array holds a duplicate copy of the Dcache tags and  
is used by the Cbox when processing Dcache fills, Icache fills, and system port probes.  
2.1.5 Onchip Caches  
The 21264/EV68A contains two onchip primary-level caches.  
2.1.5.1 Instruction Cache  
The instruction cache (Icache) is a 64KB virtual-addressed, 2-way set-predict cache.  
Set prediction is used to approximate the performance of a 2-set cache without slowing  
the cache access time. Each Icache block contains:  
16 Alpha instructions (64 bytes)  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–11  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                       
21264/EV68A Microarchitecture  
Virtual tag bits [47:15]  
8-bit address space number (ASN) field  
1-bit address space match (ASM) bit  
1-bit PALcode bit to indicate physical addressing  
Valid bit  
Data and tag parity bits  
Four access-check bits for the following modes: kernel, executive, supervisor, and  
user (KESU)  
Additional predecoded information to assist with instruction processing and fetch  
control  
2.1.5.2 Data Cache  
The data cache (Dcache) is a 64KB, 2-way set-associative, virtually indexed, physically  
tagged, write-back, read/write allocate cache with 64-byte blocks. During each cycle  
the Dcache can perform one of the following transactions:  
Two quadword (or shorter) read transactions to arbitrary addresses  
Two quadword write transactions to the same aligned octaword  
Two non-overlapping less-than-quadword writes to the same aligned quadword  
One sequential read and write transaction from and to the same aligned octaword  
Each Dcache block contains:  
64 data bytes and associated quadword ECC bits  
Physical tag bits  
Valid, dirty, shared, and modified bits  
Tag parity bit calculated across the tag, dirty, shared, and modified bits  
One bit to control round-robin set allocation (one bit per two cache blocks)  
The Dcache contains two sets, each with 512 rows containing 64-byte blocks per row  
(that is, 32K bytes of data per set). The 21264/EV68A requires two additional bits of  
virtual address beyond the bits that specify an 8KB page, in order to specify a Dcache  
row index. A given virtual address might be found in four unique locations in the  
Dcache, depending on the virtual-to-physical translation for those two bits. The 21264/  
EV68A prevents this aliasing by keeping only one of the four possible translated  
addresses in the cache at any time.  
2.1.6 Memory Reference Unit  
The memory reference unit (Mbox) controls the Dcache and ensures architecturally  
correct behavior for load and store instructions. The Mbox contains the following struc-  
tures:  
Load queue (LQ)  
Store queue (SQ)  
2–12 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Pipeline Organization  
Miss address file (MAF)  
Dstream translation buffer (DTB)  
2.1.6.1 Load Queue  
The load queue (LQ) is a reorder buffer for load instructions. It contains 32 entries and  
maintains the state associated with load instructions that have been issued to the Mbox,  
but for which results have not been delivered to the processor and the instructions  
retired. The Mbox assigns load instructions to LQ slots based on the order in which  
they were fetched from the Icache, then places them into the LQ after they are issued by  
the IQ. The LQ helps ensure correct Alpha memory reference behavior.  
2.1.6.2 Store Queue  
The store queue (SQ) is a reorder buffer and graduation unit for store instructions. It  
contains 32 entries and maintains the state associated with store instructions that have  
been issued to the Mbox, but for which data has not been written to the Dcache and the  
instruction retired. The Mbox assigns store instructions to SQ slots based on the order  
in which they were fetched from the Icache and places them into the SQ after they are  
issued by the IQ. The SQ holds data associated with store instructions issued from the  
IQ until they are retired, at which point the store can be allowed to update the Dcache.  
The SQ also helps ensure correct Alpha memory reference behavior.  
2.1.6.3 Miss Address File  
The 8-entry miss address file (MAF) holds physical addresses associated with pending  
Icache and Dcache fill requests and pending I/O space read transactions.  
2.1.6.4 Dstream Translation Buffer  
The Mbox includes a 128-entry, fully associative Dstream translation buffer (DTB) used  
to store Dstream address translations and page protection information. Each of the entries  
in the DTB can map 1, 8, 64, or 512 contiguous 8KB pages. The allocation scheme is  
round-robin. The DTB supports an 8-bit ASN and contains an ASM bit.  
2.1.7 SROM Interface  
The serial read-only memory (SROM) interface provides the initialization data load  
path from a system SROM to the Icache. Refer to Chapter 7 for more information.  
2.2 Pipeline Organization  
The 7-stage pipeline provides an optimized environment for executing Alpha instruc-  
tions. The pipeline stages (0 to 6) are shown in Figure 2–8 and described in the follow-  
ing paragraphs.  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–13  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                       
Pipeline Organization  
Figure 2–8 Pipeline Organization  
0
1
2
3
4
5
6
ALU  
Shifter  
Branch  
Predictor  
Integer  
Register  
Rename  
Map  
ALU Shifter  
Multiplier  
Integer  
Issue  
Queue  
(20)  
Integer  
Register  
File  
System  
Bus  
(64 Bits)  
Address  
Address  
ALU  
ALU  
Four  
Instructions  
64KB  
Data  
Cache  
Bus  
Interface  
Unit  
Cache  
Bus  
(128 Bits)  
Instruction  
Cache  
(64KB)  
(2-Set)  
Floating-Point  
Add, Divide,  
and Square Root  
Floating-  
Point  
Issue  
Queue  
(15)  
Floating-  
Point  
Register  
File  
Floating-  
Point  
Register  
Rename  
Map  
Floating-Point  
Multiply  
Physical  
Address  
(44 Bits)  
FM-05575.AI4  
Stage 0 Instruction Fetch  
The branch predictor uses a branch history algorithm to predict a branch instruction tar-  
get address.  
Up to four aligned instructions are fetched from the Icache, in program order. The  
branch prediction tables are also accessed in this cycle. The branch predictor uses tables  
and a branch history algorithm to predict a branch instruction target address for one  
branch or memory format JSR instruction per cycle. Therefore, the prefetcher is limited  
to fetching through one branch per cycle. If there is more than one branch within the  
fetch line, and the branch predictor predicts that the first branch will not be taken, it will  
predict through subsequent branches at the rate of one per cycle, until it predicts a taken  
branch or predicts through the last branch in the fetch line.  
The Icache array also contains a line prediction field, the contents of which are applied  
to the Icache in the next cycle. The purpose of the line predictor is to remove the pipe-  
line bubble which would otherwise be created when the branch predictor predicts a  
branch to be taken. In effect, the line predictor attempts to predict the Icache line which  
the branch predictor will generate. On fills, the line predictor value at each fetch line is  
initialized with the index of the next sequential fetch line, and later retrained by the  
branch predictor if necessary.  
Stage 1 — Instruction Slot  
The Ibox maps four instructions per cycle from the 64KB 2-way set-predict Icache.  
Instructions are mapped in order, executed dynamically, but are retired in order.  
2–14 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Pipeline Organization  
In the slot stage, the branch predictor compares the next Icache index that it generates to  
the index that was generated by the line predictor. If there is a mismatch, the branch  
predictor wins—the instructions fetched during that cycle are aborted, and the index  
predicted by the branch predictor is applied to the Icache during the next cycle. Line  
mispredictions result in one pipeline bubble.  
The line predictor takes precedence over the branch predictor during memory format  
calls or jumps. If the line predictor was trained with a true (as opposed to predicted)  
memory format call or jump target, then its contents take precedence over the target  
hint field associated with these instructions. This allows dynamic calls or jumps to be  
correctly predicted.  
The instruction fetcher produces the full VPC address during the fetch stage of the pipe-  
line. The Icache produces the tags for both Icache sets 0 and 1 each time it is accessed.  
That enables the fetcher to separate set mispredictions from true Icache misses. If the  
access was caused by a set misprediction, the instruction fetcher aborts the last two  
fetched slots and refetches the slot in the next cycle. It also retrains the appropriate set  
prediction bits.  
The instruction data is transferred from the Icache to the integer and floating-point reg-  
ister map hardware during this stage. When the integer instruction is fetched from the  
Icache and slotted into the IQ, the slot logic determines whether the instruction is for  
the upper or lower subclusters. The slot logic makes the decision based on the  
resources needed by the (up to four) integer instructions in the fetch block. Although all  
four instructions need not be issued simultaneously, distributing their resource usage  
improves instruction loading across the units. For example, if a fetch block contains  
two instructions that can be placed in either cluster followed by two instructions that  
must execute in the lower cluster, the slot logic would designate that combination as  
EELL and slot them as UULL. Slot combinations are described in Section 2.3.2 and  
Stage 2 Map  
Instructions are sent from the Icache to the integer and floating-point register maps dur-  
ing the slot stage and register renaming is performed during the map stage. Also, each  
instruction is assigned a unique 8-bit number, called an inum, which is used to identify  
the instruction and its program order with respect to other instructions during the time  
that it is in flight. Instructions are considered to be in flight between the time they are  
mapped and the time they are retired.  
Mapped instructions and their associated inums are placed in the integer and floating-  
point queues by the end of the map stage.  
Stage 3 Issue  
The 20-entry integer issue queue (IQ) issues instructions at the rate of four per cycle.  
The 15-entry floating-point issue queue (FQ) issues floating-point operate instructions,  
conditional branch instructions, and store instructions, at the rate of two per cycle. Nor-  
mally, instructions are deleted from the IQ or FQ two cycles after they are issued. For  
example, if an instruction is issued in cycle n, it remains in the FQ or IQ in cycle n+1  
but does not request service, and is deleted in cycle n+2.  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–15  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Instruction Issue Rules  
Stage 4 Register Read  
Instructions issued from the issue queues read their operands from the integer and float-  
ing-point register files and receive bypass data.  
Stage 5 — Execute  
The Ebox and Fbox pipelines begin execution.  
Stage 6 Dcache Access  
Memory reference instructions access the Dcache and data translation buffers. Nor-  
mally load instructions access the tag and data arrays while store instructions only  
access the tag arrays. Store data is written to the store queue where it is held until the  
store instruction is retired. Most integer operate instructions write their register results  
in this cycle.  
2.2.1 Pipeline Aborts  
The abort penalty as given is measured from the cycle after the fetch stage of the  
instruction which triggers the abort to the fetch stage of the new target, ignoring any  
Ibox pipeline stalls or queuing delay that the triggering instruction might experience.  
Table 2–1 lists the timing associated with each common source of pipeline abort.  
Table 2–1 Pipeline Abort Delay (GCLK Cycles)  
Penalty  
Abort Condition  
(Cycles)  
Comments  
Branch misprediction  
7
Integer or floating-point conditional branch  
misprediction.  
JSR misprediction  
Mbox order trap  
Other Mbox replay traps  
DTB miss  
8
Memory format JSR or HW_RET.  
14  
Load-load order or store-load order.  
13  
13  
ITB miss  
7
Integer arithmetic trap  
12  
Floating-point arithmetic  
trap  
13+latency  
Add latency of instruction. See Section 2.3.3 for  
instruction latencies.  
2.3 Instruction Issue Rules  
This section defines instruction classes, the functional unit pipelines to which they are  
issued, and their associated latencies.  
2–16 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                               
Instruction Issue Rules  
2.3.1 Instruction Group Definitions  
Table 2–2 lists the instruction class, the pipeline assignments, and the instructions  
included in the class.  
Table 2–2 Instruction Name, Pipeline, and Types  
Class  
Name  
Pipeline  
L0, L1  
L0, L1  
L0, L1  
Instruction Type  
ild  
All integer load instructions  
All floating-point load instructions  
All integer store instructions  
fld  
ist  
fst  
FST0, FST1, L0, L1 All floating-point store instructions  
lda  
L0, L1, U0, U1  
LDA, LDAH  
WH64, ECB, WMB  
RPCC  
mem_misc L1  
rpcc  
rx  
L1  
L1  
RS, RC  
mxpr  
L0, L1  
HW_MTPR, HW_MFPR  
(depends on IPR)  
icbr  
jsr  
U0, U1  
L0  
Integer conditional branch instructions  
BR, BSR, JMP, CALL, RET, COR, HW_RET,  
CALL_PAL  
iadd  
ilog  
L0, U0, L1, U1  
L0, U0, L1, U1  
U0, U1  
Instructions with opcode 10 , except CMPBGE  
16  
AND, BIC, BIS, ORNOT, XOR, EQV, CMPBGE  
ishf  
Instructions with opcode 12  
16  
cmov  
imul  
imisc  
L0, U0, L1, U1  
U1  
Integer CMOV — either cluster  
Integer multiply instructions  
U0  
CTLZ, CTPOP, CTTZ, PERR, MINxxx, MAXxxx,  
PKxx, UNPKxx  
fcbr  
FA  
FA  
Floating-point conditional branch instructions  
fadd  
All floating-point operate instructions except multiply,  
divide, square root, and conditional move instructions  
fmul  
FM  
FA  
Floating-point multiply instruction  
Floating-point CMOV—first half  
Floating-point CMOV— second half  
Floating-point divide instruction  
fcmov1  
fcmov2  
fdiv  
FA  
FA  
fsqrt  
FA  
Floating-point square root instruction  
TRAP, EXCB, UNOP - LDQ_U R31, 0(Rx)  
nop  
None  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–17  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Instruction Issue Rules  
Table 2–2 Instruction Name, Pipeline, and Types (Continued)  
Class  
Name  
Pipeline  
Instruction Type  
ftoi  
FST0, FST1, L0, L1 FTOIS, FTOIT  
itof  
L0, L1  
FM  
ITOFS, ITOFF, ITOFT  
mx_fpcr  
Instructions that move data from the floating-point  
control register  
2.3.2 Ebox Slotting  
Instructions that are issued from the IQ, and could execute in either upper or lower  
Ebox subclusters, are slotted to one pair or the other during the pipeline mapping stage  
based on the instruction mixture in the fetch line. The codes that are used in Table 2–3  
are as follows:  
U—The instruction only executes in an upper subcluster.  
L—The instruction only executes in a lower subcluster.  
E—The instruction could execute in either an upper or lower subcluster.  
Table 2–3 defines the slotting rules. The table field Instruction Class 3, 2, 1 and 0 iden-  
tifies each instruction’s location in the fetch line by the value of bits [3:2] in its PC.  
Table 2–3 Instruction Group Definitions and Pipeline Unit  
Instruction Class  
3 2 1 0  
Slotting  
3 2 1 0  
Instruction Class  
3 2 1 0  
Slotting  
3 2 1 0  
E E E E  
E E E L  
E E E U  
E E L E  
E E L L  
E E L U  
E E U E  
E E U L  
E E U U  
E L E E  
E L E L  
E L E U  
E L L E  
E L L L  
E L L U  
E L U E  
E L U L  
U L U L  
U L U L  
U L L U  
U L L U  
U U L L  
U L L U  
U L U L  
U L U L  
L L U U  
U L U L  
U L U L  
U L L U  
U L L U  
U L L L  
U L L U  
U L U L  
U L U L  
L L L L  
L L L U  
L L U E  
L L U L  
L L U U  
L U E E  
L U E L  
L U E U  
L U L E  
L U L L  
L U L U  
L U U E  
L U U L  
L U U U  
U E E E  
U E E L  
U E E U  
L L L L  
L L L U  
L L U U  
L L U L  
L L U U  
L U L U  
L U U L  
L U L U  
L U L U  
L U L L  
L U L U  
L U U L  
L U U L  
L U U U  
U L U L  
U L U L  
U L L U  
2–18 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Instruction Issue Rules  
Table 2–3 Instruction Group Definitions and Pipeline Unit (Continued)  
Instruction Class  
3 2 1 0  
Slotting  
3 2 1 0  
Instruction Class  
3 2 1 0  
Slotting  
3 2 1 0  
E L U U  
E U E E  
E U E L  
E U E U  
E U L E  
E U L L  
E U L U  
E U U E  
E U U L  
E U U U  
L E E E  
L E E L  
L E E U  
L E L E  
L E L L  
L E L U  
L E U E  
L E U L  
L E U U  
L L E E  
L L E L  
L L E U  
L L L E  
L L U U  
L U L U  
L U U L  
L U L U  
L U L U  
U U L L  
L U L U  
L U U L  
L U U L  
L U U U  
L U L U  
L U U L  
L U L U  
L U L U  
L U L L  
L U L U  
L U U L  
L U U L  
L L U U  
L L U U  
L L U L  
L L U U  
L L L U  
U E L E  
U E L L  
U E L U  
U E U E  
U E U L  
U E U U  
U L E E  
U L E L  
U L E U  
U L L E  
U L L L  
U L L U  
U L U E  
U L U L  
U L U U  
U U E E  
U U E L  
U U E U  
U U L E  
U U L L  
U U L U  
U U U E  
U U U L  
U U U U  
U L L U  
U U L L  
U L L U  
U L U L  
U L U L  
U L U U  
U L U L  
U L U L  
U L L U  
U L L U  
U L LL  
U L L U  
U L U L  
U L U L  
U L U U  
U U L L  
U U L L  
U U L U  
U U L L  
U U L L  
U U L U  
U U U L  
U U U L  
U U U U  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–19  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Instruction Issue Rules  
2.3.3 Instruction Latencies  
After an instruction is placed in the IQ or FQ, its issue point is determined by the avail-  
ability of its register operands, functional unit(s), and relationship to other instructions  
in the queue. There are register producer-consumer dependencies and dynamic func-  
tional unit availability dependencies that affect instruction issue. The mapper removes  
register producer-producer dependencies.  
The latency to produce a register result is generally fixed. The one exception is for load  
instructions that miss the Dcache. Table 2–4 lists the latency, in cycles, for each  
instruction class.  
Table 2–4 Instruction Class Latency in Cycles  
Class  
Latency  
Comments  
ild  
3
13+  
Dcache hit.  
Dcache miss, latency with 6-cycle Bcache. Add additional Bcache loop latency if  
Bcache latency is greater than 6 cycles.  
fld  
4
Dcache hit.  
14+  
Dcache miss, latency with 6-cycle Bcache. Add additional Bcache loop latency if  
Bcache latency is greater than 6 cycles.  
lda  
1
Possible 1-cycle Ebox cross-cluster delay.  
Does not produce register value.  
Does not produce register value.  
Does not produce register value.  
Possible 1-cycle cross-cluster delay.  
mem_misc —  
ist  
1
fst  
rpcc  
rx  
1
mxpr  
1 or 3  
HW_MFPR:  
Ebox IPRs = 1.  
Ibox and Mbox IPRs = 3.  
HW_MTPR does not produce a register value.  
Conditional branch. Does not produce register value.  
Unconditional branch. Does not produce register value.  
icbr  
3
ubr  
jsr  
3
iadd  
ilog  
1
Possible 1-cycle Ebox cross-cluster delay.  
Possible 1-cycle Ebox cross-cluster delay.  
Possible 1-cycle Ebox cross-cluster delay.  
Only consumer is cmov2. Possible 1-cycle Ebox cross-cluster delay.  
Possible 1-cycle Ebox cross-cluster delay.  
Possible 1-cycle Ebox cross-cluster delay.  
Possible 1-cycle Ebox cross-cluster delay.  
Does not produce register value.  
1
ishf  
1
cmov1  
cmov2  
imul  
imisc  
fcbr  
1
1
7
3
2–20 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Instruction Retire Rules  
Table 2–4 Instruction Class Latency in Cycles (Continued)  
Class  
Latency  
Comments  
fadd  
4
6
Consumer other than fst or ftoi.  
Consumer fst or ftoi.  
Measured from when an fadd is issued from the FQ to when an fst or ftoi is issued  
from the IQ.  
fmul  
4
6
Consumer other than fst or ftoi.  
Consumer fst or ftoi.  
Measured from when an fmul is issued from the FQ to when an fst or ftoi is issued  
from the IQ.  
fcmov1  
fcmov2  
4
Only consumer is fcmov2.  
4
6
Consumer other than fst.  
Consumer fst or ftoi.  
Measured from when an fcmov2 is issued from the FQ to when an fst or ftoi is  
issued from the IQ.  
fdiv  
12  
9
15  
12  
Single precision - latency to consumer of result value.  
Single precision - latency to using divider again.  
Double precision - latency to consumer of result value.  
Double precision - latency to using divider again.  
fsqrt  
18  
15  
33  
30  
Single precision - latency to consumer of result value.  
Single precision - latency to using unit again.  
Double precision - latency to consumer of result value.  
Double precision - latency to using unit again.  
ftoi  
itof  
nop  
3
4
Does not produce register value.  
2.4 Instruction Retire Rules  
An instruction is retired when it has been executed to completion, and all previous  
instructions have been retired. The execution pipeline stage in which an instruction  
becomes eligible to be retired depends upon the instruction’s class.  
Table 2–5 gives the minimum retire latencies (assuming that all previous instructions  
have been retired) for various classes of instructions.  
Table 2–5 Minimum Retire Latencies for Instruction Classes  
Instruction Class  
Integer conditional branch  
Integer multiply  
Retire Stage  
Comments  
7
7/13  
7
Latency is 13 cycles for the MUL/V instruction.  
Integer operate  
Memory  
10  
11  
11  
Floating-point add  
Floating-point multiply  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–21  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Retire of Operate Instructions into R31/F31  
Table 2–5 Minimum Retire Latencies for Instruction Classes (Continued)  
Instruction Class  
Retire Stage  
Comments  
Floating-point DIV/SQRT  
11 + latency  
Add latency of unit reuse for the instruction indicated in Table  
2–4. For example, latency for a single-precision fdiv would be  
11 plus 9 from Table 2–4. Latency is 11 if hardware detects that  
no exception is possible (see Section 2.4.1).  
Floating-point conditional  
branch  
11  
10  
Branch instruction mispredict is reported in stage 7.  
BSR/JSR  
JSR instruction mispredict is reported in stage 8.  
2.4.1 Floating-Point Divide/Square Root Early Retire  
The floating-point divider and square root unit can detect that, for many combinations  
of source operand values, no exception can be generated. Instructions with these oper-  
ands can be retired before the result is generated. When detected, they are retired with  
the same latency as the FP add class. Early retirement is not possible for the following  
instruction/operand/architecture state conditions:  
Instruction is not a DIV or SQRT.  
SQRT source operand is negative.  
Divide operand exponent_a is 0.  
Either operand is NaN or INF.  
Divide operand exponent_b is 0.  
Trapping mode is /I (inexact).  
INE status bit is 0.  
Early retirement is also not possible for divide instructions if the resulting exponent has  
any of the following characteristics (EXP is the result exponent):  
DIVT, DIVG: (EXP >= 3FF ) OR (EXP <= 2 )  
16 16  
DIVS, DIVF: (EXP >= 7F ) OR (EXP <= 382 )  
16  
16  
2.5 Retire of Operate Instructions into R31/F31  
Many instructions that have R31 or F31 as their destination are retired immediately  
upon decode (stage 3). These instructions do not produce a result and are removed from  
the pipeline as well. They do not occupy a slot in the issue queues and do not occupy a  
functional unit. Table 2–6 lists these instructions and some of their characteristics. The  
instruction type in Table 2–6 is from Table C-6 in Appendix C of the Alpha Architecture  
Handbook, Version 4.  
2–22 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Load Instructions to R31 and F31  
Table 2–6 Instructions Retired Without Execution  
Instruction Type  
Notes  
INTA, INTL, INTM, INTS  
FLTI, FLTL, FLTV  
All with R31 as destination.  
All with F31 as destination. MT_FPCR is not included  
because it has no destination—it is never removed from the  
pipeline.  
LDQ_U  
MISC  
All with R31 as destination.  
TRAPB and EXCB are always removed. Others are never  
removed.  
FLTS  
All (SQRT, ITOF) with F31 as destination.  
2.6 Load Instructions to R31 and F31  
This section describes how the 21264/EV68A processes software-directed prefetch  
transactions and load instructions with a destination of R31 and F31.  
Prefetches allocate a MAF entry. How the MAF entry is allocated is what distinguishes  
the type of prefetch. A normal prefetch is equivalent to a normal load MAF (that is, a  
MAF entry that puts the block into the Dcache in a readable state). A prefetch with  
modify intent is equivalent to a normal store MAF (that is, a MAF entry that puts the  
block into the Dcache in a writeable state). A prefetch, evict next, is equivalent to a nor-  
mal load MAF, with the additional behavior described in Section 2.6.3.  
A prefetch is not performed if the prefetch hits in the Dcache (as if it were a normal  
load).  
Load operations to R31 and F31 may generate exceptions. These exceptions must be  
dismissed by PALcode.  
The following sections describe the operational prefetch behavior of these instructions.  
2.6.1 Normal Prefetch: LDBU, LDF, LDG, LDL, LDT, LDWU, HW_LDL Instructions  
The 21264/EV68A processes these instructions as normal cache line prefetches. If the  
load instruction hits the Dcache, the instruction is dismissed, otherwise the addressed  
cache block is allocated into the Dcache.  
The HW_LDL instruction construct equates to the HW_LD instruction with the LEN  
field clear. See Table 6–3.  
2.6.2 Prefetch with Modify Intent: LDS Instruction  
The 21264/EV68A processes an LDS instruction, with F31 as the destination, as a  
prefetch with modify intent transaction (ReadBlkMod command). If the transaction hits  
a dirty Dcache block, the instruction is dismissed. Otherwise, the addressed cache block  
is allocated into the Dcache for write access, with its dirty and modified bits set.  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–23  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
Special Cases of Alpha Instruction Execution  
2.6.3 Prefetch, Evict Next: LDQ and HW_LDQ Instructions  
The 21264/EV68A processes this instruction like a normal prefetch transaction (Read-  
BlkSpec command), with one exception—if the load misses the Dcache, the addressed  
cache block is allocated into the Dcache, but the Dcache set allocation pointer is left  
pointing to this block. The next miss to the same Dcache line will evict the block. For  
example, this instruction might be used when software is reading an array that is known  
to fit in the offchip Bcache, but will not fit into the onchip Dcache. In this case, the  
instruction ensures that the hardware provides the desired prefetch function without dis-  
placing useful cache blocks stored in the other set within the Dcache.  
The HW_LDQ instruction construct equates to the HW_LD instruction with the LEN  
field set. See Table 6–3.  
2.7 Special Cases of Alpha Instruction Execution  
This section describes the mechanisms that the 21264/EV68A uses to process irregular  
instructions in the Alpha instruction set, and cases in which the 21264/EV68A pro-  
cesses instructions in a non-intuitive way.  
2.7.1 Load Hit Speculation  
The latency of integer load instructions that hit in the Dcache is three cycles. Figure 2–  
9 shows the pipeline timing for these integer load instructions. In Figure 2–9:  
Symbol  
Meaning  
Q
R
E
Issue queue  
Register file read  
Execute  
D
B
Dcache access  
Data bus active  
Figure 2–9 Pipeline Timing for Integer Load Instructions  
Hit  
Cycle Number  
ILD  
1
2
3
4
D
Q
5
B
R
Q
6
7
8
Q
R
E
Instruction 1  
Instruction 2  
FM-05814.AI4  
There are two cycles in which the IQ may speculatively issue instructions that use load  
data before Dcache hit information is known. Any instructions that are issued by the IQ  
within this 2-cycle speculative window are kept in the IQ with their requests inhibited  
until the load instruction’s hit condition is known, even if they are not dependent on the  
load operation. If the load instruction hits, then these instructions are removed from the  
queue. If the load instruction misses, then the execution of these instructions is aborted  
and the instructions are allowed to request service again.  
2–24 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Special Cases of Alpha Instruction Execution  
For example, in Figure 2–9, instruction 1 and instruction 2 are issued within the specu-  
lative window of the load instruction. If the load instruction hits, then both instructions  
will be deleted from the queue by the start of cycle 7—one cycle later than normal for  
instruction 1 and at the normal time for instruction 2. If the load instruction misses, both  
instructions are aborted from the execution pipelines and may request service again in  
cycle 6.  
IQ-issued instructions are aborted if issued within the speculative window of an integer  
load instruction that missed in the Dcache, even if they are not dependent on the load  
data. However, if software misses are likely, the 21264/EV68A can still benefit from  
scheduling the instruction stream for Dcache miss latency. The 21264/EV68A includes  
a saturating counter that is incremented when load instructions hit and is decremented  
when load instructions miss. When the upper bit of the counter equals zero, the integer  
load latency is increased to five cycles and the speculative window is removed. The  
counter is 4 bits wide and is incremented by 1 on a hit and is decremented by two on a  
miss.  
Since load instructions to R31 do not produce a result, they do not create a speculative  
window when they execute and, therefore, never waste IQ-issue cycles if they miss.  
Floating-point load instructions that hit in the Dcache have a latency of four cycles. Fig-  
ure 2–10 shows the pipeline timing for floating-point load instructions. In Figure 2–10:  
Symbol  
Meaning  
Q
R
E
Issue queue  
Register file read  
Execute  
D
B
Dcache access  
Data bus active  
Figure 2–10 Pipeline Timing for Floating-Point Load Instructions  
Hit  
Cycle Number  
FLD  
1
2
3
4
5
B
Q
6
7
8
Q
R
E
D
Instruction 1  
Instruction 2  
R
Q
FM-05815.AI4  
The speculative window for floating-point load instructions is one cycle wide.  
FQ-issued instructions that are issued within the speculative window of a floating-point  
load instruction that has missed, are only aborted if they depend on the load being suc-  
cessful.  
For example, in Figure 2–10 instruction 1 is issued in the speculative window of the  
load instruction.  
If instruction 1 is not a user of the data returned by the load instruction, then it is  
removed from the queue at its normal time (at the start of cycle 7).  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–25  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Special Cases of Alpha Instruction Execution  
If instruction 1 is dependent on the load instruction data and the load instruction hits,  
instruction 1 is removed from the queue one cycle later (at the start of cycle 8). If the  
load instruction misses, then instruction 1 is aborted from the Fbox pipeline and may  
request service again in cycle 7.  
2.7.2 Floating-Point Store Instructions  
Floating-point store instructions are duplicated and loaded into both the IQ and the FQ  
from the mapper. Each IQ entry contains a control bit, fpWait, that when set prevents  
that entry from asserting its requests. This bit is initially set for each floating-point store  
instruction that enters the IQ, unless it was the target of a replay trap. The instruction’s  
FQ clone is issued when its Ra register is about to become clean, resulting in its IQ  
clone’s fpWait bit being cleared and allowing the IQ clone to issue and be executed by  
the Mbox. This mechanism ensures that floating-point store instructions are always  
issued to the Mbox, along with the associated data, without requiring the floating-point  
register dirty bits to be available within the IQ.  
2.7.3 CMOV Instruction  
For the 21264/EV68A, the Alpha CMOV instruction has three operands, and so pre-  
sents a special case. The required operation is to move either the value in register Rb or  
the value from the old physical destination register into the new destination register,  
based upon the value in Ra. Since neither the mapper nor the Ebox and Fbox data paths  
are otherwise required to handle three operand instructions, the CMOV instruction is  
decomposed by the Ibox pipeline into two 2-operand instructions:  
The Alpha architecture instruction  
CMOV Ra, Rb  
Rc  
Becomes the 21264/EV68A instructions  
CMOV1 Ra, oldRc  
newRc1  
CMOV2 newRc1, Rb  
newRc2  
The first instruction, CMOV1, tests the value of Ra and records the result of this test in  
a 65th bit of its destination register, newRc1. It also copies the value of the old physical  
destination register, oldRc, to newRc1.  
The second instruction, CMOV2, then copies either the value in newRc1 or the value in  
Rb into a second physical destination register, newRc2, based on the CMOV predicate  
bit stored in newRc1.  
In summary, the original CMOV instruction is decomposed into two dependent instruc-  
tions that each use a physical register from the free list.  
To further simplify this operation, the two component instructions of a CMOV instruc-  
tion are driven through the mappers in successive cycles. Hence, if a fetch line contains  
n CMOV instructions, it takes n+1 cycles to run that fetch line through the mappers.  
For example, the following fetch line:  
ADD CMOVx SUB CMOVy  
Results in the following three map cycles:  
ADD CMOVx1  
CMOVx2 SUB CMOVy1  
CMOVy2  
2–26 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Memory and I/O Address Space Instructions  
The Ebox executes integer CMOV instructions as two distinct 1-cycle latency opera-  
tions. The Fbox add pipeline executes floating-point CMOV instructions as two distinct  
4-cycle latency operations.  
2.8 Memory and I/O Address Space Instructions  
This section provides an overview of the way the 21264/EV68A processes memory and  
I/O address space instructions.  
The 21264/EV68A supports, and internally recognizes, a 44-bit physical address space  
that is divided equally between memory address space and I/O address space. Memory  
address space resides in the lower half of the physical address space (PA[43]=0)  
and I/O address space resides in the upper half of the physical address space  
(PA[43]=1).  
The IQ can issue any combination of load and store instructions to the Mbox at the rate  
of two per cycle. The two lower Ebox subclusters, L0 and L1, generate the  
48-bit effective virtual address for these instructions.  
An instruction is defined to be newer than another instruction if it follows that instruc-  
tion in program order and is older if it precedes that instruction in program order.  
2.8.1 Memory Address Space Load Instructions  
The Mbox begins execution of a load instruction by translating its virtual address to a  
physical address using the DTB and by accessing the Dcache. The Dcache is virtually  
indexed, allowing these two operations to be done in parallel. The Mbox puts informa-  
tion about the load instruction, including its physical address, destination register, and  
data format, into the LQ.  
If the requested physical location is found in the Dcache (a hit), the data is formatted  
and written into the appropriate integer or floating-point register. If the location is not in  
the Dcache (a miss), the physical address is placed in the miss address file (MAF) for  
processing by the Cbox. The MAF performs a merging function in which a new miss  
address is compared to miss addresses already held in the MAF. If the new miss address  
points to the same Dcache block as a miss address in the MAF, then the new miss  
address is discarded.  
When Dcache fill data is returned to the Dcache by the Cbox, the Mbox satisfies the  
requesting load instructions in the LQ.  
2.8.2 I/O Address Space Load Instructions  
Because I/O space load instructions may have side effects, they cannot be performed  
speculatively. When the Mbox receives an I/O space load instruction, the Mbox places  
the load instruction in the LQ, where it is held until it retires. The Mbox replays retired  
I/O space load instructions from the LQ to the MAF in program order, at a rate of one  
per GCLK cycle.  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–27  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Memory and I/O Address Space Instructions  
The Mbox allocates a new MAF entry to an I/O load instruction and increases I/O band-  
width by attempting to merge I/O load instructions in a merge register. Table 2–7 shows  
the rules for merging data. The columns represent the load instructions replayed to the  
MAF while the rows represent the size of the load in the merge register.  
Table 2–7 Rules for I/O Address Space Load Instruction Data Merging  
Merge Register/  
Replayed Instruction  
Load Byte/Word Load Longword  
Load Quadword  
Byte/Word  
No merge  
No merge  
No merge  
No merge  
No merge  
Longword  
Merge up to 32 bytes  
No merge  
No merge  
Quadword  
Merge up to 64 bytes  
In summary, Table 2–7 shows some of the following rules:  
Byte/word load instructions and different size load instructions are not allowed to  
merge.  
A stream of ascending non-overlapping, but not necessarily consecutive, longword  
load instructions are allowed to merge into naturally aligned 32-byte blocks.  
A stream of ascending non-overlapping, but not necessarily consecutive, quadword  
load instructions are allowed to merge into naturally aligned 64-byte blocks.  
Merging of quadwords can be limited to naturally-aligned 32-byte blocks based on  
the Cbox WRITE_ONCE chain 32_BYTE_IO field.  
Issued MB, WMB, and I/O load instructions close the I/O register merge window.  
To minimize latency, the merge window is also closed when a timer detects no I/O  
store instruction activity for 1024 cycles.  
After the Mbox I/O register has closed its merge window, the Cbox sends I/O read  
requests offchip in the order that they were received from the Mbox.  
2.8.3 Memory Address Space Store Instructions  
The Mbox begins execution of a store instruction by translating its virtual address to a  
physical address using the DTB and by probing the Dcache. The Mbox puts informa-  
tion about the store instruction, including its physical address, its data and the results of  
the Dcache probe, into the store queue (SQ).  
If the Mbox does not find the addressed location in the Dcache, it places the address  
into the MAF for processing by the Cbox. If the Mbox finds the addressed location in a  
Dcache block that is not dirty, then it places a ChangeToDirty request into the MAF.  
A store instruction can write its data into the Dcache when it is retired, and when the  
Dcache block containing its address is dirty and not shared. SQ entries that meet these  
two conditions can be placed into the writable state. These SQ entries are placed into  
the writable state in program order at a maximum rate of two entries per cycle. The  
Mbox transfers writable store queue entry data from the SQ to the Dcache in program  
order at a maximum rate of two entries per cycle. Dcache lines associated with writable  
store queue entries are locked by the Mbox. System port probe commands cannot evict  
these blocks until their associated writable SQ entries have been transferred into the  
Dcache. This restriction assists in STx_C instruction and Dcache ECC processing.  
2–28 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
Memory and I/O Address Space Instructions  
SQ entry data that has not been transferred to the Dcache may source data to newer load  
instructions. The Mbox compares the virtual Dcache index bits of incoming load  
instructions to queued SQ entries, and sources the data from the SQ, bypassing the  
Dcache, when necessary.  
2.8.4 I/O Address Space Store Instructions  
The Mbox begins processing I/O space store instructions, like memory space store  
instructions, by translating the virtual address and placing the state associated with the  
store instruction into the SQ.  
The Mbox replays retired I/O space store entries from the SQ to the IOWB in program  
order at a rate of one per GCLK cycle. The Mbox never allows queued I/O space store  
instructions to source data to subsequent load instructions.  
The Cbox maximizes I/O bandwidth when it allocates a new IOWB entry to an I/O  
store instruction by attempting to merge I/O store instructions in a merge register. Table  
2–8 shows the rules for I/O space store instruction data merging. The columns represent  
the load instructions replayed to the IOWB while the rows represent the size of the store  
in the merge register.  
Table 2–8 Rules for I/O Address Space Store Instruction Data Merging  
Merge Register/  
Store  
Replayed Instruction  
Byte/Word  
Store Longword  
No merge  
Store Quadword  
No merge  
Byte/Word  
Longword  
Quadword  
No merge  
No merge  
No merge  
Merge up to 32 bytes  
No merge  
No merge  
Merge up to 64 bytes  
Table 2–8 shows some of the following rules:  
Byte/word store instructions and different size store instructions are not allowed to  
merge.  
A stream of ascending non-overlapping, but not necessarily consecutive, longword  
store instructions are allowed to merge into naturally aligned 32-byte blocks.  
A stream of ascending non-overlapping, but not necessarily consecutive, quadword  
store instructions are allowed to merge into naturally aligned 64-byte blocks.  
Merging of quadwords can be limited to naturally-aligned 32-byte blocks based on  
the Cbox WRITE_ONCE chain 32_BYTE_IO field.  
Issued MB, WMB, and I/O load instructions close the I/O register merge window.  
To minimize latency, the merge window is also closed when a timer detects no I/O  
store instruction activity for 1024 cycles.  
After the IOWB merge register has closed its merge window, the Cbox sends I/O space  
store requests offchip in the order that they were received from the Mbox.  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–29  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
MAF Memory Address Space Merging Rules  
2.9 MAF Memory Address Space Merging Rules  
Because all memory transactions are to 64-byte blocks, efficiency is improved by merg-  
ing several small data transactions into a single larger data transaction. Table 2–9 lists  
the rules the 21264/EV68A uses when merging memory transactions into 64-byte natu-  
rally aligned data block transactions. Rows represent the merged instruction in the  
MAF and columns represent the new issued transaction.  
Table 2–9 MAF Merging Rules  
MAF/New LDx  
STx  
STx_C  
WH64  
ECB  
Istream  
LDx  
Merge  
STx  
Merge  
Merge  
STx_C  
WH64  
ECB  
Merge  
Merge  
Merge  
Istream  
Merge  
In summary, Table 2–9 shows that only like instruction types, with the exception of  
load instructions merging with store instructions, are merged.  
2.10 Instruction Ordering  
In the absence of explicit instruction ordering, such as with MB or WMB instructions,  
the 21264/EV68A maintains a default instruction ordering relationship between pairs of  
load and store instructions.  
The 21264/EV68A maintains the default memory data instruction ordering as shown in  
Table 2–10 (assume address X and address Y are different).  
Table 2–10 Memory Reference Ordering  
First Instruction in Pair  
Load memory to address X  
Load memory to address X  
Store memory to address X  
Store memory to address X  
Load memory to address X  
Load memory to address X  
Store memory to address X  
Store memory to address X  
Second Instruction in Pair Reference Order  
Load memory to address X  
Load memory to address Y  
Store memory to address X  
Store memory to address Y  
Store memory to address X  
Store memory to address Y  
Load memory to address X  
Load memory to address Y  
Maintained (litmus test 1)  
Not maintained  
Maintained  
Maintained  
Maintained  
Not maintained  
Maintained  
Not maintained  
2–30 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Replay Traps  
The 21264/EV68A maintains the default I/O instruction ordering as shown in Table 2–  
11 (assume address X and address Y are different).  
Table 2–11 I/O Reference Ordering  
First Instruction in Pair  
Load I/O to address X  
Load I/O to address X  
Store I/O to address X  
Store I/O to address X  
Load I/O to address X  
Load I/O to address X  
Store I/O to address X  
Store I/O to address X  
Second Instruction in Pair Reference Order  
Load I/O to address X  
Load I/O to address Y  
Store I/O to address X  
Store I/O to address Y  
Store I/O to address X  
Store I/O to address Y  
Load I/O to address X  
Load I/O to address Y  
Maintained  
Maintained  
Maintained  
Maintained  
Maintained  
Not maintained  
Maintained  
Not maintained  
2.11 Replay Traps  
There are some situations in which a load or store instruction cannot be executed due to  
a condition that occurs after that instruction issues from the IQ or FQ. The instruction is  
aborted (along with all newer instructions) and restarted from the fetch stage of the  
pipeline. This mechanism is called a replay trap.  
2.11.1 Mbox Order Traps  
Load and store instructions may be issued from the IQ in a different order than they  
were fetched from the Icache, while the architecture dictates that Dstream memory  
transactions to the same physical bytes must be completed in order. Usually, the Mbox  
manages the memory reference stream by itself to achieve architecturally correct  
behavior, but the two cases in which the Mbox uses replay traps to manage the memory  
stream are load-load and store-load order traps.  
2.11.1.1 Load-Load Order Trap  
The Mbox ensures that load instructions that read the same physical byte(s) ultimately  
issue in correct order by using the load-load order trap. The Mbox compares the  
address of each load instruction, as it is issued, to the address of all load instructions in  
the load queue. If the Mbox finds a newer load instruction in the load queue, it invokes  
a load-load order trap on the newer instruction. This is a replay trap that aborts the tar-  
get of the trap and all newer instructions from the machine and refetches instructions  
starting at the target of the trap.  
2.11.1.2 Store-Load Order Trap  
The Mbox ensures that a load instruction ultimately issues after an older store instruc-  
tion that writes some portion of its memory operand by using the store-load order trap.  
The Mbox compares the address of each store instruction, as it is issued, to the address  
of all load instructions in the load queue. If the Mbox finds a newer load instruction in  
the load queue, it invokes a store-load order trap on the load instruction. This is a replay  
trap. It functions like the load-load order trap.  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–31  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                     
I/O Write Buffer and the WMB Instruction  
The Ibox contains extra hardware to reduce the frequency of the store-load trap. There  
is a 1-bit by 1024-entry VPC-indexed table in the Ibox called the stWait table. When an  
Icache instruction is fetched, the associated stWait table entry is fetched along with the  
Icache instruction. The stWait table produces 1 bit for each instruction accessed from  
the Icache. When a load instruction gets a store-load order replay trap, its associated bit  
in the stWait table is set during the cycle that the load is refetched. Hence, the trapping  
load instruction’s stWait bit will be set the next time it is fetched.  
The IQ will not issue load instructions whose stWait bit is set while there are older unis-  
sued store instructions in the queue. A load instruction whose stWait bit is set can be  
issued the cycle immediately after the last older store instruction is issued from the  
queue. All the bits in the stWait table are unconditionally cleared every 16384 cycles, or  
every 65536 cycles if I_CTL[ST_WAIT_64K] is set.  
2.11.2 Other Mbox Replay Traps  
The Mbox also uses replay traps to control the flow of the load queue and store queue,  
and to ensure that there are never multiple outstanding misses to different physical  
addresses that map to the same Dcache or Bcache line. Unlike the order traps, however,  
these replay traps are invoked on the incoming instruction that triggered the condition.  
2.12 I/O Write Buffer and the WMB Instruction  
The I/O write buffer (IOWB) consists of four 64-byte entries with the associated  
address and control logic used to buffer I/O write data between the store queue (SQ)  
and the system port.  
2.12.1 Memory Barrier (MB/WMB/TB Fill Flow)  
The Cbox CSR SYSBUS_MB_ENABLE bit determines if MB instructions produce  
external system port transactions. When the SYSBUS_MB_ENABLE bit equals 0, the  
Cbox CSR MB_CNT[3:0] field contains the number of pending uncommitted transac-  
tions. The counter will increment for each of the following commands:  
RdBlk, RdBlkMod, RdBlkI  
RdBlkSpec (valid), RdBlkModSpec (valid), RdBlkSpecI (valid)  
RdBlkVic, RdBlkModVic, RdBlkVicI  
CleanToDirty, SharedToDirty, STChangeToDirty, InvalToDirty  
FetchBlk, FetchBlkSpec (valid), Evict  
RdByte, RdLw, RdQw, WrByte, WrLW, WrQW  
The counter is decremented with the C (commit) bit in the Probe and SysDc commands  
(see Section 4.7.7). Systems can assert the C bit in the SysDc fill response to the com-  
mands that originally incremented the counter, or attached to the last probe seen by that  
command when it reached the system serialization point. If the number of uncommitted  
transactions reaches 15 (saturating the counter), the Cbox will stall MAF and IOWB  
processing until at least one of the pending transactions has been committed. Probe pro-  
cessing is not interrupted by the state of this counter.  
2–32 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
I/O Write Buffer and the WMB Instruction  
2.12.1.1 MB Instruction Processing  
When an MB instruction is fetched in the predicted instruction execution path, it stalls  
in the map stage of the pipeline. This also stalls all instructions after the MB, and con-  
trol of instruction flow is based upon the value in Cbox CSR SYSBUS_MB_ENABLE  
as follows:  
If Cbox CSR SYSBUS_MB_ENABLE is clear, the Cbox waits until the IQ is  
empty and then performs the following actions:  
a. Sends all pending MAF and IOWB entries to the system port.  
b. Monitors Cbox CSR MB_CNT[3:0], a 4-bit counter of outstanding committed  
events. When the counter decrements from one to zero, the Cbox marks the  
youngest probe queue entry.  
c. Waits until the MAF contains no more Dstream references and the SQ, LQ, and  
IOWB are empty.  
When all of the above have occurred and a probe response has been sent to the sys-  
tem for the marked probe queue entry, instruction execution continues with the  
instruction after the MB.  
If Cbox CSR SYSBUS_MB_ENABLE is set, the Cbox waits until the IQ is empty  
and then performs the following actions:  
a. Sends all pending MAF and IOWB entries to the system port  
b. Sends the MB command to the system port  
c. Waits until the MB command is acknowledged, then marks the youngest entry  
in the probe queue  
d. Waits until the MAF contains no more Dstream references and the SQ, LQ, and  
IOWB are empty  
When all of the above have occurred and a probe response has been sent to the sys-  
tem for the marked probe queue entry, instruction execution continues with the  
instruction after the MB.  
Because the MB instruction is executed speculatively, MB processing can begin  
and the original MB can be killed. In the internal acknowledge case, the MB may  
have already been sent to the system interface, and the system is still expected to  
respond to the MB.  
2.12.1.2 WMB Instruction Processing  
Write memory barrier (WMB) instructions are issued into the Mbox store-queue, where  
they wait until they are retired and all prior store instructions become writable. The  
Mbox then stalls the writable pointer and informs the Cbox. The Cbox closes the IOWB  
merge register and responds in one of the following two ways:  
If Cbox CSR SYSBUS_MB_ENABLE is clear, the Cbox performs the following  
actions:  
a. Stalls further MAF and IOWB processing.  
b. Monitors Cbox CSR MB_CNT[3:0], a 4-bit counter of outstanding committed  
events. When the counter decrements from one to zero, the Cbox marks the  
youngest probe queue entry.  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–33  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
I/O Write Buffer and the WMB Instruction  
c. When a probe response has been sent to the system for the marked probe queue  
entry, the Cbox considers the WMB to be satisfied.  
If Cbox CSR SYSBUS_MB_ENABLE is set, the Cbox performs the following  
actions:  
a. Stalls further MAF and IOWB processing.  
b. Sends the MB command to the system port.  
c. Waits until the MB command is acknowledged by the system with a SysDc  
MBDone command, then sends acknowledge and marks the youngest entry in  
the probe queue.  
d. When a probe response has been sent to the system for the marked probe queue  
entry, the Cbox considers the WMB to be satisfied.  
2.12.1.3 TB Fill Flow  
Load instructions (HW_LDs) to a virtual page table entry (VPTE) are processed by the  
21264/EV68A to avoid litmus test problems associated with the ordering of memory  
transactions from another processor against loading of a page table entry and the subse-  
quent virtual-mode load from this processor.  
Consider the sequence shown in Table 2–12. The data could be in the Bcache. Pj should  
fetch datai if it is using PTEi.  
Table 2–12 TB Fill Flow Example Sequence 1  
Pi  
Pj  
Write Datai  
MB  
Load/Store datai  
<TB miss>  
Write PTEi  
Load-PTE  
<write TB>  
Load/Store (restart)  
Also consider the related sequence shown in Table 2–13. In this case, the data could be  
cached in the Bcache; Pj should fetch datai if it is using PTEi.  
Table 2–13 TB Fill Flow Example Sequence 2  
Pi  
Pj  
Write Datai  
MB  
Istream read datai  
<TB miss>  
Write PTEi  
Load-PTE  
<write TB>  
Istream read (restart) - will miss the Icache  
The 21264/EV68A processes Dstream loads to the PTE by injecting, in hardware, some  
memory barrier processing between the PTE transaction and any subsequent load or  
store instruction. This is accomplished by the following mechanism:  
1. The integer queue issues a HW_LD instruction with VPTE.  
2–34 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Performance Measurement Support—Performance Counters  
2. The integer queue issues a HW_MTPR instruction with a DTB_PTE0, that is data-  
dependent on the HW_LD instruction with a VPTE, and is required in order to fill  
the DTBs. The HW_MTPR instruction, when queued, sets IPR scoreboard bits [4]  
and [0].  
3. When a HW_MTPR instruction with a DTB_PTE0 is issued, the Ibox signals the  
Cbox indicating that a HW_LD instruction with a VPTE has been processed. This  
causes the Cbox to begin processing the MB instruction. The Ibox prevents any  
subsequent memory operations being issued by not clearing the IPR scoreboard bit  
[0]. IPR scoreboard bit [0] is one of the scoreboard bits associated with the  
HW_MTPR instruction with DTB_PTE0.  
4. When the Cbox completes processing the MB instruction (using one of the above  
sequences, depending upon the state of SYSBUS_MB_ENABLE), the Cbox sig-  
nals the Ibox to clear IPR scoreboard bit [0].  
The 21264/EV68A uses a similar mechanism to process Istream TB misses and fills to  
the PTE for the Istream.  
1. The integer queue issues a HW_LD instruction with VPTE.  
2. The IQ issues a HW_MTPR instruction with an ITB_PTE that is data-dependent  
upon the HW_LD instruction with VPTE. This is required in order to fill the ITB.  
The HW_MTPR instruction, when queued, sets IPR scoreboard bits [4] and [0].  
3. The Cbox issues a HW_MTPR instruction for the ITB_PTE and signals the Ibox  
that a HW_LD/VPTE instruction has been processed, causing the Cbox to start pro-  
cessing the MB instruction. The Mbox stalls Ibox fetching from when the HW_LD/  
VPTE instruction finishes until the probe queue is drained.  
4. When the 21264/EV68A is finished (SYS_MB selects one of the above sequences),  
the Cbox directs the Ibox to clear IPR scoreboard bit [0]. Also, the Mbox directs the  
Ibox to start prefetching.  
Inserting MB instruction processing within the TB fill flow is only required for multi-  
processor systems. Uniprocessor systems can disable MB instruction processing by  
deasserting Ibox CSR I_CTL[TB_MB_EN].  
2.13 Performance Measurement Support—Performance Counters  
The 21264/EV68A provides hardware support for two methods of obtaining program  
performance feedback information. The two methods do not require program modifica-  
tion. The first method offers similar capabilities to earlier microprocessor performance  
counters. The second method supports the new ProfileMe way of statistically sampling  
individual instructions during program execution to develop a model of program execu-  
tion. Both methods use the same hardware registers.  
See Section 6.10 for information about counter control.  
2.14 Floating-Point Control Register  
The floating-point control register (FPCR) is shown in Figure 2–11.  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–35  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Floating-Point Control Register  
Figure 2–11 Floating-Point Control Register  
63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47  
0
SUM  
INED  
UNFD  
UNDZ  
DYN  
IOV  
INE  
UNF  
OVF  
DZE  
INV  
OVFD  
DZED  
INVD  
DNZ  
LK99-0050A  
The floating-point control register fields are described in Table 2–14.  
Table 2–14 Floating-Point Control Register Fields  
Name  
Extent  
Type  
Description  
SUM  
[63]  
RW  
Summary bit. Records bit-wise OR of FPCR exception bits.The summary bit is  
not directly modified by writes to bit 63 of the FPCR, but is indirectly modified  
by changes to FPCR bits 57–52.  
INED  
[62]  
[61]  
RW  
RW  
Inexact Disable. If this bit is set and a floating-point instruction that enables  
trapping on inexact results generates an inexact value, the result is placed in the  
destination register and the trap is suppressed.  
UNFD  
Underflow Disable. The 21264/EV68A hardware cannot generate IEEE com-  
pliant denormal results. UNFD is used in conjunction with UNDZ as follows:  
UNFD UNDZ  
Result  
0
1
1
X
0
Underflow trap.  
Trap to supply a possible denormal result.  
1
Underflow trap suppressed. Destination is written  
with a true zero (+0.0).  
UNDZ  
[60]  
RW  
Underflow to zero. When UNDZ is set together with UNFD, underflow traps  
are disabled and the 21264/EV68A places a true zero in the destination register.  
See UNFD, above.  
2–36 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
AMASK and IMPLVER Instruction Values  
Table 2–14 Floating-Point Control Register Fields (Continued)  
Name  
Extent  
Type  
Description  
DYN  
[59:58] RW  
Dynamic rounding mode. Indicates the rounding mode to be used by an IEEE  
floating-point instruction when the instruction specifies dynamic rounding  
mode:  
Bits Meaning  
00  
01  
10  
11  
Chopped  
Minus infinity  
Normal  
Plus infinity  
IOV  
[57]  
[56]  
[55]  
[54]  
[53]  
[52]  
[51]  
RW  
RW  
RW  
RW  
RW  
RW  
RW  
Integer overflow. A CVTGQ, CVTTQ, or CVTQL overflowed the destination  
precision.  
INE  
Inexact result. A floating-point arithmetic or conversion operation gave a result  
that differed from the mathematically exact result.  
UNF  
OVF  
DZE  
INV  
Underflow. A floating-point arithmetic or conversion operation gave a result  
that underflowed the destination exponent.  
Overflow. A floating-point arithmetic or conversion operation gave a result that  
overflowed the destination exponent.  
Divide by zero. An attempt was made to perform a floating-point divide with a  
divisor of zero.  
Invalid operation. An attempt was made to perform a floating-point arithmetic  
operation and one or more of its operand values were illegal.  
OVFD  
Overflow disable. If this bit is set and a floating-point arithmetic operation gen-  
erates an overflow condition, then the appropriate IEEE nontrapping result is  
placed in the destination register and the trap is suppressed.  
DZED  
INVD  
[50]  
[49]  
RW  
RW  
Division by zero disable. If this bit is set and a floating-point divide by zero is  
detected, the appropriate IEEE nontrapping result is placed in the destination  
register and the trap is suppressed.  
Invalid operation disable. If this bit is set and a floating-point operate generates  
an invalid operation condition and 21264/EV68A is capable of producing the  
correct IEEE nontrapping result, that result is placed in the destination register  
and the trap is suppressed.  
DNZ  
[48]  
RW  
Denormal operands to zero. If this bit is set, treat all Denormal operands as a  
signed zero value with the same sign as the Denormal operand.  
1
Reserved [47:0]  
1
Alpha architecture FPCR bit 47 (DNOD) is not implemented by the 21264/EV68A.  
2.15 AMASK and IMPLVER Instruction Values  
The AMASK and IMPLVER instructions return the supported architecture extensions  
and processor type , respectively.  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–37  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Design Examples  
2.15.1 AMASK  
The 21264/EV68A returns the AMASK instruction values provided in Table 2–15. The  
I_CTL register reports the 21264/EV68A pass level (see I_CTL[CHIP_ID], Section  
Table 2–15 21264/EV68A AMASK Values  
21264/EV68A Pass Level  
AMASK Feature Mask Value  
1307  
See I_CTL[CHIP_ID], Table 5–11  
16  
The AMASK bit definitions provided in Table 2–15 are defined in Table 2–16.  
Table 2–16 AMASK Bit Assignments  
Bit  
Meaning  
0
Support for the byte/word extension (BWX)  
The instructions that comprise the BWX extension are LDBU, LDWU, SEXTB,  
SEXTW, STB, and STW.  
1
Support for the square-root and floating-point convert extension (FIX)  
The instructions that comprise the FIX extension are FTOIS, FTOIT, ITOFF, ITOFS,  
ITOFT, SQRTF, SQRTG, SQRTS, and SQRTT.  
2
8
Support for the count extension (CIX)  
The instructions that comprise the CIX extension are CTLZ, CTPOP, and CTTZ.  
Support for the multimedia extension (MVI)  
The instructions that comprise the MVI extension are MAXSB8, MAXSW4,  
MAXUB8, MAXUW4, MINSB8, MINSW4, MINUB8, MINUW4, PERR, PKLB,  
PKWB, UNPKBL, and UNPKBW.  
9
Support for precise arithmetic trap reporting in hardware. The trap PC is the same as  
the instruction PC after the trapping instruction is executed.  
12  
Support for using a prefetch with modify intent to improve the performance of the  
first attempt to acquire a lock. When clear, indicates possible prefetch error with  
locks, described in waiver 10 to the Alpha Architecture and in the prefetch section of  
the appropriate processor (21264/EV6 and 21264/EV67) documents.  
2.15.2 IMPLVER  
For the 21264/EV68A, the IMPLVER instruction returns the value 2.  
2.16 Design Examples  
The 21264/EV68A can be designed into many different uniprocessor and multiproces-  
sor system configurations. Figures 2–12 and 2–13 illustrate two possible configura-  
tions. These configurations employ additional system/memory controller chipsets.  
Figure 2–12 shows a typical uniprocessor system with a second-level cache. This sys-  
tem configuration could be used in standalone or networked workstations.  
2–38 Internal Architecture  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Design Examples  
Figure 2–12 Typical Uniprocessor Configuration  
Duplicate  
Tag Store  
(Optional)  
L2 Cache  
21272 Core  
Logic Chipset  
21264  
Tag  
Store  
Tag  
Address  
Out  
Control  
Chips  
Address  
DRAM  
Arrays  
Address  
In  
Data Slice  
Chips  
Data  
Store  
Data  
Address  
Data  
Host PCI  
Bridge Chip  
Data  
64-bit PCI Bus  
FM-05573-EV67  
Figure 2–13 shows a typical multiprocessor system, each processor with a second-level  
cache. Each interface controller must employ a duplicate tag store to maintain cache  
coherency. This system configuration could be used in a networked database server  
application.  
Figure 2–13 Typical Multiprocessor Configuration  
21272 Core  
Logic Chipset  
21264  
DRAM  
Arrays  
L2  
Cache  
Address  
Data  
Control  
Chip  
21264  
Data Slice  
Chips  
DRAM  
Arrays  
L2  
Cache  
Address  
Data  
Host PCI  
Bridge Chip  
Host PCI  
Bridge Chip  
64-bit PCI Bus  
64-bit PCI Bus  
FM-05574-EV67  
21264/EV68A Hardware Reference Manual  
Internal Architecture 2–39  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Download from Www.Somanuals.com. All Manuals Search And Download.  
3
Hardware Interface  
This chapter contains the 21264/EV68A microprocessor logic symbol and provides  
information about signal names, their function, and their location. This chapter also  
describes the mechanical specifications of the 21264/EV68A. It is organized as fol-  
lows:  
The 21264/EV68A logic symbol  
The 21264/EV68A signal names and functions  
Lists of the signal pins, sorted by name and PGA location  
The specifications for the 21264/EV68A mechanical package  
The top and bottom views of the 21264/EV68A pinouts  
3.1 21264/EV68A Microprocessor Logic Symbol  
Figure 3–1 show the logic symbol for the 21264/EV68A chip.  
21264/EV68A Hardware Reference Manual  
Hardware Interface  
3–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
21264/EV68A Microprocessor Logic Symbol  
Figure 3–1 21264/EV68A Microprocessor Logic Symbol  
21264  
Bcache Interface  
BcAdd_H[23:4]  
System Interface  
SysAddIn_L[14:0]  
SysAddInClk_L  
BcData_H[127:0]  
BcCheck_H[15:0]  
BcDataInClk_H[7:0]  
BcDataOutClk_[3:0]  
BcDataOE_L  
SysAddOut_L[14:0]  
SysAddOutClk_L  
SysVref  
SysData_L[63:0]  
SysCheck_L[7:0]  
SysDataInClk_H[7:0]  
SysDataOutClk_L[7:0]  
SysDataInValid_L  
SysDataOutValid_L  
SysFillValid_L  
BcDataWr_L  
BcTag_H[42:20]  
BcTagInClk_H  
BcTagOutClk_x  
BcVref  
BcTagDirty_H  
BcTagParity_H  
BcTagShared_H  
BcTagValid_H  
BcTagOE_L  
BcTagWr_L  
BcLoad_L  
Clocks  
ClkIn_x  
FrameClk_x  
EV6Clk_x  
PLL_VDD  
2.5 V  
Miscellaneous  
IRQ_H[5:0]  
ClkFwdRst_H  
SromData_H  
Tms_H  
Trst_L  
Tick_H  
Tdi_H  
PllBypass_H  
MiscVref  
Reset_L  
SromClk_H  
SromOE_L  
TestStat_H  
Tdo_H  
DCOK_H  
FM-05646b  
3–2  
Hardware Interface  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
21264/EV68A Signal Names and Functions  
3.2 21264/EV68A Signal Names and Functions  
Table 3–1 defines the 21264/EV68A signal types referred to in this section.  
Table 3–1 Signal Pin Types Definitions  
Signal Type  
Inputs  
Definition  
I_DC_REF  
I_DA  
Input DC reference pin  
Input differential amplifier receiver  
Input clock pin  
I_DA_CLK  
Outputs  
O_OD  
Open drain output driver  
Open drain driver for test pins  
Push/pull output driver  
O_OD_TP  
O_PP  
O_PP_CLK  
Bidirectional  
B_DA_OD  
B_DA_PP  
Other  
Push/pull output clock driver  
Bidirectional differential amplifier receiver with open drain output  
Bidirectional differential amplifier receiver with push/pull output  
1
Spare  
Reserved to COMPAQ  
NoConnect  
No connection — Do not connect to these pins for any revision of the  
21264/EV68A. These pins must float.  
1
All Spare connections are Reserved to COMPAQ to maintain compatibility between  
passes of the chip. Designers should not use these pins.  
Table 3–2 lists all signal pins in alphabetic order and provides a full functional descrip-  
tion of the pins. Table 3–4 lists the signal pins and their corresponding pin grid array  
(PGA) locations in alphabetic order for the signal type. Table 3–5 lists the pin grid array  
locations in alphabetical order.  
Table 3–2 21264/EV68A Signal Descriptions  
Signal  
Type  
Count Description  
BcAdd_H[23:4]  
BcCheck_H[15:0]  
BcData_H[127:0]  
BcDataInClk_H[7:0]  
O_PP  
20  
16  
128  
8
These signals provide the index to the Bcache.  
B_DA_PP  
B_DA_PP  
I_DA  
ECC check bits for BcData_H[127:0].  
Bcache data signals.  
Bcache data input clocks. These clocks are used with high  
speed SDRAMs, such as DDRs, that provide a clock-out with  
data-output pins to optimize Bcache read bandwidths. The  
21264/EV68A internally synchronizes the data to its logic with  
clock forward receive circuits similar to the system interface.  
BcDataOE_L  
O_PP  
1
Bcache data output enable. The 21264/EV68A asserts this sig-  
nal during Bcache read operations.  
21264/EV68A Hardware Reference Manual  
Hardware Interface  
3–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                                         
21264/EV68A Signal Names and Functions  
Table 3–2 21264/EV68A Signal Descriptions (Continued)  
Signal  
Type  
Count Description  
BcDataOutClk_H[3:0] O_PP  
8
Bcache data output clocks. These free-running clocks are dif-  
BcDataOutClk_L[3:0]  
ferential copies of the Bcache clock and are derived from the  
21264/EV68A GCLK. Their period is a multiple of the GCLK  
and is fixed for all operations. They can be configured so that  
their rising edge lags BcAdd_H[23:4] by 0 to 2 GCLK cycles.  
The 21264/EV68A synchronizes tag output information with  
these clocks.  
BcDataWr_L  
O_PP  
1
Bcache data write enable. The 21264/EV68A asserts this signal  
when writing data to the Bcache data arrays.  
BcLoad_L  
O_PP  
1
Bcache burst enable.  
Bcache tag bits.  
BcTag_H[42:20]  
BcTagDirty_H  
B_DA_PP  
B_DA_PP  
23  
1
Tag dirty state bit. During cache write operations, the 21264/  
EV68A will assert this signal if the Bcache data has been mod-  
ified.  
BcTagInClk_H  
I_DA  
1
Bcache tag input clock. The 21264/EV68A uses this input  
clock to latch the tag information on Bcache read operations.  
This clock is used with high-speed SDRAMs, such as DDRs,  
that provide a clock-out with data-output pins to optimize  
Bcache read bandwidths. The 21264/EV68A internally syn-  
chronizes the data to its logic with clock forward receive cir-  
cuits similar to the system interface.  
BcTagOE_L  
O_PP  
O_PP  
1
2
Bcache tag output enable. This signal is asserted by the 21264/  
EV68A for Bcache read operations.  
BcTagOutClk_H  
BcTagOutClk_L  
Bcache tag output clock. These clocks “echo” the clock-for-  
warded BcDataOutClk_x[3:0] clocks.  
BcTagParity_H  
BcTagShared_H  
B_DA_PP  
B_DA_PP  
1
1
Tag parity state bit.  
Tag shared state bit. The 21264/EV68A will write a 1 on this  
signal line if another agent has a copy of the cache line.  
BcTagValid_H  
BcTagWr_L  
B_DA_PP  
O_PP  
1
1
Tag valid state bit. If set, this line indicates that the cache line  
is valid.  
Tag RAM write enable. The 21264/EV68A asserts this signal  
when writing a tag to the Bcache tag arrays.  
BcVref  
I_DC_REF  
I_DA  
1
1
Bcache tag reference voltage.  
ClkFwdRst_H  
Systems assert this synchronous signal to wake up a powered-  
down 21264/EV68A. The ClkFwdRst_H signal is clocked  
into a 21264/EV68A register by the captured FrameClk_x  
signals. Systems must ensure that the timing of this signal  
meets 21264/EV68A requirements (see Section 4.7.2).  
ClkIn_H  
ClkIn_L  
I_DA_CLK  
I_DA  
2
1
2
Differential input signals provided by the system.  
DCOK_H  
dc voltage OK. Must be deasserted until dc voltage reaches  
proper operating level. After that, DCOK_H is asserted.  
EV6Clk_H  
EV6Clk_L  
O_PP_CLK  
Provides an external test point to measure phase alignment of  
the PLL.  
3–4  
Hardware Interface  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                                 
21264/EV68A Signal Names and Functions  
Table 3–2 21264/EV68A Signal Descriptions (Continued)  
Signal  
Type  
Count Description  
FrameClk_H  
FrameClk_L  
I_DA_CLK  
2
6
A skew-controlled differential 50% duty cycle copy of the sys-  
tem clock. It is used by the 21264/EV68A as a reference, or  
framing, clock.  
IRQ_H[5:0]  
I_DA  
These six interrupt signal lines may be asserted by the system.  
The response of the 21264/EV68A is determined by the system  
software.  
MiscVref  
I_DC_REF  
I_DA  
1
1
Voltage reference for the miscellaneous pins  
(see Table 3–3).  
PllBypass_H  
When asserted, this signal will cause the two input clocks  
(ClkIn_x) to be applied to the 21264/EV68A internal circuits,  
instead of the 21264/EV68A global clock (GCLK).  
PLL_VDD  
Reset_L  
2.5 V  
I_DA  
1
1
2.5-V dedicated power supply for the 21264/EV68A PLL.  
System reset. This signal protects the 21264/EV68A from  
damage during initial power-up. It must be asserted until  
DCOK_H is asserted. After that, it is deasserted and the  
21264/EV68A begins its reset sequence.  
SromClk_H  
O_OD_TP  
1
Serial ROM clock. Supplies the clock that causes the SROM to  
advance to the next bit. The cycle time for this clock is 256  
times the cycle time of the GCLK (internal 21264/EV68A  
clock).  
SromData_H  
SromOE_L  
I_DA  
1
Serial ROM data. Input data line from the SROM.  
O_OD_TP  
I_DA  
1
Serial ROM enable. Supplies the output enable to the SROM.  
SysAddIn_L[14:0]  
15  
Time-multiplexed command/address/ID/Ack from system to  
the 21264/EV68A.  
SysAddInClk_L  
I_DA  
1
Single-ended forwarded clock from system for  
SysAddIn_L[14:0] and SysFillValid_L.  
SysAddOut_L[14:0]  
SysAddOutClk_L  
O_OD  
15  
1
Time-multiplexed command/address/ID/mask from the 21264/  
EV68A to the system bus.  
O_OD  
Single-ended forwarded clock output for  
SysAddOut_L[14:0].  
SysCheck_L[7:0]  
SysData_L[63:0]  
B_DA_OD  
8
Quadword ECC check bits for SysData_L[63:0].  
B_DA_OD 64  
Data bus for memory and I/O data.  
SysDataInClk_H[7:0] I_DA  
SysDataInValid_L I_DA  
SysDataOutClk_L[7:0] O_OD  
8
1
8
1
1
Single-ended system-generated clocks for clock forwarded  
input system data.  
When asserted, marks a valid data cycle for data transfers to  
the 21264/EV68A.  
Single-ended 21264/EV68A-generated clocks for clock for-  
warded output system data.  
SysDataOutValid_L  
SysFillValid_L  
I_DA  
I_DA  
When asserted, marks a valid data cycle for data transfers from  
the 21264/EV68A.  
When asserted, this bit indicates validation for the cache fill  
delivered in the previous system SysDc command.  
21264/EV68A Hardware Reference Manual  
Hardware Interface  
3–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                                       
21264/EV68A Signal Names and Functions  
Table 3–2 21264/EV68A Signal Descriptions (Continued)  
Signal  
Type  
Count Description  
SysVref  
Tck_H  
I_DC_REF  
I_DA  
1
1
1
1
1
System interface reference voltage.  
IEEE 1149.1 test clock.  
Tdi_H  
I_DA  
IEEE 1149.1 test data-in signal.  
IEEE 1149.1 test data-out signal.  
Tdo_H  
TestStat_H  
O_OD_TP  
O_OD_TP  
Test status pin. System reset drives the test status pin low.  
The TestStat_H pin is forced high at the start of the Icache  
BiST. If the Icache BiST passes, the pin is deasserted at the end  
of the BiST operation; otherwise, it remains high.  
The 21264/EV68A generates a timeout reset signal if an  
instruction is not retired within one billion cycles.  
The 21264/EV68A signals the timeout reset event by output-  
ting a 256 GCLK cycle wide pulse on TestStat_H.  
Tms_H  
Trst_L  
I_DA  
I_DA  
1
1
IEEE 1149.1 test mode select signal.  
IEEE 1149.1 test access port (TAP) reset signal.  
Table 3–3 lists signals by function and provides an abbreviated description.  
Table 3–3 21264/EV68A Signal Descriptions by Function  
Signal  
Type  
Count Description  
BcVref Domain  
BcAdd_H[23:4]  
BcCheck_H[15:0]  
BcData_H[127:0]  
BcDataInClk_H[7:0]  
BcDataOE_L  
O_PP  
20  
16  
128  
8
Bcache index.  
B_DA_PP  
B_DA_PP  
I_DA  
ECC check bits for BcData_H[127:0].  
Bcache data.  
Bcache data input clocks.  
Bcache data output enable.  
Bcache data output clocks.  
O_PP  
1
BcDataOutClk_H[3:0] O_PP  
8
BcDataOutClk_L[3:0]  
BcDataWr_L  
BcLoad_L  
O_PP  
1
Bcache data write enable.  
Bcache burst enable.  
Bcache tag bits.  
O_PP  
1
BcTag_H[42:20]  
BcTagDirty_H  
BcTagInClk_H  
BcTagOE_L  
B_DA_PP  
B_DA_PP  
I_DA  
23  
1
Tag dirty state bit.  
1
Bcache tag input clock.  
Bcache tag output enable.  
Bcache tag output clocks.  
O_PP  
1
BcTagOutClk_H  
BcTagOutClk_L  
O_PP  
2
BcTagParity_H  
BcTagShared_H  
BcTagValid_H  
BcTagWr_L  
B_DA_PP  
B_DA_PP  
B_DA_PP  
O_PP  
1
1
1
1
Tag parity state bit.  
Tag shared state bit.  
Tag valid state bit.  
Tag RAM write enable.  
3–6  
Hardware Interface  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
21264/EV68A Signal Names and Functions  
Table 3–3 21264/EV68A Signal Descriptions by Function (Continued)  
Signal  
Type  
Count Description  
BcVref  
I_DC_REF  
1
Tag data input reference voltage.  
SysVref Domain  
SysAddIn_L[14:0]  
SysAddInClk_L  
I_DA  
I_DA  
15  
1
Time-multiplexed SysAddIn, system-to-21264/EV68A.  
Single-ended forwarded clock from system for  
SysAddIn_L[14:0] and SysFillValid_L.  
SysAddOut_L[14:0]  
SysAddOutClk_L  
SysCheck_L[7:0]  
SysData_L[63:0]  
O_OD  
15  
1
Time-multiplexed SysAddOut, 21264/EV68A-to-system.  
Single-ended forwarded-clock.  
O_OD  
B_DA_OD  
8
Quadword ECC check bits for SysData_L[63:0].  
Data bus for memory and I/O data.  
B_DA_OD 64  
SysDataInClk_H[7:0] I_DA  
SysDataInValid_L I_DA  
SysDataOutClk_L[7:0] O_OD  
8
1
8
1
Single-ended system-generated clocks for clock forwarded  
input system data.  
When asserted, marks a valid data cycle for data transfers to  
the 21264/EV68A.  
Single-ended 21264/EV68A-generated clocks for clock for-  
warded output system data.  
SysDataOutValid_L  
I_DA  
When asserted, marks a valid data cycle for data transfers  
from the 21264/EV68A.  
SysFillValid_L  
SysVref  
I_DA  
1
1
Validation for fill given in previous SysDC command.  
System interface reference voltage.  
I_DC_REF  
Clocks and PLL  
ClkIn_H  
ClkIn_L  
I_DA_CLK  
O_PP_CLK  
I_DA_CLK  
2
2
2
Differential input signals provided by the system.  
EV6Clk_H  
EV6Clk_L  
Provides an external test point to measure phase alignment of  
the PLL.  
FrameClk_H  
FrameClk_L  
A skew-controlled differential 50% duty cycle copy of the  
system clock. It is used by the 21264/EV68A as a reference,  
or framing, clock.  
PLL_VDD  
2.5 V  
I_DA  
1
1
2.5-V dedicated power supply for the 21264/EV68A PLL.  
MiscVref Domain  
ClkFwdRst_H  
Systems assert this synchronous signal to wake up a powered-  
down 21264/EV68A. The ClkFwdRst_H signal is clocked  
into a 21264/EV68A register by the captured FrameClk_x  
signals.  
DCOK_H  
I_DA  
1
dc voltage OK. Must be deasserted until dc voltage reaches  
proper operating level. After that, DCOK_H is asserted.  
IRQ_H[5:0]  
MiscVref  
I_DA  
6
1
1
These six interrupt signal lines may be asserted by the system.  
Reference voltage for miscellaneous pins.  
I_DC_REF  
I_DA  
PllBypass_H  
When asserted, this signal will cause the input clocks  
(ClkIn_x) to be applied to the 21264/EV68A internal circuits,  
instead of the 21264/EV68A’s global clock (GCLK).  
21264/EV68A Hardware Reference Manual  
Hardware Interface  
3–7  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Pin Assignments  
Table 3–3 21264/EV68A Signal Descriptions by Function (Continued)  
Signal  
Type  
Count Description  
Reset_L  
I_DA  
1
System reset. This signal protects the 21264/EV68A from  
damage during initial power-up. It must be asserted until  
DCOK_H is asserted. After that, it is deasserted and the  
21264/EV68A begins its reset sequence.  
SromClk_H  
SromData_H  
SromOE_L  
Tck_H  
O_OD_TP  
I_DA  
1
1
1
1
1
1
1
1
1
Serial ROM clock.  
Serial ROM data.  
O_OD_TP  
I_DA  
Serial ROM enable.  
IEEE 1149.1 test clock.  
Tdi_H  
I_DA  
IEEE 1149.1 test data-in signal.  
IEEE 1149.1 test data-out signal.  
Test status pin.  
Tdo_H  
O_OD_TP  
O_OD_TP  
I_DA  
TestStat_H  
Tms_H  
IEEE 1149.1 test mode select signal.  
IEEE 1149.1 test access port (TAP) reset signal.  
Trst_L  
I_DA  
3.3 Pin Assignments  
The 21264/EV68A package has 587 pins aligned in a pin grid array (PGA) design.  
There are 380 functional signal pins, 1 dedicated 2.5-V pin for the PLL, 112 ground  
VSS pins, and 94 VDD pins. Table 3–4 lists the signal pins and their corresponding pin  
grid array (PGA) locations in alphabetical order for the signal type. Table 3–5 lists the  
pin grid array locations in alphabetical order.  
Table 3–4 Pin List Sorted by Signal Name  
Signal Name  
BcAdd_H_10  
BcAdd_H_13  
BcAdd_H_16  
BcAdd_H_19  
BcAdd_H_22  
BcAdd_H_5  
PGA Location Signal Name  
PGA Location Signal Name  
PGALocation  
C31  
B30  
H28  
E31  
BcAdd_H_11  
BcAdd_H_14  
BcAdd_H_17  
BcAdd_H_20  
BcAdd_H_23  
BcAdd_H_6  
D30  
G29  
D32  
B36  
E33  
BcAdd_H_12  
BcAdd_H_15  
BcAdd_H_18  
BcAdd_H_21  
BcAdd_H_4  
A33  
B34  
A35  
C35  
E27  
H30  
B28  
A29  
F28  
BcAdd_H_7  
G27  
BcAdd_H_8  
C29  
AB4  
E45  
BcAdd_H_9  
BcCheck_H_0  
BcCheck_H_11  
BcCheck_H_14  
BcCheck_H_3  
BcCheck_H_6  
BcCheck_H_9  
BcData_H_10  
BcData_H_102  
BcData_H_105  
F2  
BcCheck_H_1  
BcCheck_H_12  
BcCheck_H_15  
BcCheck_H_4  
BcCheck_H_7  
BcData_H_0  
BcData_H_100  
BcData_H_103  
BcCheck_H_10  
BcCheck_H_13  
BcCheck_H_2  
BcCheck_H_5  
BcCheck_H_8  
BcData_H_1  
BcData_H_101  
BcData_H_104  
AW1  
AC45  
AT2  
AB42  
M8  
BD10  
AT44  
BC11  
AU43  
AA3  
L3  
BB36  
M38  
BC37  
B10  
D42  
H42  
D10  
D44  
G45  
H40  
L43  
3–8  
Hardware Interface  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Pin Assignments  
Table 3–4 Pin List Sorted by Signal Name (Continued)  
PGA Location Signal Name PGA Location Signal Name  
Signal Name  
BcData_H_106  
BcData_H_109  
BcData_H_111  
BcData_H_114  
BcData_H_117  
BcData_H_12  
BcData_H_122  
BcData_H_125  
BcData_H_13  
BcData_H_16  
BcData_H_19  
BcData_H_21  
BcData_H_24  
BcData_H_27  
BcData_H_3  
PGALocation  
L45  
BcData_H_107  
BcData_H_11  
BcData_H_112  
BcData_H_115  
BcData_H_118  
BcData_H_120  
BcData_H_123  
BcData_H_126  
BcData_H_14  
BcData_H_17  
BcData_H_2  
N45  
M2  
BcData_H_108  
T44  
U45  
AA43  
AE41  
AL43  
T2  
BcData_H_110  
BcData_H_113  
BcData_H_116  
BcData_H_119  
BcData_H_121  
BcData_H_124  
BcData_H_127  
BcData_H_15  
BcData_H_18  
BcData_H_20  
BcData_H_23  
BcData_H_26  
BcData_H_29  
BcData_H_31  
BcData_H_34  
BcData_H_37  
BcData_H_4  
W45  
AD44  
AK44  
AR45  
BA45  
BB42  
BD40  
Y4  
AC43  
AG45  
AM42  
AP40  
BB44  
BA37  
V2  
AV42  
BC41  
U1  
AC1  
AG1  
AL3  
AY2  
BB4  
C5  
AD2  
A5  
AE3  
AK2  
AP2  
AW5  
BE5  
BE7  
B40  
BcData_H_22  
BcData_H_25  
BcData_H_28  
BcData_H_30  
BcData_H_33  
BcData_H_36  
BcData_H_39  
BcData_H_41  
BcData_H_44  
BcData_H_47  
BcData_H_5  
AR1  
BB2  
BB8  
BB10  
C37  
BcData_H_32  
BcData_H_35  
BcData_H_38  
BcData_H_40  
BcData_H_43  
BcData_H_46  
BcData_H_49  
BcData_H_51  
BcData_H_54  
BcData_H_57  
BcData_H_6  
G33  
C41  
G41  
K44  
P42  
C43  
E43  
F44  
C3  
N41  
U43  
AB44  
E3  
BcData_H_42  
BcData_H_45  
BcData_H_48  
BcData_H_50  
BcData_H_53  
BcData_H_56  
BcData_H_59  
BcData_H_61  
BcData_H_64  
BcData_H_67  
BcData_H_7  
M44  
V44  
Y42  
AE43  
AJ45  
AP44  
AU41  
H6  
AD42  
AF42  
AN45  
AW45  
BA43  
BD42  
C11  
BcData_H_52  
BcData_H_55  
BcData_H_58  
BcData_H_60  
BcData_H_63  
BcData_H_66  
BcData_H_69  
BcData_H_71  
BcData_H_74  
BcData_H_77  
BcData_H_8  
AK42  
AN41  
AY44  
BC43  
BE41  
C9  
BcData_H_62  
BcData_H_65  
BcData_H_68  
BcData_H_70  
BcData_H_73  
BcData_H_76  
BcData_H_79  
BcData_H_81  
BcData_H_84  
BcData_H_87  
BB38  
A7  
B6  
B4  
D4  
E1  
G5  
D2  
BcData_H_72  
BcData_H_75  
BcData_H_78  
BcData_H_80  
BcData_H_83  
BcData_H_86  
BcData_H_89  
H4  
G1  
N5  
L1  
N1  
U3  
W5  
W1  
J3  
AB2  
AF4  
AN1  
BA1  
AC3  
AJ3  
BcData_H_82  
BcData_H_85  
BcData_H_88  
AD4  
AK4  
AU5  
AM4  
21264/EV68A Hardware Reference Manual  
Hardware Interface  
3–9  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Pin Assignments  
Table 3–4 Pin List Sorted by Signal Name (Continued)  
Signal Name  
PGA Location Signal Name  
PGA Location Signal Name  
PGALocation  
BcData_H_9  
K2  
BcData_H_90  
BcData_H_93  
BcData_H_96  
BcData_H_99  
BcDataInClk_H_2  
BcDataInClk_H_5  
BcDataOE_L  
BA3  
BA9  
A39  
B42  
AH2  
U39  
A27  
BcData_H_91  
BC3  
BC9  
D36  
E7  
BcData_H_92  
BD6  
AY12  
A41  
R3  
BcData_H_94  
BcData_H_95  
BcData_H_97  
BcData_H_98  
BcDataInClk_H_0  
BcDataInClk_H_3  
BcDataInClk_H_6  
BcDataInClk_H_1  
BcDataInClk_H_4  
BcDataInClk_H_7  
BC5  
AH44  
F38  
AY40  
BcDataOutClk_H_0 J5  
BcDataOutClk_H_3 AR43  
BcDataOutClk_L_2 K42  
BcDataOutClk_H_1 AU3  
BcDataOutClk_L_0 K4  
BcDataOutClk_L_3 AT42  
BcDataOutClk_H_2 J43  
BcDataOutClk_L_1 AV4  
BcDataWr_L  
BcTag_H_21  
BcTag_H_24  
BcTag_H_27  
BcTag_H_30  
BcTag_H_33  
BcTag_H_36  
BcTag_H_39  
BcTag_H_42  
BcTagOE_L  
BcTagParity_H  
BcTagWr_L  
ClkIn_H  
D26  
BcLoad_L  
BcTag_H_22  
BcTag_H_25  
BcTag_H_28  
BcTag_H_31  
BcTag_H_34  
BcTag_H_37  
BcTag_H_40  
BcTagDirty_H  
BcTagOutClk_H  
BcTagShared_H  
BcVref  
F26  
BcTag_H_20  
BcTag_H_23  
BcTag_H_26  
BcTag_H_29  
BcTag_H_32  
BcTag_H_35  
BcTag_H_38  
BcTag_H_41  
BcTagInClk_H  
BcTagOutClk_L  
BcTagValid_H  
ClkFwdRst_H  
DCOK_H  
E13  
H16  
A11  
B12  
D14  
E15  
A13  
G17  
C15  
H18  
D16  
B16  
C17  
A17  
E19  
B18  
A19  
F20  
D20  
E21  
C21  
D22  
H22  
C23  
G19  
H24  
C25  
D24  
B22  
G23  
B24  
E25  
F18  
BE11  
AY18  
AV16  
BE13  
BC15  
BB14  
BD12  
V38  
AM8  
AM6  
AW15  
AW17  
BB16  
BD2  
BD16  
AT4  
ClkIn_L  
AN7  
AL7  
BA15  
AV18  
AV22  
AV8  
AJ1  
EV6Clk_H  
FrameClk_L  
IRQ_H_2  
EV6Clk_L  
IRQ_H_0  
FrameClk_H  
IRQ_H_1  
IRQ_H_3  
IRQ_H_4  
IRQ_H_5  
MiscVref  
NoConnect  
NoConnect  
Reset_L  
PLL_VDD  
Spare  
PllBypass_H  
Spare  
Spare  
Spare  
BE9  
AJ43  
E39  
Spare  
F8  
Spare  
BD4  
T4  
Spare  
Spare  
AR3  
BA39  
BC17  
BC29  
BD24  
AY28  
Spare  
Spare  
Spare  
Spare  
BC21  
BE17  
BB24  
BE23  
BE29  
SromClk_H  
SysAddIn_L_0  
SysAddIn_L_11  
SysAddIn_L_14  
SysAddIn_L_4  
AW19  
BD30  
AV24  
AW23  
AW27  
SromData_H  
SysAddIn_L_1  
SysAddIn_L_12  
SysAddIn_L_2  
SromOE_L  
SysAddIn_L_10  
SysAddIn_L_13  
SysAddIn_L_3  
3–10 Hardware Interface  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Pin Assignments  
Table 3–4 Pin List Sorted by Signal Name (Continued)  
Signal Name  
SysAddIn_L_5  
SysAddIn_L_8  
SysAddOut_L_0  
SysAddOut_L_11  
SysAddOut_L_14  
SysAddOut_L_4  
SysAddOut_L_7  
SysAddOutClk_L  
SysCheck_L_2  
SysCheck_L_5  
SysData_L_0  
PGA Location Signal Name  
PGA Location Signal Name  
PGALocation  
BA27  
AY26  
AW33  
AW29  
BB30  
BA33  
AV30  
BD34  
AK8  
AA41  
F14  
SysAddIn_L_6  
SysAddIn_L_9  
SysAddOut_L_1  
SysAddOut_L_12  
SysAddOut_L_2  
SysAddOut_L_5  
SysAddOut_L_8  
SysCheck_L_0  
SysCheck_L_3  
SysCheck_L_6  
SysData_L_1  
BD28  
BC25  
BE39  
BC31  
BD36  
AY32  
BB32  
L7  
SysAddIn_L_7  
BE27  
BB26  
BE33  
AV28  
BC35  
BE35  
BA31  
AA5  
L39  
SysAddInClk_L  
SysAddOut_L_10  
SysAddOut_L_13  
SysAddOut_L_3  
SysAddOut_L_6  
SysAddOut_L_9  
SysCheck_L_1  
SysCheck_L_4  
SysCheck_L_7  
SysData_L_10  
SysData_L_13  
SysData_L_16  
SysData_L_19  
SysData_L_21  
SysData_L_24  
SysData_L_27  
SysData_L_3  
BA13  
AM40  
G13  
AY34  
P6  
SysData_L_11  
SysData_L_14  
SysData_L_17  
SysData_L_2  
T8  
SysData_L_12  
SysData_L_15  
SysData_L_18  
SysData_L_20  
SysData_L_23  
SysData_L_26  
SysData_L_29  
SysData_L_31  
SysData_L_34  
SysData_L_37  
SysData_L_4  
V8  
V6  
W7  
Y6  
AB8  
AE5  
AH8  
AP8  
AV6  
H12  
AC7  
F12  
AD8  
AH6  
AL5  
SysData_L_22  
SysData_L_25  
SysData_L_28  
SysData_L_30  
SysData_L_33  
SysData_L_36  
SysData_L_39  
SysData_L_41  
SysData_L_44  
SysData_L_47  
SysData_L_5  
AJ7  
AR7  
AV10  
AV12  
F34  
AT8  
AW11  
AW13  
H34  
SysData_L_32  
SysData_L_35  
SysData_L_38  
SysData_L_40  
SysData_L_43  
SysData_L_46  
SysData_L_49  
SysData_L_51  
SysData_L_54  
SysData_L_57  
SysData_L_6  
F32  
G35  
F40  
G39  
K38  
J41  
H10  
M40  
T38  
N39  
SysData_L_42  
SysData_L_45  
SysData_L_48  
SysData_L_50  
SysData_L_53  
SysData_L_56  
SysData_L_59  
SysData_L_61  
SysData_L_7  
P40  
V40  
W41  
AB38  
AD38  
AJ39  
AN39  
AT38  
AV36  
K8  
W39  
AC39  
AF40  
AL41  
AP38  
F6  
Y40  
G7  
SysData_L_52  
SysData_L_55  
SysData_L_58  
SysData_L_60  
SysData_L_63  
SysData_L_9  
AH38  
AK38  
AR39  
AY38  
AV34  
N7  
SysData_L_62  
SysData_L_8  
AW35  
M6  
SysDataInClk_H_0 D8  
SysDataInClk_H_3 AY6  
SysDataInClk_H_6 AG41  
SysDataOutClk_L_0 G11  
SysDataOutClk_L_3 AY8  
SysDataInClk_H_1 P4  
SysDataInClk_H_4 E37  
SysDataInClk_H_7 AV40  
SysDataOutClk_L_1 U7  
SysDataOutClk_L_4 H36  
SysDataInClk_H_2 AF6  
SysDataInClk_H_5 R43  
SysDataInValid_L  
BD22  
SysDataOutClk_L_2 AG7  
21264/EV68A Hardware Reference Manual  
Hardware Interface 3–11  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Pin Assignments  
Table 3–4 Pin List Sorted by Signal Name (Continued)  
Signal Name  
PGA Location Signal Name  
PGA Location Signal Name  
PGALocation  
SysDataOutClk_L_5 R41  
SysDataOutValid_L BB22  
SysDataOutClk_L_6 AH40 SysDataOutClk_L_7 AW39  
SysFillValid_L  
Tdi_H  
BC23  
BA21  
BD18  
SysVref  
Tdo_H  
Trst_L  
BA25  
BB20  
AY20  
Tck_H  
BE19  
BA19  
TestStat_H  
Tms_H  
Table 3–5 Pin List Sorted by PGA Location  
PGA Location Signal Name PGA Location Signal Name  
PGA Location Signal Name  
A11  
BcTag_H_22  
BcTag_H_36  
BcAdd_H_15  
BcData_H_98  
BcCheck_H_9  
SysCheck_L_1  
BcCheck_H_1  
SysData_L_16  
SysData_L_49  
SysData_L_17  
BcData_H_82  
SysData_L_18  
BcData_H_49  
SysData_L_51  
BcData_H_19  
A13  
BcTag_H_26  
BcDataOE_L  
BcAdd_H_19  
BcData_H_2  
SysCheck_L_5  
BcData_H_80  
BcCheck_H_5  
BcData_H_16  
BcData_H_112  
BcData_H_17  
BcData_H_48  
BcData_H_18  
SysData_L_19  
BcData_H_50  
SysDataInClk_H_6  
BcDataInClk_H_2  
BcDataInClk_H_6  
Spare  
A17  
BcTag_H_33  
A19  
A27  
A29  
BcAdd_H_6  
A33  
A35  
A39  
BcData_H_96  
BcData_H_65  
BcData_H_111  
SysData_L_48  
BcData_H_47  
BcData_H_81  
BcCheck_H_13  
SysData_L_50  
BcData_H_113  
BcData_H_114  
BcData_H_83  
SysDataInClk_H_2  
BcData_H_115  
SysData_L_52  
SysData_L_20  
BcData_H_84  
BcData_H_51  
SysData_L_55  
BcData_H_116  
SysData_L_54  
EV6Clk_L  
A41  
A5  
A7  
AA3  
AA5  
AB4  
AB8  
AC39  
AC7  
AD4  
AD8  
AE43  
AF40  
AG1  
AG7  
AH40  
AH8  
AJ39  
AJ7  
AA41  
AB2  
AB42  
AC1  
AC43  
AD2  
AD42  
AE3  
AE5  
AF42  
AG41  
AA43  
AB38  
AB44  
AC3  
AC45  
AD38  
AD44  
AE41  
AF4  
AF6  
AG45  
AH38  
AH6  
AJ3  
SysDataOutClk_L_2 AH2  
SysDataOutClk_L_6 AH44  
SysData_L_21  
SysData_L_53  
SysData_L_22  
BcData_H_85  
SysCheck_L_2  
BcData_H_117  
BcData_H_87  
EV6Clk_H  
AJ1  
AJ43  
AK2  
AK42  
AL3  
Spare  
AJ45  
AK38  
AK44  
AL41  
AL7  
BcData_H_20  
BcData_H_52  
BcData_H_21  
SysData_L_23  
SysCheck_L_6  
ClkIn_H  
AK4  
AK8  
AL43  
AM4  
AM6  
AN39  
AN7  
AP40  
AL5  
AM40  
AM8  
AN41  
AP2  
AM42  
AN1  
AN45  
AP38  
AP8  
BcData_H_118  
BcData_H_86  
BcData_H_53  
SysData_L_57  
SysData_L_24  
SysData_L_56  
ClkIn_L  
BcData_H_55  
BcData_H_23  
BcData_H_54  
BcData_H_120  
AP44  
3–12 Hardware Interface  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
Pin Assignments  
Table 3–5 Pin List Sorted by PGA Location (Continued)  
PGA Location Signal Name PGA Location Signal Name  
PGA Location Signal Name  
AR1  
BcData_H_22  
BcDataOutClk_H_3 AR45  
BcCheck_H_2 AT38  
AR3  
Spare  
AR39  
AR7  
SysData_L_58  
SysData_L_25  
Spare  
AR43  
AT2  
BcData_H_119  
SysData_L_59  
BcCheck_H_14  
BcData_H_57  
SysData_L_28  
IRQ_H_3  
AT4  
AT42  
AU3  
BcDataOutClk_L_3 AT44  
BcDataOutClk_H_1 AU41  
AT8  
SysData_L_26  
BcCheck_H_6  
SysData_L_30  
MiscVref  
AU43  
AV12  
AV22  
AV30  
AV4  
AU5  
BcData_H_88  
FrameClk_H  
AV10  
AV18  
AV28  
AV36  
AV42  
AW1  
AV 16  
AV 24  
AV 34  
AV 40  
AV 8  
SysAddIn_L_11  
SysData_L_63  
SysDataInClk_H_7  
PLL_VDD  
SysAddOut_L_13  
SysData_L_61  
BcData_H_122  
BcCheck_H_10  
FrameClk_L  
SysAddIn_L_14  
SysAddOut_L_0  
BcData_H_56  
DCOK_H  
SysAddOut_L_7  
BcDataOutClk_L_1  
SysData_L_27  
SysData_L_29  
IRQ_H_2  
AV6  
AW11  
AW17  
AW27  
AW35  
AW5  
AY2  
AW13  
AW19  
AW29  
AW39  
AY12  
AY20  
AY32  
AY40  
AY8  
SysData_L_31  
SromClk_H  
AW15  
AW23  
AW33  
SysAddIn_L_4  
SysData_L_62  
BcData_H_26  
BcData_H_24  
SysAddIn_L_2  
SysData_L_60  
SysDataInClk_H_3  
BcTag_H_23  
BcTagParity_H  
BcAdd_H_10  
BcData_H_68  
BcData_H_67  
IRQ_H_0  
SysAddOut_L_11  
SysDataOutClk_L_7 AW45  
BcData_H_95  
Trst_L  
AY18  
AY26  
AY34  
AY44  
SysAddIn_L_8  
SysCheck_L_7  
BcData_H_58  
BcData_H_0  
BcTag_H_35  
BcAdd_H_4  
BcAdd_H_20  
BcData_H_99  
SysCheck_L_3  
Tdi_H  
AY28  
AY38  
AY6  
SysAddOut_L_5  
BcDataInClk_H_7  
SysDataOutClk_L_3 B10  
B12  
B16  
BcTag_H_31  
B18  
B22  
B24  
BcTagValid_H  
BcAdd_H_18  
BcData_H_34  
BcData_H_89  
TestStat_H  
B28  
B30  
B34  
B36  
B4  
B40  
B42  
B6  
BA1  
BA13  
BA21  
BA3  
BA15  
BA25  
BA31  
BA39  
BA9  
BA19  
BA27  
BA33  
BA43  
BB10  
BB2  
SysVref  
SysAddIn_L_5  
SysAddOut_L_4  
BcData_H_59  
BcData_H_30  
BcData_H_25  
SysAddIn_L_10  
SysAddOut_L_8  
BcData_H_27  
BcData_H_28  
SromData_H  
BcData_H_90  
BcData_H_126  
BcData_H_121  
NoConnect  
SysAddOut_L_9  
Spare  
BA37  
BA45  
BB14  
BB20  
BB26  
BB36  
BB42  
BC11  
BC21  
BcData_H_93  
IRQ_H_5  
BB16  
BB22  
BB30  
BB38  
BB44  
BC15  
BC23  
Tdo_H  
SysDataOutValid_L  
SysAddOut_L_14  
BcData_H_62  
BcData_H_123  
IRQ_H_4  
BB24  
BB32  
BB4  
SysAddInClk_L  
BcCheck_H_15  
BcData_H_124  
BcCheck_H_3  
Spare  
BB8  
BC17  
SysFillValid_L  
21264/EV68A Hardware Reference Manual  
Hardware Interface 3–13  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Pin Assignments  
Table 3–5 Pin List Sorted by PGA Location (Continued)  
PGA Location Signal Name  
PGA Location Signal Name  
PGA Location Signal Name  
BC25  
BC31  
BC41  
BC9  
BD16  
BD22  
BD30  
BD4  
BD6  
BE17  
BE27  
BE35  
BE5  
C11  
C21  
C29  
C35  
C43  
D10  
D2  
SysAddIn_L_9  
SysAddOut_L_12  
BcData_H_125  
BcData_H_94  
Reset_L  
BC29  
BC35  
BC43  
BD10  
BD18  
BD24  
BD34  
BD40  
BE11  
BE19  
BE29  
BE39  
BE7  
C15  
SysAddIn_L_1  
SysAddOut_L_3  
BcData_H_60  
BcCheck_H_11  
Tms_H  
BC3  
BC37  
BC5  
BD12  
BD2  
BD28  
BD36  
BD42  
BE13  
BE23  
BE33  
BE41  
BE9  
C17  
C25  
C31  
C41  
C9  
BcData_H_91  
BcCheck_H_7  
BcDataInClk_H_3  
PllBypass_H  
NoConnect  
SysDataInValid_L  
SysAddIn_L_0  
Spare  
SysAddIn_L_12  
SysAddOutClk_L  
BcData_H_127  
ClkFwdRst_H  
Tck_H  
SysAddIn_L_6  
SysAddOut_L_2  
BcData_H_61  
IRQ_H_1  
BcData_H_92  
SromOE_L  
SysAddIn_L_13  
SysAddOut_L_10  
BcData_H_63  
Spare  
SysAddIn_L_7  
SysAddOut_L_6  
BcData_H_29  
BcData_H_64  
BcTag_H_40  
BcAdd_H_8  
SysAddIn_L_3  
SysAddOut_L_1  
BcData_H_31  
BcTag_H_28  
BcTagDirty_H  
BcData_H_4  
BcData_H_33  
BcData_H_3  
BcTag_H_24  
BcTag_H_38  
BcDataWr_L  
BcData_H_97  
BcData_H_101  
BcTag_H_20  
BcTag_H_39  
BcData_H_5  
SysDataInClk_H_4  
BcCheck_H_12  
SysData_L_0  
BcTag_H_37  
SysData_L_32  
SysData_L_36  
Spare  
BcTag_H_32  
BcTagOutClk_H  
BcAdd_H_12  
BcData_H_35  
BcData_H_66  
BcTag_H_30  
BcTag_H_41  
BcAdd_H_11  
BcData_H_69  
SysDataInClk_H_0  
BcTag_H_25  
BcTagWr_L  
BcAdd_H_16  
Spare  
C23  
C3  
BcAdd_H_22  
BcData_H_36  
BcData_H_1  
BcData_H_71  
BcTagOutClk_L  
BcAdd_H_17  
BcData_H_100  
BcData_H_7  
BcTag_H_34  
BcAdd_H_5  
C37  
C5  
D14  
D20  
D26  
D36  
D44  
E13  
D16  
D22  
D30  
D4  
D24  
D32  
D42  
E1  
D8  
E15  
E25  
E31  
E39  
E7  
E19  
E21  
E27  
E3  
E33  
BcAdd_H_23  
BcData_H_37  
SysData_L_2  
BcCheck_H_0  
BcAdd_H_9  
E37  
E43  
E45  
BcDataInClk_H_0  
BcVref  
F12  
F14  
F18  
F2  
F20  
F26  
BcLoad_L  
F28  
F32  
F34  
SysData_L_33  
BcData_H_39  
BcData_H_73  
BcTag_H_27  
BcAdd_H_7  
SysData_L_35  
F38  
BcDataInClk_H_4  
SysData_L_6  
F40  
F44  
F6  
F8  
G1  
G11  
G19  
G29  
SysDataOutClk_L_0 G13  
SysData_L_1  
BcTagShared_H  
BcData_H_32  
G17  
G27  
G35  
BcTagInClk_H  
BcAdd_H_14  
G23  
G33  
3–14 Hardware Interface  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Pin Assignments  
Table 3–5 Pin List Sorted by PGA Location (Continued)  
PGA Location Signal Name PGA Location Signal Name  
PGA Location Signal Name  
G39  
G5  
SysData_L_37  
BcData_H_70  
SysData_L_3  
BcTag_H_42  
BcAdd_H_21  
BcData_H_72  
BcData_H_6  
G41  
G7  
BcData_H_38  
SysData_L_5  
BcTag_H_21  
BcTagOE_L  
SysData_L_34  
BcData_H_102  
BcData_H_8  
G45  
H10  
H18  
H28  
H36  
H42  
J41  
BcData_H_104  
SysData_L_4  
H12  
H22  
H30  
H4  
H16  
H24  
H34  
H40  
J3  
BcTag_H_29  
BcAdd_H_13  
SysDataOutClk_L_4  
BcData_H_103  
SysData_L_39  
BcData_H_9  
H6  
J43  
K38  
K44  
L3  
BcDataOutClk_H_2 J5  
K4  
BcDataOutClk_H_0 K2  
BcDataOutClk_L_0 K42  
SysData_L_38  
BcData_H_40  
BcData_H_10  
BcData_H_106  
BcCheck_H_4  
SysData_L_8  
SysData_L_41  
BcData_H_74  
SysData_L_42  
BcDataInClk_H_1  
BcData_H_12  
BcData_H_108  
BcData_H_77  
BcData_H_109  
Spare  
BcDataOutClk_L_2  
BcData_H_75  
BcData_H_105  
BcData_H_11  
BcData_H_42  
BcData_H_76  
BcData_H_107  
SysDataInClk_H_1  
SysData_L_10  
SysDataInClk_H_5  
Spare  
K8  
SysData_L_7  
SysCheck_L_4  
SysCheck_L_0  
SysData_L_40  
BcCheck_H_8  
BcData_H_41  
SysData_L_9  
BcData_H_43  
L1  
L39  
L7  
L43  
M2  
M44  
N1  
L45  
M38  
M6  
N39  
N5  
M40  
M8  
N41  
N7  
N45  
P4  
P40  
R3  
P42  
R41  
T38  
T8  
P6  
SysDataOutClk_L_5 R43  
T2  
SysData_L_43  
SysData_L_11  
T4  
T44  
U3  
U1  
BcData_H_13  
BcData_H_44  
BcData_H_14  
BcData_H_45  
BcData_H_79  
BcData_H_110  
BcData_H_15  
SysData_L_15  
U39  
U7  
BcDataInClk_H_5  
U43  
U45  
V38  
V6  
SysDataOutClk_L_1 V2  
V40  
V8  
SysData_L_44  
SysData_L_12  
SysData_L_45  
SysData_L_14  
BcData_H_46  
V44  
SysData_L_13  
SysData_L_46  
BcData_H_78  
SysData_L_47  
W1  
W45  
Y4  
W39  
W5  
Y40  
W41  
W7  
Y42  
Y6  
21264/EV68A Hardware Reference Manual  
Hardware Interface 3–15  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Pin Assignments  
Table 3–6 lists the 21264/EV68A ground and power (VSS and VDD, respectively) pin  
list.  
Table 3–6 Ground and Power (VSS and VDD) Pin List  
Signal  
PGA Location  
VSS  
A15  
A21  
A25  
A3  
A31  
AE1  
AL1  
A37  
A43  
A9  
AA1  
AG3  
AA39  
AG39  
AA45 AA7  
AG43 AG5  
AR41 AR5  
AC41 AC5  
AE39 AE45 AE7  
AL39 AL45 AN3  
AJ41  
AU1  
AJ5  
AN43 AN5  
AW21 AW25 AW3 AW31  
AU39 AU45 AU7  
AW37 AW41 AW43 AW7  
BA35 BA41 BA5 BA7  
BC45 BC7  
AW9  
BC1  
AY14 BA11 BA17 BA23 BA29  
BC13 BC19 BC27 BC33 BC39  
BE15 BE21 BE25 BE3  
BE31 BE37 BE43 C1  
C13  
E23  
G31  
N3  
C19  
E29  
G37  
N43  
W43  
C27  
E35  
G43  
R1  
C33  
E41  
G9  
C39  
E5  
C45  
E9  
C7  
G15  
J45  
R7  
DS8  
G21  
J7  
E11  
G25  
L41  
U41  
E17  
G3  
L5  
J1  
J39  
R5  
R39  
R45  
T42  
U5  
W3  
VDD  
A23  
AB40 AB6  
AD40 AD6  
AM2  
AF2  
AF38 AF44 AF8  
AP42 AP6  
AH4  
AH42 AK40 AK6  
AT6 AV14 AV2  
AY22 AY24 AY30 AY36 AY4  
B32 B38 B44 B8 BB12 BB18 BB28 BB34 BB40 BB6  
BD14 BD20 BD26 BD32 BD38 BD44 BD8  
AM38 AM44 AP4  
AT40  
AV20 AV26 AV32 AV38 AV44 AY10 AY16  
AY42 B14 B2 B20 B26  
D12  
F30  
H44  
T6  
D18  
F36  
K40  
V4  
D28  
F4  
D34  
F42  
M4  
Y2  
D40  
H14  
M42  
Y38  
D6  
H2  
P2  
F10  
H20  
P38  
Y8  
F16  
H26  
P44  
F22  
H32  
P8  
F24  
H38  
T40  
K6  
V42  
Y44  
3–16 Hardware Interface  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Mechanical Specifications  
3.4 Mechanical Specifications  
This section shows the 21264/EV68A mechanical package dimensions without a heat  
sink. For heat sink information and dimensions, refer to Chapter 10.  
Figure 3–2 shows the package physical dimensions without a heat sink.  
Figure 3–2 Package Dimensions  
1.27 mm (.050 in) Typ  
4.32 mm (.170 in) Typ  
2.54 mm (.100 in) Typ  
1.377 mm (.055 in) Typ  
BE  
BC  
BA  
AW  
AU  
AR  
AN  
AL  
AJ  
AG  
AE  
AC  
AA  
W
Standoff (4x)  
587x 1.40 mm (.055 in) Typ  
BD  
BB  
AY  
AV  
AT  
AP  
AM  
AK  
AH  
AF  
AD  
AB  
Y
1.27 mm (.050 in) Typ  
Lid  
1/4-20 Stud (2x)  
V
U
T
R
P
27.94 mm  
(1.100 in)  
7.62 mm (.300 in) Typ  
N
M
L
K
J
H
G
F
E
D
C
B
A
.13 mm  
(.005 in) R  
02 04 06 08 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44  
01 03 05 07 09 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45  
27.94 mm  
(1.100 in)  
1.905 mm (.075 in) Typ  
59.94 mm (2.360 in) Typ  
29.62 mm  
(1.180 in) Typ  
29.62 mm  
(1.180 in) Typ  
25.40 mm  
(1.000 in) Typ  
53.85 mm  
(2.120 in) Typ  
FM-05662.AI4  
21264/EV68A Hardware Reference Manual  
Hardware Interface 3–17  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
21264/EV68A Packaging  
3.5 21264/EV68A Packaging  
Figure 3–3 shows the 21264/EV68A pinout from the top view with pins facing down.  
Figure 3–3 21264/EV68A Top View (Pin Down)  
3–18 Hardware Interface  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
21264/EV68A Packaging  
Figure 3–4 shows the 21264/EV68A pinout from the bottom view with pins facing up.  
Figure 3–4 21264/EV68A Bottom View (Pin Up)  
21264/EV68A Hardware Reference Manual  
Hardware Interface 3–19  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
Download from Www.Somanuals.com. All Manuals Search And Download.  
4
Cache and External Interfaces  
This chapter describes the 21264/EV68A cache and external interface, which includes  
the second-level cache (Bcache) interface and the system interface. It also describes  
locks, interrupt signals, and ECC/parity generation. It is organized as follows:  
Introduction to the external interfaces  
Physical address considerations  
Bcache structure  
Victim data buffer  
Cache coherency  
Lock mechanism  
System port  
Bcache port  
Interrupts  
Chapter 3 lists and defines all 21264/EV68A hardware interface signal pins. Chapter 9  
describes the 21264/EV68A hardware interface electrical requirements.  
4.1 Introduction to the External Interfaces  
A 21264/EV68A-based system can be divided into three major sections:  
21264/EV68A microprocessor  
Second-level Bcache  
System interface logic  
Optional duplicate tag store  
Optional lock register  
Optional victim buffers  
The 21264/EV68A external interface is flexible and mandates few design rules, allow-  
ing a wide range of prospective systems. The external interface is composed of the  
Bcache interface and the system interface.  
Input clocks must have the same frequency as their corresponding output clock. For  
example, the frequency of SysAddInClk_L must be the same as  
SysAddOutClk_L.  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces  
4–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
Introduction to the External Interfaces  
The Bcache interface includes a 128-bit bidirectional data bus, a 20-bit unidirec-  
tional address bus, and several control signals.  
The BcDataOutClk_x[3:0] clocks are free-running and are derived from the  
internal GCLK. The period of BcDataOutClk_x[3:0] is a programmable mul-  
tiple of GCLK.  
The Bcache turns the BcDataOutClk_x[3:0] clocks around and returns them  
to the 21264/EV68A as BcDataInClk_H[7:0]. Likewise, BcTagOutClk_x  
returns as BcTagInClk_H.  
The Bcache interface supports a 64-byte block size.  
The system interface includes a 64-bit bidirectional data bus, two 15-bit  
unidirectional address buses, and several control signals.  
The SysAddOutClk_L clock is free-running and is derived from the internal  
GCLK. The period of SysAddOutClk_L is a programmable multiple of  
GCLK.  
The SysAddInClk_L clock is a turned-around copy of SysAddOutClk_L.  
Figure 4–1 shows a simplified view of the external interface. The function and purpose  
of each signal is described in Chapter 3.  
4–2  
Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Introduction to the External Interfaces  
Figure 4–1 21264/EV68A System and Bcache Interfaces  
SysAddIn_L[14:0]  
SysAddInClk_L  
SysAddOut_L[14:0]  
SysAddOutClk_L  
SysVref  
SysData_L[63:0]  
SysCheck_L[7:0]  
SysDataInClk_H[7:0]  
SysDataOutClk_L[7:0]  
SysDataInValid_L  
SysDataOutValid_L  
SysFillValid_L  
BcAdd_H[23:4]  
[23:4]  
Data  
[23:6]  
[23:6]  
Status  
21264  
Tag  
System  
BcLoad_L  
BcData_H[127:0]  
BcCheck_H[15:0]  
BcDataInClk_H[7:0]  
BcDataOutClk_x[3:0]  
BcDataOE_L  
BcDataWr_L  
BcTag_H[42:20]  
BcTagInClk_H  
BcTagOutClk_ x  
BcVref  
BcTagWr_L  
BcTagOE_L  
BcTagValid_H  
BcTagDir ty_H  
BcTagShared_H  
BcTagParity_H  
IRQ_H[5:0]  
FM-05818B-EV67  
4.1.1 System Interface  
This section introduces the system (external) bus interface. The system interface is  
made up of two unidirectional 15-bit address buses, 64 bidirectional data lines, eight  
bidirectional check bits, two single-ended unidirectional clocks, and a few control pins.  
The 15-bit address buses provide time-shared address/command/ID in two or four  
GCLK cycles. The Cbox controls the system interface.  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces  
4–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Physical Address Considerations  
4.1.1.1 Commands and Addresses  
The system sends probe and data movement commands to the 21264/EV68A. The  
21264/EV68A can hold up to eight probe commands from the system. The system con-  
trols the number of outstanding probe commands and must ensure that the 21264/  
EV68A 8-entry probe queue does not overflow.  
The Cbox contains an 8-entry miss buffer (MAF) and an 8-entry victim buffer (VAF).  
A miss occurs when the 21264/EV68A probes the Bcache but does not find the  
addressed block. The 21264/EV68A can queue eight cache misses to the system in its  
MAF.  
4.1.2 Second-Level Cache (Bcache) Interface  
The 21264/EV68A Cbox provides control signals and an interface for a second-level  
cache, the Bcache. The 21264/EV68A supports a Bcache from 1MB to 16MB, with 64-  
byte blocks. A 128-bit data bus is used for transfers between the 21264/EV68A and the  
Bcache. The Bcache must be comprised of synchronous static RAMs (SSRAMs) and  
must contain either one, two, or three internal registers. All Bcache control and address  
pins are clocked synchronously on Bcache cycle boundaries. The Bcache clock rate  
varies as a multiple of the CPU clock cycle in half-cycle increments from 1.5 to 4.0,  
and in full-cycle increments of 5, 6, 7, and 8 times the CPU clock cycle. The 1.5 multi-  
ple is only available in dual-data mode.  
4.2 Physical Address Considerations  
The 21264/EV68A supports a 44-bit physical address space that is divided equally  
between memory space and I/O space. Memory space resides in the lower half of the  
physical address space (PA[43] = 0) and I/O space resides in the upper half of the phys-  
ical address space (PA[43] = 1). The 21264/EV68A recognizes these spaces internally.  
The 21264/EV68A-generated external references to memory space are always of a  
fixed 64-byte size, though the internal access granularity is byte, word, longword, or  
quadword. All 21264/EV68A-generated external references to memory or I/O space  
are physical addresses that are either successfully translated from a virtual address or  
produced by PALcode. Speculative execution may cause a reference to nonexistent  
memory. Systems must check the range of all addresses and report nonexistent  
addresses to the 21264/EV68A.  
Table 4–1 describes the translation of internal references to external interface refer-  
ences. The first column lists the instructions used by the programmer, including load  
(LDx) and store (STx) instructions of several sizes. The column headings are described  
here:  
DcHit (block was found in the Dcache)  
DcW (block was found in a writable state in the Dcache)  
BcHit (block was found in the Bcache)  
BcW (block was found in a writable state in the Bcache)  
Status and Action (status at end of instruction and action performed by the 21264/  
EV68A)  
4–4  
Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Physical Address Considerations  
Prefetches (LDL, LDF, LDG, LDT, LDBU, LDWU) to R31 use the LDx flow, and  
prefetch with modify intent (LDS) uses the STx flow. If the prefetch target is addressed  
to I/O space, the upper address bit is cleared, converting the address to memory space  
(PA[42:6] ). Notes follow the table.  
Table 4–1 Translation of Internal References to External Interface Reference  
Instruction  
DcHit DcW  
BcHit BcW  
Status and Action  
LDx Memory  
LDx Memory  
LDx Memory  
LDx I/O  
1
0
0
X
1
0
0
1
1
0
0
0
X
0
1
X
X
X
X
X
X
X
X
1
X
1
X
X
X
X
X
X
X
X
X
1
Dcache hit, done.  
Bcache hit, done.  
0
Miss, generate RdBlk command.  
RdBytes, RdLWs, or RdQWs based on size.  
Dcache hit, Istream serviced from Dcache.  
Bcache hit, Istream serviced from Bcache.  
Miss, generate RdBlkI command.  
Store Dcache hit and writable, done.  
Store hit and not writable, set dirty flow (note 1).  
Store Bcache hit and writable, done.  
Store hit and not writable, set-dirty flow (note 1).  
Miss, generate RdBlkMod command.  
WrBytes, WrLWs, or WrQWs based on size.  
Fail STx_C.  
X
X
1
Istream Memory  
Istream Memory  
Istream Memory  
STx Memory  
STx Memory  
STx Memory  
STx Memory  
STx Memory  
STx I/O  
0
X
X
1
0
X
X
X
X
X
0
1
0
0
X
X
X
X
X
X
X
X
X
STx_C Memory  
STx_C Memory  
STx_C I/O  
STx_C hit and not writable, set dirty flow (note 1).  
X
Always succeed and WrQws or WrLws are generated,  
based on the size.  
WH64 Memory  
WH64 Memory  
WH64 Memory  
WH64 Memory  
WH64 Memory  
WH64 I/O  
1
1
0
0
0
X
1
X
X
1
X
X
1
Hit, done.  
0
WH64 hit not writable, set dirty flow (note 1).  
WH64 hit dirty, done.  
X
X
X
X
1
0
WH64 hit not writable, set dirty flow (note 1).  
Miss, generate InvalToDirty command (note 2).  
0
X
X
X
NOP the instruction. WH64 is UNDEFINED for I/O  
space.  
ECB Memory  
ECB I/O  
X
X
X
X
X
X
X
X
Generate evict command (note 3).  
NOP the instruction. ECB instruction is UNDEFINED  
for I/O space.  
MB/WMB  
X
X
X
X
Generate MB command (note 4). Also see Section 3.2.5.  
TBFill Flows  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces  
4–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Physical Address Considerations  
Table 4–1 notes:  
1. Set Dirty Flow: Based on the Cbox CSR SET_DIRTY_ENABLE[2:0], SetDirty  
requests can be either internally acknowledged (called a SetModify) or sent to the  
system environment for processing. When externally acknowledged, the shared sta-  
tus information for the cache block is also broadcast. The commands sent exter-  
nally are SharedToDirty or CleanToDirty. Based on the Cbox CSR  
ENABLE_STC_COMMAND[0], the external system can be informed of a STx_C  
generating a SetDirty using the STCChangeToDirty command. See Table 4–16 for  
more information.  
2. InvalToDirty: Based on the Cbox CSR INVAL_TO_DIRTY_ENABLE[1:0], Inval-  
ToDirty requests can be either internally acknowledged or sent to the system envi-  
ronment as InvalToDirty commands. This Cbox CSR provides the ability to convert  
WH64 instructions to RdModx operations. See Table 4–15 for more information.  
3. Evict: There are two aspects to the commands that are generated by an ECB  
instruction: first, those commands that are generated to notify the system of an evict  
being performed; second, those commands that are generated by any victim that is  
created by servicing the ECB.  
If Cbox CSR ENABLE_EVICT[0] is clear, no command is issued by the  
21264/EV68A on the external interface to notify the system of an evict being  
performed. If Cbox CSR ENABLE_EVICT[0] is set, the 21264/EV68A issues  
an Evict command on the system interface only if a Bcache index match to the  
ECB address is found in the 21264/EV68A cache system.  
Note that whenever ENABLE_EVICT[0] is true (in the write-many chain),  
BC_CLEAN_VICTIM must also be true (in the write-once chain). Otherwise,  
the 21264/EV68A could respond miss to a probe, rather than hit, before an  
Evict command has been sent off chip, but after the Evict command has  
removed a (clean) block from the internal caches and the Bcache. That behav-  
ior might cause systems that maintain an external duplicate copy of the Bcache  
tags to become confused, because the system could receive the probe response  
indicating the miss before it receives the Evict command.  
The 21264/EV68A can issue the commands CleanVictimBlk and WrVictimBlk  
for a victim that is created by an ECB. CleanVictimBlk is issued only if Cbox  
CSR BC_CLEAN_VICTIM is set and there is a Bcache index match valid but  
not dirty in the 21264/EV68A cache system. WrVictimBlk is issued for any  
Bcache match of the ECB address that is dirty in the 21264/EV68A cache sys-  
tem.  
4. MB: Based on the Cbox CSR SYSBUS_MB_ENABLE, the MB command can be  
sent to the pins.  
Each of these CSRs is programmed appropriately, based on the cache coherence proto-  
col used by the system environment. For example, uniprocessor systems would prefer  
to internally acknowledge most of these transactions. In contrast, multiprocessor sys-  
tems may require notification and control of any change in cache state. The 21264/  
EV68A and the external system must cooperate to maintain cache coherence. Section  
4.5 explains the 21264/EV68A part of the cache coherency protocol.  
4–6  
Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Bcache Structure  
4.3 Bcache Structure  
The 21264/EV68A Cbox provides control signals and an interface for a second-level  
cache (Bcache).  
The 21264/EV68A supports a Bcache from 1MB to 16MB, with 64-byte blocks. A 128-  
bit bidirectional data bus is used for transfers between the 21264/EV68A and the  
Bcache. The Bcache is fully synchronous and the synchronous static RAMs (SSRAMs)  
must contain either one, two, or three internal registers. All Bcache control and address  
pins are clocked synchronously on Bcache cycle boundaries. The Bcache clock rate  
varies as a multiple of the CPU clock cycle in half-cycle increments from 1.5 to 4.0,  
and in full-cycle increments of 5, 6, 7, and 8 times the CPU clock cycle. The 1.5 multi-  
ple is only available in dual-data mode.  
4.3.1 Bcache Interface Signals  
Figure 4–2 shows the 21264/EV68A system interface signals.  
Figure 4–2 21264/EV68A Bcache Interface Signals  
BcData_H[127:0]  
21264  
BcCheck_H[15:0]  
BcDataInClk_H[7:0]  
BcDataOutClk_x[3:0]  
BcDataOE_L  
BcDataWr_L  
BcAdd_H[23:4]  
BcTag_H[42:20]  
BcTagInClk_H  
BcTagOutClk_x  
BcVref  
BcTagDirty_H  
BcTagParity_H  
BcTagShared_H  
BcTagValid_H  
BcTagOE_L  
BcTagWr_L  
BcLoad_L  
FM-05650  
4.3.2 System Duplicate Tag Stores  
The 21264/EV68A provides Bcache state support for systems with and without dupli-  
cate tag stores, and will take different actions on this basis. The system sets the Cbox  
CSR DUP_TAG_ENA[0], indicating that it has a duplicate tag store for the Bcache.  
Systems using the DUP_TAG_ENA[0] bit must also use the Cbox CSR  
BC_CLEAN_VICTIM[0] bit to avoid deadlock situations.  
Systems using a Bcache duplicate tag store can accelerate system performance by:  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces  
4–7  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Victim Data Buffer  
Issuing probes and SysDc fill commands to the 21264/EV68A out-of-order with  
respect to their order at the system serialization point  
Filtering out all probe misses from the 21264/EV68A cache system  
If a probe misses in the 21264/EV68A cache system (Bcache miss and VAF miss), the  
21264/EV68A stalls probe processing with the expectation that a SysDc fill will allo-  
cate this block. Because of this, in duplicate tag mode, the 21264/EV68A can never  
generate a probe miss response.  
When Cbox CSR DUP_TAG_ENA[0] equals 0, the 21264/EV68A delivers a miss  
response for probes that do not hit in its cache system.  
4.4 Victim Data Buffer  
The 21264/EV68A has eight victim data buffers (VDBs). They have the following  
properties:  
The VDBs are used for both victims (fills that are replacing dirty cache blocks) and  
for system probes that require data movement. The CleanVictimBlk command  
(optional) assigns and uses a VDB.  
Each VDB has two valid bits that indicate the buffer is valid for a victim or valid  
for a probe or valid for both a victim and a probe. Probe commands that match the  
address of a victim address file (VAF) entry with an asserted probe-valid bit (P)  
will stall the 21264/EV68A probe queue. No ProbeResponses will be returned until  
the P bit is clear.  
The release victim buffer (RVB) bit, when asserted, causes the victim valid bit, on  
the victim data buffer (VDB) specified in the ID field, to be cleared. The RVB bit  
will also clear the IOWB when systems move data on I/O write transactions. In this  
case, ID[3] equals one.  
The release probe buffer (RPB) bit, when asserted (with a WriteData or Release-  
Buffer SysDc command), clears the P bit in the victim buffer entry specified in the  
ID field.  
Read data commands and victim write commands use IDs 0-7, while IDs 8-11 are  
used to address the four I/O write buffers.  
4.5 Cache Coherency  
This section describes the basics and protocols of the 21264/EV68A cache coherency  
scheme.  
4.5.1 Cache Coherency Basics  
The 21264/EV68A systems maintain the cache hierarchy shown in Figure 4–3.  
4–8  
Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
Cache Coherency  
Figure 4–3 Cache Subset Hierarchy  
System  
Main Memory  
Bcache  
Dcache  
Icache  
FM-05824.AI4  
The following tasks must be performed to maintain cache coherency:  
Istream data from memory spaces may be cached in the Icache and Bcache. Icache  
coherence is not maintained by hardware—it must be maintained by software using  
the CALL_PAL IMB instruction.  
The 21264/EV68A maintains the Dcache as a subset of the Bcache. The Dcache is  
set-associative but is kept a subset of the larger externally implemented direct-  
mapped Bcache.  
System logic must help the 21264/EV68A to keep the Bcache coherent with main  
memory and other caches in the system.  
The 21264/EV68A requires the system to allow only one change to a block at a  
time. This means that if the 21264/EV68A gains the bus to read or write a block, no  
other node on the bus should be allowed to access that block until the data has been  
moved.  
The 21264/EV68A provides hardware mechanisms to support several cache coher-  
ency protocols. The protocols can be separated into two classes: write invalidate  
cache coherency protocol and flush cache coherency protocol.  
4.5.2 Cache Block States  
Table 4–2 lists the cache block states supported by the 21264/EV68A.  
Table 4–2 21264/EV68A-Supported Cache Block States  
State Name  
Invalid  
Description  
The 21264/EV68A does not have a copy of the block.  
Clean  
This 21264/EV68A holds a read-only copy of the block, and no other agent in the system  
holds a copy. Upon eviction, the block is not written to memory.  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces  
4–9  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
Cache Coherency  
Table 4–2 21264/EV68A-Supported Cache Block States (Continued)  
State Name  
Description  
Clean/Shared  
This 21264/EV68A holds a read-only copy of the block, and at least one other agent in the  
system may hold a copy of the block. Upon eviction, the block is not written to memory.  
Dirty  
This 21264/EV68A holds a read-write copy of the block, and must write it to memory after it  
is evicted from the cache. No other agent in the system holds a copy of the block.  
Dirty/Shared  
This 21264/EV68A holds a read-only copy of the dirty block, which may be shared with  
another agent. The block must be written back to memory when it is evicted.  
4.5.3 Cache Block State Transitions  
Cache block state transitions are reflected by 21264/EV68A-generated commands to  
the system. Cache block state transitions can also be caused by system-generated com-  
mands to the 21264/EV68A (probes). Probes control the next state for the cache block.  
The next state can be based on the previous state of the cache block. Table 4–3 lists the  
next state for the cache block.  
Table 4–3 Cache Block State Transitions  
Next State  
Action Based on Probe Hit  
No change  
Do not update cache state. Useful for DMA transactions that sample data but  
do not want to update tag state.  
Clean  
Independent of previous state, update next state to Clean.  
Clean/Shared  
Independent of previous state, update next state to Clean/Shared. This transac-  
tion is useful for systems that update memory on probe hits.  
T1:  
Based on the dirty bit, make the block clean or dirty shared. This transaction  
is useful for systems that do not update memory on probe hits.  
Clean Clean/Shared  
Dirty  
Dirty/Shared  
T3:  
If the block is Clean or Dirty/Shared, change to Clean/Shared. If the block is  
Dirty, change to Invalid. This transaction is useful for systems that use the  
Dirty/Shared state as an exclusive state.  
Clean Clean/Shared  
Dirty Invalid  
Dirty/Shared Clean/Shared  
The cache state transitions caused by 21264/EV68A-generated commands are under the  
full control of the system environment using the SysDc (system data control) com-  
mands. Table 4–4 lists these commands.  
Table 4–4 System Responses to 21264/EV68A Commands  
Response Type  
21264/EV68A Action  
SysDc ReadData  
Fill block with the associated data and update tag with clean cache status.  
Fill block with the associated data and update tag with dirty cache status.  
Fill block with the associated data and update tag with shared cache status.  
Fill block with the associated data and update tag with dirty/shared status.  
Fill block with all-ones reference pattern and update tag with invalid status.  
SysDc ReadDataDirty  
SysDc ReadDataShared  
SysDc ReadDataShared/Dirty  
SysDc ReadDataError  
SysDc ChangeToDirtySuccess Unconditionally update block with dirty cache status.  
SysDc ChangeToDirtyFail Do not update cache status and fail any associated STx_C instructions.  
4–10 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                             
Cache Coherency  
4.5.4 Using SysDc Commands  
Note the following:  
The conventional response for RdBlk commands is SysDc ReadData or ReadD-  
ataShared.  
The conventional response for a RdBlkMod command is SysDc ReadDataDirty.  
The conventional response for ChangeToDirty commands is  
ChangeToDirtySuccess or ChangeToDirtyFail.  
However, the system environment is not limited to these responses. Table 4–5 shows all  
21264/EV68A commands, system responses, and the 21264/EV68A reaction. The  
21264/EV68A commands are described in the following list:  
Rdx commands are generated by load or Istream references.  
RdBlkModx commands are generated by store references.  
The ChxToDirty command group includes CleanToDirty, SharedToDirty, and STC-  
ChangeToDirty commands, which are generated by store references that hit in the  
21264/EV68A cache system.  
InvalToDirty commands are generated by WH64 instructions that miss in the  
21264/EV68A cache system.  
FetchBlk and FetchBlkSpec are noncached references to memory space that have  
missed in the 21264/EV68A cache system.  
Rdiox commands are noncached references to I/O address space.  
Evict and STCChangeToDirty commands are generated by ECB and STx_C  
instructions, respectively.  
Table 4–5 shows the system responses to 21264/EV68A commands and 21264/EV68A  
reactions.  
Table 4–5 System Responses to 21264/EV68A Commands and Reactions  
21264/EV68A  
CMD  
SysDc  
21264/EV68A Action  
Rdx  
ReadData  
ReadDataShared  
This is a normal fill. The cache block is filled and marked clean or  
shared based on SysDc.  
Rdx  
ReadDataShared/Dirty The cache block is filled and marked dirty/shared. Succeeding store  
commands cannot update the block without external reference.  
Rdx  
Rdx  
ReadDataDirty  
ReadDataError  
The cache block is filled and marked dirty.  
The cache block access was to NXM address space. The 21264/  
EV68A delivers an all-ones pattern to any load command and evicts  
the block from the cache (with associated victim processing). The  
cache block is marked invalid.  
Rdx  
ChangeToDirtySuccess Both SysDc responses are illegal for read commands.  
ChangeToDirtyFail  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–11  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                   
Cache Coherency  
Table 4–5 System Responses to 21264/EV68A Commands and Reactions (Continued)  
21264/EV68A  
CMD  
SysDc  
21264/EV68A Action  
RdBlkModx  
ReadData  
ReadDataShared  
The cache block is filled and marked with a nonwritable status. If the  
store instruction that generated the RdBlkModx command is still  
ReadDataShared/Dirty active (not killed), the 21264/EV68A will retry the instruction, gener-  
ating the appropriate ChangeToDirty command. Succeeding store  
commands cannot update the block without external reference.  
RdBlkModx  
RdBlkModx  
RdBlkModx  
ReadDataDirty  
The 21264/EV68A performs a normal fill response, and the cache  
block becomes writable.  
ChangeToDirtySuccess Both SysDc responses are illegal for read/modify commands.  
ChangeToDirtyFail  
ReadDataError  
The cache block command was to NXM address space. The 21264/  
EV68A delivers an all-ones pattern to any dependent load command,  
forces a fail action on any pending store commands to this block, and  
any store to this block is not retried. The Cbox evicts the cache block  
from the cache system (with associated victim processing). The cache  
block is marked invalid.  
ChxToDirty  
ChxToDirty  
ReadData  
ReadDataShared  
The original data in the Dcache is replaced with the filled data. The  
block is not writable, so the 21264/EV68A will retry the store instruc-  
ReadDataShared/Dirty tion and generate another ChxToDirty class command. To avoid a  
potential livelock situation, the STC_ENABLE CSR bit must be set.  
Any STx_C instruction to this block is forced to fail. In addition, a  
Shared/Dirty response causes the 21264/EV68A to generate a victim  
for this block upon eviction.  
ReadDataDirty  
The data in the Dcache is replaced with the filled data. The block is  
writable, so the store instruction that generated the original command  
can update this block. Any STx_C instruction to this block is forced  
to fail. In addition, the 21264/EV68A generates a victim for this  
block upon eviction.  
ChxToDirty  
ChToDirty  
ReadDataError  
Impossible situation. The block must be cached to generate a ChxTo-  
Dirty command. Caching the block is not possible because all NXM  
fills are filled noncached.  
ChangeToDirtySuccess Normal response. ChangeToDirtySuccess makes the block writable.  
The 21264/EV68A retries the store instruction and updates the  
Dcache. Any STx_C instruction associated with this block is allowed  
to succeed.  
ChxToDirty  
ChangeToDirtyFail  
The MAF entry is retired. Any STx_C instruction associated with the  
block is forced to fail. If a STx instruction generated this block, the  
21264/EV68A retries and generates either a RdBlkModx (because the  
reference that failed the ChangeToDirty also invalidated the cache by  
way of an invalidating probe) or another ChxToDirty command.  
InvalToDirty  
InvalToDirty  
InvalToDirty  
ReadData  
ReadDataShared  
ReadDataShared/Dirty  
The block is not writable, so the 21264/EV68A will retry the WH64  
instruction and generate a ChxToDirty command.  
ReadDataError  
The 21264/EV68A doesn’t send InvalToDirty commands offchip  
speculatively. This NXM condition is a hard error. Systems should  
perform a machine check.  
ReadDataDirty  
The block is writable. Done.  
ChangeToDirtySuccess  
4–12 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                 
Cache Coherency  
Table 4–5 System Responses to 21264/EV68A Commands and Reactions (Continued)  
21264/EV68A  
CMD  
SysDc  
21264/EV68A Action  
InvalToDirty  
ChangeToDirtyFail  
Illegal. InvalToDirty instructions must provide a cache block.  
Fetchx  
Rdiox  
ReadData  
ReadDataShared  
ReadDataShared/Dirty 21264/EV68A cache system.  
ReadDataDirty  
The 21264/EV68A delivers the data block, independent of its  
status, to waiting load instructions and does not cache the block in the  
Fetchx  
Rdiox  
Evict  
ReadDataError  
ReadDataError  
ChangeToDirtyFail  
The cache block address was to an NXM address space. The 21264/  
EV68A delivers the all-ones patterns to any dependent load instruc-  
tions and does not cache the block in the 21264/EV68A cache system.  
The cache block access was to NXM address space. The 21264/  
EV68A delivers an all-ones pattern to any load command and does  
not cache the block in the 21264/EV68A cache system.  
Retiring the MAF entry is the only legal response.  
STCChangeTo ReadDataX  
Dirty ChangeToDirtyFail  
All fill and ChangeToDirtyFail responses will fail the STx_C require-  
ments.  
STCChangeTo ChangeToDirtySuccess The STx_C instruction succeeds.  
Dirty  
MB  
MBDone  
Acknowledgment for MB.  
The 21264/EV68A sends a WrVictimBlk command to the system when it evicts a Dirty  
or Dirty/Shared cache block. The 21264/EV68A may be configured to send a CleanVic-  
timBlk to the system (by way of the Cbox CSR BC_CLEAN_VICTIM[0]) when evict-  
ing a clean or shared block. Both commands allocate buffers in the VAF (victim address  
file). This buffer is a coherent part of the 21264/EV68A cache system. Write data con-  
trol and deallocation of the VAF can be directly controlled by using the SysDc Write-  
Data and ReleaseBuffer commands.  
4.5.5 Dcache States and Duplicate Tags  
Each Dcache block contains an extra state bit (modified bit), beyond those required to  
support the cache protocol. If set, this bit indicates that the associated block should be  
written to the Bcache when it is evicted from the Dcache. The modified bit is set in two  
cases:  
1. When a block is filled into the Dcache from memory its modified bit is set, ensur-  
ing that it also gets written back into the Bcache at some future time.  
2. When the processor writes to a dirty Dcache block the modified bit is set, indicating  
it should be written to the Bcache when evicted.  
The contents of the modified bit are functionally invisible to the external cache environ-  
ment, but knowledge of the bits function is useful to programmers optimizing the  
scheduling of the Bcache data bus.  
The Cbox contains a duplicate copy of the Dcache tag array. In contrast to the Dcache  
tag array (DTAG), which is virtually indexed, the Cbox copy of the Dcache tag array  
(CTAG) is physically-indexed. The Cbox uses the CTAG array entries in the following  
situations.  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–13  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Lock Mechanism  
1. When the Mbox requests a Dcache fill, the Cbox uses the CTAG array entry to find  
if the Dcache already contains the requested physical address in another virtually-  
indexed Dcache line. If it does, the Cbox invalidates that cache line after first writ-  
ing the data back to the Bcache if it was in the modified state. The Cbox also checks  
to see if the Dcache contains an address different from the requested address, but  
maps to the same Bcache line. If it does, the Dcache line is evicted in order to keep  
the Dcache a subset of the Bcache.  
2. When the Ibox requests an Icache fill, the Cbox uses the CTAG array entries to find  
if the Dcache contains the requested physical address in the modified state. If it  
does, the Cbox forces the line to be written back to the Bcache before servicing the  
Icache fill request. The Cbox also checks to see if the Dcache contains an address  
different from the requested address but which maps to the same Bcache line. In  
this case the Istream request will miss the Bcache, and the Cbox will  
service the request by launching a noncached Fetch command to the system port  
and will not put the Istream block into the Bcache. This mechanism allows the  
21264/EV68A to use a cache resident lock flag for LDx_L/STx_C instructions.  
3. The Cbox uses the CTAG array entries to find whether probe addresses are held in  
the Dcache without interrupting load/store instruction processing in the processor  
core.  
4.6 Lock Mechanism  
The 21264/EV68A does not contain a dedicated lock register, nor are system compo-  
nents required to do so.  
When a load-lock (LDx_L) instruction executes, data is accessed from the Dcache or  
Bcache. If there is a cache miss, data is accessed from memory with a RdBlk command.  
Its associated cache line is filled into the Dcache in the clean state, if it is not already  
there.  
When the store-conditional (STx_C) instruction executes, it is allowed to succeed if its  
associated cache line is still present in the Dcache and can be made writable; otherwise,  
it fails.  
This algorithm is successful because another agent in the system writing to the cache  
line between the load-lock and the store-conditional cache line would make the cache  
line invalid. This mechanism’s coherence is based on the following four items:  
1. LDx_L instructions are processed in-order in relation to the associated STx_C.  
2. Once a block is locked by way of an LDx_L instruction, no internal agent can evict  
the block from the Dcache as a side-effect of its processing.  
3. Any external agent that intends to update the contents of the stored block must use  
an invalidating probe command to inform the 21264/EV68A.  
4. The system is the only agent with sufficient information to manage the tasks of fair-  
ness and liveness. However, to enable these tasks, the 21264/EV68A only generates  
external commands for nonspeculative STx_C instructions, and once given a suc-  
cess indication from the system, must faithfully update the Dcache with the STx_C  
value.  
The system is entirely responsible for item number three. The 21264/EV68A plays an  
active role in items one, two, and four.  
4–14 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Lock Mechanism  
4.6.1 In-Order Processing of LDx_L/STx_C Instructions  
The 21264/EV68A uses the stWait logic in the IQ to ensure that LDx_L/STx_C pairs  
are issued in order. The stWait logic treats an Ldx_L instruction like Stx instructions.  
STx_C instructions are always loaded into the IQ with their associate stWait bit set.  
Thus, a STx_C instruction is not issued until the older LDx_L is out of the IQ.  
4.6.2 Internal Eviction of LDx_L Blocks  
The 21264/EV68A prevents the eviction of cache blocks in the Dcache due to either of  
the following references:  
Istream references with a Bcache index that matches the Dcache block and a  
Bcache tag that mismatches the Dcache block.  
To avoid evictions of LDx_L blocks, Istream references that match the index of a  
block in the Dcache are converted to noncached references.  
Ldx or Stx references with a Dcache index that matches the block.  
In the Alpha architecture, Dstream references between a LDx_L/STx_C pair force  
the value of the STx_C success flag to be UNPREDICTABLE. The 21264/EV68A  
forces all STx_C instructions that interrupt an LDx_L/STx_C pair to fail in pro-  
gram order.  
There should be no Dstream references between LDx_L/STx_C pairs; however, the  
out-of-order nature of the 21264/EV68A can introduce Dstream references between  
LDx_L/STx_C pairs. To prevent load or store instructions older than the LDx_L  
from evicting the LDx_L cache block, the Mbox invokes a replay trap on the  
incoming load or store instruction, which also aborts the LDx_L. These instructions  
are issued in program order in the next iteration of the trap retry down the pipeline.  
To prevent newer load or store instructions from evicting the locked cache line, the  
Ibox ensures that a STx_C is issued before any newer load or store instruction by  
placing the STx_C into the IQ and stalling all subsequent instructions in the map  
stage of the pipe until the IQ is empty.  
Branch instructions between the LDx_L/STx_C pair may be mispredicted, intro-  
ducing load and store instructions that evict the locked cache block. To prevent that  
from happening, there is a bit in the instruction fetcher that is set for a LDx_L refer-  
ence and cleared on any other memory reference. When this bit is set, the branch  
predictor predicts all branches to fall through.  
4.6.3 Liveness and Fairness  
To prevent a livelock condition, the 21264/EV68A processes the STx_C as follows:  
1. If a STx_C misses the Dcache, then no system port transaction is started and the  
STx_C fails.  
2. If a STx_C hits a block that is not dirty, then a ChangeToDirty (Shared or Clean) is  
launched after the STx_C retires and all older store queue entries are in the writable  
state. This ensures that once the ChangeToDirty command is launched on behalf of  
the STx_C, the STx_C will be executed to completion if the ChangeToDirty com-  
mand succeeds.  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–15  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
System Port  
If the ChangeToDirty command succeeds, the STx_C enters the writable state, and the  
Mbox locks the Dcache line. The Mbox does not release the Dcache line until the  
STx_C data is transferred to the Dcache. This ensures that no other agent, by way of a  
probe, can take the block before the STx_C can update the locked block.  
4.6.4 Managing Speculative Store Issues with Multiprocessor Systems  
The 21264/EV68A provides two mechanisms to manage an inherent potential side  
effect of speculative execution with multiprocessor systems — a livelock condition  
caused by a speculative store that misses in one processor affecting the execution of a  
LDx_L/STx_C pair in another processor. The potential livelock condition in multipro-  
cessor systems can be effectively controlled by placing processors in a conservative  
mode, where speculative store MAFs are blocked. The 21264/EV68A manages conser-  
vative mode with the Mbox IPR, M_CTL[SMC], described in Table 5–19.  
M_CTL[SMC] can be set to place the 21264/EV68A in full-time conservative  
mode.  
M_CTL[SMC] can be set to place the 21264/EV68A in periodic conservative  
mode, timed by two counters: an 8-bit primary counter that tracks branch mispre-  
dicts and conditional branch retires, and a backup counter that places the 21264/  
EV68A in conservative mode for a period of 16K cycles every 2 million cycles.  
The 8-bit counter is enabled by placing M_CTL[SMC] in periodic conservative  
mode. The backup counter takes effect whenever the 8-bit counter is enabled. Fur-  
ther, the backup counter can be reset to 0 by clearing a previously set  
M_CTL[SMC], allowing synchronization between processors.  
4.7 System Port  
The system port is the 21264/EV68A’s connection to either a memory or I/O controller  
or to a shared multiprocessor system controller. System port interface signals are shown  
The system port supports transactions between the 21264/EV68A and the system. Sys-  
tems must receive and drive signals that are asserted low. Transaction commands are  
communicated on signal lines SysAddOut_L[14:0] (21264/EV68A-to-system) and  
SysAddIn_L[14:0] (system-to-21264/EV68A). Transaction data is transferred on a  
bidirectional data bus over pins SysData_L[63:0] with ECC on pins SysCheck_L[7:0].  
4–16 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
System Port  
Figure 4–4 System Interface Signals  
SysAddIn_L[14:0]  
SysAddInClk_L  
21264  
SysAddOut_L[14:0]  
SysAddOutClk_L  
SysVref  
SysData_L[63:0]  
SysCheck_L[7:0]  
SysDataInClk_H[7:0]  
SysDataOutClk_L[7:0]  
SysDataInValid_L  
SysDataOutValid_L  
SysFillValid_L  
IRQ_H[5:0]  
FM-05652-EV67  
4.7.1 System Port Pins  
Table 3–1 defines the 21264/EV68A signal types referred to in this section. Table 4–6  
lists the system port pin groups along with their type, number, and functional descrip-  
tion.  
Table 4–6 System Port Pins  
Pin Name  
Type  
Count Description  
IRQ_H[5:0]  
I_DA  
6
These six interrupt signal lines may be asserted by the sys-  
tem.  
SysAddIn_L[14:0]  
SysAddInClk_L  
I_DA  
I_DA  
15  
1
Time-multiplexed SysAddIn, system-to-21264/EV68A.  
Single-ended forwarded clock from system for  
SysAddIn_L[14:0] and SysFillValid_L.  
SysAddOut_L[14:0]  
SysAddOutClk_L  
SysVref  
O_OD  
15  
1
Time-multiplexed SysAddOut, 21264/EV68A-to-system.  
Single-ended forwarded clock.  
O_OD  
I_DC_REF  
B_DA_OD  
1
System interface reference voltage.  
SysCheck_L[7:0]  
SysData_L[63:0]  
SysDataInClk_H[7:0]  
8
Quadword ECC check bits for SysData_L[63:0].  
Data bus for memory and I/O data.  
B_DA_OD 64  
I_DA  
8
1
8
1
1
Single-ended system-generated clocks for clock forwarded  
input system data.  
SysDataInValid_L  
I_DA  
When asserted, marks a valid data cycle for data transfers to  
the 21264/EV68A.  
SysDataOutClk_L[7:0] O_OD  
Single-ended 21264/EV68A-generated clocks for clock for-  
warded output system data.  
SysDataOutValid_L  
SysFillValid_L  
I_DA  
I_DA  
When asserted, marks a valid data cycle for data transfers  
from the 21264/EV68A.  
Validation for fill given in previous SysDc command.  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–17  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
System Port  
4.7.2 Programming the System Interface Clocks  
The system forwarded clocks are free running and derived from the 21264/EV68A  
GCLK. The period of the system forwarded clocks is controlled by three Cbox CSRs,  
based on the bit-rate ratio (similar to the Bcache bit-rate ratio) except that all transfers  
are dual-data.  
SYS_CLK_LD_VECTOR[15:0]  
SYS_BPHASE_LD_VECTOR[3:0]  
SYS_FDBK_EN[7:0]  
Table 4–7 lists the programming values used to program the system interface clocks.  
Table 4–7 Programming Values for System Interface Clocks  
1
1
1
System Transfer  
1.5X-DD  
2.0X-DD  
2.5X-DD  
3.0X-DD  
3.5X-DD  
4.0X-DD  
5.0X-DD  
6.0X-DD  
7.0X-DD  
SYS_CLK_LD_VECTOR SYS_BPHASE_LD_VECTOR SYS_FDBK_EN  
9249  
3333  
8C63  
71C7  
C387  
0F0F  
7C1F  
F03F  
C07F  
00FF  
5
0
5
0
A
0
0
0
0
0
02  
01  
02  
10  
04  
01  
40  
10  
04  
01  
8.0X-DD  
1
These are hexadecimal values.  
In addition to programming of the clock CSRs, the data-sample/drive Cbox CSRs at the  
pads have to be set appropriately. Table 4–8 shows the programmed values for these  
system CSRs. In Table 4–8, each system forwarded clock is the inversion of the low-  
assertion signal at the corresponding pin.  
Table 4–8 Program Values for Data-Sample/Drive CSRs  
CBOX CSR  
Description  
SYS_DDM_FALL_EN[0]  
Enables the update of 21264/EV68A system outputs based on the falling edge  
of the system forwarded clock. (Always asserted)  
SYS_DDM_RISE_EN[0]  
Enables the update of 21264/EV68A system outputs based on the rising edge  
of the system forwarded clock. (Always asserted)  
SYS_DDM_RD_FALL_EN[0]  
Enables the sampling of incoming data on the falling edge of the incoming  
forwarded clock. (Always asserted)  
4–18 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                     
System Port  
Table 4–8 Program Values for Data-Sample/Drive CSRs (Continued)  
CBOX CSR  
Description  
SYS_DDM_RD_RISE_EN[0]  
Enables the sampling of incoming data on the rising edge of the incoming  
forwarded clock. (Always asserted)  
SYS_DDMF_ENABLE  
SYS_DDMR_ENABLE  
Enables the falling edge of the system forwarded clock. (Always asserted)  
Enables the rising edge of the system forwarded clock. (Always asserted)  
Table 4–9 lists the program values for CSR SYS_FRAME_LD_VECTOR[4:0] that set  
the ratio between the forwarded clocks and the frame clock.  
Table 4–9 Forwarded Clocks and Frame Clock Ratio  
1
Clock Ratio  
Transfer Mode  
Value  
1:1  
2:1  
2:1  
4:1  
4:1  
4:1  
All  
00  
3.0X, 3.5X, 8.0X  
1E  
1.5X, 2.0X, 2.5X 4.0X, 5.0X, 6.0X 7.0X  
1F  
8X  
15  
1.5X, 4.0X, 5.0X, 6.0X, 7.0X  
3.0X, 3.5X  
0B  
14  
4:1  
2.0X, 2.5X  
0A  
1
These are hexadecimal values.  
4.7.3 21264/EV68A-to-System Commands  
This section describes the 21264/EV68A-to-system commands format and operation.  
The command, address, ID, and mask bits are transmitted in four consecutive cycles on  
SysAddOut_L[14:0]. The 21264/EV68A sends the command information in one of the  
two following modes as selected by the Cbox CSR bit.  
Bank interleave on cache block boundary mode—SYSBUS_FORMAT[0] = 0  
Page hit mode—SYSBUS_FORMAT[0] = 1  
The physical address (PA) bits arrangements for the two modes is shown in Tables 4–10  
and 4–11. The purpose of the two modes is to give the system the PA bits that allow it to  
select the memory bank and drive the RAS address as soon as possible.  
4.7.3.1 Bank Interleave on Cache Block Boundary Mode  
Table 4–10 shows the command format for the bank interleave on cache block bound-  
ary mode of operation (21264/EV68A-to-system).  
Table 4–10 Bank Interleave on Cache Block Boundary Mode of Operation  
SysAddOut_L[14:2]  
Command[4:0] PA[34:28]  
SysAddOut_L[1] SysAddOut_L[0]  
Cycle 1 M1  
PA[36]  
PA[38]  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–19  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                 
System Port  
Table 4–10 Bank Interleave on Cache Block Boundary Mode of Operation (Continued)  
SysAddOut_L[14:2]  
SysAddOut_L[1] SysAddOut_L[0]  
Cycle 2  
PA[27:22], PA[12:6]  
PA[35]  
PA[40]  
PA[39]  
PA[37]  
PA[42]  
PA[41]  
Cycle 3 M2  
Cycle 4 RV  
Mask[7:0]  
CH  
ID[2:0]  
PA[21:13], PA[5:3]  
4.7.3.2 Page Hit Mode  
Table 4–11 shows the command format for page hit mode (21264/EV68A-to-system).  
Table 4–11 Page Hit Mode of Operation  
SysAddOut_L[14:2]  
Command[4:0] PA[31:25]  
PA[24:12]  
CH  
PA[34:32], PA[11:3]  
SysAddOut_L[1] SysAddOut_L[0]  
Cycle 1 M1  
Cycle 2  
PA[32]  
PA[11]  
PA[35]  
PA[36]  
PA[33]  
PA[34]  
PA[37]  
PA[38]  
Cycle 3 M2  
Cycle 4 RV  
Mask[7:0]  
ID[2:0]  
Table 4–12 describes the field definitions for Tables 4–10 and 4–11.  
Table 4–12 21264/EV68A-to-System Command Fields Definitions  
SysAddOut Field  
Definition  
M1  
When set, reports a miss to the system for the oldest probe.  
When clear, has no meaning.  
Command[4:0]  
SysAddOut[1:0]  
The 5-bit command field is defined in Table 4–14.  
This field is needed for systems with greater than 32GB of memory, up to a maximum of 8  
Terabyte (8TB). Cost-focused systems can tie these bits high and use a 13-bit command/  
address field.  
M2  
When set, reports that the oldest probe has missed in cache. Also, this bit is set for system-  
to-21264/EV68A probe commands that hit but have no data movement (see the CH bit,  
below).  
When clear, has no meaning.  
M1 and M2 are not asserted simultaneously. Reporting probe results as soon as possible is  
critical to high-speed operation, so when a result is known the 21264/EV68A uses the ear-  
liest opportunity to send an M signal to the system. M bit assertion can occur either in a  
valid command or a NZNOP.  
ID[2:0]  
RV  
The ID number for the MAF, VDB, or WIOB associated with the command.  
If set, validates this command.  
In speculative read mode (optional), RV = 1 validates the command and RV = 0 indicates  
a NOP.  
For all nonspeculative commands RV = 1.  
Mask[7:0]  
CH  
The byte, LW, or QW mask field for the corresponding I/O commands.  
The cache hit bit is asserted, along with M2, when probes with no data movement hit in  
the Dcache or Bcache. This response can be generated by a probe that explicitly indicates  
no data movement or a ReadIfDirty command that hits on a valid but clean or shared  
block.  
4–20 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
System Port  
System designers can minimize pin count for systems with a small memory by config-  
uring both the bank interleave on cache block boundary mode and the page hit mode  
formats into a short bus format. The pin SysAddOut_L[1] and/or SysAddOut_L[0]  
are not used (selected by Cbox CSR SYS_BUS_SIZE[1:0]). Table 4–13 lists the values  
for SYSBUS_FORMAT and SYS_BUS_SIZE[1:0] and shows the maximum physical  
memory size.  
Table 4–13 Maximum Physical Address for Short Bus Format  
SYSBUS_  
FORMAT  
SYSBUS_  
SIZE[1:0]  
Maximum PA Comment  
0
0
0
0
1
1
1
1
00  
01  
10  
11  
00  
01  
10  
11  
42  
Bank interleave + full address  
36  
Bank interleave + SysAddOut_L[0] unused  
Illegal combination  
Illegal  
34  
Bank interleave + both SysAddOut_L[1:0] are used for I/O  
Page hit mode + full address  
38  
36  
Page hit mode + SysAddOut_L[0] unused  
Illegal combination  
Illegal  
34  
Page hit mode + both SysAddOut_L[1:0] are unused  
Because addresses above the maximum PA are not visible to the external system, any  
memory transaction generated to addresses above the maximum PA are detected and  
converted to transactions to NXM (nonexistent memory) and processed internally by  
the 21264/EV68A.  
4.7.4 21264/EV68A-to-System Commands Descriptions  
Table 4–14 describes the 21264/EV68A-to-system commands.  
Table 4–14 21264/EV68A-to-System Commands Descriptions  
Command  
Command  
[4:0]  
Function  
NOP  
00000  
The 21264/EV68A drives this command on idle cycles during reset. After  
the clock forward reset period, the first NZNOP is generated and this  
command is no longer generated.  
ProbeResponse  
00001  
Returns probe status and ID number of the VDB entry holding the  
requested cache block.  
NZNOP  
00010  
00011  
This nonzero NOP helps to parse the command packet.  
VDBFlushRequest  
VDB flush request. The 21264/EV68A sends this command to the system  
when an internally generated transaction Bcache index matches a Bcache  
victim or probe in the VDB. The system should flush VDB entries  
associated with all probe and WrVictimBlk transactions that occurred  
before this command.  
1
MB  
00111  
10000  
Indicates an MB was issued, optional when Cbox CSR  
SYSBUS_MB_ENA[0] is set.  
ReadBlk  
Memory read.  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–21  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                       
System Port  
Table 4–14 21264/EV68A-to-System Commands Descriptions (Continued)  
Command  
Command  
ReadBlkMod  
ReadBlkI  
[4:0]  
10001  
10010  
10011  
10100  
10101  
10110  
10111  
11000  
11001  
11010  
00100  
00101  
00110  
01000  
01001  
01010  
01100  
01101  
01110  
11100  
11101  
Function  
Memory read with modify intent.  
Memory read for Istream.  
FetchBlk  
Noncached memory read.  
2
ReadBlkSpec  
Speculative memory read (optional).  
Speculative memory read with modify intent (optional).  
Memory read for Istream (optional).  
Speculative memory noncached ReadBlk (optional).  
Memory read with a victim (optional).  
Memory read with modify intent, with a victim (optional).  
Memory read for Istream with a victim (optional).  
Write-back of dirty block.  
2
ReadBlkModSpec  
2
ReadBlkSpecI  
2
FetchBlkSpec  
3
ReadBlkVic  
3
ReadBlkModVic  
3
ReadBlkVicI  
WrVictimBlk  
CleanVictimBlk  
Address of a clean victim (optional).  
Invalidate evicted block at the given Bcache index (optional).  
I/O read, byte mask.  
4
Evict  
ReadBytes  
ReadLWs  
ReadQWs  
WrBytes  
WrLWs  
I/O read, longword mask.  
I/O read, quadword mask.  
I/O write, byte mask.  
I/O write, longword mask.  
WrQWs  
I/O write, quadword mask.  
6
CleanToDirty  
Sets a block dirty that was previously clean (optional for duplicate tags).  
6
SharedToDirty  
Sets a block dirty that was previously shared (optional for multiprocessor  
systems).  
6
STCChangeToDirty  
11110  
Sets a block dirty that was previously clean or shared for a STx_C  
instruction (optional for multiprocessor systems).  
3,5  
InvalToDirtyVic  
11011  
11111  
Invalid to dirty with a victim (optional).  
5
InvalToDirty  
WH64 Acts like a ReadBlkMod without the fill cycles (optional).  
4–22 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                                               
System Port  
Table 4–14 footnotes:  
1. Systems can optionally enable MB instructions to the external system by asserting  
Cbox CSR SYSBUS_MB_ENABLE. This mode is described in Section 2.12.1.  
2. To minimize load-to-use memory latency, systems can optionally enable specula-  
tive transactions to memory space by asserting the Cbox CSR  
SPEC_READ_ENABLE[0]. If the Cbox system command queue is empty, a  
bypass between the Bcache interface and the system interface is enabled (in combi-  
nation with this mode). When the next new transaction is delivered by the Mbox,  
the Cbox starts MAF memory references to the system interface before the results  
of Bcache hit is known. The RV bit is deasserted on a Bcache hit, or in  
BC_RDVICTIM[0] mode (see footnote 3, below), and for Bcache miss transactions  
that generate a victim (clean or dirty). Otherwise, the RV bit is asserted.  
3. Systems can optionally enable RdBlkVic, RdBlkModVic, and InvalToDirtyVic  
commands using Cbox CSR BC_RDVICTIM[0]. In this mode of operation  
RdBlkxVic command cycles are always followed immediately by the WrVictimBlk  
commands. Also, when CleanVictimBlk commands are enabled, they  
immediately follow RdBlkVic, RdBlkModVic, and InvalToDirtyVic commands.  
4. Systems can optionally enable Evict commands by asserting the Cbox CSR  
ENABLE_EVICT. In this mode, all ECB instructions will generate an Evict com-  
mand, and in combination with BC_RDVICTIM[0] mode, the WriteVictim or  
CleanVictim (when Cbox CSR BC_CLEAN_VICTIM[0] is asserted) is associated  
with the Evict command is atomically sent after the Evict command.  
5. Optionally, systems can enable InvalToDirty commands by programming Cbox  
CSR INVAL_TO_DIRTY_ENABLE[1:0]. Table 4–15 shows how to program  
INVAL_TO_DIRTY_ENABLE[1:0].  
6. Optionally, systems can enable CleanToDirty or SharedToDirty commands by  
using Cbox CSR SET_DIRTY_ENABLE[2:0]. These three bits control the Cbox  
action upon a block that was hit in the Dcache with a status of dirty/shared, clean/  
shared, or clean respectively.  
Table 4–15 Programming INVAL_TO_DIRTY_ENABLE[1:0]  
INVAL_TO_DIRTY_ENABLE[1:0] Cbox Action  
X0  
WH64 instructions are converted to RdModx commands at the interface.  
Beyond this point, no other agent sees the WH64 instruction. This mode is  
useful for microprocessors that do not want to support InvalToDirty transac-  
tions.  
01  
11  
WH64 instructions are enabled, but they are acknowledged within the  
21264/EV68A.  
WH64 instructions are enabled, and generate InvalToDirty transactions at  
the 21264/EV68A pins.  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–23  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                   
System Port  
Table 4–16 Programming SET_DIRTY_ENABLE[2:0]  
SET_DIRTY_ENABLE  
[2,0] (DS,CS,C)  
Cbox Action  
000  
001  
010  
Everything acknowledged internally (uniprocessor).  
Only clean blocks generate external acknowledge (CleanToDirty commands only).  
Only clean/shared blocks generate external acknowledge (SharedToDirty command  
only).  
011  
100  
Clean and clean/shared blocks generate external acknowledge.  
Only dirty/shared blocks generate external acknowledge (SharedToDirty commands  
only).  
101  
110  
111  
Only dirty/shared and clean blocks generate external acknowledge.  
Only dirty/shared and clean/shared blocks generate external acknowledge.  
All transactions generate external acknowledge.  
Systems that require an explicit indication of ChangeToDirty status changes initi-  
ated by STx_C instructions can assert Cbox CSR STC_ENABLE[0]. When this  
register field = 000, CleanToDirty and SharedToDirty commands are used. The dis-  
tinction between a ChangeToDirty command generated by a STx_C instruction and  
one generated by a STx instruction is important to systems that want to service  
ChangeToDirty commands with dirty data from a source processor. In this case, the  
distinction between a locked exclusive instruction and a normal instruction is criti-  
cal to avoid livelock for a LDx_L/STx_C sequence.  
4.7.5 ProbeResponse Commands (Command[4:0] = 00001)  
The 21264/EV68A responds to system probes that did not miss with a 4-cycle transfer  
on SysAddOut_L[14:0]. As shown in Table 4–14, the Command[4:0] field for a Prob-  
eResponse command equals 00001. Table 4–17 shows the format of the 21264/EV68A  
ProbeResponse command.  
Table 4–17 21264/EV68A ProbeResponse Command  
SysAddOut_L[14:2]  
SysAddOut_L[1] SysAddOut_L[0]  
Cycle 1  
Cycle 2  
0
0
00001 Status[1:0]  
DM  
X
VS VDB X  
X
X
[2:0]  
0
MS MAF X  
[2:0]  
Cycle 3  
Cycle 4  
X
X
X
X
X
4–24 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
System Port  
Table 4–18 describes the ProbeResponse command fields.  
Table 4–18 ProbeResponse Fields Descriptions  
ProbeResponse Field Description  
Command[4:0]  
DM  
The value 00001 identifies the command as a ProbeResponse.  
Indicates that data movement should occur (copy of probe valid bit). See Section 4.4.  
Write victim sent bit.  
VS  
VDB[2:0]  
ID number of the VDB entry containing the requested cache block. This field is valid  
when either the DM bit or the VS bit equals 1.  
MS  
MAF address sent.  
MAF[2:0]  
This field indicates the SharedToDirty, CleanToDirty, or  
STCChangetoDirty MAF entry that matched the full probe address.  
Status[1:0]  
Result of probe:  
Status[1:0] Probe state  
00  
01  
10  
11  
HitClean  
HitShared  
HitDirty  
HitSharedDirty  
The system uses the SysDc signal lines to retrieve data for probes that requested a cache  
block from the 21264/EV68A. See Section 4.7.7.2 for more information about 2-cycle  
data transfer commands. Probes that respond with M1, M2, or CH=1 will not be  
reported to the system in a probe response command.  
4.7.6 SysAck and 21264/EV68A-to-System Commands Flow Control  
Controlling the flow of 21264/EV68A-to-system commands is a joint task of the  
21264/EV68A and the system. The flow is controlled using the A bit, which is asserted  
by the system, and the Cbox CSR SYSBUS_ACK_LIMIT[4:0] counter. The counter  
has the following properties:  
The 21264/EV68A increments its command-outstanding counter when it sends a  
command to the system. The 21264/EV68A decrements the counter by one each  
time the A bit (SysAddIn_L[14]) is asserted in a system-to-21264/EV68A com-  
mand. The A bit is transmitted during cycle four of a probe mode command or dur-  
ing cycle two of a SysDc command.  
The 21264/EV68A stops sending new commands when the counter hits the maxi-  
mum count specified by Cbox CSR SYSBUS_ACK_LIMIT[4:0]. When this  
counter is programmed to zero, the CMD_ACK count is ignored (unlimited com-  
mands are allowed in-flight).  
Because RdBlkxVic and WrVictimBlk commands are atomic when the CSR  
BC_RDVICTIM[0] is set, the 21264/EV68A does not send a RdBlkxVic command  
if the SYSBUS_ACK_LIMIT[4:0] is equal to one less than the maximum outstand-  
ing count. The limit cannot be programmed with a value of one when RdBlkxVic  
commands are enabled unless the Cbox CSR RDVIC_ACK_INHIBIT command is  
also asserted (see Table 5–24).  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–25  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
System Port  
There is no mechanism for the system to reject a 21264/EV68A-to-system com-  
mand. ProbeResponse, VDBFlushReq, NOP, NZNOP, and RdBlkxSpec (with a  
clear RV bit) commands do not require a response from the system. Systems must  
provide adequate resources for responses to all probes sent to the 21264/EV68A.  
Systems that program the Cbox CSR BC_RDVICTIM[0] to immediately follow  
victim write transactions with read transactions and allocate combined resources  
for the pair, may find it useful to increment the SYSBUS_ACK_LIMIT[4:0]  
counter only once for the pair. These systems may assert Cbox CSR  
RDVIC_ACK_INHIBIT, which does not increment the  
SYSBUS_ACK_LIMIT[4:0] count for RdBlkVic, RdBlkModVic, and RdBlkVicI  
commands.  
Systems that maintain victim data buffers may find it useful to limit the number of  
outstanding WrVictimBlk commands. This can be accomplished by using the Cbox  
CSR SYSBUS_VIC_LIMIT[2:0]. When the number of outstanding WrVictim  
commands or CleanVictim commands reaches this programmed limit, the Cbox  
stops generating victim commands on the system port. Because victim and read  
commands are atomic when BC_RDVICTIM[0] = 1, the RdBlkxVic commands are  
stalled when the victim limit is reached. Programming the  
SYSBUS_VIC_LIMIT[2:0] to zero disables this limit.  
4.7.7 System-to-21264/EV68A Commands  
The system can send either probes (4-cycle) or data movement (2-cycle) commands to  
the 21264/EV68A. Signal pin SysAddIn_L[14] in the first command cycle indicates  
the type of command being sent (1 = probe, 0 = data transfer). Sections 4.7.7.1 and  
4.7.7.2 describe the formats of the two types of commands.  
4.7.7.1 Probe Commands (Four Cycles)  
Probes are always 4-cycle commands that contain a field to indicate a valid SysDc com-  
mand. The format of the 4-cycle command is shown below.  
Note:  
The SysAddIn_L[1:0] signal lines are optional and are used for memory  
designs greater than 32GB. The position of the address bits matches the  
selected format of the SysAddOut bus. The example below shows the bank  
interleave format.  
Table 4–19 shows the format of the system-to-21264/EV68A probe commands.  
Table 4–19 System-to-21264/EV68A Probe Commands  
SysAddIn_L[14:2]  
Cycle 1 1 Probe[4:0]  
PA[27:22], PA[12:6]  
SysDc[4:0] RVB RPB  
PA[21:13], PA[5:3]  
SysAddIn_L[1] SysAddIn_L[0]  
PA[34:28]  
PA[36]  
PA[35]  
PA[40]  
PA[39]  
PA[38]  
PA[37]  
PA[42]  
PA[41]  
Cycle 2  
Cycle 3 0  
Cycle 4 C  
A
ID[3:0]  
4–26 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                 
System Port  
Table 4–20 describes the system-to-21264/EV68A probe commands fields descriptions.  
Table 4–20 System-to-21264/EV68A Probe Commands Fields Descriptions  
SysAddIn_L[14:0]  
Field  
Description  
Probe[4:0]  
SysDc[4:0]  
Probe type and next tag state (see Tables 4–21 and 4–22).  
Controls data movement in and out of the 21264/EV68A. See Table 4–24 for a list of data  
movement types.  
RVB  
RPB  
A
Clears the victim or I/O write buffer (IOWB) valid bit specified in ID[3:0].  
Clears probe valid bit specified in ID[2:0].  
Command acknowledge. When set, the 21264/EV68A decrements its command outstand-  
ing counter (SYSBUS_ACK_LIMIT[4:0]).  
ID[3:0]  
C
Identifies the victim data buffer (VDB) number or the I/O write buffer (IOWB) number.  
Bit [3] is only asserted for the IOWB.  
Commit bit. This bit decrements the uncommitted event counter (MB_CNTR) used for  
MB acknowledge.  
The probe command field Probe[4:0] has two sections, Probe[4:3] and Probe[2:0].  
Table 4–21 lists the data movement selected by Probe[4:3].  
Table 4–21 Data Movement Selection by Probe[4:3]  
Probe[4:3]  
Data Movement Function  
00  
01  
10  
11  
NOP  
Read if hit, supply data to system if block is valid.  
Read if dirty, supply data to system if block is valid/dirty.  
Read anyway, supply data to the system at index of probe.  
Table 4–22 lists the next cache block state selected by Probe[2:0].  
Table 4–22 Next Cache Block State Selection by Probe[2:0]  
Probe[2:0]  
000  
Next Tag State  
NOP  
001  
Clean  
010  
Clean/Shared  
1
011  
Transition3 : Clean  
Clean/Shared  
Invalid  
Dirty  
Dirty/Shared  
Clean/Shared  
100  
Dirty/Shared  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–27  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
System Port  
Table 4–22 Next Cache Block State Selection by Probe[2:0] (Continued)  
Probe[2:0]  
101  
Next Tag State  
Invalid  
2
110  
Transition1 : Clean  
Clean/Shared  
Dirty/Shared  
Dirty  
111  
Reserved  
1
Transition3 is useful in nonduplicate tag systems that want to give writable status to the reader and do  
not know if the block is clean or dirty.  
2
Transition1 is useful in nonduplicate tag systems that do not update memory on ReadBlk hits to a  
dirty block in another processor.  
The 21264/EV68A holds pending probe commands in a 8-entry deep probe queue. The  
system must count the number of probes that have been sent and ensure that the probes  
do not overrun the 21264/EV68A queue. The 21264/EV68A removes probes from the  
internal probe queue when the probe response is sent.  
The 21264/EV68A expects to hit in cache on a probe response, so it always fetches a  
cache block from the Bcache on system probes. This can become a performance prob-  
lem for systems that do not monitor the Bcache tags, so the 21264/EV68A provides  
Cbox CSR PRB_TAG_ONLY[0], which only accesses Bcache tags for system probes.  
For a Bcache hit, the 21264/EV68A retries the probe reference to get the associated  
data. In this mode, the 21264/EV68A has a cache-hit counter that maintains some his-  
tory of past cache hits in order to fetch the data with the tag in the cases where streamed  
transactions are being performed to the host processor.  
4.7.7.2 Data Transfer Commands (Two Cycles)  
Data transfer commands use a 2-cycle format on SysAddIn_L[14:0]. The SysDc[4:0]  
field indicates success or failure for ChangeToDirty and MB commands, and error con-  
ditions as shown in Table 4–24.  
The pattern of data is controlled by the SysDataInValid_L and SysDataOutValid_L  
signals. These signals are valid each cycle of data transfer, indicating any gaps in the  
data cycle pattern. The SysDataInValid_L and SysDataOutValid_L signals are  
described in Section 4.7.8.4. Table 4–23 shows the format of the data transfer com-  
mand.  
Table 4–23 Data Transfer Command Format  
SysAddIn_L[14:2]  
SysAddIn_L[1] SysAddIn_L[0]  
Cycle 1  
Cycle 2  
0
SysDc[4:0]  
RVB  
RPB  
X
A
ID[3:0]  
X
X
X
X
C
4–28 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
System Port  
Table 4–24 describes the SysDc[4:0] field.  
Table 4–24 SysDc[4:0] Field Description  
SysDc[4:0] Command  
NOP  
SysDc[4:0]  
00000  
Description  
NOP, SysData is ignored by the 21264/EV68A.  
ReadDataError  
00001  
Data is returned for read commands. The system drives the SysData  
bus, I/O, or memory NXM.  
ChangeToDirtySuccess  
ChangeToDirtyFail  
00100  
00101  
No data. SysData is ignored by the 21264/EV68A. This command is  
also used for the InvalToDirty response.  
No data. SysData is ignored by the 21264/EV68A. This command is  
also used for the Evict response.  
MBDone  
00110  
00111  
Memory barrier operation completed.  
ReleaseBuffer  
Command to alert the 21264/EV68A that the RVB, RPB, and ID  
field are valid.  
ReadData  
(System Wrap)  
100xx  
Data returned for read commands. The system drives SysData. The  
system uses SysDc[1:0] to control the wrap order. See Section  
4.7.8.6 for a description of the data wrapping scheme.  
ReadDataDirty  
(System Wrap)  
101xx  
110xx  
Data is returned for Rdx and RdModx commands. The ending tag  
status is dirty. The system uses SysDc[1:0] to define the wrap order.  
ReadDataShared  
(System Wrap)  
Data is returned for read commands. The system drives the data. The  
tag is marked shared. The system uses SysDc[1:0] to control the  
wrap order.  
ReadDataShared/Dirty  
(System Wrap)  
111xx  
010xx  
Data is returned for the RdBlk command. The ending tag status is  
Shared/Dirty. The system uses SysDc[1:0] to control the wrap order.  
WriteData  
Data is sent for 21264/EV68A write commands or system probes.  
The 21264/EV68A drives during the SysData cycles. The lower two  
bits of the command specify the octaword address around which the  
21264/EV68A wraps the data.  
The A bit in the first cycle indicates that the command is acknowledged. When A = 1, the  
21264/EV68A decrements its command outstanding counter, but the A bit is not neces-  
sarily related to the current SysDc command.  
Probe commands can combine a SysDc command along with MBDone. In that event,  
the probe is considered ahead of the SysDc command. If the SysDc command allows  
the 21264/EV68A to retire an instruction before an MB, or allows the 21264/EV68A  
itself to retire an MB (SysDc is MBDone), that MB will not complete until the probe is  
executed.  
The system can select the ending cache status for a cache fill operation by specifying  
the status in one of the following SysDc commands:  
ReadData (Clean)  
ReadDataShared (Clean/Shared)  
ReadDataDirty (Dirty)  
ReadDataShared/Dirty (Shared/Dirty)  
The system returns ReadDataShared or ReadData for ReadBlk commands, and ReadD-  
ataDirty for a ReadMod command. However, other combinations are possible, but  
should be used only after a careful study of the situation.  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–29  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
System Port  
The ChangeToDirtySuccess and ChangeToDirtyFail commands cannot be issued in the  
shadow of SysDc cache fill commands (ReadDataError, ReadData, ReadDataDirty,  
ReadDataShared, and ReadDataShared/Dirty). Each cache fill command allocates eight  
cycles on the SysData bus. Systems are required to ensure that any future SysDc com-  
mands do not cause conflicts with those eight SysData bus cycles. In addition, the sys-  
tem must not issue ChangeToDirtySuccess or ChangeToDirtyFail commands in the six  
SysAddrIn cycles after any of the ReadDatax commands because doing so will over-  
load internal MAF resources in the 21264/EV68A.  
Because of an internal 21264/EV68A constraint, a minimum memory latency of  
4 × BCACHE_CLK_PERIOD is imposed. This latency is measured from A3 of the out-  
going command (the last cycle) to the delivery of the SysDc command to the processor.  
4.7.8 Data Movement In and Out of the 21264/EV68A  
There are two modes of operation for data movement in and out of the 21264/EV68A:  
fast mode and fast mode disable. The data movement mode is selected using Cbox CSR  
FAST_MODE_DISABLE[0]. Fast data mode allows movement of data from the  
21264/EV68A to bypass protocol and achieve the lowest possible latency for probe’s  
data, write victim data, and I/O write data. Rules and conditions for the two modes are  
listed and described in Sections 4.7.8.2 and 4.7.8.3. Before discussing data movement  
operation, 21264/EV68A clock basics are described in Section 4.7.8.1.  
4.7.8.1 21264/EV68A Clock Basics  
The 21264/EV68A uses a clock forwarding technique to achieve very high bandwidth  
on its pin interfaces. The clock forwarding technique has three main principles:  
1. Local point-to-point transfers can be made safely, and at very high bandwidth, if the  
sender can provide the receiver with a forward clock (FWD_CLK) to latch the  
transmitted data at the receiver.  
The SysAddOutClk_L and SysDataOutClk_L[7:0] pins provide the forward-  
ing clocks for transfers out of the 21264/EV68A.  
The SysAddInClk_L and SysDataInClk_H[7:0] pins provide the forwarding  
clocks for transfers into the 21264/EV68A.  
2. If only one state element was used to capture the transmitted data, and the skew  
between the two clock systems was greater then the bit-rate of the transfer, the data  
valid time of the transmitted data would not be sufficient to safely transfer the  
latched data into the receivers clock domain. In order to avoid this problem, the  
receiver provides a queue that is manipulated in the transmitter’s time domain.  
Using this queue, the data valid window of the transmitted data is extended (to an  
arbitrary size based on the queue size), and the transfer to the receiver’s clock  
domain can be safely made by delaying the unloading of this queue element beyond  
the skew between the two clock domains. The internal clock that unloads this queue  
is labelled INT_FWD_CLK. INT_FWD_CLK is timed at both the rising and fall-  
ing edges of the external clock, thus appearing to run at twice the external clock’s  
frequency.  
3. The first two points provide the steady state basis for clock forwarded transfers;  
however, both the sender and receiver must be correctly initialized to enable coher-  
ent and predictable transfers. This clock initialization is performed during system  
initialization using the ClkFwdRst_H and FrameClk_H signals.  
4–30 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
System Port  
If both the sender and the receiver are sampling at the same rate, these three principles  
are sufficient to safely make point-to-point transfers using clock forwarding. However,  
it is often desirable for systems to align clock-forwarded transactions on a slower  
SYSCLK that is the basis of all non-processor system transactions.  
The 21264/EV68A supports three ratios for SYSCLK to INT_FWD_CLK:  
one-to-one (1-1), two-to-one (2-1), and four-to-one (4-1). Using one of these ratios, the  
21264/EV68A starts transactions on SYSCLK boundaries. This ratio is programmed  
into the 21264/EV68A using the Cbox CSR SYS_FRAME_LD_VECTOR[4:0]. This  
ratio is independent of the frequency of FrameClk_H.  
For data movement, the 21264/EV68A reacts to SysDc commands when they are  
resolved into the 21264/EV68A’s clock domain. This occurs when the 21264/EV68A’s  
INT_FWD_CLK unloads the SysDc command from the clock forwarding queue. This  
moment is determined by the amount of delay programmed into the clock forwarding  
silo (by way of Cbox CSR SYS_RCV_MUX_CNT_PRESET[1:0]). Thus, all the tim-  
ing relationships are relative to this unload point in time, which will be referred to as  
the point the command is perceived by 21264/EV68A.  
4.7.8.2 Fast Data Mode  
1
The 21264/EV68A is the default driver of the bidirectional SysData bus . As the  
21264/EV68A is processing WrVictim, ProbeResponse (only the hit case), and IOWB  
commands to the system, accompanying data is made available at the clock-forwarded  
bus.  
Because there is a bandwidth difference between address (4 cycles) and data (8 cycles)  
transfers, the 21264/EV68A tries to fully use fast data mode by delaying the next  
SysAddOut write command until a fast data mode slot is available on the SysDataOut  
bus.  
SysDc commands (cache fill or explicit write commands) that collide with the fast data  
on the SysData bus have higher priority, and so may interrupt the successful completion  
of the fast transfer. Systems are responsible for detecting and replaying all interrupted  
fast transfers. There are no gaps in a fast transfer and no data wrapping (the first cycle  
contains QW0, addressed by PA[5:3] = 000).  
The system must release victim buffers, and probe buffers and IOWB entries by send-  
ing a SysDc command with the appropriate RVB/RPB bit for both successful fast data  
transfers and for transfers that have been replayed. Fast data transfers have two parts:  
1. SysAddOut command with the probe response, WrVictim, or Wr(I/O)  
2. Data  
The command precedes data by at least one SYSCLK period. Table 4–25 shows the  
number of SYSCLK cycles between SysAddOut and SysData for all system clock  
ratios (clock forwarded bit times) and system framing clock multiples.  
1
The SysData bus contains SysData_L[63:0] and SysCheck_L[7:0].  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–31  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
System Port  
Table 4–25 SYSCLK Cycles Between SysAddOut and SysData  
GCLK/INT_FWD_CLK (Data Rate Ratio)  
2.0X 2.5X 3.0X 3.5X 4.0X 5.0X 6.0X 7.0X 8.0X  
System framing clock ratio 1.5X  
1
2
4
4
2
1
3
2
1
2
1
1
2
2
1
1
2
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Figure 4–5 show a simple example of a fast transfer. The data rate ratio is 1.5X with a  
4:1 SYSCLK to INT_FWD_CLK ratio.  
Figure 4–5 Fast Transfer Timing Example  
Response  
Probe  
SysAddOut_L[14:0]  
D0  
D1  
D2  
D3  
D4  
D5  
D6  
SysData_L[63:0]  
SYSCLK  
SysAddOutClk_L  
INT_FWD_CLK  
GCLK  
FM05822B.AI4  
In fast data mode, movement of data into the 21264/EV68A requires turning around the  
SysData bus that is being actively driven by the 21264/EV68A. Given a SysDc fill  
command (ReadDataError, ReadData, ReadDataShared, ReadDataShared/Dirty, Read-  
DataDirty), the 21264/EV68A responds as follows:  
1. Three GCLK cycles after perceiving the SysDc fill command, the 21264/EV68A  
turns off its drivers, interrupting any ongoing fast data write transactions.  
2. The 21264/EV68A drivers stay off until the last piece of fill data is received, or a  
new SysDc write command overrides the current SysDc fill command. It is the  
responsibility of the external system to schedule SysDc fill or write commands so  
that there is no conflict on the SysData bus.  
3. The 21264/EV68A samples fill data in the GCLK clock domain, 10 +  
SYSDC_DELAY GCLK cycles after perceiving the SysDc fill command. The  
Cbox CSR SYSDC_DELAY[4:0] provides GCLK granularity for precisely placing  
fills into the processor pipeline discussed in Section 2.2.  
4–32 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
System Port  
Table 4–26 shows four example configurations and shows their use of the  
SYSDC_DELAY[4:0].  
Table 4–26 Cbox CSR SYSDC_DELAY[4:0] Examples  
1
System  
Bit Rate  
1.5X  
2.0X  
2.5X  
4X  
System Framing Clock Ratio  
SYSDC_DELAY  
System 1  
System 2  
System 3  
4:1  
2:1  
2:1  
2:1  
5 (3 SYSCLK cycles)  
2 (3 SYSCLK cycles)  
0 (2 SYSCLK cycles)  
6 (2 SYSCLK cycles)  
System 4  
1
The system framing clock ratio is the number of INT_FWD_CLK cycles per  
SYSCLK cycles.  
System 1 has six GCLKs to every SYSCLK and only sends 4-cycle commands to the  
21264/EV68A. Thus, a period of three SYSCLKs between the SysDc command and  
data leaves a period of 15 GCLKs between SysDc and data (SysDc is in the middle of  
the 4-cycle command). A SYSDC_DELAY[4:0] of five would align sampling and  
receipt of SysData.  
System 2 has four GCLKs in every SYSCLK, so leading data by three SYSCLK cycles,  
and programming the SYSDC_DELAY[4:0] to two, aligns sampling and receiving.  
Timing for systems 3 and 4 is derived in a similar manner.  
Note:  
The maximum valid value for SYSDC_DELAY must be less than the min-  
imum number of GCLK cycles between two consecutive SYSDC com-  
mands to the 21264/EV68A.  
If a fast data transfer is interrupted and fails to complete, the system must use the con-  
ventional protocol to send a SysDc WriteData command to the 21264/EV68A, remov-  
ing the desired data buffer. Section 4.7.8.3 describes the timing events for transferring  
data from the 21264/EV68A to the system.  
4.7.8.3 Fast Data Disable Mode  
The system controls all data movement to and from the 21264/EV68A. Movement of  
data into and out of the 21264/EV68A is preceded by a SysDc command. The 21264/  
EV68A drivers are only enabled for the duration of an 8-cycle transfer of data from the  
21264/EV68A to the system. Systems must ensure that there is no overlap of enabled  
drivers and that there is adequate settle time on the SysData bus.  
Given a SysDc fill command, the 21264/EV68A samples data 10 + SYSDC_DELAY  
GCLK cycles after the command is perceived within the 21264/EV68A clock domain.  
Because there is no linkage with the output driver, fills into the 21264/EV68A are not  
affected by the SYS_RCV_MUX_PRESET[1:0] value.  
In both modes, given a SysDc write command, the 21264/EV68A looks for the next  
SYSCLK edge 8.5 cycles after perceiving the SysDc write command in its clock  
domain. Because the SysDc write command must be perceived before its use, SysDc  
write commands are dependent upon the amount of delay introduced by Cbox CSR  
SYS_RCV_MUX_CNT_PRESET[1:0].  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–33  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
System Port  
Table 4–27 lists information for the four timing examples. In Table 4–27, note the fol-  
lowing:  
SysDc write commands are not affected by the SYSDC_DELAY parameter.  
The SYS_RCV_MUX_PRESET adds delay at the rate of one INT_FWD_CLK at a  
time. For example, adding the delay of one bit time to system 1 adds 1.5 GCLK  
cycles to the delay and drives the SysDc write command-to-data relationship from  
one to two SYSCLKs.  
For write transfers, the 21264/EV68A drivers are enabled on the preceding GCLK  
BPHASE, before the start of a write transfer, and disabled on the succeeding GCLK  
BPHASE at the end of the write transfer. The write data is enveloped by the 21264/  
EV68A drivers to guarantee that every data transfer has the same data valid win-  
dow.  
Table 4–27 Four Timing Examples  
1
System  
Bit Rate  
1.5X  
2.0X  
2.5X  
4X  
System Framing Clock Ratio  
Write Data  
2 SYSCLKs  
3 SYSCLKs  
2 SYSCLKs  
2 SYSCLKs  
System 1  
System 2  
System 3  
4:1  
2:1  
2:1  
2:1  
System 4  
1
The system framing clock ratio is the number of INT_FWD_CLK cycles per  
SYSCLK cycles.  
The four examples described here assume no skew for the 2.0X and 4.0X cases and one  
bit time of skew for the 1.5X and 2.5X cases.  
For system 1, the distance between SysDc and the first SYSCLK is nine GCLK cycles  
but the additional delay of one bit time (1.5 GCLKs) puts the actual delay after perceiv-  
ing the SysDc command to 7.5 GCLKS, which misses the 8.5 cycle constraint. There-  
fore, the 21264/EV68A drives data two SYSCLKs after receiving the SysDc write  
command.  
For system two, the distance between SysDc and the second SYSCLK is eight GCLK  
cycles, which also misses the 8.5 cycle constraint, so the 21264/EV68A drives data  
three SYSCLK cycles after receiving the SysDc write command (12 cycles).  
The other two cases are derived in a similar manner.  
4.7.8.4 SysDataInValid_L and SysDataOutValid_L  
The SysDataValid signals (SysDataInValid_L and SysDataOutValid_L) are driven by  
the system and control the rate of data delivery to and from the 21264/EV68A.  
SysDataInValid_L  
The SysDataInValid_L signal controls the flow of data into the 21264/EV68A, and  
may be used to introduce an arbitrary number of cycles between octaword transfers into  
the 21264/EV68A. The rules for using SysDataInValid_L follow:  
4–34 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
System Port  
1. The SysDataInValid_L signal must be asserted for both cycles of a SysDc fill  
command, and two quadwords of data must be delivered to the 21264/EV68A in  
succeeding bit-clock cycles with the appropriate timing in reference to the SysDc  
fill command (SYSDC_DELAY + 10 CPU cycles).  
2. Any number of bubble cycles can be introduced within the fill by deasserting  
SysDataInValid_L between octaword transfers.  
3. The transfer of fill data can continue by asserting SysDataInValid_L for at least  
two bit-clock cycles, and delivering data SYSDC_DELAY + 10 CPU cycles after  
the assertion of SysDataInValid_L.  
4. The 21264/EV68A must see SysDataInValid_L asserted for eight data cycles in  
order to complete a fill. When the eighth cycle of an asserted SysDataInValid_L is  
perceived by the 21264/EV68A, the transfer is complete.  
5. Systems that do not use SysDataInValid_L may tie the pin to the asserted state.  
If SYSDC_DELAY is greater than the bit-time of a transfer, the SysDataInValid_L  
signal must be internally pipelined. To enable the correct sampling of  
SysDataInValid_L, the 21264/EV68A provides a delay, with Cbox CSR  
DATA_VALID_DELAY[1:0], that is equal to SYSDC_DELAY[4:0]/bit-time. For  
example, consider system 1 in Table 4–26, which has a SYSDC_DELAY of five  
GCLKs. Running at a bit-time of 1.5X, the DATA_VALID_DELAY[1,0] is pro-  
grammed with a value of three.  
SysDataOutValid_L  
Systems that use a ratio of 1:1 for SYSCLK:INT_FWD_CLK may control the flow of  
data out of the 21264/EV68A by using SysDataOutValid_L as follows:  
1. The SysDataOutValid_L pin must be asserted for at least the first cycle of the  
SysDc write command that initiates a write transfer.  
2. Any number of bubble cycles may be introduced between quadword transfers by  
deasserting SysDataOutValid_L.  
3. The 21264/EV68A must see the SysDataOutValid_L signal asserted for eight data  
cycles to complete a write transaction, and when the eighth cycle of an asserted  
SysDataOutValid_L is perceived by the 21264/EV68A, the transfer is complete.  
4.7.8.5 SysFillValid_L  
The SysFillValid_L pin, when asserted, validates the current memory and I/O data  
transfer into the 21264/EV68A. The system designer may tie this pin to the asserted  
state (validating all fills), or use it to enable or cancel fills as they progress. The 21264/  
EV68A samples SysFillValid_L at D1 time (when the 21264/EV68A samples the sec-  
ond data cycle).  
If SysFillValid_L is asserted at D1 time, the fill will continue uninterrupted. If it is not  
asserted, the 21264/EV68A cancels the fill, but expects all eight QWs of data to arrive  
at its system bus before continuing to the next fill. Also, the 21264/EV68A maintains  
the state of the MAF, expecting another valid fill to the same MAF entry. Figure 4–6  
illustrates SysFillValid_L timing.  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–35  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
System Port  
Figure 4–6 SysFillValid_L Timing  
SysAddIn_L[14:0]  
SysDc  
Transport Delay on Address  
T3  
Command Receiver  
SysFillValid_L  
SysData_L[63:0]  
D0  
D1  
D2  
D3  
D4  
FM-05823B.FH8  
4.7.8.6 Data Wrapping  
All data movement between the 21264/EV68A and the system is composed of 64 bytes  
in eight cycles on the data bus. All 64 bytes of memory data are valid. This applies to  
memory read transactions, memory write transactions, and system probe read transac-  
tions. The wrap order is interleaved. The internal data bus, which delivers data to the  
functional units and the Dcache, is 16 bytes wide, and so, no transfers happen until two  
data cycles occur on the interface.  
Table 4–28 lists the rules for data wrapping. I/O read and write addresses on the  
SysAddOut bus point to the desired byte, word, LW, or QW, with a combination of  
SysAddOut_L[5:3] and the mask field [7:0].  
Table 4–28 Data Wrapping Rules  
Significant Address Mask  
Command  
Bits Type  
Rules  
ReadQW and  
WrQW  
SysAddOut_L[5:3] QW  
SysAddOut_L[5:3] contains the exact PA bits of the first  
LDQ or STQ to the block. The mask bits point to the valid  
QWs merged in ascending order.  
ReadLW and  
WrLW  
SysAddOut_L[5:3] LW  
SysAddOut_L[5:3] contain the exact PA bits of the first  
LDL or STL to the block. The mask bits point to the valid  
LWs merged in ascending order within one hexword.  
LDByte/Word SysAddOut_L[5:3] Byte  
and  
STByte/Word  
SysAddOut_L[5:3] contain the exact QW PA bits of the  
LDByte/Word or STByte/Word instruction. The mask bits  
point to the valid byte in the QW.  
The order in which data is provided to the 21264/EV68A (for a memory or I/O fill) or  
moved from the 21264/EV68A (write victims or probe reads) can be determined by the  
system. The system chooses to reflect back the same low-order address bits and the cor-  
responding octaword found in the SysAddOut field or the system chooses any other  
starting point within the block.  
SysDc commands for the ReadData, ReadDataShared, and WriteData groups require  
that systems define the position of the first QW by inserting the appropriate value of  
SysAddOut_L[5:3] into bits [1:0] of the command field. The recommended starting  
4–36 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
System Port  
point is the QW pointed to by the 21264/EV68A; however, some systems may find it  
more beneficial to begin the transfer elsewhere. The system must always indicate the  
starting point to the 21264/EV68A. The wrap order for subsequent QWs is interleaved.  
Table 4–29 defines the method for systems to specify wrap and deliver data.  
Table 4–29 System Wrap and Deliver Data  
Source/  
Destination  
SysDc[4:2]  
SysDc[1:0]  
Size  
Rules  
Memory  
Memory  
Memory  
Memory  
Memory  
I/O  
100 (ReadData)  
SysAddOut_L[5:4]  
SysAddOut_L[5:4]  
SysAddOut_L[5:4]  
SysAddOut_L[5:4]  
SysAddOut_L[5:4]  
SysAddOut_L[5:4]  
SysAddOut_L[4:3]  
SysAddOut_L[4:3]  
SysAddOut_L[5:4]  
SysAddOut_L[5:4]  
SysAddOut_L[5:4]  
Block (64 Bytes)  
Block (64 Bytes)  
Block (64 Bytes)  
Block (64 Bytes)  
Block (64 Bytes)  
QW (8-64 Bytes)  
LW(4-32 Bytes)  
Byte/Word  
See Note 1  
See Note 1  
See Note 1  
See Note 1  
See Note 1  
See Note 1  
See Note 2  
See Note 2  
See Note 1  
See Note 1  
See Note 1  
101(ReadDataDirty)  
110 (ReadDataShared)  
111(Read DataShared/Dirty)  
010 (WriteData)  
100 (ReadData)  
I/O  
100 (ReadData)  
I/O  
100 (ReadData)  
I/O  
010 (WriteData)  
010 (WriteData)  
010 (WriteData)  
QW (8-64 Bytes)  
LW(4-32 Bytes)  
Byte/Word  
I/O  
I/O  
Note 1:  
Transfers to and from the 21264/EV68A have eight data cycles for a total  
of eight quadwords. The starting point is defined by the system. The pre-  
ferred starting point is the one pointed to by SysAddOut_L[5:4]. Systems  
can insert the SysAddOut_L[5:4] into the SysDc[1:0] field of the com-  
mand. See Table 4–30 for the wrap order.  
Note 2:  
LW and byte/word read transfers differ from all other transfers. The system  
unloads only four QWs of data into eight data cycles by sending each QW  
twice (referred to as double-pumped data transfer). The first QW returned  
is determined by SysAddOut_L[4:3]. The system again may elect to  
choose its own starting point for the transfer and insert that value into  
SysDc[1:0]. See Table 4–31 for the wrap order.  
Table 4–30 defines the interleaved scheme for the wrap order.  
Table 4–30 Wrap Interleave Order  
PA Bits [5:3] of Transferred QW  
First quadword  
Second quadword  
Third quadword  
Fourth quadword  
Fifth quadword  
000  
001  
010  
011  
100  
010  
011  
000  
001  
110  
100  
101  
110  
111  
000  
110  
111  
100  
101  
010  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–37  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
System Port  
Table 4–30 Wrap Interleave Order (Continued)  
PA Bits [5:3] of Transferred QW  
Sixth quadword  
Seventh quadword  
Eighth quadword  
101  
110  
111  
111  
100  
101  
001  
010  
011  
011  
000  
001  
Table 4–31 defines the wrap order for double-pumped data transfers.  
Table 4–31 Wrap Order for Double-Pumped Data Transfers  
PA [5:3] of Transferred QW  
First quadword  
Second quadword  
Third quadword  
Fourth quadword  
Fifth quadword  
Sixth quadword  
Seventh quadword  
Eighth quadword  
x00  
x00  
x01  
x01  
x10  
x10  
x11  
x11  
x01  
x01  
x00  
x00  
x11  
x11  
x10  
x10  
x10  
x11  
x11  
x10  
x10  
x01  
x01  
x00  
x00  
x10  
x11  
x11  
x00  
x00  
x01  
x01  
4.7.9 Nonexistent Memory Processing  
Like its predecessors, the 21264/EV68A can generate references to nonexistent (NXM)  
memory or I/O space. However, unlike the earlier Alpha microprocessor implementa-  
tions, the 21264/EV68A can generate speculative references to memory space. To  
accommodate the speculative nature of the 21264/EV68A, the system must not gener-  
ate or lock error registers because of speculative references. The 21264/EV68A trans-  
lates all memory references through the translation lookaside buffer (TLB) and, in some  
cases, the 21264/EV68A may generate speculative references (instruction execution  
down mispredicted paths) to NXM space. In these cases, the system sends a SysDc  
ReadDataError and the 21264/EV68A does the following:  
Delivers an all-ones pattern to all load instructions to the NXM address  
Force-fails all store instructions to the NXM address (much like a STx_C  
failure)  
Invalidates the cache block at the same index by way of an atomic Evict  
command  
4–38 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
System Port  
Table 4–32 shows each 21264/EV68A command, with NXM addresses, and the appro-  
priate system response.  
Table 4–32 21264/EV68A Commands with NXM Addresses and System Response  
21264/EV68A  
Command NXM  
Address  
System/21264/EV68A Response  
ProbeResponse  
Probe responses for addresses to NXM space are of UNPREDICTABLE status. Although  
the final status of a ReadDataError is Invalid, the 21264/EV68A fills the block Valid/  
Clean and uses an atomic Evict command to invalidate the block. Systems that send  
probes to NXM space to the 21264/EV68A must disregard the probe result.  
RdBlk  
RdBlkSpec  
RdBlkVic  
Load references to NXM space can be speculative. In this case, systems should respond  
with a SysDc ReadDataError fill that the 21264/EV68A uses to service the original load/  
Istream command. If the original load command was speculative, the 21264/EV68A will  
remove the load instruction that generated the NXM command, and start processing  
instructions down the correctly predicted path. If the command was not speculative, there  
must be an error in the operating system mapping of a virtual address to an illegal physi-  
cal address, and the 21264/EV68A provides an all ones pattern as a signature for this bug.  
The NXM block is not cached in the Dcache or Bcache.  
RdBlkI  
RdBlkSpecI  
RdBlkVicI  
Istream references to NXM space can be speculative. In this case, systems should respond  
with a SysDc ReadDataError fill, which the 21264/EV68A will use to service and exe-  
cute the original Istream reference. If the original Istream reference was speculative, the  
21264/EV68A will remove the instructions started after the mispredicted instruction that  
generated the NXM reference, and start instruction processing down the correctly pre-  
dicted path. If the reference was not speculative, there must be an error in the operating  
system mapping of a virtual address to an illegal physical address, and the 21264/EV68A  
provides an all ones pattern as a signature for this bug. The NXM block is not cached in  
the Bcache, but can be cached in the Icache.  
RdBlkMod  
RdBlkModSpec  
RdBlkModVic  
Store instructions to NXM space initiate RdBlkMod commands. Again, speculative store  
instructions are removed. Nonspeculative store instructions are forced to fail, much like  
STx_C instructions that fail. The NXM block is not cached in the Dcache or Bcache.  
WrVictimBlk  
Dirty Victims to NXM space are illegal. Systems should perform a machine check, with  
the 21264/EV68A indicating a severe error.  
CleanVictimBlk  
The 21264/EV68A can generate CleanVictimBlk commands to NXM space if the Cbox  
CSR BC_CLEAN_VICTIM[0] bit is asserted and a SysDc ReadDataError has been gen-  
erated. Systems that use clean victims must faithfully deallocate the CleanVictim VAF  
entry.  
Evict  
If the Cbox CSR ENABLE_EVICT is asserted, the 21264/EV68A will generate Evict  
commands to NXM space. Systems may use this command to invalidate their duplicate  
tags. Systems must respond with SysDc ChangeToDirtyFail to retire the NXM MAF  
entry.  
RdBytes  
RdLWs  
RdQWs  
Load instructions to I/O space are not speculative, so an I/O reference to NXM space is  
an error. Systems must respond with ReadDataError and should generate a machine  
check to indicate an operating system error.  
WrBytes  
WrLWs  
WrQWs  
Store instructions to I/O space are not speculative, so an I/O reference to NXM space is  
an error. Systems must respond by deallocating the appropriate IOWB entries, and should  
generate a machine check to indicate an operating system error.  
FetchBlk  
FetchBlkSpec  
Loads to noncached memory in NXM space may be speculative. Systems must respond  
with a SysDc ReadDataError to retire the MAF entry.  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–39  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                                           
System Port  
Table 4–32 21264/EV68A Commands with NXM Addresses and System Response (Continued)  
21264/EV68A  
Command NXM  
Address  
System/21264/EV68A Response  
CleanToDirty  
SharedToDirty  
STCChangeToDirty  
ChangeToDirty commands to NXM space are impossible in the 21264/EV68A because  
all NXM references to memory space are atomically filled with an Invalid cache status.  
InvalToDirty  
InvalToDirtyVic  
InvalToDirty commands are not speculative, so InvalToDirty commands to NXM space  
indicate an operating system error. Systems should respond with a SysDc ReadDataError,  
and should generate a machine check to indicate error.  
4.7.10 Ordering of System Port Transactions  
This section describes ordering of system port transactions. The two classes of transac-  
tions are listed here:  
21264/EV68A commands and system probes  
System probes and SysDc transfers  
4.7.10.1 21264/EV68A Commands and System Probes  
This section describes the interaction of 21264/EV68A-generated commands and sys-  
tem-generated probes that reference the same cache block. Some definitions are pre-  
sented here:  
ProbeResponses generated by the 21264/EV68A respond to all system-generated  
probe commands. System-generated data transfer commands respond to all 21264/  
EV68A-generated data transfer commands.  
The victim address file (VAF) and victim data buffer (VDB) entries each have inde-  
pendent valid bits for both a victim and a probe.  
Probe results indicate a hit on a VAF/VDB and when a WrVictim command has  
been sent to the system. Systems can decide whether to move the buffer once or  
twice.  
ProbeResponses are issued in the order that the system-generated probes were  
received; however, there is no requirement for the system to retain order when issu-  
ing release buffer commands.  
Probe processing can stall inside the 21264/EV68A when the probe entry index  
matches PA[19:6] of a previous probe entry in the VAF.  
The 21264/EV68A reserves one VAF entry for probe processing, so that VAF-full  
conditions cannot stall the processing of probes at the head of the queue.  
Table 4–33 lists all interactions between pending internal 21264/EV68A commands  
and the Probe[2:0] command field, Next Cache Block State, described in Table 4–22.  
Table 4–33 shows the 21264/EV68A response to system probe and in-flight command  
interaction. In the table, note the following:  
ReadBlkVic and ReadBlkModVic commands do not appear in Table 4–33. If there  
is interaction between the probe and the victim, it is the same as a WrVictimBlk  
command.  
4–40 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
System Port  
Probes that invalidate locked blocks do not generate a ReadBlkMod command. The  
21264/EV68A fails the STx_C instruction as defined in the Alpha Architecture  
Handbook, Version 4.  
All read commands (RdBlk, RdBlkMod, Fetch, InvalToDirty) do not interact  
because the 21264/EV68A does not yet own the block.  
Table 4–33 21264/EV68A Response to System Probe and In-Flight Command Interaction  
Pending Internal  
21264/EV68A  
Command  
21264/EV68A Response to System Probe and In-Flight Command Interaction  
ReadBlk  
ReadBlkMod  
FetchBlk  
InvalToDirty  
WrVictimBlk  
This case assumes that a WrVictimBlk command has been sent to the system and another  
agent has performed a load/store instruction to the same address. The 21264/EV68A pro-  
vides VAF hit information with the probe response so that the system can manage the race  
condition between the WrVictimBlk command from this processor and a possible WrVic-  
timBlk command from the probing processor. This race condition can be managed by  
either forcing the completion of the WrVictimBlk command to memory before allowing  
the progress by the probing processor, or by killing the WrVictimBlk command in this  
processor.  
CleanToDirty  
SharedToDirty  
This case assumes that a SetDirty command has been sent to the system environment  
because of a store instruction that hit in the 21264/EV68A caches and that another proces-  
sor has performed a load/store instruction to the same address. The 21264/EV68A pro-  
vides MAF hit information so that the system can correctly respond to the Set/Dirty  
command. If the next state of the probe was Invalid (the other processor performed a store  
instruction), and the probe reached the system serialization point before the Set/Dirty  
command, the system must either fail the Set/Dirty command or provide the updated data  
from the other processor.  
STCChangeToDirty This case is similar to case 2, except that the initiating instruction for the Set/Dirty com-  
mand is a STx_C. An address match with an invalidating probe must fail the Set/Dirty  
command. Delivering the updated data from the other processor is not an option because  
of the requirements of the LDx_L/STx_C instruction pair.  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–41  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
Bcache Port  
4.7.10.2 System Probes and SysDc Commands  
Ordering of cache transactions at the system serialization point must be reflected in the  
21264/EV68A cache system. Table 4–34 shows the rules that a system must follow to  
control the order of cache status update within the 21264/EV68A cache structures  
(including the VAF) at the 21264/EV68A pins.  
Table 4–34 Rules for System Control of Cache Status Update Order  
First  
Second  
Rule  
Probe  
Probe  
To control the sequence of cache status updates between probes, systems  
can present the probes in order to the 21264/EV68A, and the 21264/  
EV68A will update the appropriate cache state (including the VAF) in  
order.  
Probe  
SysDc MAF  
To ensure that a probe updates the internal cache status before a SysDc  
MAF transaction (including fills and ChangeToDirtySuccess commands),  
systems must wait for the probe response before presenting the SysDc  
MAF command to the 21264/EV68A. To ensure that a probe updates a  
VAF entry before a SysDc VAF (release buffer), systems must wait for the  
probe response.  
Probe  
SysDc VAF  
Same as Probe/SysDc MAF, above.  
SysDc MAF Probe  
To ensure that a SysDc MAF command updates the 21264/EV68A cache  
system before a probe to the same address, systems must deliver the D1  
(the second QW of data delivered to the 21264/EV68A) before or in the  
same cycle as the A3 of the probe (the last cycle of the 4-cycle probe com-  
mand). This rule also applies to ChangeToDirtySuccess commands that  
have a virtual D0 and D1 transaction.  
SysDc MAF SysDc MAF  
SysDc MAF SysDc VAF  
SysDc MAF transactions can be ordered into the 21264/EV68A by order-  
ing them appropriately at the 21264/EV68A interface.  
SysDc MAF transactions and SysDc VAF transactions cannot interact  
within the 21264/EV68A because the 21264/EV68A does not generate  
MAF transactions to the same address as existing VAF transactions.  
SysDc VAF  
SysDc VAF  
SysDc VAF  
Probe  
To ensure that a SysDc VAF invalidates a VAF entry before a probe to the  
same address, the SysDc VAF command must precede the first cycle of the  
4-cycle probe command.  
SysDc MAF  
SysDc VAF  
SysDc MAF transactions and SysDc VAF transactions cannot interact  
within the 21264/EV68A because the 21264/EV68A does not generate  
MAF transactions to the same address as existing VAF transactions.  
SysDc VAF transactions can be ordered into the 21264/EV68A by order-  
ing them appropriately at the 21264/EV68A interface.  
4.8 Bcache Port  
The 21264/EV68A supports a second-level cache (Bcache) with 64-byte blocks. The  
Bcache size can be 1MB, 2MB, 4MB, 8MB, or 16MB. The Bcache port has a 144-bit  
data bus that is used for data transfers between the 21264/EV68A and the Bcache. All  
Bcache control and address signal lines are clocked synchronously on Bcache clock cycle  
boundaries.  
4–42 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
Bcache Port  
The Bcache supports the following multiples of the GCLK period: 1.5X (dual-data  
mode only), 2X, 2.5X, 3X, 3.5X, 4X, 5X, 6X, 7X, and 8X. However, the 21264/EV68A  
imposes a maximum Bcache clock period based on the SYSCLK ratio. Table 4–35 lists  
the range of maximum Bcache clock periods. Section 4.7.8.2 describes fast mode.  
Table 4–35 Range of Maximum Bcache Clock Ratios  
Bcache Clock Ratio with Fast Mode  
Bcache Clock Ratio with Fast Mode  
Disabled  
SYSCLK Ratio  
Enabled  
1.5X  
4.0X  
4.0X  
5.0X  
6.0X  
7.0X  
7.0X  
8.0X  
8.0X  
8.0X  
8.0X  
7.0X  
7.0X  
8.0X  
8.0X  
8.0X  
8.0X  
8.0X  
8.0X  
8.0X  
8.0X  
2.0X  
2.5X  
3.0X  
3.5X  
4.0X  
5.0X  
6.0X  
7.0X  
8.0X  
The 21264/EV68A provides a range of programmable Cbox CSRs to manipulate the  
Bcache port pins so that a variety of industry-standard SSRAMs can communicate effi-  
ciently with the 21264/EV68A. The following SSRAMs can be used:.  
Nonburst mode Reg/Reg late-write SSRAMs  
Burst mode Reg/Reg late-write dual-data SSRAMs  
4.8.1 Bcache Port Pins  
Table 3–1 defines the 21264/EV68A signal types referred to in this section. Table 4–36  
lists the Bcache port pin groups along with their type, number, reference clock, and  
functional description.  
Table 4–36 Bcache Port Pins  
Pin Name  
Type  
Count Reference Clock  
20 Int_Index_BcClk  
Description  
BcAdd_H[23:4]  
BcCheck_H[15:0]  
O_PP  
Bcache index  
B_DA_PP 16  
Int_Data_BcClk output ECC check bits for BcData  
BcDataInClk_H input  
BcData_H[127:0]  
B_DA_PP 128  
Int_Data_BcClk output Bcache data  
BcDataInClk_H  
input  
BcDataInClk_H[7:0]  
BcDataOE_L  
I_DA  
O_PP  
8
1
NA  
Bcache data input clocks  
Int_Index_BcClk  
Bcache data output enable/chip  
select  
BcDataOutClk_H[3:0] O_PP  
BcDataOutClk_L[3:0]  
8
NA  
Bcache data clocks— high and low  
version  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–43  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                   
Bcache Port  
Table 4–36 Bcache Port Pins (Continued)  
Pin Name  
Type  
O_PP  
O_PP  
Count Reference Clock  
Description  
BcDataWr_L  
BcLoad_L  
1
1
Int_Index_BcClk  
Int_Index_BcClk  
Bcache data write enable  
Bcache burst enable  
BcTag_H[42:20]  
B_DA_PP 23  
Int_Data_BcClk output Bcache tag data  
BcTagInClk_H input  
BcTagDirty_H  
B_DA_PP  
1
Int_Data_BcClk output Bcache tag dirty bit  
BcTagInClk_H  
input  
BcTagInClk_H  
BcTagOE_L  
I_DA  
O_PP  
1
1
NA  
Tag input data reference clock  
Int_Index_BcClk  
Bcache tag output enable/chip  
select  
BcTagOutClk_H  
BcTagOutClk_L  
O_PP  
2
1
1
1
NA  
Bcache tag clock— high and low  
versions  
BcTagParity_H  
BcTagShared_H  
BcTagValid_H  
B_DA_PP  
B_DA_PP  
B_DA_PP  
Int_Data_BcClk output Bcache tag parity bit  
BcTagInClk_H input  
Int_Data_BcClk output Bcache tag shared bit  
BcTagInClk_H input  
Int_Data_BcClk output Bcache tag valid bit  
BcTagInClk_H  
input  
BcVref  
I_DC_REF  
O_PP  
1
1
NA  
Input reference voltage for tag data  
Bcache data write enable  
BcTagWr_L  
Int_Index_BcClk  
4.8.2 Bcache Clocking  
For clocking, the Bcache port pins can be divided into three groups.  
1. The Bcache index pins (address and control) are referenced to Int_Add_BcClk, an  
internal version of the Bcache forwarded clock. The index pins are valid for the  
whole period of the Int_Add_BcClk. The index pins are:  
BcAdd_H[23:4]  
BcDataOE_L  
BcDataWr_L  
BcLoad_L  
BcTagOE_L  
BcTagWr_L  
2. The data pins, when driven as outputs, are referenced to Int_Data_BcClk, another  
internal version of the Bcache forwarded clock. The data pins, when used as inputs,  
can be referenced to the incoming Bcache clocks, BcDataInClk_H[7:0] and  
BcTagInClk_H. Int_Data_BcClk can be delayed relative to Int_Add_BcClk from  
0 to 3 GCLK cycles by using Cbox CSR BC_CPU_CLK_DELAY[1:0]. The data  
pins are:  
BcCheck_H[15:0]  
BcData_H[127:0]  
BcTag_H[42:20]  
BcTagDirty_H  
BcTagParity_H  
4–44 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                                 
Bcache Port  
BcTagShared_H  
BcTagValid_H  
3. The Bcache clock pins (BcDataOutClk_x[3:0] and BcTagOutClk_x) clock the  
index and data pins at the SSRAMs. These clocks can be delayed from  
Int_Data_BcClk from 0 to 2 GCLK phases (half cycles) using Cbox CSR  
BC_CPU_CLK_DELAY[1:0].  
Table 4–37 provides the BC_CPU_CLK_DELAY[1:0] values, which is the delay  
from BC_ADDRESS to BC_WRITE_DATA (and BC_CLOCK_OUT) in GCLK  
cycles.  
Table 4–37 BC_CPU_CLK_DELAY[1:0] Values  
BC_CPU_CLK_DELAY[1:0] Value  
GCLK Cycles of Delay  
0
1
2
3
0
1
2
3
In the 21264/EV68A topology, the index pins are loaded by all the SSRAMs, while the  
clock and data pins see a limit load. This arrangement requires a relatively large amount  
of delay between the index pins and the Bcache clock pins to meet the setup constraints  
at the SSRAMs. The 21264/EV68A Cbox CSRs can provide a programmable amount  
of delay between the index and clock pins by using Cbox CSRs  
BC_CPU_CLK_DELAY[1:0] and BC_CLK_DELAY[1:0].  
Table 4–38 provides the BC_CLK_DELAY[1:0] values, which is the delay from  
BC_WRITE_DATA to BC_CLOCK_OUT, in GCLK phases.  
Table 4–38 BC_CLK_DELAY[1:0] Values  
BC_CLK_DELAY[1:0] Value  
GCLK Phases  
0
1
2
3
Invalid (turns off BC_CLOCK_OUT)  
0
1
2
With BC_CPU_CLK_DELAY[1:0] and BC_CLK_DELAY[1:0], a 500-MHz 21264/  
EV68A can provide up to 8 ns (3 × 2 + 2) of delay between the index and the outgoing  
forwarded clocks. The relative loading difference between the data and the clock is  
minimal, so Cbox CSR BC_CLK_DELAY[1:0] alone is sufficient to provide the delay  
needed for the setup constraint at the Bcache data register.  
4.8.2.1 Setting the Period of the Cache Clock  
The free running Bcache clocks are derived from the 21264/EV68A GCLK. The period  
of the Bcache clocks is programmed using the following three Cbox CSRs:  
1. BC_CLK_LD_VECTOR[15:0]  
2. BC_BPHASE_LD_VECTOR[3:0]  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–45  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                 
Bcache Port  
3. BC_FDBK_EN[7:0]  
To program these three CSRs, the programmer must know the bit-rate of the Bcache  
data, and whether only the rising edge or both edges of the clock are used to latch data.  
For example, a 200-MHz late-write SSRAM has a data period of 5 ns. For a 2-ns  
GCLK, the READCLK_RATIO must be set to 2.5X. This part is called a 2.5X SD (sin-  
gle-data part).  
Table 4–39 shows how the three CSRs are programmed for single-data devices.  
Table 4–39 Program Values to Set the Cache Clock Period (Single-Data)  
1
1
1
Bcache Transfer  
2.0X-SD  
BC_CLK_LD_VECTOR  
BC_BPHASE_LD_VECTOR  
BC_FDBK_EN  
5555  
94A5  
9249  
4C99  
3333  
8C63  
71C7  
C387  
0F0F  
0
01  
02  
02  
04  
01  
02  
10  
04  
01  
2.5X-SD  
3
3.0X-SD  
A
C
0
3.5X-SD  
4.0X-SD  
5.0X-SD  
5
6.0X-SD  
0
7.0X-SD  
A
0
8.0X-SD  
1
These are hexadecimal values.  
With the exception of the 2.5X-SD and 3.5X-SD cases, the clock waveform generated  
by the 21264/EV68A for the forwarded clocks has a 50-50 duty cycle. In the 2.5X-SD  
case, the 21264/EV68A produces an asymmetric clock that is high for two GCLK  
phases and low for three phases. Likewise, for the 3.5X-SD case, the 21264/EV68A  
produces an asymmetric clock that is high for three GCLK phases and low for four  
GCLK phases. Also, for both of these cases, the 21264/EV68A will only start transac-  
tions on the rising edge of the GCLK and the Bcache clock. The 1.5X-SD case is not  
supported.  
A dual-data rate (DDR) SSRAM’s data rate is derived in a similar manner, except that  
because both edges of the clock are used, the SSRAM clock generated is 2X the period  
of the data. This part is called a 2.5X DDR SSRAM.  
Table 4–40 shows how the three CSRs are programmed for dual-data devices.  
Table 4–40 Program Values to Set the Cache Clock Period (Dual-Data Rate)  
Bcache  
1
1
1
Transfer  
1.5X-DD  
2.0X-DD  
2.5X-DD  
3.0X-DD  
BC_CLK_LD_VECTOR  
BC_BPHASE_LD_VECTOR BC_FDBK_EN  
9249  
3333  
8C63  
71C7  
A
0
02  
01  
02  
10  
5
0
4–46 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Bcache Port  
Table 4–40 Program Values to Set the Cache Clock Period (Dual-Data Rate) (Continued)  
Bcache  
Transfer  
1
1
1
BC_CLK_LD_VECTOR  
BC_BPHASE_LD_VECTOR BC_FDBK_EN  
3.5X-DD  
4.0X-DD  
5.0X-DD  
6.0X-DD  
7.0X-DD  
C387  
0F0F  
7C1F  
F03F  
C07F  
00FF  
A
0
0
0
0
0
04  
01  
40  
10  
04  
01  
8.0X-DD  
1
These are hexadecimal values.  
In addition to programming the clock CSRs, the data-sample/drive Cbox CSRs, at the  
pads, must be set appropriately. Table 4–41 lists these CSRs and provides their pro-  
grammed value.  
Table 4–41 Data-Sample/Drive Cbox CSRs  
CBOX CSR  
Description  
BC_DDM_FALL_EN[0]  
Enables the update of the 21264/EV68A’s Bcache outputs referenced to the  
falling edge of the Bcache forwarded clock. Dual-data RAMs assert this  
CSR.  
BC_TAG_DDM_FALL_EN[0] Enables the update of the 21264/EV68A’s Bcache tag outputs referenced to  
the falling edge of the Bcache forwarded clock. Alway deasserted.  
BC_DDM_RISE_EN[0]  
Enables the update of the 21264/EV68A’s Bcache outputs referenced to the  
rising edge of the Bcache forwarded clock. Always asserted.  
BC_TAG_DDM_RISE_EN[0]  
Enables the update of the 21264/EV68A’s Bcache tag outputs referenced to  
the rising edge of the Bcache forwarded clock. Always asserted.  
BC_DDMF_ENABLE[0]  
BC_DDMR_ENABLE[0]  
BC_FRM_CLK[0]  
Enables the rising edge of the Bcache forwarded clock. Always asserted.  
Enables the falling edge of the Bcache forwarded clock. Always asserted.  
Forces the 21264/EV68A to only start Bcache transactions on the rising edge  
of Bcache clocks that also coincide with the rising edge of GCLK. Must be  
asserted for all dual-data parts and single-data parts at 2.5X and 3.5X.  
BC_CLKFWD_ENABLE[0]  
Enables clock forward enable. Always asserted.  
4.8.3 Bcache Transactions  
The Cbox uses the programmed clock values to start data read, tag read, data write, and  
tag write transactions on the rising edge of a Bcache clock. The Cbox can also be con-  
figured to introduce a programmable number of bubbles when changing between write  
and read commands. The following three sections describe these Bcache transactions.  
4.8.3.1 Bcache Data Read and Tag Read Transactions  
The 21264/EV68A always reads four pieces of data (64 bytes) from the Bcache during  
a data read transaction, and always interrogates the tag array on the first cycle. Once  
started, data read transactions are never cancelled. Assuming that the appropriate values  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–47  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                       
Bcache Port  
have been programmed for the Bcache clock period, and with satisfactory delay param-  
eters for the SSRAM setup/hold Bcache address latch requirements, a Bcache read  
command proceeds through the 21264/EV68A Cbox as follows:  
1. When the 21264/EV68A clocks out the first address value on the Bcache index pins  
with the appropriate Int_Add_BcClk value, the Cbox loads the values of Cbox CSR  
BC_LAT_DATA_PATTERN[31:0] and Cbox CSR  
BC_LAT_TAG_PATTERN[23:0] into two shift registers, which shift during every  
GCLK cycle.  
2. The address and control pins are latched into the SSRAMs. During the next cycle,  
the SSRAMs provide data and tag information to the 21264/EV68A.  
3. Using the returning forwarded clocks (BcDataInClk_H[7:0], BcTagInClk_H), the  
data/tag information is loaded into the 21264/EV68A clock forwarding queue for  
the Bcache.  
4. Based on the value of BC_RCV_MUX_PRESET_CNT[1,0] (the unload pointer),  
the result of a Bcache write command is loaded into a 21264/EV68A GCLK  
(BPHASE) register.  
5. The Cbox CSR BC_LAT_DATA_PATTERN[31:0] and  
BC_LAT_TAG_PATTERN[23:0] contain the GCLK frequency at which the output  
of the clock forward FIFO can be consumed by the processor. This provides GCLK  
granularity for the Bcache interface, so that the 21264/EV68A can minimize  
latency to the Bcache. When the values based on these Cbox CSRs are shifted  
down to the bottom of the shift register, the processor samples the Bcache data and  
delivers it to the consumers of load data in the 21264/EV68A functional units.  
For example, when a 2.5X-SD SSRAM has a latency of eight GCLK cycles from  
BcAdd_H[23:4] to the output of Bcache FIFO, Cbox CSR  
BC_LAT_DATA_PATTERN[31:0] is programmed to 948 and Cbox CSR  
16  
BC_LAT_TAG_PATTERN[23:0] is programmed to 8 . The data pattern contains the  
16  
placement for four pieces of data and the aggregate rate of the data is 2.5X. In addition,  
bit one of the BC_LAT_DATA_PATTERN is placed at a GCLK latency of six GCLK  
cycles, which is the minimum latency supported by the 21264/EV68A. The  
BC_LAT_TAG_PATTERN contains the placement of the tag data to the 21264/EV68A.  
A shift of one to the left increases the latency of the Bcache transfer to nine GCLK  
cycles, and a shift to the right reduces the latency of the Bcache transfer to seven GCLK  
cycles.  
The Cbox performs isolated tag read transactions in response to system probe com-  
mands. In addition, when using burst-mode SSRAMs, the Cbox can combine a separate  
tag read transaction with the tail end of a data read transaction, thus optimizing Bcache  
bandwidth. A Bcache tag read transaction proceeds exactly like a Bcache data read  
transaction, except that only the BC_LAT_TAG PATTERN is used to update the tag  
shift register.  
4.8.3.2 Bcache Data Write Transactions  
During a data write transaction, the 21264/EV68A always writes four pieces of data (64  
bytes of data and 8 bytes of ECC) to the Bcache, and always writes the tag array during  
the first cycle. Once started, data write operations are never cancelled. Given the appro-  
4–48 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Bcache Port  
priate programming of the Bcache clock period and delay parameters to satisfy SSRAM  
setup/hold requirements of the Bcache address latch, a Bcache write transaction pro-  
ceeds through the Cbox as follows:  
1. The Cbox transmits the index and write control signals during an Int_Adr_BcClk  
edge.  
2. The data is placed on Bcache data, tag, and tag status pins on the appropriate  
Int_Data_BcClk edge from 0 to 7 Bcache bit-times later, based on the Cbox CSR  
BC_LATE_WRITE_NUM[2:0]. The BC_LATE_WRITE_NUM[2:0] supports the  
late-write SSRAM, which optimize Bcache data bus bandwidth by minimizing  
bubbles between read and write transactions. For example, single-data late-write  
SSRAMs would need this CSR programmed to a value of one, and dual-data late-  
write SSRAMs would need this CSR programmed to a value of two.  
3. The difference between the data delivery (Int_Data_BcClk) and forwarded clocks  
out provides the setup for the data at the Bcache data flip-flop.  
4. For Bcache writes, the 21264/EV68A drivers are enabled on the GCLK BPHASE  
preceding the start of a write transfer, and disabled on the succeeding GCLK  
BPHASE at the end of a write transfer. Thus, the write data is enveloped by the  
21264/EV68A drivers to guarantee that every data transfer has the same data-valid  
window.  
4.8.3.3 Bubbles on the Bcache Data Bus  
When changing between read and write transactions on the bidirectional bus, it is often  
necessary to introduce NOP cycles (bubbles) to allow the bus to settle and to drain the  
Bcache read pipeline. The Cbox provides two CSRS, BC_RD_WR_BUBBLES[5:0]  
and BC_WR_RD_BUBBLES[3:0], to help control the bubbles between read and write  
transactions.  
The optimum parameters for these CSRs are determined by formulas that include the  
following terms:  
Term  
Description  
bcfrm  
Bcache frame clock.  
In dual-data mode, bcfrm is twice the ratio.  
In single-data mode, the value for bcfrm is determined by whether  
the ratio is even or odd:  
When the ratio is even, bcfrm is equal to the ratio.  
When the ratio is odd, bcfrm is twice the ratio.  
For example, in single-data mode:  
Ratio  
2
Bcfrm  
2
5
2.5  
GCLK  
The processor clock.  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–49  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Bcache Port  
Term  
Description  
Ratio  
The number of GCLK cycles per peak Bcache bandwidth transfer. For example, a  
ratio of 2.5 means the peak Bcache bandwidth is 16 bytes for every 2.5 GCLK  
cycles.  
rd_wr  
wr_rd  
The minimum spacing required between the read and write indices at the data/tag  
pins, expressed as GCLK cycles.  
The minimum spacing required between the write and read indices at the data/tag  
pins, expressed as GCLK cycles.  
The Relationship Between Write-to-Read — BC_WR_RD_BUBBLES and wr_rd  
The following formulas calculate the relationship between the Cbox CSR  
BC_WR_RD_BUBBLES and wr_rd:  
wr_rd = (BC_WR_RD_BUBBLES – 1) * bcfrm  
or  
BC_WR_RD_BUBBLES = ((wr_rd + bcfrm – 1) / bcfrm) + 1  
There is never a need to use a value of 0 or 1 for BC_WR_RD_BUBBLES.  
If wr_rd = 4*ratio, then value 3 would be the minimum  
BC_WR_RD_BUBBLES value when bcfrm = 2*ratio, and value 5 would be the  
minimum BC_WR_RD_BUBBLES value when bcfrm = ratio.  
There is a special case for ratio = 2.0 in single-data mode. In this case, the for-  
mula is:  
wr_rd = (BC_WR_RD_BUBBLES – 2) * bcfrm  
The Relationship Between Read-to-Write — BC_RD_WR_BUBBLES and rd_wr  
Use the following formula to calculate the value for the Cbox CSR  
BC_RD_WR_BUBBLES that produces the minimum rd_wr restriction:  
BC_RD_WR_BUBBLES = rd_wr – 6  
Note that a value for BC_RD_WR_BUBBLES of zero really means 64 GCLK cycles.  
In that case, amend the formula. For example, it is impossible to have rd_wr = 6 in  
the 1.5x dual-data rate mode case.  
4.8.4 Pin Descriptions  
This section describes the characteristics of the Bcache interface pins.  
4–50 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
Bcache Port  
4.8.4.1 BcAdd_H[23:4]  
The BcAdd_H[23:4] pins are high drive outputs that provides the index for the Bcache.  
The 21264/EV68A supports Bcache sizes of 1MB, 2MB, 4MB, 8MB, and 16MB.  
Table 4–42 lists the values to be programmed into Cbox CSRs BC_ENABLE[0] and  
BC_SIZE[3:0] to support each size of the Bcache.  
Table 4–42 Programming the Bcache to Support Each Size of the Bcache  
BC_ENABLE[0]  
BC_SIZE[3:0]  
0000  
Bcache Size  
1MB  
1
1
1
1
1
0001  
2MB  
0011  
4MB  
0111  
8MB  
1111  
16MB  
When the Cbox CSR BC_BANK_ENABLE[0] is not set, the unused BcAdd_H[23:4]  
pins are tied to zero. For example, when configured as a 4MB cache, the 21264/EV68A  
never changes BcAdd_H[23:22] from logic zero, and when BC_BANK_ENABLE[0]  
is asserted, the 21264/EV68A drives the complement of the MSB index on the next  
higher BcAdd_H pin.  
4.8.4.2 Bcache Control Pins  
The Bcache control pins (BcLoad_L, BcDataWr_L, BcDataOE_L, BcTagWr_L,  
BcTagOE_L) are controlled using Cbox CSRs BC_BURST_MODE_ENABLE[0] and  
BC_PENTIUM_MODE[0].  
Table 4–43 shows the four combinations of Bcache control pin behavior obtained using  
the two CSRs.  
Table 4–43 Programming the Bcache Control Pins  
BC_PENTIUM_MODE  
BC_BURST_MODE_ENABLE  
RAM_TYPE  
0
0
1
1
0
1
0
1
RAM_TYPE A  
RAM_TYPE B  
Unsupported  
Unsupported  
Table 4–44 lists the combination of control pin assertion for RAM_TYPE A.  
Table 4–44 Control Pin Assertion for RAM_TYPE A  
TYPE_A  
NOP RA0 RA1 RA2 RA3 NOP NOP WA0 WA1 WA2 WA3 NOP  
BcLoad_L  
H
H
H
H
H
H
L
H
L
H
L
H
L
H
H
H
H
H
H
H
H
H
H
H
L
L
L
L
H
L
H
L
H
L
H
H
H
H
H
BcDataOE_L  
BcDataWr_L  
BcTagOE_L  
BcTagWr_L  
H
L
H
H
H
H
H
H
H
H
H
L
L
L
H
H
H
H
H
H
H
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–51  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                           
Bcache Port  
Table 4–45 lists the combination of control pin assertion for RAM_TYPE B.  
Table 4–45 Control Pin Assertion for RAM_TYPE B  
TYPE_B  
NOP RA0 RA1 RA2 RA3 NOP NOP WA0 WA1 WA2 WA3 NOP  
BcLoad_L  
H
H
L
L
L
H
L
H
H
L
H
L
H
L
H
H
L
H
H
L
L
L
L
L
L
H
L
H
L
H
L
H
H
L
BcDataOE_L  
BcDataWr_L  
BcTagOE_L  
BcTagWr_L  
H
H
H
H
H
H
H
H
H
L
L
L
H
H
H
H
H
H
H
H
H
H
H
H
H
H
Table 4–46 lists the combination of control pin assertion for RAM_TYPE C.  
Table 4–46 Control Pin Assertion for RAM_TYPE C  
TYPE_C  
NOP RA0 RA1 RA2 RA3  
NOP NOP WA0 WA1 WA2 WA3 NOP  
BcLoad_L  
H
H
H
H
H
H
H
H
L
H
L
H
L
H
L
H
L
H
L
H
H
L
H
H
L
H
H
L
H
H
L
H
H
H
H
H
BcDataOE_L  
BcDataWr_L  
BcTagOE_L  
BcTagWr_L  
H
L
H
H
H
H
H
H
H
H
H
H
H
H
H
L
H
H
H
H
H
H
H
H
Table 4–47 lists the combination of control pin assertion for RAM_TYPE D.  
Table 4–47 Control Pin Assertion for RAM_TYPE D  
TYPE_D  
NOP RA0 RA1 RA2 RA3 NOP NOP WA0 WA1 WA2 WA3 NOP  
BcLoad_L  
H
H
H
H
H
L
H
L
H
L
H
L
H
L
H
L
L
H
L
H
L
H
H
L
H
H
L
H
H
L
H
H
H
H
H
BcDataOE_L  
BcDataWr_L  
BcTagOE_L  
BcTagWr_L  
H
H
H
H
H
L
H
L
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
Notes:  
1. The NOP condition for RAM_TYPE B is consistent with bursting nonPentium  
style SSRAMs.  
2. In both RAM_TYPE A and RAM_TYPE B, the pins BcDataOE_L and BcTagOE_L  
function changes from output-enable control to chip-select control.  
3. In both RAM_TYPE C and RAM_TYPE D SSRAMs, the pins BcDataOE_L and  
BcTagOE_L function as an asynchronous output enable that envelopes the Bcache  
read data by providing an extra cycle of output enable.  
Using these Cbox CSRs, late-write nonbursting and dual-data rate SSRAMs can be  
connected to the 21264/EV68A as described in Appendix E.  
4–52 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Bcache Port  
4.8.4.3 BcDataInClk_H and BcTagInClk_H  
The BcDataInClk_H[7:0] and BcTagInClk_H pins are used to capture tag data and  
data from the Bcache data and tag RAMs respectively. Dual-data rate SSRAMs provide  
a clock output with the data output pins to minimize skew between the data and clock,  
thus allowing maximum bandwidth. The 21264/EV68A internally synchronizes the  
data to its GCLK with clock forward receive circuitry similar to that in the system inter-  
face. For nonDDR SSRAMs, systems can connect the Bcache data and tag output clock  
pins to the Bcache data and tag input clock pins.  
4.8.5 Bcache Banking  
Bcache banking is possible by decoding the index MSB (as determined by Cbox CSR  
BC_SIZE[3:0]) and asserting Cbox CSR BC_BANK_ENABLE[0]. To facilitate bank-  
ing, the 21264/EV68A provides the complement of the MSB bit in the next higher  
unused index bit. For example, when configured as an 8MB cache with banking  
enabled, the 21264/EV68A drives the inversion of PA[22] on BcAdd_H[23] for use as  
a chip enable in a banked configuration. Because there is no higher index bit available  
for 16MB caches, this scheme only works for cache sizes of 1MB, 2MB, 4MB, and  
8MB.  
Setting BC_RD_RD_BUBBLE to 1 introduces one Bcache clock cycle of delay  
between consecutive read transactions, regardless of whether or not they are read trans-  
actions to the same bank.  
Setting BC_WR_WR_BUBBLE to 1 introduces one Bcache clock cycle of delay  
between consecutive write transactions, regardless of whether or not they are write  
transactions to the same bank.  
Setting BC_SJ_BANK_ENABLE to 1 introduces one Bcache clock cycle of delay  
between consecutive read transactions to a different bank (based on the MSB of the  
index), even if BC_RD_RD_BUBBLE is set to 0. No additional delay is inserted  
between consecutive read transactions to the same bank or between consecutive write  
transactions.  
4.8.6 Disabling the Bcache for Debugging  
The Bcache is a required component for a 21264/EV68A-based system. However, for  
debug purposes, the 21264/EV68A can be operated with the Bcache disabled. The  
Bcache can be disabled by clearing all of the BC_ENABLE bits in the Cbox  
WRITE_MANY CSR. When disabling the Bcache, the following additional steps must  
be taken:  
1. The various Bcache control bits in the Cbox WRITE_ONCE chain must be pro-  
grammed to a valid combination (normally the same settings that would be used if  
the Bcache were enabled).  
2. The Bcache must still be initialized (using BC_INIT mode) during the reset PAL  
flow, after which the Bcache should be left disabled.  
3. Error Detection and Correction should be disabled by clearing DC_DAT_ERR_EN  
(bit 7 of the DC_CTL IPR), or the following bits in the Cbox WRITE_ONCE chain  
must be programmed to the indicated values:  
BC_CLK_DELAY[1:0]  
BC_CPU_CLK_DELAY[1:0]  
= 0x1  
= 0x1  
21264/EV68A Hardware Reference Manual  
Cache and External Interfaces 4–53  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
Interrupts  
BC_CPU_LATE_WRITE_NUM[1:0] = 0x1  
BC_LATE_WRITE_NUM[2:0]  
BC_LATE_WRITE_UPPER  
DUP_TAG_ENABLE  
= 0x0  
= 0  
= 0  
4.9 Interrupts  
The system may request interrupts by way of the IRQ_H[5:0] pins. These six interrupt  
sources are identical. They may be asynchronous, are level sensitive, and can be indi-  
vidually masked by way of the EIE field of the CM_IER IPR. The system designer  
determines how these signals are used and selects their relative priority.  
4–54 Cache and External Interfaces  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
5
Internal Processor Registers  
This chapter describes 21264/EV68A internal processor registers (IPRs). They are sep-  
arated into the following circuit logic groups: Ebox, Ibox, Mbox, and Cbox.  
The gray areas in register figures indicate reserved fields. Bit ranges that are coupled  
with the field name specify those bits in that named field that are included in the IPR.  
For example, in Figure 5–2, the field named COUNTER[31:4] contains bits 31 through  
4 of the COUNTER field from Section 5.1.1. The bit range of COUNTER[31:4] in the  
IPR is also listed in the column Extent in Table 5–2. In many cases, such as this one, the  
bit ranges correspond. However, the bit range of the named field need not always corre-  
spond to the Extent in the IPR. For example, in Figure 5–14, the field VA[47:13] resides  
in IPR IVA_FORM[37:3] under the stated conditions.  
The register contents after initialization are listed in Section 7.8.  
Table 5–1 lists the 21264/EV68A internal processor registers.  
Table 5–1 Internal Processor Registers  
MT/MF  
Issued  
Latency  
for  
Score-  
Board  
Bit  
Index  
from Ebox MFPR  
Access Pipe (Cycles)  
Register Name  
Mnemonic  
(Binary)  
Ebox IPRs  
Cycle counter  
CC  
1100 0000  
1100 0001  
1100 0010  
1100 0100  
1100 0011  
5
RW  
W0  
RO  
WO  
RO  
1L  
1L  
1L  
1L  
1L  
1
Cycle counter control  
Virtual address  
CC_CTL  
VA  
5
1
4, 5, 6, 7  
5
Virtual address control  
Virtual address format  
Ibox IPRs  
VA_CTL  
VA_FORM  
1
4, 5, 6, 7  
ITB tag array write  
ITB PTE array write  
ITB invalidate all process (ASM=0)  
ITB invalidate all  
ITB invalidate single  
ProfileMePC  
ITB_TAG  
ITB_PTE  
ITB_IAP  
ITB_IA  
0000 0000  
6
WO  
WO  
WO  
WO  
WO  
RO  
0L  
0L  
0L  
0L  
0L  
3
0000 0001 4, 0  
0000 0010  
0000 0011  
4
4
ITB_IS  
0000 0100 4, 6  
PMPC  
0000 0101  
0000 0110  
Exception address  
EXC_ADDR  
RO  
0L  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers  
5–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                 
Table 5–1 Internal Processor Registers (Continued)  
Index  
MT/MF  
Issued  
Latency  
for  
Score-  
Board  
Bit  
from Ebox MFPR  
Access Pipe (Cycles)  
Register Name  
Mnemonic  
IVA_FORM  
CM  
(Binary)  
Instruction VA format  
Current mode  
0000 0111  
0000 1001  
0000 1010  
0000 10xx  
0000 1100  
0000 1101  
0000 1110  
0000 1111  
0001 0000  
0001 0001  
0001 0110  
0001 0011  
5
RO  
RW  
RW  
RW  
RW  
RO  
WO  
RO  
RW  
RW  
RW  
W
0L  
0L  
0L  
0L  
0L  
3
4
3
Interrupt enable  
IER  
4
3
Interrupt enable and current mode  
Software interrupt request  
Interrupt summary  
IER_CM  
SIRR  
4
3
4
3
ISUM  
4
3
Hardware interrupt clear  
Exception summary  
PAL base address  
HW_INT_CLR  
EXC_SUM  
PAL_BASE  
I_CTL  
0L  
0L  
0L  
0L  
0L  
0L  
0L  
0L  
0L  
0L  
0L  
0L  
4
3
Ibox control  
4
3
Ibox status  
I_STAT  
4
3
Icache flush  
IC_FLUSH  
4
3
Icache flush ASM  
IC_FLUSH_ASM 0001 0010  
4
WO  
WO  
WO  
W
Clear virtual-to-physical map  
Sleep mode  
CLR_MAP  
SLEEP  
0001 0101 4, 5, 6, 7  
0001 0111  
4, 5, 6, 7  
1
Process context register  
Process context register  
Performance counter control  
Mbox IPRs  
PCTX  
01xn nnnn  
01xx xxxx  
0001 0100  
4
4
4
PCTX  
R
3
PCTR_CTL  
RW  
3
DTB tag array write 0  
DTB tag array write 1  
DTB PTE array write 0  
DTB PTE array write 1  
DTB alternate processor mode  
DTB_TAG0  
DTB_TAG1  
DTB_PTE0  
DTB_PTE1  
0010 0000 2, 6  
1010 0000 1, 5  
0010 0001 0, 4  
1010 0001 3, 7  
WO  
WO  
WO  
WO  
WO  
WO  
WO  
WO  
WO  
WO  
WO  
RO  
0L  
1L  
0L  
0L  
1L  
1L  
1L  
0L  
1L  
0L  
1L  
0L  
0L  
0L  
0L  
3
DTB_ALTMODE 0010 0110  
6
7
7
6
7
4
7
6
6
6
DTB invalidate all process (ASM = 0) DTB_IAP  
1010 0010  
1010 0011  
0010 0100  
1010 0100  
0010 0101  
1010 0101  
0010 0111  
0010 1000  
0010 1001  
0010 1010  
DTB invalidate all  
DTB_IA  
DTB invalidate single (array 0)  
DTB invalidate single (array 1)  
DTB address space number 0  
DTB address space number 1  
Memory management status  
Mbox control  
DTB_IS0  
DTB_IS1  
DTB_ASN0  
DTB_ASN1  
MM_STAT  
M_CTL  
WO  
WO  
RW  
3
Dcache control  
DC_CTL  
DC_STAT  
Dcache status  
5–2  
Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
Ebox IPRs  
Table 5–1 Internal Processor Registers (Continued)  
Index  
MT/MF  
Issued  
Latency  
for  
Score-  
Board  
Bit  
from Ebox MFPR  
Access Pipe (Cycles)  
Register Name  
Mnemonic  
(Binary)  
Cbox IPRs  
Cbox data  
C_DATA  
C_SHFT  
0010 1011  
0010 1100  
6
6
RW  
WO  
0L  
0L  
3
Cbox shift control  
Ò
1
When n equals 1, that process context field is selected (FPE, PPCE, ASTRR, ASTER, ASN).  
5.1 Ebox IPRs  
This section describes the internal processor registers that control Ebox functions.  
5.1.1 Cycle Counter Register – CC  
The cycle counter register (CC) is a read-write register. The lower half of CC is a  
counter that, when enabled by way of CC_CTL[32], increments once each CPU cycle.  
The upper half of the register is 32 bits of register storage that may be used as a counter  
offset as described in the Alpha Architecture Handbook, Version 4 under Processor Cycle  
Counter (PCC) Register.  
A HW_MTPR instruction to the CC writes the upper half of the register and leaves the  
lower half unchanged. The RPCC instruction returns the full 64-bit value of the register.  
Figure 5–1 shows the cycle counter register.  
Figure 5–1 Cycle Counter Register  
63  
32 31  
0
OFFSET  
LK99-0008A  
COUNTER  
5.1.2 Cycle Counter Control Register – CC_CTL  
The cycle counter control register (CC_CTL) is a write-only register through which the  
lower half of the CC register may be written and its associated counter enabled and dis-  
abled. Figure 5–2 shows the cycle counter control register.  
Figure 5–2 Cycle Counter Control Register  
63  
33 32 31  
4
3
0
CC_ENA  
LK99-0009A  
COUNTER[31:4]  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers  
5–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
Ebox IPRs  
Table 5–2 describes the CC_CTL register fields.  
Table 5–2 Cycle Counter Control Register Fields Description  
Name  
Extent  
[63:33]  
[32]  
Type  
Description  
Reserved  
CC_ENA  
WO  
Counter Enable.  
When set, this bit allows the cycle counter to increment.  
COUNTER[31:4]  
Reserved  
[31:4]  
[3:0]  
WO  
CC[31:4] may be written by way of this field. Write transactions  
to CC_CTL result in CC[3:0] being cleared.  
5.1.3 Virtual Address Register – VA  
The virtual address register (VA) is a read-only register. When a DTB miss or fault  
occurs, the associated effective virtual address is written into the VA register. VA is not  
written when a LD_VPTE gets a DTB miss or Dstream fault. Figure 5–3 shows the vir-  
tual address register.  
Figure 5–3 Virtual Address Register  
63  
0
VA[63:0]  
LK99-0010A  
5.1.4 Virtual Address Control Register – VA_CTL  
The virtual address control register (VA_CTL) is a write-only register that controls the  
way in which the faulting virtual address stored in the VA register is formatted when it  
is read by way of the VA_FORM register. It also contains control bits that affect the  
behavior of the memory pipe virtual address sign extension checkers and the behavior  
of the Ebox extract, insert, and mask instructions. Figure 5–4 shows the virtual address  
control register.  
Figure 5–4 Virtual Address Control Register  
63  
30 29  
3
2
1
0
VPTB[63:30]  
VA_FORM_32  
VA_48  
B_ENDIAN  
LK99-0014A  
5–4  
Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Ebox IPRs  
Table 5–3 describes the virtual address control register fields.  
Table 5–3 Virtual Address Control Register Fields Description  
Name  
Extent  
Type  
Description  
VPTB[63:30]  
[63:30] WO  
Virtual Page Table Base.  
See the VA_FORM register section for details.  
Reserved  
[29:3]  
[2]  
VA_FORM_32  
WO  
This bit is used to control address formatting when reading the  
VA_FORM register. See the section on the VA_FORM register for  
details.  
VA_48  
[1]  
WO,0 This bit controls the format applied to effective virtual addresses  
by the VA_FORM register and the memory pipe virtual address  
sign extension checkers. When VA_48 is clear, the 43-bit virtual  
address format is used, and when VA_48 is set, the 48-bit virtual  
address format is used.  
When VA_48 is set, the sign extension checkers generate an  
access control violation (ACV) if VA[63:0] SEXT (VA[47:0]).  
When VA_48 is clear, the sign extension checkers generate an  
ACV if VA[63:0] SEXT(VA[42:0]).  
B_ENDIAN  
[0]  
WO  
Big Endian Mode.  
When set, the shift amount (Rbv[2:0]) is inverted for EXTxx,  
INSxx, and MSKxx instructions. The lower bits of the physical  
address for Dstream accesses are inverted based upon the length  
of the reference as follows:  
Byte:  
Word:  
Longword:  
Invert bits [2:0]  
Invert bits [2:1]  
Inverts bit [2]  
5.1.5 Virtual Address Format Register – VA_FORM  
The virtual address format register (VA_FORM) is a read-only register. It contains the  
virtual page table entry address derived from the faulting virtual address stored in the  
VA register. It also contains the virtual page table base and associated control bits stored  
in the VA_CTL register.  
Figure 5–5 shows VA_FORM when VA_CTL(VA_48) equals 0 and  
VA_CTL(VA_FORM_32) equals 0.  
Figure 5–5 Virtual Address Format Register (VA_48 = 0, VA_FORM_32 = 0)  
63  
33 32  
3
2
0
VPTB[63:33]  
VA[42:13]  
LK99-0011A  
Figure 5–6 shows VA_FORM when VA_CTL(VA_48) equals 1 and  
VA_CTL(VA_FORM_32) equals 0.  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers  
5–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Ibox IPRs  
Figure 5–6 Virtual Address Format Register (VA_48 = 1, VA_FORM_32 = 0)  
63  
43 42  
38 37  
3
2
0
VPTB[63:43]  
SEXT(VA[47])  
VA[47:13]  
LK99-0012A  
Figure 5–7 shows VA_FORM when VA_CTL(VA_48) equals 0 and  
VA_CTL(VA_FORM_32) equals 1.  
Figure 5–7 Virtual Address Format Register (VA_48 = 0, VA_FORM_32 = 1)  
63  
30 29  
22 21  
3
2
0
VPTB[63:30]  
VA[31:13]  
LK99-0013A  
5.2 Ibox IPRs  
This section describes the internal processor registers that control Ibox functions.  
5.2.1 ITB Tag Array Write Register – ITB_TAG  
The ITB tag array write register (ITB_TAG) is a write-only register. The ITB tag array  
is written by way of this register. A write transaction to ITB_TAG writes a register out-  
side the ITB array. When a write to the ITB_PTE register is retired, the contents of both  
the ITB_TAG and ITB_PTE registers are written into the ITB entry. The specific ITB  
entry that is written is determined by a round-robin algorithm; the algorithm writes to  
entry number 0 as the first entry after the 21264/EV68A is reset. Figure 5–8 shows the  
ITB tag array write register.  
Figure 5–8 ITB Tag Array Write Register  
63  
48 47  
13 12  
0
VA[47:13]  
LK99-0015A  
5.2.2 ITB PTE Array Write Register – ITB_PTE  
The ITB PTE array write register (ITB_PTE) is a write-only register through which the  
ITB PTE array is written. A round-robin allocation algorithm is used. A write to the  
ITB_PTE array, when retired, results in both the ITB_TAG and ITB_PTE arrays being  
written. The specific entry that is written is chosen by the round-robin algorithm  
described above. Figure 5–9 shows the ITB PTE array write register.  
5–6  
Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
Ibox IPRs  
Figure 5–9 ITB PTE Array Write Register  
63  
44 43  
13 12 11 10  
9
8
7
6
5
4
3
0
PFN[43:13]  
URE  
SRE  
ERE  
KRE  
GH[1:0]  
ASM  
LK99-0016A  
5.2.3 ITB Invalidate All Process (ASM=0) Register – ITB_IAP  
The ITB invalidate all process register (ITB_IAP) is a pseudo register that, when writ-  
ten to, invalidates all ITB entries whose ASM bit is clear. An explicit write to  
IC_FLUSH_ASM is required to flush the Icache of blocks with ASM equal to zero.  
5.2.4 ITB Invalidate All Register – ITB_IA  
The ITB invalidate all register (ITB_IA) is a pseudo register that, when written to,  
invalidates all ITB entries. An explicit write to IC_FLUSH is required to flush the  
Icache.  
5.2.5 ITB Invalidate Single Register – ITB_IS  
The ITB invalidate single register (ITB_IS) is a write-only register. Writing a virtual  
page number to this register invalidates any ITB entry that meets one of the following  
criteria:  
The ITB entry’s virtual page number matches ITB_IS[47:13] (or fewer bits if gran-  
ularity hint bits are set in the ITB entry) and its ASN field matches the address  
space number supplied in PCTX[46:39].  
The ITB entry’s virtual page number matches ITB_IS[47:13] and its ASM bit is set.  
Figure 5–10 shows the ITB invalidate single register.  
Figure 5–10 ITB Invalidate Single Register  
63  
48 47  
13 12  
0
INVAL_ITB[47:13]  
LK99-0017A  
Note:  
Because the Icache is virtually indexed and tagged, it is normally not nec-  
essary to flush the Icache when paging. Therefore, a write to ITB_IS will  
not flush the Icache.  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers  
5–7  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
Ibox IPRs  
5.2.6 ProfileMe PC Register – PMPC  
The ProfileMe PC register (PMPC) is a read-only register that contains the PC of the  
last profiled instruction. Additional information is available in the I_STAT and  
PCTR_CTL register descriptions.  
Usage of PMPC in performance monitoring is described in Section 6.10.  
Figure 5–11 shows the ProfileMe PC register.  
Figure 5–11 ProfileMe PC Register  
63  
2
1
0
PC[63:2]  
PAL  
LK99-0018A  
Table 5–4 describes the ProfileMe PC register fields.  
Table 5–4 ProfileMe PC Fields Description  
Name  
Extent  
[63:2]  
[1]  
Type  
RO  
Description  
PC[63:2]  
Reserved  
PAL  
Address of the profiled instruction  
Read as zero  
RO  
[0]  
RO  
Indicates that the PC field contains a physical-mode PALmode  
address  
5.2.7 Exception Address Register – EXC_ADDR  
The exception address register (EXC_ADDR) is a read-only register that is updated by  
hardware when it encounters an exception or interrupt.  
EXC_ADDR[0] is set if the associated exception occurred in PALmode. The exception  
actions are listed here:  
If the exception was a fault or a synchronous trap, EXC_ADDR contains the PC of  
the instruction that triggered the fault or trap.  
If the exception was an interrupt, EXC_ADDR contains the PC of the next instruc-  
tion that would have executed if the interrupt had not occurred.  
Figure 5–12 shows the exception address register.  
Figure 5–12 Exception Address Register  
63  
2
1
0
PC[63:2]  
PAL  
LK99-0018A  
5–8  
Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Ibox IPRs  
5.2.8 Instruction Virtual Address Format Register — IVA_FORM  
The instruction virtual address format register (IVA_FORM) is a read-only register. It  
contains the virtual PTE address derived from the faulting virtual address stored in the  
EXC_ADDR register, and from the virtual page table base, VA_48 and VA_FORM_32  
bits, stored in the I_CTL register.  
Figure 5–13 shows IVA_FORM when I_CTL(VA_48) equals 0 and  
I_CTL(VA_FORM_32) equals 0.  
Figure 5–13 Instruction Virtual Address Format Register (VA_48 = 0, VA_FORM_32 = 0)  
63  
33 32  
3
2
0
VPTB[63:33]  
VA[42:13]  
LK99-0019A  
Figure 5–14 shows IVA_FORM when I_CTL(VA_48) equals 1 and  
I_CTL(VA_FORM_32) equals 0.  
Figure 5–14 Instruction Virtual Address Format Register (VA_48 = 1, VA_FORM_32 = 0)  
63  
43 42  
38 37  
3
2
0
VPTB[63:43]  
SEXT(VA[47])  
VA[47:13]  
LK99-0020A  
Figure 5–15 shows IVA_FORM when I_CTL(VA_48) equals 0 and  
I_CTL(VA_FORM_32) equals 1.  
Figure 5–15 Instruction Virtual Address Format Register (VA_48 = 0, VA_FORM_32 = 1)  
63  
30 29  
22 21  
3
2
0
VPTB[63:30]  
VA[31:13]  
LK99-0021A  
5.2.9 Interrupt Enable and Current Processor Mode Register – IER_CM  
The interrupt enable and current processor mode register (IER_CM) contains the inter-  
rupt enable and current processor mode bit fields. These bit fields can be written either  
individually or together with a single HW_MTPR instruction. When bits [7:2] of the  
IPR index field of a HW_MTPR instruction contain the value 000010 , this register is  
2
selected. Bits [1:0] of the IPR index indicate which bit fields are to be written: bit[1]  
corresponds to the IER field and bit[0] corresponds to the processor mode field. A  
HW_MFPR instruction to this register returns the values in both fields. Figure 5–16  
shows the interrupt enable and current processor mode register.  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers  
5–9  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Ibox IPRs  
Figure 5–16 Interrupt Enable and Current Processor Mode Register  
63  
39 38  
33 32 31 30 29 28  
14 13 12  
5
4
3
2
0
EIEN[5:0]  
SLEN  
CREN  
PCEN[1:0]  
SIEN[15:1]  
ASTEN  
CM[1:0]  
LK99-0022A  
Table 5–5 describes the interrupt enable and current processor mode register fields.  
Table 5–5 IER_CM Register Fields Description  
Name  
Extent  
[63:39]  
[38:33]  
[32]  
Type  
Description  
Reserved  
EIEN[5:0]  
SLEN  
RW  
RW  
RW  
RW  
RW  
RW  
External Interrupt Enable  
Serial Line Interrupt Enable  
Corrected Read Error Interrupt Enable  
Performance Counter Interrupt Enables  
Software Interrupt Enables  
AST Interrupt Enable  
CREN  
[31]  
PCEN[1:0]  
SIEN[15:1]  
ASTEN  
[30:29]  
[28:14]  
[13]  
When set, enables those AST interrupt requests that are also  
enabled by the value in ASTER.  
Reserved  
CM[1:0]  
[12:5]  
[4:3]  
RW  
Current Mode  
00  
01  
10  
11  
Kernel  
Executive  
Supervisor  
User  
Reserved  
[2:0]  
5.2.10 Software Interrupt Request Register – SIRR  
The software interrupt request register (SIRR) is a read-write register containing bits to  
request software interrupts. To generate a particular software interrupt, its correspond-  
ing bits in SIRR and IER[SIER] must both be set. Figure 5–17 shows the software  
interrupt request register.  
5–10 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Ibox IPRs  
Figure 5–17 Software Interrupt Request Register  
63  
29 28  
14 13  
0
SIR[15:1]  
LK99-0023A  
Table 5–6 describes the software interrupt request register fields.  
Table 5–6 Software Interrupt Request Register Fields Description  
Name  
Extent  
[63:29]  
[28:14]  
[13:0]  
Type  
Description  
Reserved  
SIR[15:1]  
Reserved  
RW  
Software Interrupt Requests  
5.2.11 Interrupt Summary Register – ISUM  
The interrupt summary register (ISUM) is a read-only register that records all pending  
hardware, software, and AST interrupt requests that have their corresponding enable bit  
set.  
If a new interrupt (hardware, serial line, crd, or performance counters) occurs simulta-  
neously with an ISUM read, the ISUM read returns zeros. That condition is normally  
assumed to be a passive release condition. The interrupt is signaled again when the  
PALcode returns to native mode. The effects of this condition can be minimized by  
reading ISUM twice and ORing the results.  
Usage of ISUM in performance monitoring is described in Section 6.10. Figure 5–18  
shows the interrupt summary register.  
Figure 5–18 Interrupt Summary Register  
63  
39 38  
33 32 31 30 29 28  
14 13  
11 10  
9
8
5
4
3
2
0
EI[5:0]  
SL  
CR  
PC[1:0]  
SI[15:1]  
ASTU  
ASTS  
ASTE  
ASTK  
LK99-0024A  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–11  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
Ibox IPRs  
Table 5–7 describes the interrupt summary register fields.  
Table 5–7 Interrupt Summary Register Fields Description  
Name  
Reserved  
EI[5:0]  
SL  
Extent  
[63:39]  
[38:33]  
[32]  
Type  
Description  
RO  
RO  
RO  
RO  
External Interrupts  
Serial Line Interrupt  
Corrected Read Error Interrupts  
CR  
[31]  
PC[1:0]  
[30:29]  
Performance Counter Interrupts  
PC0 when PC[0] is set.  
PC1 when PC[1] is set.  
SI[15:1]  
[28:14]  
[13:11]  
[10],[9]  
RO  
Software Interrupts  
Reserved  
ASTU, ASTS  
RO  
AST Interrupts  
For each processor mode, the bit is set if an associated AST  
interrupt is pending. This includes the mode’s ASTER and  
ASTRR bits and whether the processor mode value held in the  
IER_CM register is greater than or equal to the value for the  
mode.  
Reserved  
[8:5]  
ASTE, ASTK  
[4],[3]  
RO  
AST Interrupts  
For each processor mode, the bit is set if an associated AST  
interrupt is pending. This includes the mode’s ASTER and  
ASTRR bits and whether the processor mode value held in the  
IER_CM register is greater than or equal to the value for the  
mode.  
Reserved  
[2:0]  
5.2.12 Hardware Interrupt Clear Register – HW_INT_CLR  
The hardware interrupt clear register (HW_INT_CLR) is a write-only register used to  
clear edge-sensitive interrupt requests. See Section D.31 for more information about the  
PALcode restriction concerning this register. Figure 5–19 shows the hardware interrupt  
clear register.  
Figure 5–19 Hardware Interrupt Clear Register  
63  
33 32 31 30 29 28 27 26 25  
0
SL  
CR  
PC[1:0]  
MCHK_D  
FBTP  
LK99-0025A  
5–12 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Ibox IPRs  
Table 5–8 describes the hardware interrupt clear register fields.  
Table 5–8 Hardware Interrupt Clear Register Fields Description  
Name  
Extent  
[63:33]  
[32]  
Type  
Description  
Reserved  
SL  
W1C  
W1C  
W1C  
W1C  
Clears serial line interrupt request  
Clears corrected read error interrupt request  
Clears performance counter interrupt requests  
Clears Dstream machine check interrupt request  
CR  
[31]  
PC[1:0]  
MCHK_D  
Reserved  
FBTP  
[30:29]  
[28]  
[27]  
[26]  
W1S  
Forces the next Bcache hit that fills the Icache to generate bad  
Icache fill parity  
Reserved  
[25:0]  
5.2.13 Exception Summary Register – EXC_SUM  
The exception summary register (EXC_SUM) is a read-only register that contains  
information about instructions that have triggered traps. The register is updated at trap  
delivery time. Its contents are valid only if it is read (by way of a HW_MFPR) in the  
first fetch block of the exception handler. There are three types of traps for which this  
register captures related information:  
Arithmetic traps: The instruction generated an exceptional condition that should be  
reported to the operating system, and/or the FPCR status bit associated with this  
condition is clear and should be set by PALcode. Additionally, the REG field con-  
tains the register number of the destination specifier for the instruction that trig-  
gered the trap.  
Istream ACV: The BAD_IVA bit of this register indicates whether the offending  
Istream virtual address is latched into the EXC_ADDR register or the VA register.  
Dstream exceptions: The REG field contains the register number of either the  
source specifier (for stores) or the destination specifier (for loads) of the instruction  
that triggered the trap.  
Figure 5–20 shows the exception summary register.  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–13  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Ibox IPRs  
Figure 5–20 Exception Summary Register  
63  
48 47 46 45 44 43 42 41 40  
14 13 12  
8
7
6
5
4
3
2
1
0
SEXT(SET_IOV)  
SET_IOV  
SET_INE  
SET_UNF  
SET_OVF  
SET_DZE  
SET_INV  
PC_OVFL  
BAD_IVA  
REG[4:0]  
INT  
IOV  
INE  
UNF  
FOV  
DZE  
INV  
SWC  
LK99-0026A  
Table 5–9 describes the exception summary register fields.  
Table 5–9 Exception Summary Register Fields Description  
Name  
Extent  
Type  
Description  
SEXT(SET_IOV)  
SET_IOV  
SET_INE  
SET_UNF  
SET_OVF  
SET_DZE  
SET_INV  
PC_OVFL  
[63:48] RO, 0  
Sign-extended value of bit 47, SET_IOV.  
PALcode should set FPCR[IOV].  
PALcode should set FPCR[INE].  
PALcode should set FPCR[UNF].  
PALcode should set FPCR[OVF].  
PALcode should set FPCR[DZE].  
PALcode should set FPCR[INV].  
[47]  
[46]  
[45]  
[44]  
[43]  
[42]  
[41]  
RO  
RO  
RO  
RO  
RO  
RO  
RO  
Indicates that EXC_ADDR was improperly sign extended for 48-  
bit mode over/underflow IACV.  
Reserved  
[40:14] RO, 0  
[13] RO  
Reserved for COMPAQ.  
Bad Istream VA.  
BAD_IVA  
This bit should be used by the IACV PALcode routine to deter-  
mine whether the offending I-stream virtual address is latched in  
the EXC_ADDR register or the VA register. If BAD_IVA is clear,  
EXC_ADDR contains the address; if BAD_IVA is set, VA con-  
tains the address.  
5–14 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Ibox IPRs  
Table 5–9 Exception Summary Register Fields Description (Continued)  
Name  
Extent  
Type  
Description  
REG[4:0]  
[12:8]  
RO  
Destination register of load or operate instruction that triggered  
the trap OR source register of store that triggered the trap. These  
bits may contain the Rc field of an operate instruction or the Ra  
field of a load or store instruction. The value is UNPREDICTABLE  
if the trap was triggered by an ITB miss, interrupt, OPCDEC, or  
other non load/st/operate.  
INT  
IOV  
[7]  
[6]  
RO  
RO  
Set to indicate Ebox integer overflow trap, clear to indicate Fbox  
trap condition.  
Indicates Fbox convert-to-integer overflow or Ebox integer over-  
flow trap.  
INE  
[5]  
[4]  
[3]  
[2]  
[1]  
[0]  
RO  
RO  
RO  
RO  
RO  
RO  
Indicates floating-point inexact error trap.  
Indicates floating-point underflow trap.  
Indicates floating-point overflow trap.  
Indicates divide by zero trap.  
UNF  
FOV  
DZE  
INV  
SWC  
Indicates invalid operation trap.  
Indicates software completion possible. This bit is set if the  
instruction that triggered the trap contained the /S modifier.  
5.2.14 PAL Base Register – PAL_BASE  
The PAL base register (PAL_BASE) is a read-write register that contains the base phys-  
ical address for PALcode. Its contents are cleared by chip reset but are not cleared after  
waking up from sleep mode or from fault reset. Figure 5–21 shows the PAL base regis-  
ter.  
Figure 5–21 PAL Base Register  
63  
44 43  
15 14  
0
LK99-0027A  
PAL_BASE[43:15]  
Table 5–10 describes the PAL base register fields.  
Table 5–10 PAL Base Register Fields Description  
Name  
Extent  
[63:44]  
[43:15]  
[14:0]  
Type  
RO, 0  
RW  
Description  
Reserved  
Reserved for COMPAQ.  
Base physical address for PALcode.  
Reserved for COMPAQ.  
PAL_BASE[43:15]  
Reserved  
RO, 0  
5.2.15 Ibox Control Register – I_CTL  
The Ibox control register (I_CTL) is a read-write register that controls various Ibox  
functions. Its contents are cleared by chip reset. Figure 5–22 shows the Ibox control  
register.  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–15  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Ibox IPRs  
Figure 5–22 Ibox Control Register  
63  
48 47  
30 29  
24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  
9
8
7
6
5
3
2
1
0
SEXT(VPTB[47])  
VPTB[47:30]  
CHIP_ID[5:0]  
BIST_FAIL  
TB_MB_EN  
MCHK_EN  
ST_WAIT_64K  
PCT1_EN  
PCT0_EN  
SINGLE_ISSUE_H  
VA_FORM_32  
VA_48  
SL_RCV  
SL_XMIT  
HWE  
BP_MODE[1:0]  
SBE[1:0]  
SDE[1:0]  
SPE[2:0]  
IC_EN[1:0]  
SPCE  
LK99-0029A  
Table 5–11 describes the Ibox control register fields.  
Table 5–11 Ibox Control Register Fields Description  
Name  
Extent  
Type  
Description  
Sign extended VPTB[47].  
Virtual Page Table Base. See Section 5.1.5 for details.  
SEXT(VPTB[47])  
VPTB[47:30]  
CHIP_ID[5:0]  
[63:48] RW,0  
[47:30] RW,0  
[29:24] RO  
This is a read-only field that supplies the revision ID number  
for the 21264/EV68A part.  
21264/EV68A pass 2.1 ID is 010010 .  
2
BIST_FAIL  
TB_MB_EN  
[23]  
[22]  
RO,0  
RW,0  
Indicates the status of BiST (clear = pass, set = fail),  
described in Section 11.5.1.  
When set, the hardware ensures that the virtual-mode loads  
in DTB and ITB fill flows that access the page table and the  
subsequent virtual mode load or store that is being retried are  
‘ordered’ relative to another processor’s stores. This must be  
set for multiprocessor systems in which no MB instruction is  
present in the TB fill flow, unless there are other mecha-  
nisms present that ensure coherency.  
MCHK_EN  
[21]  
RW,0  
Machine check enable — set to enable machine checks.  
5–16 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Ibox IPRs  
Table 5–11 Ibox Control Register Fields Description (Continued)  
Name  
Extent  
Type  
Description  
ST_WAIT_64K  
[20]  
RW,0  
The stWait table is used to reduce load/store order traps.  
When set, the stWait table is cleared after 64K cycles. When  
clear, the stWait table is cleared after 16K cycles. See Sec-  
PCT1_EN  
PCT0_EN  
[19]  
[18]  
RW,0  
RW,0  
Enable performance counter #1. If this bit is one, the perfor-  
mance counter will count if either the system (SPCE) or pro-  
cess (PPCE) performance counter enable is asserted.  
Enable performance counter #0. If this bit is one, the perfor-  
mance counter will count if EITHER the system (SPCE) or  
process (PPCE) performance counter enable is set.  
SINGLE_ISSUE_H  
VA_FORM_32  
VA_48  
[17]  
[16]  
[15]  
RW,0  
RW,0  
RW,0  
When set, this bit forces instructions to issue only from the  
bottom-most entries of the IQ and FQ.  
This bit controls address formatting on a read of the  
IVA_FORM register.  
This bit controls the format applied to effective virtual  
addresses by the IVA_FORM register and the Ibox virtual  
address sign extension checkers. When VA_48 is clear, 43-  
bit virtual address format is used, and when VA_48 is set,  
48-bit virtual address format is used. The effect of this bit on  
the IVA_FORM register is identical to the effect of  
VA_CTL[VA_48] on the VA_FORM register. See Section  
When VA_48 is set, the sign extension checkers generate an  
ACV if va[63:0] SEXT(va[47:0]). When VA_48 is clear,  
the sign extension checkers generate an ACV if va[63:0] ≠  
SEXT(va[42:0]).  
This bit also affects DTB_DOUBLE traps. If set, the DTB  
double miss traps vector to the DTB_DOUBLE_4 entry  
point.  
DTB_DOUBLE PALcode flow selection is not affected by  
VA_CTL[VA_48].  
SL_RCV  
SL_XMIT  
HWE  
[14]  
[13]  
[12]  
RO  
WO  
RW,0  
When set, drives a value on SromClk_H. See Section 11.2.  
If set, allow PALRES intructions to be executed in kernel  
mode. Note that modification of the ITB while in kernel  
mode/native mode may cause UNPREDICTABLE behavior.  
BP_MODE[1:0]  
[11:10] RW,0  
Branch Prediction Mode Selection.  
BP_MODE[1], if set, forces all branches to be predicted to  
fall through. If clear, the dynamic branch predictor is chosen.  
BP_MODE[0]. If set, the dynamic branch predictor chooses  
local history prediction. If clear, the dynamic branch predic-  
tor chooses local or global prediction based on the state of  
the chooser.  
SBE[1:0]  
[9:8]  
RW,0  
Stream Buffer Enable.  
The value in this bit field specifies the number of Istream  
buffer prefetches (besides the demand-fill) that are launched  
after an Icache miss. If the value is zero, only demand  
requests are launched.  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–17  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Ibox IPRs  
Table 5–11 Ibox Control Register Fields Description (Continued)  
Name  
Extent  
Type  
Description  
SDE[1:0]  
[7:6]  
RW,0  
PALshadow Register Enable.  
Enables access to the PALshadow registers. If SDE[1] is set,  
R4-R7 and R20-R23 are used as PALshadow registers.  
SDE[0] does not affect 21264/EV68A operation.  
SPE[2:0]  
[5:3]  
[2:1]  
RW,0  
RW,3  
Super Page Mode Enable.  
Identical to the SPE bits in the Mbox M_CTL SPE[2:0]. See  
IC_EN[1:0]  
Icache Set Enable.  
At least one set must be enabled. The entire cache may be  
enabled by setting both bits. Zero, one, or two Icache sets  
can be enabled.  
This bit does not clear the Icache, but only disables fills to  
the affected set.  
SPCE  
[0]  
RW,0  
System Performance Counting Enable.  
Enables performance counting for the entire system if indi-  
vidual counters (PCTR0 or PCTR1) are enabled by setting  
PCT0_EN or PCT1_EN, respectively.  
Performance counting for individual processes can be  
enabled by setting PCTX[PPCE]. See Section 5.2.21 for  
more information.  
See Section 6.10 for information about performance count-  
ing.  
5.2.16 Ibox Status Register – I_STAT  
The Ibox status register (I_STAT) is a read/write-1-to-clear register that contains Ibox  
status information.  
Usage of I_STAT in performance monitoring is described in Section 6.10.  
Figure 5–23 shows the Ibox status register.  
5–18 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Ibox IPRs  
Figure 5–23 Ibox Status Register  
63  
41 40 39 38 37  
34 33 32 30 29 28 27  
0
Reserved  
MIS  
TRP  
LS0  
TRAP TYPE[3:0]  
ICM  
OVR[2:0]  
PAR  
LAM  
Reserved  
LK99-0031A  
Table 5–12 describes the Ibox status register fields.  
Table 5–12 Ibox Status Register Fields Description  
Name  
Reserved  
MIS  
Extent  
[63:41]  
[40]  
Type  
RO  
Description  
Reserved for COMPAQ.  
ProfileMe Mispredict Trap.  
RO  
If the I_STAT[TRP] bit is set, this bit indicates that the profiled instruc-  
tion caused a mispredict trap. JSR/JMP/RET/COR or HW_JSR/  
HW_JMP/HW_RET/HW_COR mispredicts do not set this bit but can be  
recognized by the presence of one of these instructions at the PMPC loca-  
tion with the I_STAT[TRP] bit set. This identification is exact in all cases  
except error condition traps. Hardware corrected Icache parity or Dcache  
ECC errors, and machine check traps can occur on any instruction in the  
pipeline.  
TRP  
LS0  
[39]  
[38]  
RO  
RO  
ProfileMe Trap.  
This bit indicates that the profiled instruction caused a trap. The trap type  
field, PMPC register, and instruction at the PMPC location are needed to  
distinguish all trap types.  
ProfileMe Load-Store Order Trap.  
If the profiled instruction caused a replay trap, this bit indicates that the  
precise trap cause was an Mbox load-store order replay trap.  
If clear, this bit indicates that the replay trap was any one of the follow-  
ing:  
Mbox load-load order  
Mbox load queue full  
Mbox store queue full  
Mbox wrong size trap (such as, STL LDQ)  
Mbox Bcache alias (2 physical addresses map to same Bcache line)  
Mbox Dcache alias (2 physical addresses map to same Dcache line)  
Icache parity error  
Dcache ECC error  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–19  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Ibox IPRs  
Table 5–12 Ibox Status Register Fields Description (Continued)  
Name  
Extent  
Type  
Description  
TRAP  
[37:34]  
RO  
ProfileMe Trap Types.  
TYPE[3:0]  
If the profiled instruction caused a trap (indicated by I_STAT[TRP]), this  
field indicates the trap type as listed here:  
Value  
0
Trap Type  
Replay  
1
Invalid (unused)  
2
3
4
5
DTB Double miss (3 level page tables)  
DTB Double miss (4 level page tables)  
Floating point disabled  
Unaligned Load/Store  
DTB Single miss  
6
7
Dstream Fault  
8
OPCDEC  
9
Invalid (use PMPC, described below)  
Machine Check  
Invalid (use PMPC, described below)  
Arithmetic  
Invalid (use PMPC, described below)  
MT_FPCR  
Reset  
10  
11  
12  
13  
14  
15  
Traps due to ITB miss, Istream access violation, or interrupts are not  
reported in the trap type field because they do not cause pipeline aborts.  
Instead, these traps cause pipeline redirection and can be distinguished by  
examining the PMPC value for the presence of the corresponding PAL-  
code entry offset addresses indicated below. In these cases, the ProfileMe  
interrupt will normally be delivered when exiting the trap PALcode flow  
and the EXC_ADDR register will contain the original PC that encoun-  
tered the redirect trap.  
PMPC[14:0]  
0581  
Trap  
ITB miss  
0481  
0681  
Istream Access Violation  
Interrupt  
ICM  
[33]  
RO  
RO  
ProfileMe Icache Miss.  
This bit indicates that the profiled instruction was contained in an aligned  
4-instruction Icache fetch block that requested a new Icache fill stream.  
OVR[2:0]  
[32:30]  
ProfileMe Counter 0 Overcount.  
This bit indicates a value (0-7) that must be subtracted from the counter 0  
result to obtain an accurate count of the number of instructions retired in  
the interval beginning three cycles after the profiled instruction reaches  
pipeline stage 2 and ending four cycles after the profiled instruction is  
retired.  
PAR  
[29]  
W1C  
Icache Parity Error.  
This bit indicates that the Icache encountered a parity error on instruction  
fetch. When a parity error is detected, the Icache is flushed, a replay trap  
back to the address of the error instruction is generated, and a correctable  
read interrupt is requested. See also I_STAT[LAM].  
LAM  
[28]  
RO  
RO  
When set, indicates that an error in the line predictor set I_STAT[PAR].  
I_STAT[LAM] is set only when I_STAT[PAR] is set; is subsequently  
cleared when I_STAT[PAR] is cleared (when a 1 is written to  
I_STAT[PAR] ).  
Reserved  
[28:0]  
Reserved for COMPAQ.  
5–20 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Ibox IPRs  
5.2.17 Icache Flush Register – IC_FLUSH  
The Icache flush register (IC_FLUSH) is a pseudo register. Writing to this register  
invalidates all Icache blocks. The cache is flushed when the next HW_RET/STALL  
instruction is retired. See Section D.20 for more information.  
5.2.18 Icache Flush ASM Register – IC_FLUSH_ASM  
The Icache flush ASM register (IC_FLUSH_ASM) is a pseudo register. Writing to this  
register invalidates all Icache blocks with their ASM bit clear.  
5.2.19 Clear Virtual-to-Physical Map Register – CLR_MAP  
The clear virtual-to-physical map register (CLR_MAP) is a pseudo register that, when  
written, results in the clearing of the current map of virtual to physical registers. This  
register must only be written after there are no register-borne dependencies present and  
there are no unretired instructions. See an example in the PALcode restrictions.  
5.2.20 Sleep Mode Register – SLEEP  
The sleep mode register (SLEEP) is a pseudo register that, when written, results in the  
PLL speed being reduced and the chip entering a low-power mode. This register must  
only be written after a sequence of code has been run which saves all necessary state to  
DRAM, flushes the caches, and unmasks certain interrupts so the chip can be woken up.  
See Section 7.3 for details.  
5.2.21 Process Context Register – PCTX  
The process context register (PCTX) contains information associated with the context  
of a process. Any combination of the bit fields within this register may be written with  
a single HW_MTPR instruction. When bits [7:6] of the IPR index field of a  
HW_MTPR instruction contain the value 012, this register is selected. Bits [4:0] of the  
IPR index indicate which bit fields are to be written. Usage of PCTX in performance  
monitoring is described in Section 6.10.  
Table 5–13 lists the correspondence between IPR index bits and register fields.  
Table 5–13 IPR Index Bits and Register Fields  
IPR Index Bit  
Register Field  
ASN  
0
1
2
3
4
ASTER  
ASTRR  
PPCE  
FPE  
A HW_MFPR from this register returns the values in all of its component bit fields.  
Figure 5–24 shows the process context register.  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–21  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                     
Ibox IPRs  
Figure 5–24 Process Context Register  
63  
47 46  
39 38  
13 12  
9
8
5
4
3
2
1
0
ASN[7:0]  
ASTRR[3:0]  
ASTER[3:0]  
FPE  
PPCE  
LK99-0032A  
Table 5–14 describes the process context register fields.  
Table 5–14 Process Context Register Fields Description  
Name  
Extent  
Type  
Description  
Reserved  
ASN[7:0]  
Reserved  
ASTRR[3:0]  
[63:47]  
[46:39] RW  
Address space number.  
[38:13]  
[12:9]  
RW  
AST request register—used to request AST interrupts in  
each of the four processor modes.  
To generate a particular AST interrupt, its corresponding  
bits in ASTRR and ASTER must be set, along with the  
ASTE bit in IER.  
Further, the value of the current mode bits in the PS register  
must be equal to or higher than the value of the mode associ-  
ated with the AST request.  
The bit order with this field is:  
User Mode  
12  
11  
10  
9
Supervisor Mode  
Executive Mode  
Kernel Mode  
ASTER[3:0]  
[8:5]  
[4:3]  
RW  
AST enable register—used to individually enable each of  
the four AST interrupt requests.  
The bit order with this field is:  
User Mode  
8
7
6
5
Supervisor Mode  
Executive Mode  
Kernel Mode  
Reserved  
5–22 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Ibox IPRs  
Table 5–14 Process Context Register Fields Description (Continued)  
Name  
Extent  
Type  
Description  
FPE  
[2]  
RW,1  
Floating-point enable—if clear, floating-point instructions  
generate FEN exceptions. This bit is set by hardware on  
reset.  
PPCE  
[1]  
RW  
Process performance counting enable.  
Enables performance counting for an individual process  
with counters PCTR0 or PCTR1, which are enabled by set-  
ting PCT0_EN or PCT1_EN, respectively.  
Performance counting for the entire system can be enabled  
by setting I_CTL[SPCE]. See Section 5.2.15 for more infor-  
mation.  
See Section 6.10 for information about performance count-  
ing.  
Reserved  
[0]  
5.2.22 Performance Counter Control Register – PCTR_CTL  
The performance counter control register (PCTR_CTL) is a read-write register that  
controls the function of the performance counters for either aggregate counting or Pro-  
fileMe sampling counting.  
Usage of PCTR_CTL in performance monitoring is described in Section 6.10.  
Figure 5–25 shows the performance counter control register.  
Figure 5–25 Performance Counter Control Register  
63  
48 47  
28 27 26 25  
6
5
4
3
2
1
0
SEXT(PCTR0_CTL[47])  
PCTR0[19:0]  
PM_STALLED  
PM_KILLED_BM  
PCTR1[19:0]  
SL0  
SL1[1:0]  
VAL  
TAK  
LK99-0034A  
Table 5–15 describes the performance counter control register fields.  
Table 5–15 Performance Counter Control Register Fields Description  
Name  
Extent Type Description  
SEXT(PCTR0_CTL[47]) [63:48] RO When read, this field is sign extended from PCTR_CTL[47]. Writes  
to this field are ignored.  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–23  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Ibox IPRs  
Table 5–15 Performance Counter Control Register Fields Description (Continued)  
Name  
Extent Type Description  
PCTR0[19:0]  
[47:28] RW Performance counter 0.  
PCTR0 is enabled by I_CTL[PCT0_EN] and either I_CTL[SPCE] or  
PCTX[PPCE].  
In Aggregate mode:  
When enabled, PCTR0 is incremented at each cycle by the selected  
input. (See Section 6.10.2 for more information.)  
On overflow, if enabled by IER_CM[PCEN0],  
ISUM[PC0] is set and an interrupt is triggered.  
In ProfileMe mode:  
On overflow, a count window is opened and PCTR0 is incremented  
as described in Section 6.10.3. When the count window overflows, if  
enabled by IER_CM[PCEN0], ISUM[PC0] is set and an interrupt is  
triggered.  
See Table 5–16 for counter modes.  
PM_STALLED  
PM_KILLED_BM  
PCTR1[19:0]  
[27]  
[26]  
RO The profiled instruction stalled for at least one cycle between the  
fetch and map stages of the pipeline.  
RO The profiled instruction was killed during or before the cycle in  
which it was mapped.  
[25:6] RW Performance counter 1.  
PCTR1 is enabled by I_CTL[PCT1_EN] and either I_CTL[SPCE] or  
PCTX[PPCE].  
In Aggregate mode:  
When enabled, PCTR1 is incremented at each cycle by the selected  
input. (See Section 6.10.2 for more information.)  
On overflow, if enabled by IER_CM[PCEN1], ISUM[PC1] is set and  
an interrupt is triggered.  
In ProfileMe mode, how PCTR1 is incremented is described in Sec-  
In either case, PCTR1 is incremented no more than 1 per cycle.  
See Table 5–16 for counter modes.  
Reserved  
SL0  
[5]  
[4]  
RO Reads to this field return zero. Writes to this field are ignored.  
RW Selector 0.  
0 = Aggregate counting mode  
1 = ProfileMe mode  
See Table 5–16 for more information.  
SL1[1:0]  
[3:2]  
RW Selector 1.  
Selects counter PCTR0 and PCTR1 modes. See Table 5–16 for more  
information.  
5–24 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Mbox IPRs  
Table 5–15 Performance Counter Control Register Fields Description (Continued)  
Name  
Extent Type Description  
VAL  
[1]  
RO Profiled instruction valid.  
When set, indicates a nontrapping profiled instruction retired valid.  
When clear, indicates that a nontrapping profiled instruction was  
killed after the cycle in which it was mapped. Valid retire/abort status  
for a trapping profiled instruction is determined by the trap type (see  
I_STAT[TRAP_TYPE]).  
TAK  
[0]  
RO ProfileMe conditional branch taken.  
Indicates program branch direction, if the profiled instruction is a  
conditional branch.  
Table 5–16 Performance Counter Control Register Input Select Fields  
SL0[4] SL1[3:2]  
Mode  
PCTR0  
Retired instructions Cycle counting  
Cycle counting Not defined  
Retired instructions Bcache miss or long latency probes  
Cycle counting Mbox replay traps  
Retired instructions Cycle counting  
Cycle counting Inum retire delay  
Retired instructions Bcache miss or long latency probes  
Cycle counting Mbox replay traps  
PCTR1  
0
0
0
0
1
1
1
1
00  
01  
10  
11  
00  
01  
10  
11  
Aggregate  
Aggregate  
Aggregate  
Aggregate  
ProfileMe  
ProfileMe  
ProfileMe  
ProfileMe  
5.3 Mbox IPRs  
This section describes the internal processor registers that control Mbox functions.  
5.3.1 DTB Tag Array Write Registers 0 and 1 – DTB_TAG0, DTB_TAG1  
The DTB tag array write registers 0 and 1 (DTB_TAG0 and DTB_TAG1) are write-  
only registers through which the two memory pipe DTB tag arrays are written. Write  
transactions to DTB_TAG0 and DTB_TAG1 write data to registers outside the DTB  
arrays. When write transactions to the corresponding DTB_PTE registers are retired,  
the contents of both the DTB_TAG and DTB_PTE registers are written into their  
respective DTB arrays, at locations determined by the round-robin allocation algorithm.  
Figure 5–26 shows the DTB tag array write registers 0 and 1.  
Figure 5–26 DTB Tag Array Write Registers 0 and 1  
63  
48 47  
13 12  
0
VA[47:13]  
LK99-0035A  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–25  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Mbox IPRs  
5.3.2 DTB PTE Array Write Registers 0 and 1 – DTB_PTE0, DTB_PTE1  
The DTB PTE array write registers 0 and 1 (DTB_PTE0 and DTB_PTE1) are registers  
through which the DTB PTE arrays are written. The entries to be written are chosen by  
a round-robin allocation scheme. Write transactions to the DTB_PTE registers, when  
retired, result in both the DTB_TAG and DTB_PTE arrays being written. Figure 5–27  
shows the DTB PTE array write registers 0 and 1.  
Figure 5–27 DTB PTE Array Write Registers 0 and 1  
63 62  
32 31  
16 15 14 13 12 11 10  
9
8
7
6
5
4
3
2
1
0
PA[43:13]  
UWE  
SWE  
EWE  
KWE  
URE  
SRE  
ERE  
KRE  
GH[1:0]  
ASM  
FOW  
FOR  
LK99-0036A  
5.3.3 DTB Alternate Processor Mode Register – DTB_ALTMODE  
The DTB alternate processor mode register (DTB_ALTMODE) is a write-only register  
whose contents specify the alternate processor mode used by some HW_LD and  
HW_ST instructions. Figure 5–28 shows the DTB alternate processor mode register.  
Figure 5–28 DTB Alternate Processor Mode Register  
63  
2
1
0
ALT_MODE[1:0]  
LK99-0037A  
Table 5–17 describes the DTB_ALTMODE register fields.  
Table 5–17 DTB Alternate Processor Mode Register Fields Description  
Name  
Extent  
Type  
Description  
Reserved  
[63:2]  
5–26 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Mbox IPRs  
Table 5–17 DTB Alternate Processor Mode Register Fields Description (Continued)  
Name  
Extent  
Type  
Description  
ALT_MODE[1:0]  
[1:0]  
WO  
Alt_Mode:  
ALT_MODE[1:0] Mode  
00  
01  
10  
11  
Kernel  
Executive  
Supervisor  
User  
5.3.4 Dstream TB Invalidate All Process (ASM=0) Register – DTB_IAP  
The Dstream translation buffer invalidate all process (ASM=0) register (DTB_IAP) is a  
write-only pseudo register. Write transactions to this register invalidate all DTB entries  
in which the address space match (ASM) bit is clear.  
5.3.5 Dstream TB Invalidate All Register – DTB_IA  
The Dstream translation buffer invalidate all register (DTB_IA) is a write-only pseudo  
register. Write transactions to this register invalidate all DTB entries and reset the DTB  
not-last-used pointer to its initial state.  
5.3.6 Dstream TB Invalidate Single Registers 0 and 1 – DTB_IS0,1  
The Dstream translation buffer invalidate single registers (DTB_IS0 and DTB_IS1) are  
write-only pseudo registers through which software may invalidate a single entry in the  
DTB arrays. Writing a virtual page number to one of these registers invalidates any  
DTB entry in the corresponding memory pipeline which meets one of the following cri-  
teria:  
The DTB entry’s virtual page number matches DTB_IS[47:13] and its ASN field  
matches DTB_ASN[63:56].  
The DTB entry’s virtual page number matches DTB_IS[47:13] and its ASM bit is  
set.  
Figure 5–29 shows the Dstream translation buffer invalidate single registers.  
Figure 5–29 Dstream Translation Buffer Invalidate Single Registers  
63  
48 47  
13 12  
0
VA[47:13]  
LK99-0015A  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–27  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Mbox IPRs  
5.3.7 Dstream TB Address Space Number Registers 0 and 1 – DTB_ASN0,1  
The Dstream translation buffer address space number registers (DTB_ASN0 and  
DTB_ASN1) are write-only registers that should be written with the address space  
number (ASN) of the current process. Figure 5–30 shows the Dstream translation buffer  
address space number registers 0 and 1.  
Figure 5–30 Dstream Translation Buffer Address Space Number Registers 0 and 1  
63  
56 55  
0
ASN[7:0]  
LK99-0038A  
5.3.8 Memory Management Status Register – MM_STAT  
The memory management status register (MM_STAT) is a read-only register.  
When a Dstream TB miss or fault occurs, information about the error is latched in  
MM_STAT. MM_STAT is not updated when a LD_VPTE gets a DTB miss instruction.  
Figure 5–31 shows the memory management status register.  
Figure 5–31 Memory Management Status Register  
63  
11 10  
9
4
3
2
1
0
DC_TAG_PERR  
OPCODE[5:0]  
FOW  
FOR  
ACV  
WR  
LK99-0039A  
Table 5–18 describes the memory management status register fields.  
Table 5–18 Memory Management Status Register Fields Description  
Name  
Extent  
[63:11]  
[10]  
Type  
Description  
Reserved  
DC_TAG_PERR  
RO  
This bit is set when a Dcache tag parity error occurred during the  
initial tag probe of a load or store instruction. The error created a  
synchronous fault to the D_FAULT PALcode entry point and is  
correctable. The virtual address associated with the error is avail-  
able in the VA register.  
OPCODE[5:0]  
FOW  
[9:4]  
[3]  
RO  
RO  
Opcode of the instruction that caused the error.  
HW_LD is displayed as 3 and HW_ST is displayed as 7.  
This bit is set when a fault-on-write error occurs during a write  
transaction and PTE[FOW] was set.  
5–28 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Mbox IPRs  
Table 5–18 Memory Management Status Register Fields Description (Continued)  
Name  
Extent  
Type  
Description  
FOR  
[2]  
RO  
This bit is set when a fault-on-read error occurs during a read  
transaction and PTE[FOR] was set.  
ACV  
WR  
[1]  
[0]  
RO  
RO  
This bit is set when an access violation occurs during a transac-  
tion. Access violations include a bad virtual address.  
This bit is set when an error occurs during a write transaction.  
Note:  
The Ra field of the instruction that triggered the error can be obtained from  
the Ibox EXC_SUM register.  
5.3.9 Mbox Control Register – M_CTL  
The Mbox control register (M_CTL) is a write-only register. Its contents are cleared by  
chip reset. Figure 5–32 shows the Mbox control register.  
Figure 5–32 Mbox Control Register  
63  
6
5
4
3
1
0
SMC[1:0]  
SPE[2:0]  
LK99-0040A  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–29  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Mbox IPRs  
Table 5–19 describes the Mbox control register fields.  
Table 5–19 Mbox Control Register Fields Description  
Name  
Extent  
Type  
Description  
Reserved  
[63:6]  
SMC[1:0] [5:4]  
WO,0  
Speculative miss control (see Section 4.6.4).  
Bits Meaning When Set  
00  
01  
Allow full-time speculation.  
Force full-time conservative mode. Make retries wait until retire,  
force all new stores that do not hit dirty to retry, and cause prefetches  
with modify intent (see Section 2.6.2) to behave like normal  
prefetches.  
10  
11  
Place 21264/EV68A in periodic conservative mode by using an 8-bit  
counter to add by 4 each time a branch mispredict happens and sub-  
tract by one each time a conditional branch retires. Enter conserva-  
tive mode if the MSB of the counter is set.  
Place 21264/EV68A in periodic conservative mode by using an 8-bit  
countner to add by 8 each time a branch mispredict happens and sub-  
tract by one each time a conditional branch retires. Enter conserva-  
tive mode if the MSB of the counter is set.  
SPE[2:0]  
[3:1]  
WO,0  
Superpage mode enables.  
SPE[2], when set, enables superpage mapping when VA[47:46] = 2. In this  
mode, VA[43:13] are mapped directly to PA[43:13] and VA[45:44] are  
ignored.  
SPE[1], when set, enables superpage mapping when VA[47:41] = 7E . In  
16  
this mode, VA[40:13] are mapped directly to PA[40:13] and PA[43:41] are  
copies of PA[40] (sign extension).  
SPE[0], when set, enables superpage mapping when VA[47:30] = 3FFFE  
.
16  
In this mode, VA[29:13] are mapped directly to PA[29:13] and PA[43:30] are  
cleared.  
Reserved  
[0]  
Note:  
Superpage accesses are only allowed in kernel mode. Non-kernel mode ref-  
erences to superpages result in access violations.  
5.3.10 Dcache Control Register – DC_CTL  
The Dcache control register (DC_CTL) is a write-only register that controls Dcache  
activity. The contents of DC_CTL are initialized by chip reset as indicated. Figure 5–33  
shows the Dcache control register.  
5–30 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Mbox IPRs  
Figure 5–33 Dcache Control Register  
63  
8
7
6
5
4
3
2
1
0
DCDAT_ERR_EN  
DCTAG_PAR_EN  
F_BAD_DECC  
F_BAD_TPAR  
F_HIT  
SET_EN[1:0]  
LK99-0041A  
Table 5–20 describes the Dcache control register fields.  
Table 5–20 Dcache Control Register Fields Description  
Name  
Extent  
[63:8]  
[7]  
Type  
Description  
Reserved  
DCDAT_ERR_EN  
DCTAG_PAR_EN  
F_BAD_DECC  
WO,0  
WO,0  
WO,0  
Dcache data ECC and parity error enable.  
Dcache tag parity enable.  
[6]  
[5]  
Force Bad Data ECC. When set, ECC data is not written into  
the cache along with the block that is loaded by a fill or store.  
Writing data that is different from that already in the block will  
cause bad ECC to be present. Since the old ECC value will  
remain, the ECC will be bad.  
F_BAD_TPAR  
[4]  
WO,0  
Force Bad Tag Parity. When set, this bit causes bad tag parity to  
be put into the Dcache tag array during Dcache fill operations.  
Reserved  
F_HIT  
[3]  
[2]  
WO,0  
Force Hit. When set, this bit causes all memory space load and  
store instructions to hit in the Dcache, independent of the  
Dcache tag address compare. F_HIT does not force the status of  
the block to register as DIRTY (the tag status bits are still con-  
sulted), so stores may still generate offchip activity.  
In this mode, only one of the two sets may be enabled, and tag  
parity checking must be disabled (set DCTAG_PER_EN to  
zero).  
SET_EN[1:0]  
[1:0]  
WO,3  
Dcache Set Enable. At least one set must be enabled.  
5.3.11 Dcache Status Register – DC_STAT  
The Dcache status register (DC_STAT) is a read-write register. If a Dcache tag parity  
error or data ECC error occurs, information about the error is latched in this register.  
Figure 5–34 shows the Dcache status register.  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–31  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Cbox CSRs and IPRs  
Figure 5–34 Dcache Status Register  
63  
5
4
3
2
1
0
SEO  
ECC_ERR_LD  
ECC_ERR_ST  
TPERR_P1  
TPERR_P0  
LK99-0042A  
Table 5–21 describes the Dcache status register fields.  
Table 5–21 Dcache Status Register Fields Description  
Name  
Reserved  
SEO  
Extent  
[63:5]  
[4]  
Type  
Description  
W1C  
Second error occurred. When set, this bit indicates that a second  
Dcache store ECC error occurred within 6 cycles of the previous  
Dcache store ECC error.  
ECC_ERR_LD [3]  
ECC_ERR_ST [2]  
W1C  
W1C  
W1C  
ECC error on load. When set, this bit indicates that a single-bit ECC  
error occurred while processing a load from the Dcache or any fill.  
ECC error on store. When set, this bit indicates that an ECC error  
occurred while processing a store.  
TPERR_P1  
TPERR_P0  
[1]  
[0]  
Tag parity error — pipe 1. When set, this bit indicates that a Dcache  
tag probe from pipe 1 resulted in a tag parity error. The error is uncor-  
rectable and results in a machine check.  
W1C  
Tag parity error — pipe 0. When set, this bit indicates that a Dcache  
tag probe from pipe 0 resulted in a tag parity error. The error is uncor-  
rectable and results in a machine check.  
5.4 Cbox CSRs and IPRs  
This section describes the Cbox CSRs and IPRs.  
The Cbox configuration registers are split into three shift register chains:  
The hardware allocates 367 bits for the WRITE_ONCE chain, of which the 21264/  
EV68A uses 304 bits. During hardware reset (after BiST), 367 bits are always  
shifted into the WRITE_ONCE chain from the SROM, MSB first, so that the  
unused bits are shifted out the end of the WRITE_ONCE chain.  
A 36-bit WRITE_MANY chain that is loaded using MTPR instructions to the Cbox  
data register. Six bits of information are shifted into the WRITE_MANY chain dur-  
ing each write transaction to the Cbox data register.  
A 60-bit Cbox ERROR_REG chain that is read by using MFFR instructions from  
the Cbox data register in combination with MTPR instructions to the Cbox shift  
register. Each write transaction to the Cbox shift register destructively shifts six bits  
of information out of the Cbox error register.  
5–32 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Cbox CSRs and IPRs  
5.4.1 Cbox Data Register – C_DATA  
Figure 5–35 shows the Cbox data register.  
Figure 5–35 Cbox Data Register  
63  
6
5
0
C_DATA[5:0]  
LK99-0043A  
Table 5–22 describes the Cbox data register fields.  
Table 5–22 Cbox Data Register Fields Description  
Name  
Extent  
[63:6]  
[5:0]  
Type  
Description  
Reserved  
C_DATA[5:0]  
RW  
Cbox data register. A HW_MTPR instruction to this register causes six  
bits of data to be placed into a serial shift register. When the  
HW_MTPR instruction is retired, the data is shifted into the Cbox. After  
the Cbox shift register has been accessed, performing a HW_MFPR  
instruction to this register will return six bits of data.  
5.4.2 Cbox Shift Register – C_SHFT  
Figure 5–36 shows the Cbox shift register.  
Figure 5–36 Cbox Shift Register  
63  
1
0
C_SHIFT  
LK99-0044A  
Table 5–23 describes the Cbox shift register fields.  
Table 5–23 Cbox Shift Register Fields Description  
Name  
Extent  
[63:1]  
[0]  
Type  
Description  
Reserved  
C_SHIFT  
W1  
Writing a 1 to this register bit causes six bits of Cbox IPR data to shift into  
the Cbox data register. Software can then use a HW_MFPR read operation  
to the Cbox data register to read the six bits of data.  
5.4.3 Cbox WRITE_ONCE Chain Description  
The WRITE_ONCE chain order is contained in Table 5–24. In the table:  
Many CSRs are duplicated for ease of hardware implementation. These CSRs are  
indicated in italics. They must be written with values that are identical to the values  
written to the original CSRs.  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–33  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                     
Cbox CSRs and IPRs  
Only a brief description of each CSR is given. The functional description of these  
CSRs is contained in Chapter 4.  
The order of multibit vectors is [MSB:LSB], so the LSB is first bit in the Cbox  
chain.  
Table 5–24 describes the Cbox WRITE_ONCE chain order from LSB to MSB.  
Table 5–24 Cbox WRITE_ONCE Chain Order  
Cbox WRITE_ONCE Chain  
32_BYTE_IO[0]  
Description  
Enable 32_BYTE I/O mode.  
Asserted when Bcache is at 1.5x ratio.  
BC_CLK_RATIO[1]  
SKEWED_FILL_MODE[0]  
Must be asserted for Bcache 1.5x ratio; for maximum performance,  
can also be asserted for 3.0x and 3.5x ratios.  
DCVIC_THRESHOLD[7:0]  
Threshold of the number of Dcache victims that will accumulate  
before streamed write transactions to the Bcache are initiated. The  
Cbox can accumulate up to six victims for streamed Dcache pro-  
cessing. This register is programmed with the decoded value of the  
threshold count.  
BC_CLEAN_VICTIM[0]  
SYS_BUS_SIZE[1:0]  
Enable clean victims to the system interface.  
Size of SysAddOut and SysAddOut buses.  
Indicates system bus format.  
SYS_BUS_FORMAT[0]  
SYS_CLK_RATIO[4:1]  
Speed of system bus.  
Code  
0001  
0010  
0100  
1000  
Multiplier  
1.5X  
2.0X  
2.5X  
3.0X  
DUP_TAG_ENABLE[0]  
PRB_TAG_ONLY[0]  
Enable duplicate tag mode in the 21264/EV68A.  
Enable probe-tag only mode in the 21264/EV68A.  
When asserted, disables fast data movement mode.  
Enables RdVictim mode on the pins.  
FAST_MODE_DISABLE[0]  
BC_RDVICTIM[0]  
BC_CLEAN_VICTIM[0]  
RDVIC_ACK_INHIBIT  
Duplicate CSR.  
Enable inhibition of incrementing acknowledge counter for RdVic  
commands.  
SYSBUS_MB_ENABLE  
SYSBUS_ACK_LIMIT[0:4]  
SYSBUS_VIC_LIMIT[0:2]  
BC_CLEAN_VICTIM[0]  
BC_WR_WR_BUBBLE[0]  
BC_RD_WR_BUBBLES[0:5]  
BC_RD_RD_BUBBLE[0]  
BC_SJ_BANK_ENABLE  
Enable MB commands offchip.  
Sysbus acknowledge limit CSR.  
Limit for victims.  
Duplicate CSR.  
Write to write GCLK bubble.  
Read to write GCLK bubbles for the Bcache interface.  
Read to read GCLK bubble for banked Bcaches.  
Enable bank mode for Bcache.  
5–34 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                                       
Cbox CSRs and IPRs  
Table 5–24 Cbox WRITE_ONCE Chain Order (Continued)  
Cbox WRITE_ONCE Chain  
BC_WR_RD_BUBBLES[0:3]  
DUP_TAG_ENABLE  
Description  
Write to read GCLK bubbles.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
SKEWED_FILL_MODE  
BC_RDVICTIM  
SKEWED_FILL_MODE  
BC_RDVICTIM  
BC_CLEAN_VICTIM  
DUP_TAG_MODE  
SKEWED_FILL_MODE  
ENABLE_PROBE_CHECK  
SPEC_READ_ENABLE[0]  
SKEWED_FILL_MODE  
SKEWED_FILL_MODE  
MBOX_BC_PRB_STALL  
Enable error checking during probe processing.  
Enable speculative references to the system port.  
Duplicate CSR.  
Duplicate CSR.  
Must be asserted when BC_RATIO = 4.0X, 5.0X, 6.0X, 7.0X, or  
8.0X.  
BC_LAT_DATA_PATTERN[0:31]  
BC_LAT_TAG_PATTERN[0:23]  
BC_RDVICTIM  
Bcache data latency pattern.  
Bcache tag latency pattern.  
Duplicate CSR.  
ENABLE_STC_COMMAND[0]  
BC_LATE_WRITE_NUM[0:2]  
Enable STx_C instructions to the pins.  
Number of Bcache clocks to delay the data for Bcache write com-  
mands.  
BC_CPU_LATE_WRITE_NUM[0:1]  
Number of GCLK cycles to delay the Bcache clock/data from  
index.  
BC_BURST_MODE_ENABLE[0]  
BC_PENTIUM_MODE[0]  
BC_CLK_RATIO[1]  
Burst mode enable signal.  
Enable Pentium mode RAM behavior.  
Duplicate CSR.  
BC_FRM_CLK[0]  
Force all Bcache transactions to start on rising edges of the A phase  
of a GCLK.  
BC_CLK_DELAY[0:1]  
BC_DDMR_ENABLE[0]  
Delay of Bcache clock for 0,0,1,2 GCLK phases.  
Enables the rising edge of the Bcache forwarded clock (always  
enabled).  
BC_DDMF_ENABLE[0]  
Enable the falling edge of the Bcache forwarded clock. (always  
enabled).  
BC_LATE_WRITE_UPPER[0]  
Asserted when (BC_LATE_WRITE_NUM > 3) or  
((BC_LATE_WRITE_NUM = 3) and  
(BC_CPU_LATE_WRITE_NUM > 1)).  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–35  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                               
Cbox CSRs and IPRs  
Table 5–24 Cbox WRITE_ONCE Chain Order (Continued)  
Cbox WRITE_ONCE Chain  
Description  
BC_TAG_DDM_FALL_EN[0]  
Enables the update of the 21264/EV68A Bcache tag outputs based  
on the falling edge of the forwarded clock.  
BC_TAG_DDM_RISE_EN[0]  
Enables the update of the 21264/EV68A Bcache tag outputs based  
on the rising edge of the forwarded clock.  
BC_CLKFWD_ENABLE[0]  
BC_RCV_MUX_CNT_PRESET[0:1]  
BC_LATE_WRITE_UPPER[0]  
SYS_DDM_FALL_EN[0]  
Enable clock forwarding on the Bcache interface.  
Initial value for the Bcache clock forwarding unload pointer FIFO.  
Duplicate CSR.  
Enables the update of the 21264/EV68A system outputs based on  
the falling edge of the system forwarded clock.  
SYS_DDM_RISE_EN[0]  
Enables the update of the 21264/EV68A system outputs based on  
the rising edge of the system forwarded clock.  
SYS_CLKFWD_ENABLE[0]  
SYS_RCV_MUX_CNT_PRESET[0:1]  
SYS_CLK_DELAY[0:1]  
Enables clock forwarding on the system interface.  
Initial value for the system clock forwarding unload pointer FIFO.  
Delay of 0 to 2 phases between the forwarded clock out and  
address/data.  
SYS_DDMR_ENABLE[0]  
SYS_DDMF_ENABLE[0]  
BC_DDM_FALL_EN[0]  
BC_DDM_RISE_EN[0]  
Enables the rising edge of the system forwarded clock (always  
enabled).  
Enables the falling edge of the system forwarded clock (always  
enabled).  
Enables update of data/address on the rising edge of the system for-  
warded clock.  
Enables the update of data/address on the falling edge of the system  
forwarded clock.  
BC_CLKFWD_ENABLE  
BC_RCV_MUX_CNT_PRESET[0:1]  
BC_CLK_DELAY[0:1]  
BC_DDMR_ENABLE  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
BC_DDMF_ENABLE  
SYS_DDM_FALL_EN  
SYS_DDM_RISE_EN  
SYS_CLKFWD_ENABLE  
SYS_RCV_MUX_CNT_PRESET[0:1]  
SYS_CLK_DELAY[0:1]  
SYS_DDMR_ENABLE  
SYS_DDMF_ENABLE  
BC_DDM_FALL_EN  
BC_DDM_RISE_EN  
5–36 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                         
Cbox CSRs and IPRs  
Table 5–24 Cbox WRITE_ONCE Chain Order (Continued)  
Cbox WRITE_ONCE Chain  
BC_CLKFWD_ENABLE  
BC_RCV_MUX_CNT_PRESET[0:1]  
SYS_DDM_FALL_EN  
Description  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
Duplicate CSR.  
SYS_DDM_RISE_EN  
SYS_CLKFWD_ENABLE  
SYS_RCV_MUX_CNT_PRESET[0:1]  
SYS_CLK_DELAY[0:1]  
SYS_DDMR_ENABLE  
SYS_DDMF_ENABLE  
BC_DDM_FALL_EN  
BC_DDM_RISE_EN  
BC_CLKFWD_ENABLE  
BC_RCV_MUX_CNT_PRESET[0:1]  
BC_CLK_DELAY[0:1]  
BC_DDMR_ENABLE  
BC_DDMF_ENABLE  
SYS_DDM_FALL_EN  
SYS_DDM_RISE_EN  
SYS_CLKFWD_ENABLE  
SYS_RCV_MUX_CNT_PRESET[0:1]  
SYS_CLK_DELAY[1:0]  
SYS_DDMR_ENABLE  
SYS_DDMF_ENABLE  
BC_DDM_FALL_EN  
BC_DDM_RISE_EN  
BC_CLKFWD_ENABLE  
BC_RCV_MUX_CNT_PRESET[1:0]  
SYS_CLK_DELAY[0:1]  
SYS_DDMR_ENABLE  
SYS_DDMF_ENABLE  
SYS_DDM_FALL_EN  
SYS_DDM_RISE_EN  
SYS_CLKFWD_ENABLE  
SYS_RCV_MUX_CNT_PRESET[0:1]  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–37  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Cbox CSRs and IPRs  
Table 5–24 Cbox WRITE_ONCE Chain Order (Continued)  
Cbox WRITE_ONCE Chain  
CFR_GCLK_DELAY[0:3]  
CFR_EV6CLK_DELAY[0:2]  
CFR_FRMCLK_DELAY[0:1]  
BC_LATE_WRITE_NUM[0:2]  
BC_CPU_LATE_WRITE_NUM[1:0]  
JITTER_CMD[0]  
Description  
Number of GCLK cycles to delay internal ClkFwdRst.  
Number of EV6Clk_x cycles to delay internal ClkFwdRst.  
Number of FrameClk_x cycles to delay internal ClkFwdRst.  
Duplicate CSR.  
Duplicate CSR.  
Add one GCLK cycle to the SYSDC write path.  
Duplicate CSR.  
FAST_MODE_DISABLE[0]  
SYSDC_DELAY[3:0]  
Number of GCLK cycles to delay SysDc fill commands before  
action by the Cbox.  
DATA_VALID_DLY[1:0]  
Number of Bcache clock cycles to delay signal SysDataInValid  
before sample by the Cbox.  
BC_DDM_FALL_EN  
BC_DDM_RISE_EN  
Duplicate CSR.  
Duplicate CSR.  
BC_CPU_CLK_DELAY[0:1]  
BC_FDBK_EN[0:7]  
Delay of Bcache clock for 0, 1, 2, 3 GCLK cycles.  
CSR to program the Bcache forwarded clock shift register feedback  
points.  
BC_CLK_LD_VECTOR[0:15]  
CSR to program the Bcache forwarded clock shift register load val-  
ues.  
BC_BPHASE_LD_VECTOR[0:3]  
SYS_DDM_FALL_EN  
CSR to program the Bcache forwarded clock b-phase enables.  
Duplicate CSR.  
Duplicate CSR.  
SYS_DDM_RISE_EN  
SYS_CPU_CLK_DELAY[0:1]  
Delay of 0..3 GCLK cycles between the forwarded clock out and  
address/data.  
SYS_FDBK_EN[0:7]  
CSR to program the system forwarded clock shift register feedback  
points.  
SYS_CLK_LD_VECTOR[0:15]  
CSR to program the system forwarded clock shift register load val-  
ues.  
SYS_BPHASE_LD_VECTOR[0:3]  
SYS_FRAME_LD_VECTOR[0:4]  
CSR to program the system forwarded clock b-phase enables.  
CSR to program the ratio between frame clock and system for-  
warded clock.  
SYSDC_DELAY[4]  
Fifth SYSDC_DELAY bit.  
5.4.4 Cbox WRITE_MANY Chain Description  
The WRITE_MANY chain order is contained in Table 5–25. Note the following:  
Many CSRs are duplicated for ease of hardware implementation. These CSR names  
are indicated in italics and have two leading asterisks.  
Only a brief description of each CSR is given. The functional description of these  
CSRs is contained in Chapter 3.  
5–38 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                                   
Cbox CSRs and IPRs  
The order of multibit vectors is [MSB:LSB], so the LSB is first bit in the Cbox  
chain.  
Table 5–25 describes the Cbox WRITE_MANY chain order from LSB to MSB.  
Table 5–25 Cbox WRITE_MANY Chain Order  
Cbox WRITE_MANY Chain  
BC_VALID_MODE  
INIT_MODE[0]  
BC_SIZE[3:0]  
Description  
For Information:  
Control Bcache block parity calculation  
Enable initialize mode  
Bcache size  
BC_ENABLE[0]  
BC_ENABLE  
Enable the Bcache  
Duplicate CSR  
BC_SIZE[0:3]  
Duplicate CSR  
1
BC_ENABLE  
Duplicate CSR  
BC_ENABLE  
Duplicate CSR  
BC_ENABLE  
Duplicate CSR  
INVAL_TO_DIRTY_ENABLE[1]  
ENABLE_EVICT  
WH64 acknowledges  
Enable issue evict  
Duplicate CSR  
BC_ENABLE  
INVAL_TO_DIRTY_ENABLE[0]  
BC_ENABLE  
WH64 acknowledges  
Duplicate CSR  
BC_ENABLE  
Duplicate CSR  
BC_ENABLE  
Duplicate CSR  
SET_DIRTY_ENABLE[0]  
INVAL_TO_DIRTY_ENABLE[0]  
SET_DIRTY_ENABLE[2:1]  
BC_BANK_ENABLE[0]  
BC_SIZE[0:3]  
SetDirty acknowledge programming  
Duplicate CSR  
SetDirty acknowledge programming  
Enable bank mode for Bcache  
Duplicate CSR  
INIT_MODE  
Duplicate CSR  
BC_WRT_STS[0:3]  
Write status for Bcache in initialize-mode  
(Valid, Dirty, Shared, Parity)  
1
MBZ during initialization mode; see Section 7.6 for information.  
Figure 5–37 shows an example of PALcode used to write to the WRITE_MANY chain.  
Figure 5–37 WRITE_MANY Chain Write Transaction Example  
;
; Initialize the Bcache configuration in the Cbox  
;
;
;
BC_VALID_MODE = 1  
BC_ENABLE = 1  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–39  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                       
Cbox CSRs and IPRs  
;
;
;
;
;
;
;
;
INIT_MODE = 0  
BC_SIZE = 0xF  
INVALID_TO_DIRTY_ENABLE = 3  
ENABLE_EVICT = 1  
SET_DIRTY_ENABLE = 6  
BC_BANK_ENABLE = 1  
BC_WRT_STS = 0  
; The value for the write_many chain is based on Table 5–25.  
;
; The value is sampled from MSB, 6 bits at a time, as it is written  
; to EV6__DATA. Therefore, before the value can be shifted in, it must be  
; inverted on a by 6 basis. The code then writes out 6 bits at a time,  
; shifting right by 6 after each write.  
;
; So the following transformation is done on the write_many value:  
;
;
;
;
;
;
;
[35:30]|[29:24]|[23:18]|[17:12]|[11:06]|[05:00] =>  
[05:00]|[11:06]|[17:12]|[23:18]|[29:24]|[35:30]  
WRITE_MANY chain = 0x07FBFFFFD  
value to be shifted in = 0xF7FFEFFC1  
; Before the chain can be written, I_CTL[SBE] must be disabled,  
; and the code must be forced into the Icache.  
;
ALIGN_CACHE_BLOCK <^x47FF041F>; align with nops  
mb  
lda  
hw_mtpr  
br  
; wait for MEM-OP’s to complete  
; load I_CTL.....  
; .....SDE=2, IC_EN=3, SBE=0  
; create dest address  
r0, ^x0086(r31)  
r0, EV6__I_CTL  
r0, .  
addq  
r0, #17, r0  
; finish computing dest address  
; flush the Icache  
hw_mtpr r31, EV6__IC_FLUSH  
bne r31, .  
; separate retires  
hw_jmp_stall (r0)  
; force flush  
ALIGN_CACHE_BLOCK <^x47FF041F>  
; align with nops  
bc_config:  
mb  
; pull this block in Icache  
; data[15:00] = 0xFFC1  
; data[31:16] = 0x7FFE  
; clear out bits [31:16]  
lda  
ldah  
zap  
r1, ^xFFC1(r31)  
r0, ^x7FFE(r31)  
r1, #^x0c, r1  
bis  
addq  
r1, r0, r1  
r31, #6, r0  
; or in bits [31:16]  
; shift in 6 x 6 bits  
bc_config_shift_in:  
hw_mtpr r1, EV6__DATA  
; shift in 6 bits  
; decrement R0  
subq  
r0, #1, r0  
beq  
srl  
br  
r0, bc_config_done  
r1, #6, r1  
r31, bc_config_shift_in  
; done if R0 is zero  
; align next 6 bits  
; continue shifting  
bc_config_done:  
hw_mtpr r31, <EV6__MM_STAT ! 64>  
; wait until last shift  
beq  
br  
bis  
bis  
r31, bc_config_end  
r31, .-4  
r31, r31, r31  
r31, r31, r31  
; predicts fall thru  
; predict infinite loop  
; nop  
; nop  
bc_config_end:  
5–40 Internal Processor Registers  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Cbox CSRs and IPRs  
5.4.5 Cbox Read Register (IPR) Description  
The Cbox read register is read 6 bits at a time. Table 5–26 shows the ordering from LSB  
to MSB.  
Table 5–26 Cbox Read IPR Fields Description  
Name  
Description  
C_SYNDROME_1[7:0]  
C_SYNDROME_0[7:0]  
Syndrome for upper QW in OW of victim that was scrubbed.  
Syndrome for lower QW in OW of victim that was scrubbed.  
21264/EV68A Hardware Reference Manual  
Internal Processor Registers 5–41  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
Table 5–26 Cbox Read IPR Fields Description (Continued)  
Name  
Description  
C_STAT[4:0]  
As follows:  
Bits  
Error Status  
0 0 0 0 0 Either no error, or error on a speculative load, or a Bcache victim read  
due to a Dcache/Bcache miss  
0 0 0 0 1 BC_PERR (Bcache tag parity error)  
0 0 0 1 0 DC_PERR (duplicate tag parity error)  
0 0 0 1 1 DSTREAM_MEM_ERR  
0 0 1 0 0 DSTREAM_BC_ERR  
0 0 1 0 1 DSTREAM_DC_ERR  
0 0 1 1 X PROBE_BC_ERR  
0 1 0 0 0 Reserved  
0 1 0 0 1 Reserved  
0 1 0 1 0 Reserved  
0 1 0 1 1 ISTREAM_MEM_ERR  
0 1 1 0 0 ISTREAM_BC_ERR  
0 1 1 0 1 Reserved  
0 1 1 1 X Reserved  
1
1 0 0 1 1 DSTREAM_MEM_DBL  
1 0 1 0 0 DSTREAM_BC_DBL  
1 1 0 1 1 ISTREAM_MEM_DBL  
1 1 1 0 0 ISTREAM_BC_DBL  
1
When the Cbox WRITE_ONCE chain bit  
SKEWED_FILL_MODE[0] is clear, the error status is as specified;  
when set, the error status applies only to first octaword of the fill, and  
error status for the rest of the fill is generic DOUBLE_BIT_ERROR  
(1XXXX).  
C_STS[3:0]  
If C_STAT equals xxx_MEM_ERR or xxx_BC_ERR, then C_STS contains the  
status of the block as follows; otherwise, the value of C_STS is X:  
Bit Value Status of Block  
7:4  
3
Reserved  
Parity  
2
Valid  
1
Dirty  
0
Shared  
C_ADDR[6:42]  
Address of last reported ECC or parity error. If C_STAT value is  
DSTREAM_DC_ERR, only bits 6:19 are valid.  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                               
6
Privileged Architecture Library Code  
This chapter describes the 21264/EV68A privileged architecture library code (PAL-  
code). The chapter is organized as follows:  
PALcode description  
PALmode environment  
Required PALcode function codes  
Opcodes reserved for PALcode  
Internal processor register access mechanisms  
PALshadow registers  
PALcode emulation of FPCR  
PALcode entry points  
Translation buffer fill flows  
Performance counter support  
6.1 PALcode Description  
PALcode is macrocode that provides an architecturally-defined, operating-system-spe-  
cific programming interface that is common across all Alpha microprocessors. The  
actual implementation of PALcode differs for each operating system. PALcode runs  
with privileges enabled, instruction stream (Istream) mapping disabled, and interrupts  
disabled. PALcode has privilege to use five special opcodes that allow functions such as  
physical data stream (Dstream) references and internal processor register (IPR) manip-  
ulation.  
PALcode can be invoked by the following events:  
Reset  
System hardware exceptions (MCHK, ARITH)  
Memory-management exceptions  
Interrupts  
CALL_PAL instructions  
PALcode has characteristics that make it appear to be a combination of microcode,  
ROM BIOS, and system service routines, though the analogy to any of these other  
items is not exact. PALcode exists for several major reasons:  
21264/EV68A Hardware Reference Manual  
Privileged Architecture Library Code  
6–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
PALmode Environment  
There are some necessary support functions that are too complex to implement  
directly in a processor chip’s hardware, but that cannot be handled by a normal  
operating system software routine. Routines to fill the translation buffer (TB),  
acknowledge interrupts, and dispatch exceptions are some examples. In some archi-  
tectures, these functions are handled by microcode, but the Alpha architecture is  
careful not to mandate the use of microcode so as to allow reasonable chip imple-  
mentations.  
There are functions that must run atomically, yet involve long sequences of instruc-  
tions that may need complete access to all of the underlying computer hardware.  
An example of this is the sequence that returns from an exception or interrupt.  
There are some instructions that are necessary for backward compatibility or ease  
of programming; however, these are not used often enough to dedicate them to  
hardware, or are so complex that they would jeopardize the overall performance of  
the computer. For example, an instruction that does a VAX style interlocked mem-  
ory access might be familiar to someone used to programming on a CISC machine,  
but is not included in the Alpha architecture. Another example is the emulation of  
an instruction that has no direct hardware support in a particular chip implementa-  
tion.  
In each of these cases, PALcode routines are used to provide the function. The routines  
are nothing more than programs invoked at specified times, and read in as Istream code  
in the same way that all other Alpha code is read. Once invoked, however, PALcode  
runs in a special mode called PALmode.  
6.2 PALmode Environment  
PALcode runs in a special environment called PALmode, defined as follows:  
Istream memory mapping is disabled. Because the PALcode is used to implement  
translation buffer fill routines, Istream mapping clearly cannot be enabled. Dstream  
mapping is still enabled.  
The program has privileged access to all of the computer hardware. Most of the  
functions handled by PALcode are privileged and need control of the lowest  
levels of the system.  
Interrupts are disabled. If a long sequence of instructions need to be executed  
atomically, interrupts cannot be allowed.  
An important aspect of PALcode is that it uses normal Alpha instructions for most of its  
operations; that is, the same instruction set that nonprivileged Alpha programmers use.  
There are a few extra instructions that are only available in PALmode, and will cause a  
dispatch to the OPCDEC PALcode entry point if attempted while not in PALmode. The  
Alpha architecture allows some flexibility in what these special PALmode instructions do.  
In the 21264/EV68A, the special PALmode-only instructions perform the following func-  
tions:  
Read or write internal processor registers (HW_MFPR, HW_MTPR)  
Perform memory load or store operations without invoking the normal memory-  
management routines (HW_LD, HW_ST)  
Return from an exception or interrupt (HW_RET)  
6–2  
Privileged Architecture Library Code  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Required PALcode Function Codes  
When executing in PALmode, there are certain restrictions for using the privileged  
instructions because PALmode gives the programmer complete access to many of the  
internal details of the 21264/EV68A. Refer to Section 6.4 for information on these spe-  
cial PALmode instructions.  
Caution: It is possible to cause unintended side effects by writing what appears to be  
perfectly acceptable PALcode. As such, PALcode is not something that  
many users will want to change. Before writing PALcode, at least become  
familiar with the information in Appendix D.  
6.3 Required PALcode Function Codes  
Table 6–1 lists opcodes required for all Alpha implementations. The notation used is  
oo.ffff, where oo is the hexadecimal 6-bit opcode and ffff is the hexadecimal 26-bit  
function code.  
Table 6–1 Required PALcode Function Codes  
Mnemonic  
DRAINA  
HALT  
Type  
Function Code  
00.0002  
Privileged  
Privileged  
Unprivileged  
00.0000  
IMB  
00.0086  
6.4 Opcodes Reserved for PALcode  
Table 6–2 lists the opcodes reserved by the Alpha architecture for implementation-spe-  
cific use. These opcodes are privileged and are only available in PALmode.  
Table 6–2 Opcodes Reserved for PALcode  
Architecture  
Mnemonic  
Opcode  
Mnemonic  
PAL1B  
PAL1F  
Function  
HW_LD  
1B  
Dstream load instruction  
HW_ST  
1F  
Dstream store instruction  
HW_RET  
HW_MFPR  
HW_MTPR  
1E  
PAL1E  
PAL19  
Return from PALcode routine  
Copies the value of an IPR into an integer GPR  
Writes the value of an integer GPR into an IPR  
19  
1D  
PAL1D  
These instructions generally produce an OPCDEC exception if executed while the pro-  
cessor is not in PALmode. If I_CTL[HWE] is set, these instructions can also be exe-  
cuted in kernel mode. Software that uses these instructions must adhere to the PALcode  
restrictions listed in this section.  
6.4.1 HW_LD Instruction  
PALcode uses the HW_LD instruction to access memory outside the realm of normal  
Alpha memory management and to perform special Dstream load transactions. Data  
alignment traps are disabled for the HW_LD instruction.  
Figure 6–1 shows the HW_LD instruction format.  
21264/EV68A Hardware Reference Manual  
Privileged Architecture Library Code  
6–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
Opcodes Reserved for PALcode  
Figure 6–1 HW_LD Instruction Format  
31  
26 25  
21 20  
16 15  
13 12 11  
0
OPCODE  
RA  
RB  
DISP  
TYPE  
LEN  
FM-05654.AI4  
Table 6–3 describes the HW_LD instruction fields.  
Table 6–3 HW_LD Instruction Fields Descriptions  
Extent  
[31:26]  
[25:21]  
[20:16]  
[15:13]  
Mnemonic Value  
OPCODE 1B  
Description  
The opcode value.  
16  
RA  
Destination register number.  
Base register for memory address.  
RB  
TYPE  
000  
001  
Physical — The effective address for the HW_LD instruction is physical.  
2
2
Physical/Lock — The effective address for the HW_LD instruction is  
physical. It is the load lock version of the HW_LD instruction.  
010  
Virtual/VPTE — Flags a virtual PTE fetch (LD_VPTE). Used by trap logic  
to distinguish a single TB miss from a double TB miss. Kernel mode access  
checks are performed.  
2
100  
101  
Virtual — The effective address for the HW_LD instruction is virtual.  
2
2
Virtual/WrChk — The effective address for the HW_LD instruction is  
virtual. Access checks for fault-on-read (FOR), fault-on-write (FOW), read  
and write protection.  
110  
Virtual/Alt — The effective address for the HW_LD instruction is virtual.  
Access checks use DTB_ALT_MODE IPR.  
2
111  
Virtual/WrChk/Alt — The effective address for the HW_LD instruction is  
virtual. Access checks for FOR, FOW, read and write protection. Access  
checks use DTB_ ALT_MODE IPR.  
2
[12]  
LEN  
0
1
Access length is longword.  
Access length is quadword.  
[11:0]  
DISP  
Holds a 12-bit signed byte displacement.  
6.4.2 HW_ST Instruction  
PALcode uses the HW_ST instruction to access memory outside the realm of normal  
Alpha memory management and to do special forms of Dstream store instructions. Data  
alignment traps are inhibited for HW_ST instructions. Figure 6–2 shows the HW_ST  
instruction format.  
Figure 6–2 HW_ST Instruction Format  
31  
26 25  
21 20  
16 15  
13 12 11  
0
OPCODE  
RA  
RB  
DISP  
TYPE  
LEN  
FM-05654.AI4  
6–4  
Privileged Architecture Library Code  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Opcodes Reserved for PALcode  
Table 6–4 describes the HW_ST instruction fields.  
Table 6–4 HW_ST Instruction Fields Descriptions  
Extent  
[31:26]  
[25:21]  
[20:16]  
[15:13]  
Mnemonic Value  
OPCODE 1F  
Description  
The opcode value.  
16  
RA  
Write data register number.  
RB  
Base register for memory address.  
Physical — The effective address for the HW_ST instruction is physical.  
TYPE  
000  
001  
2
2
Physical/Cond — The effective address for the HW_ST instruction is  
physical. Store conditional version of the HW_ST instruction. The lock  
flag is returned in RA. Refer to PALcode restrictions for correct use of this  
function.  
010  
110  
Virtual — The effective address for the HW_ST instruction is virtual.  
2
Virtual/Alt — The effective address for the HW_ST instruction is virtual.  
Access checks use DTB_ ALT_MODE IPR.  
2
All others Unused.  
[12]  
LEN  
0
1
Access length is longword.  
Access length is quadword.  
[11:0]  
DISP  
Holds a 12-bit signed byte displacement.  
6.4.3 HW_RET Instruction  
The HW_RET instruction is used to return instruction flow to a specified PC. The RB  
field of the HW_RET instruction specifies an integer GPR, which holds the new value  
of the PC. Bit [0] of this register provides the new value of PALmode after the  
HW_RET instruction is executed. Bits [15:14] of the instruction determine the stack  
action.  
Normally the HW_RET instruction succeeds a CALL_PAL instruction, or a trap han-  
dler that pushed its PC onto the prediction stack. In this mode, the HINT should be set  
to ‘10’ to pop the PC and generate a predicted target address for the HW_RET instruc-  
tion.  
In some conditions, the HW_RET instruction is used in the middle of a PALcode flow  
to cause a group of instructions to retire. In these cases, if the HW_RET instruction  
does not have a corresponding instruction that pushed a PC onto the stack, the HINT  
field should be set to ‘00’ to keep the stack from being modified.  
In the rare circumstance that the HW_RET instruction might be used like a JSR or  
JSR_COROUTINE, the stack can be managed by setting the HINT bits accordingly.  
See Section D.25 for more information about the HW_RET instruction.  
Figure 6–3 shows the HW_RET instruction format.  
21264/EV68A Hardware Reference Manual  
Privileged Architecture Library Code  
6–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Opcodes Reserved for PALcode  
Figure 6–3 HW_RET Instruction Format  
31  
26 25  
21 20  
16 15 14 13 12  
0
OPCODE  
RA  
RB  
DISP  
HINT  
STALL  
FM-05656.AI4  
Table 6–5 describes the HW_RET instruction fields.  
Table 6–5 HW_RET Instruction Fields Descriptions  
Extent  
[31:26]  
[25:21]  
[20:16]  
Mnemonic Value  
OPCODE 1E  
Description  
The opcode value.  
16  
RA  
RB  
Register number. It should be R31.  
Target PC of the HW_RET instruction. Bit [0] of the register’s contents  
determines the new value of PALmode.  
[15:14]  
HINT  
00  
01  
HW_JMP — The PC is not pushed onto the prediction stack. The predicted  
target is PC + (4*DISP[12:0]).  
HW_JSR — The PC is pushed onto the prediction stack. The predicted  
target is PC + (4*DISP[12:0]).  
10  
11  
HW_RET — The prediction is popped off the stack and used as the target.  
HW_COROUTINE — The prediction is popped off the stack and used as  
the target. The PC is pushed onto the stack.  
[13]  
STALL  
If set, the fetcher is stalled until the HW_RET instruction is retired or  
aborted. The 21264/EV68A will:  
Force a mispredict  
Kill instructions that were fetched beyond the HW_RET instruction  
Refetch the target of the HW_RET instruction  
Stall until the HW_RET instruction is retired or aborted  
If instructions beyond the HW_RET have been issued out of order, they  
will be killed and refetched.  
[12:0]  
DISP  
Holds a 13-bit signed longword displacement.  
6.4.4 HW_MFPR and HW_MTPR Instructions  
The HW_MFPR and HW_MTPR instructions are used to access internal processor reg-  
isters. The HW_MFPR instruction reads the value from the specified IPR into the inte-  
ger register specified by the RA field of the instruction. The HW_MTPR instruction  
writes the value from the integer GPR, specified by the RB field of the instruction, into  
the specified IPR. Figure 6–4 shows the HW_MFPR and HW_MTPR instructions for-  
mat.  
Figure 6–4 HW_MFPR and HW_MTPR Instructions Format  
31  
26 25  
21 20  
16 15  
8
7
0
OPCODE  
RA  
RB  
INDEX  
SCBD_MASK  
FM-05657.AI4  
6–6  
Privileged Architecture Library Code  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
Internal Processor Register Access Mechanisms  
Table 6–6 describes the HW_MFPR and HW_MTPR instructions fields.  
Table 6–6 HW_MFPR and HW_MTPR Instructions Fields Descriptions  
Extent  
Mnemonic  
Value  
19  
Description  
[31:26]  
OPCODE  
The opcode value for the HW_MFPR instruction.  
The opcode value for the HW_MTPR instruction.  
16  
1D  
16  
[25:21]  
[20:16]  
RA  
Destination register for the HW_MFPR instruction. It should be R31  
for the HW_MTPR instruction.  
RB  
Source register for the HW_MTPR instruction. It should be R31 for the  
HW_MFPR instruction.  
[15:8]  
[7:0]  
INDEX  
IPR index.  
SCBD_MASK —  
Specifies which IPR scoreboard bits in the IQ are to be applied to this  
instruction. If a mask bit is set, it indicates that the corresponding IPR  
scoreboard bit should be applied to this instruction.  
6.5 Internal Processor Register Access Mechanisms  
This section describes the hardware and software access mechanisms that are used for  
the 21264s IPRs.  
Because the Ibox reorders and executes instructions speculatively, extra hardware is  
required to provide software with the correct view of the architecturally-defined state.  
The Alpha architecture defines two classes of state: general-purpose registers and  
memory. Register renaming is used to provide architecturally-correct register file  
behavior. The Ibox and Mbox each have dedicated hardware that provides correct mem-  
ory behavior to the programmer. Because the internal processor registers are implemen-  
tation-specific, and their state is not defined by the Alpha architecture, access  
mechanisms for these registers may be defined that impose restrictions and limitations  
on the software that uses them.  
For every IPR, each instruction type can be classified by how it affects and is affected  
by the value held by that IPR.  
Explicit readers are HW_MFPR instructions that explicitly read the value of the  
IPR.  
Implicit readers are instructions whose behavior is affected by the value of the IPR.  
For example, each load instruction is an implicit reader of the DTB.  
Explicit writers are HW_MTPR instructions that explicitly write a value into the  
IPR.  
Implicit writers are instructions that may write a value into the IPR as a side effect  
of execution. For example, a load instruction that generates an access violation is  
an implicit writer of the VA, MM_STAT, and EXC_ADDR IPRs. In the 21264/  
EV68A, only instructions that generate an exception will act as implicit IPR writ-  
ers.  
Only certain IPRs, such as those with write-one-to-clear bits, are both implicitly and  
explicitly written. The read-write semantics of these IPRs is controlled by software.  
21264/EV68A Hardware Reference Manual  
Privileged Architecture Library Code  
6–7  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Internal Processor Register Access Mechanisms  
6.5.1 IPR Scoreboard Bits  
In previous Alpha implementations, IPR registers were not scoreboarded in hardware.  
Software was required to schedule HW_MTPR and HW_MFPR instructions for each  
machine’s pipeline organization in order to ensure correct behavior. This software  
scheduling task is more difficult in the 21264/EV68A because the Ibox performs  
dynamic scheduling. Hence, eight extra scoreboard bits are used within the IQ to help  
maintain correct IPR access order. The HW_MTPR and HW_MFPR instruction for-  
mats contain an 8-bit field that is used as an IPR scoreboard bit mask to specify which  
of the eight IPR scoreboard bits are to be applied to the instruction.  
If any of the unmasked scoreboard bits are set when an instruction is about to enter the  
IQ, then the instruction, and those behind it, are stalled outside the IQ until all the  
unmasked scoreboard bits are clear and the queue does not contain any implicit or  
explicit readers that were dependent on those bits when they entered the queue. When  
all the unmasked scoreboard bits are clear, and the queue does not contain any of those  
readers, the instruction enters the IQ and the unmasked scoreboard bits are set.  
HW_MFPR instructions are stalled in the IQ until all their unmasked IPR scoreboard  
bits are clear.  
When scoreboard bits [3:0] and [7:4] are set, their effect on other instructions is differ-  
ent, and they are cleared in a different manner.  
If any of scoreboard bits [3:0] are set when a load or store instruction enters the IQ, that  
load or store instruction will not be issued from the IQ until those scoreboard bits are  
clear.  
Scoreboard bits [3:0] are cleared when the HW_MTPR instructions that set them are  
issued (or are aborted). Bits [7:4] are cleared when the HW_MTPR instructions that set  
them are retired (or are aborted).  
Bits [3:0] are used for the DTB_TAG and DTB_PTE register pairs within the DTB fill  
flows. These bits can be used to order writes to the DTB for load and store instructions.  
See Sections 5.3.1 and 6.9.1.  
Bit [0] is used in both DTB and ITB fill flows to trigger, in hardware, a lightweight  
memory barrier (TB-MB) to be inserted between a LD_VPTE and the corresponding  
virtual-mode load instruction that missed in the TB.  
6.5.2 Hardware Structure of Explicitly Written IPRs  
IPRs that are written by software are physically implemented as two registers. When  
the HW_MTPR instruction that writes the IPR executes, it writes its value to the first  
register. When the HW_MTPR instruction is retired, the contents of the first register are  
written into the second register. Instructions that either implicitly or explicitly read the  
value of the IPR access the second register. Read-after-write and write-after-write  
dependencies are managed using the IPR scoreboard bits. To avoid write-after-read  
conflicts, the second register is not written until the writer is retired. The writer will not  
be retired until the previous reader is retired, and the reader is retired after it has read its  
value from the second register.  
Some groups of IPRs are built using a single shared first register. To prevent write-  
after-write conflicts, IPRs that share a first register also share scoreboard bits.  
6–8  
Privileged Architecture Library Code  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Internal Processor Register Access Mechanisms  
6.5.3 Hardware Structure of Implicitly Written IPRs  
Implicitly written IPRs are physically built using only a single level of register, how-  
ever the IPR has two hardware states associated with it:  
1. Default State—The contents of the register may be written when an instruction gen-  
erates an exception. If an exception occurs, write a new value into the IPR and go to  
state 2.  
2. Locked State—The contents of the register may only be overwritten by an except-  
ing instruction that is older than the instruction associated with the contents of the  
IPR. If such an exception occurs, overwrite the value of the IPR. When the trigger-  
ing instruction, or instruction that is older than the triggering instruction, is killed  
by the Ibox, go to state 1.  
6.5.4 IPR Access Ordering  
IPR access mechanisms must allow values to be passed through each IPR from a pro-  
ducer to its intended consumers.  
Table 6–7 lists all of the paired instruction orderings between instructions of the four  
IPR access types. It specifies whether access order must be maintained, and if so, the  
mechanisms used to ensure correct ordering.  
Table 6–7 Paired Instruction Fetch Order  
Second  
Instruction  
First Instruction  
Explicit Reader  
No IPRs in this class. Read transactions can A variety of mechanisms are  
Implicit Reader Implicit Writer  
Explicit Writer  
Implicit  
Reader  
Read transac-  
tions can be  
reordered.  
be reordered.  
used to ensure order:  
scoreboard bits to stall issue of  
reader; HW_RET_STALL to  
stall reader; double write plus  
buffer blocks to force retire and  
allow for propagation delay.  
Implicit  
Writer  
No IPRs in this The hardware struc- IPR-specific PALcode No IPRs in this class.  
class.  
ture of implicitly  
restrictions are  
written IPRs handles required for this case.  
this case.  
An interlock mecha-  
nism must be placed  
between the explicit  
reader and the implicit  
writer (a read transac-  
tion).  
Explicit  
Reader  
Read transac-  
tions can be  
reordered.  
If the reader is in the Read transactions can Scoreboard bits stall issue of  
PALcode routine  
invoked by the  
exception associated  
with the writer, then  
ordering is guaran-  
teed.  
be reordered.  
reader until writer is retired.  
21264/EV68A Hardware Reference Manual  
Privileged Architecture Library Code  
6–9  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Internal Processor Register Access Mechanisms  
Table 6–7 Paired Instruction Fetch Order (Continued)  
Second  
Instruction  
First Instruction  
Explicit  
Writer  
Reader reads  
Write-one-to-clear Reader reads second  
Scoreboard bits stall second  
second register. bits, or performance register. Writer cannot writer in map stage until first  
Writer cannot  
write second  
counter special case. write second register  
For example, perfor- until it is retired.  
writer is retired.  
register until it mance counter incre-  
is retired.  
ments are typically  
not scoreboarded  
against read transac-  
tions.  
For convenience of implementation, there is no IPR scoreboard bit checking within the  
same fetch block (octaword-aligned octaword).  
Within one fetch block, there can be only one explicit writer (HW_MTPR) to an  
IPR in a particular scoreboard group.  
Within one fetch block, an explicit writer (HW_MTPR) to an IPR in a particular  
scoreboard group cannot be followed by an explicit reader (HW_MFPR) to an IPR  
in that same scoreboard group.  
Within one fetch block, an explicit writer (HW_MTPR) to an IPR in a particular  
scoreboard group cannot be followed by an implicit reader to an IPR in that score-  
board group. This case covers writes to DTB_PTE or DTB_TAG followed by a  
LD, ST, or any memory operation, including HW_RETs without the ‘stall’ bit set.  
6.5.5 Correct Ordering of Explicit Writers Followed by Implicit Readers  
Across fetch blocks, the correct ordering of the explicit write of the DTB_PTE or  
DTB_TAG followed by an implicit reader (memory operation) is guaranteed using the  
IPR scoreboard bits.  
However, there are cases where correct ordering of explicit writers followed by implicit  
readers cannot be guaranteed using the IPR scoreboard mechanism. If the instruction  
that implicitly reads the IPR does so before the issue stage of the pipeline, the score-  
board mechanism is not sufficient.  
For example, modification of the ITB affects instructions before the issue state of the  
pipeline. In this case, PALcode must contain a HW_RET instruction, with its stall bit  
set, before any instruction that implicitly reads the IPR(s) in question. This prevents  
instructions that are newer than the HW_RET instruction from being successfully  
fetched, issued, and retired until after the HW_RET instruction is retired (or aborted).  
There are also cases when the HW_RET with the STALL bit mechanism is not suffi-  
cient. There may be additional propagation delay past the retirement of the HW_RET  
instruction. In these cases, instead of using a HW_RET, a suggested method of ensur-  
ing the ordering is coding a group of 5 fetch blocks, where the first contains the  
HW_MTPR to the IPR, the second contains a HW_MTPR to the same IPR or one in the  
same scoreboard group, and where the following 3 fetch blocks each contain at least  
one non-NOP instruction. See Appendix D for a listing of cases where this method is  
recommended.  
6–10 Privileged Architecture Library Code  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
PALshadow Registers  
6.5.6 Correct Ordering of Explicit Readers Followed by Implicit Writers  
Certain IPRs that are updated as a result of faulting memory operations require PAL-  
code assistance to maintain ordering against newer instructions. Consider the following  
code sequence:  
HW_MFPR IPR_MM_STAT  
LDQ rx,(ry)  
It is typically the case that these instructions would issue in-order:  
The MFPR is data-ready and both instructions use a lower subcluster. However, the  
HW_MFPRs (and HW_MTPRs) respond to certain resource-busy indications and  
do not issue when the MBOX informs the IBOX that a certain set of resources  
(store bubbles) are busy.  
The LDs respond to a different set of resource-busy indications (load-bubbles) and  
could issue around the HW_MFPR in the presence of the former. PALcode assis-  
tance is required to enforce the issue order.  
One totally reliable method is to insert an MB (memory barrier) instruction before the  
first load that occurs after the HW_MFPR MM_STAT. Another method would be to  
force a register dependency between the HW_MFPR and the LD.  
6.6 PALshadow Registers  
The 21264/EV68A contains eight extra virtual integer registers, called shadow regis-  
ters, which are available to PALcode for use as scratch space and storage for com-  
monly used values. These registers are made available under the control of the SDE[1]  
field of the I_CTL IPR. These shadow registers overlay R4 through R7 and R20  
through R23, when the CPU is in PALmode and SDE[1] is set.  
PALcode generally runs with shadow mode enabled. Any PALcode that supports  
CALL_PAL instructions must run in that mode because the hardware writes a  
PALshadow register with the return address of CALL_PAL instructions.  
PALcode may occasionally be required to toggle shadow mode to obtain access to the  
overlayed registers. See the PALcode restriction, Updating I_CTL[SDE], in Section  
6.7 PALcode Emulation of the FPCR  
The FPCR register contains status and control bits. They are accessed by way of the  
MT_FPCR and MF_FPCR instructions. The register is physically implemented like an  
explicitly written IPR. It may be written with a value from the floating-point register  
file by way of the MT_FPCR instruction. Architecturally-compliant FPCR behavior  
requires PALcode assistance. The FPCR register must operate as listed here:  
1. Correct operation of the status bits, which must be set when a floating-point  
instruction encounters an exceptional condition, independent of whether a trap for  
the condition is enabled.  
2. Correct values must be returned when the FPCR is read by way of a MF_FPCR  
instruction.  
21264/EV68A Hardware Reference Manual  
Privileged Architecture Library Code 6–11  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
PALcode Entry Points  
3. Correct actions must occur when the FPCR is written by way of a MT_FPCR  
instruction.  
6.7.1 Status Flags  
The FPCR status bits in the 21264/EV68A are set with PALcode assistance. Floating-  
point exceptions, for which the associated FPCR status bit is clear or for which the  
associated trap is enabled, result in a hardware trap to the ARITH PALcode routine. The  
EXC_SUM register contains information to allow this routine to update the FPCR  
appropriately, and to decide whether to report the exception to the operating system.  
6.7.2 MF_FPCR  
The MF_FPCR is issued from the floating-point queue and executed by the Fbox. No  
PALcode assistance is required.  
6.7.3 MT_FPCR  
The MT_FPCR instruction is issued from the floating-point queue. This instruction is  
implemented as an explicit IPR write operation. The value is written into the first latch,  
and when the instruction is retired, the value is written into the second latch. There is no  
IPR scoreboarding mechanism in the floating-point queue, so PALcode assistance is  
required to ensure that subsequent readers of the FPCR get the updated value.  
After writing the first latch, the MT_FPCR instruction invokes a synchronous trap to  
the MT_FPCR PALcode entry point. The PALcode can return using a HW_RET  
instruction with its STALL bit set. This sequence ensures that the MT_FPCR instruc-  
tion will be correctly ordered for subsequent readers of the FPCR.  
6.8 PALcode Entry Points  
PALcode is invoked at specific entry points, of which there are two classes:  
CALL_PAL and exceptions.  
6.8.1 CALL_PAL Entry Points  
CALL_PAL entry points are used whenever the Ibox encounters a CALL_PAL instruc-  
tion in the Istream. To speed the processing of CALL_PAL instructions, CALL_PAL  
instructions do not invoke pipeline aborts but are processed as normal jumps to the off-  
set from the contents of the PAL_BASE register, which is specified by the CALL_PAL  
instruction’s function field.  
The Ibox fetches a CALL_PAL instruction, bubbles one cycle, and then fetches the  
instructions at the CALL_PAL entry point. For convenience of implementation, returns  
from CALL_PAL are aided by a linkage register (much like JSRs). PALshadow regis-  
ter R23 is used as the linkage register. The Ibox loads the PC of the instruction after the  
CALL_PAL instruction, into the linkage register. Bit [0] of the linkage register is set if  
the CALL_PAL instruction was executed while the processor was in PALmode.  
The Ibox pushes the value of the return PC onto the return prediction stack.  
CALL_PAL instructions start at the following offsets:  
Privileged CALL_PAL instructions start at offset 2000 .  
16  
Nonprivileged CALL_PAL instructions start at offset 3000 .  
16  
6–12 Privileged Architecture Library Code  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                 
PALcode Entry Points  
Each CALL_PAL instruction includes a function field that is used to calculate the PC of  
its associated PALcode entry point. The PALcode OPCDEC exception flow will be  
invoked if the CALL_PAL function field satisfies any of the following requirements:  
Is in the range of 40 to 7F inclusive  
16 16  
Is greater than BF  
16  
Is between 00 and 3F inclusive, and IER_CM[CM] is not equal to the kernel  
16  
16  
mode value 0  
If none of the conditions above are met, the PALcode entry point PC is as follows:  
PC[63:15] = PAL_BASE[63:15]  
PC[14] = 0  
PC[13] = 1  
PC[12] = CALL_PAL function field [7]  
PC[11:6] = CALL_PAL function field [5:0]  
PC[5:1] = 0  
PC[0] = 1 (PALmode)  
6.8.2 PALcode Exception Entry Points  
When hardware encounters an exception, Ibox execution jumps to a PALcode entry  
point at a PC determined by the type of exception. The return PC of the instruction that  
triggered the exception is placed in the EXC_ADDR register and onto the return predic-  
tion stack.  
Table 6–8 shows the PALcode exception entry locations and their offset from the  
PAL_BASE IPR.  
Table 6–8 PALcode Exception Entry Locations  
Entry Name  
Type  
Offset  
Description  
16  
DTBM_DOUBLE_3  
Fault  
100  
Dstream TB miss on virtual page table entry fetch. Use three-  
level flow.  
DTBM_DOUBLE_4  
Fault  
180  
Dstream TB miss on virtual page table entry fetch. Use four-  
level flow.  
FEN  
Fault  
Fault  
Fault  
Fault  
Fault  
200  
280  
300  
380  
400  
Floating point disabled.  
UNALIGN  
DTBM_SINGLE  
DFAULT  
Unaligned Dstream reference.  
Dstream TB miss.  
Dstream fault or virtual address sign check error.  
OPCDEC  
Illegal opcode or function field:  
• Opcode 1, 2, 3, 4, 5, 6 or 7  
• Opcode 19 , 1B , 1D , 1E or 1F , not PALmode or  
16  
16  
16  
16  
16  
not I_CTL[HWE]  
• Extended precision IEEE format  
• Unimplemented function field of opcodes 14 or 1C  
16  
16  
IACV  
Fault  
480  
Istream access violation or virtual address sign check error.  
21264/EV68A Hardware Reference Manual  
Privileged Architecture Library Code 6–13  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                     
Translation Buffer (TB) Fill Flows  
Table 6–8 PALcode Exception Entry Locations (Continued)  
Entry Name  
MCHK  
Type  
Offset  
500  
Description  
16  
Interrupt  
Fault  
Machine check.  
ITB_MISS  
ARITH  
580  
Istream TB miss.  
Synch. Trap 600  
Interrupt 680  
Synch. Trap 700  
Interrupt 780  
Arithmetic exception or update to FPCR.  
Interrupts: hardware, software, and AST.  
Invoked when a MT_FPCR instruction is issued.  
Chip reset or wake-up from sleep mode.  
INTERRUPT  
MT_FPCR  
RESET/WAKEUP  
6.9 Translation Buffer (TB) Fill Flows  
This section shows the expected PALcode flows for DTB miss and ITB miss. Familiar-  
ity with 21264/EV68A IPRs is assumed.  
6.9.1 DTB Fill  
Figure 6–5 shows single-miss DTB instructions flow.  
Figure 6–5 Single-Miss DTB Instructions Flow Example  
hw_mfprp23, EV6__EXC_ADDR  
hw_mfprp4, EV6__VA_FORM  
hw_mfprp5, EV6__MM_STAT  
hw_mfpr p7, EV6__EXC_SUM  
; (0L) get exception address  
; (4-7,1L) get vpte address  
; (0L) get miss info  
; (0L) get exc_sum for ra  
hw_mfpr p6, EV6__VA  
bic p7, #1, p7  
; (4-7,1L) get original va  
; clear double miss flag  
; interlock p4 and p6  
; restore p4  
xor p4, p6, p4  
xor p4, p6, p4  
trap__dtbm_single_vpte:  
hw_ldq/v p4, (p4)  
; (1L) get vpte  
blt p_misc, trap__d1to1  
blbcp4, trap__invalid_dpte  
; (xU) <63>=1 => 1-to-1  
; (xU) invalid => branch  
and p4, #^x80, p7  
xor p7, #^x80, p7  
; isolate mb bit  
; flip mb bit  
ALIGN_FETCH_BLOCK <^x47FF041F>  
PVC_VIOLATE <2>  
; ignore scoreboard violation  
; (2&6,0L) write tag0  
hw_mtprp6, EV6__DTB_TAG0  
hw_mtpr p6, EV6__DTB_TAG1  
hw_mtprp4, <EV6__DTB_PTE0 ! ^x44>  
hw_mtprp4, <EV6__DTB_PTE1 ! ^x22>  
; (1&5,1L) write tag1  
; (0,4,2,6) (0L) write pte0  
; (3,7,1,5) (1L) write pte1  
6–14 Privileged Architecture Library Code  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                       
Translation Buffer (TB) Fill Flows  
ASSUME <tb_mb_en + pte_eco> ne 2  
.if ne pte_eco  
bne p7, trap__dtbm_single_mb  
hw_ret (p23)  
; branch for mb  
; return  
trap__dtbm_single_mb:  
mb  
hw_ret(p23)  
; return  
.iff  
hw_ret(p23)  
; return  
; (assumes tb_mb_en on multi-processors)  
.endc  
The following list presents information about the single-miss DTB code example:  
In Figure 6–5, where (x,y) or (y) appear in the comments, x specifies the scoreboard  
bits and y specifies the Ebox subcluster.  
r4 r7 and r20 r23 are PALshadow registers.  
PALshadow r22 contains a flag that indicates whether the native code is running  
“1to1”, that is, running in a mode where the physical address should be mapped  
1to1 to the virtual address, rather than being taken from a page table.  
IPR scoreboard bits [3:0] are used to order the restarted load or store instructions  
for the DTB write transactions.  
MM_STAT and VA will not be overwritten if the LD_VPTE instruction misses the  
DTB. There is no issue order constraint.  
The code is written to prevent a later execution of the DTB fill instruction from  
being issued before a previous execution and corrupting the previous write to the  
TB registers. The correct sequence of executions is accomplished by placing code  
dependencies on scoreboard bits [7:4] in the path of the successive writers. This  
prevents the successive writers from being issued before the previous writers are  
retired.  
When I_CTL[TB_MB_EN] = 1, the issue of MTPR DTB_PTE0 triggers, in hard-  
ware, a lightweight memory barrier (TB-MB). The lightweight memory barrier  
enforces read-ordering of store instructions from another processor (I) to this pro-  
cessor’s (J) page table and this processor’s virtual memory area such that if this  
processor sees the write to the PTE from (I) it will see the new data.  
Processor I  
Processor J  
LD/ST  
Wr Data  
MB  
<tb miss>  
Wr PTE  
LD-PTE, write TB  
LD/ST  
21264/EV68A Hardware Reference Manual  
Privileged Architecture Library Code 6–15  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Translation Buffer (TB) Fill Flows  
The conditional branch is placed in the code so that all of the MTPR instructions  
are issued and retired or none of them are issued and retired. This allows the TB fill  
hardware to update the TB whenever it sees the retiring of PTE1 and to ignore  
writes to TAG0/TAG1/PTE0/PTE1 in the interim between the issuing of those  
writes and a retire of PTE1.  
As an alternative to using I_CTL[TB_MB_EN] = 1 to enforce read ordering,  
I_CTL[TB_MB_EN] can be set to 0 and the PALcode may use a bit in the PTE to  
indicate whether to do an explicit MB.  
The flow example in Figure 6–5 shows the code using pte_eco and the code not  
using pte_eco. It assumes the following:  
In a multi-processor configuration, if pte_eco is not enabled, it is necessary to  
enable tb_mb_en.  
In a uni-processor configuration, if pte_eco is not enabled, it is not necessary to  
enable tb_mb_en.  
At no time should pte_eco and tb_mb_en both be enabled.  
The value in DTB_PTEx[GH] determines whether the scoreboard mechanism alone  
is sufficient to guarantee all subsequent load/store instructions (implicit readers of  
the DTB) are ordered relative to the creation of a new DTB entry; whether all sub-  
sequent loads and stores to the loaded address will hit in the DTB.  
If DTB_PTEx[GH] is zero, the scoreboard mechanism alone is sufficient.  
If DTB_PTEx[GH] is not zero, the scoreboard mechanism alone is not suffi-  
cient (although this is not a problem). In this case, the new DTB entry is not  
visible to subsequent load/store instructions until after the MTPR DTB_PTE1  
retires.  
Issuing a HW_RET_STALL instead of a HW_RET would guarantee ordering,  
but is not necessary. Code executes correctly without the stall although execu-  
tion might result in two passes through the DTB miss flow, rather than one,  
because the re-execution of the memory operation after the first DTB miss  
might miss again.  
This behavior is functionally correct because DTB loads that tag-match an  
existing DTB entry are ignored by the 21264/EV68A and the second DTB miss  
execution will load exactly the same entry as the first.  
6.9.2 ITB Fill  
Figure 6–6 shows the ITB miss instructions flow.  
Figure 6–6 ITB Miss Instructions Flow Example  
hw_mfpr  
hw_mfpr  
lda  
r4,  
r23, EV6__EXC_ADDR  
r6, ^x0FFF(r31)  
r31, r31, r31  
EV6__IVA_FORM  
; (0L) get vpte address  
; (0L) get exception address  
; (xU) create mask for prot  
; (xU) fill out fetch block  
bis  
trap__itb_miss_vpte:  
hw_ldq/v r4,  
(r4)  
; (xL) get vpte  
and  
r4, r6, r5  
; (xL) get prot bits  
; (xU) 1-to-1 => branch  
blt p_misc, trap__i1to1  
6–16 Privileged Architecture Library Code  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Performance Counter Support  
srl  
r4,  
r6,  
r4,  
r4,  
r7,  
#OSF_PTE__PFN__S, r6  
#EV6__ITB_PTE__PFN__S, r6  
#<1@OSF_PTE__FOE__S>, r7  
trap__invalid_ipte  
trap__foe  
; (xU) shift PFN to <0>  
; (xU) shift PFN into place  
; (xL) get FOE bit  
sll  
and  
blbc  
bne  
; (xU) invalid => branch  
; (xU) FOE => branch  
; check for mb bit  
srl  
r4, #7, r7  
bis  
r5, r6, r6  
; (xL) PTE in ITB format  
; (6,0L) write tag  
hw_mtpr  
hw_mtpr  
r23, EV6__ITB_TAG  
r6,  
EV6__ITB_PTE  
; (0&4,0L) write PTE  
ASSUME <tb_mb_en + pte_eco> ne 2  
.if ne pte_eco  
blbc  
r7, trap__itb_miss_mb  
; branch for mb  
hw_ret_stall (r23); (0L)  
trap__itb_miss_mb:  
mb  
.endc  
hw_ret_stall (r23)  
; (0L)  
The following list presents information about the ITB miss flow code example:  
In Figure 6–6, where (x,y) or (y) appear in the comments, x specifies the scoreboard  
bits and y specifies the Ebox subcluster.  
The ITB is only accessed on Icache misses.  
r4 r7 and r20 r23 are PALshadow registers.  
PALshadow r22 contains a flag that indicates whether the native code is running  
“1to1”, that is, running in a mode where the physical address should be mapped  
1to1 to the virtual address, rather than being taken from a page table.  
The HW_RET instruction should have its STALL bit set to ensure that the restarted  
Istream does not read the ITB until the ITB is written.  
As an alternative to using I_CTL[TB_MB_EN] = 1 to enforce read ordering,  
I_CTL[TB_MB_EN] can be set to 0 and the PALcode may use a bit in the PTE to  
indicate whether to do an explicit MB. The flow example in Figure 6–6 assumes  
this alternative.  
6.10 Performance Counter Support  
The 21264/EV68A provides hardware support for two methods of obtaining program  
performance feedback information. The two methods do not require program modifica-  
tion. Instead, performance monitoring utilities make calls to the PALcode to set up the  
counters and contain interrupt handlers that call PALcode to retrieve the collected data.  
The first method, Aggregate mode, offers capabilities that are similar to earlier micro-  
processor performance counters. This mode counts events when enabled, until it over-  
flows, causing an interrupt that can retrieve the collected data. The second method,  
21264/EV68A Hardware Reference Manual  
Privileged Architecture Library Code 6–17  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Performance Counter Support  
ProfileMe mode, supports a new way of statistically sampling individual instructions  
during program execution. This mode counts events triggered by a targeted inflight  
instruction.  
Counter support uses the hardware registers listed in Table 6–9.  
Table 6–9 IPRs Used for Performance Counter Support  
Register Name  
Mnemonic Relevant Fields  
Described in Section  
ProfileMe PC  
PMPC  
All fields  
5.2.6  
5.2.9  
Interrupt enable and current proces- IER_CM  
sor mode  
PCEN[1:0]  
Interrupt summary  
Ibox control  
ISUM  
PC[1:0]  
5.2.11  
5.2.15  
5.2.16  
I_CTL  
I_STAT  
SPCE, PCT0_EN, PCT1_EN  
Ibox status  
OVR, ICM, TRAP-TYPE,  
LSO, TRP, MIS  
Ibox process context  
PCTX  
PPCE  
5.2.21  
5.2.22  
Performance counter support  
PCTR_CTL All fields  
6.10.1 General Precautions  
Initialize both counters, (PCTR_CTL[PCTR0 and PCTR1]), to zero in reset PALcode  
to avoid spurious interrupts when exiting initial PALcode. Counters must be written  
twice during initialization to ensure that the overflow latch has been cleared (see the  
PALcode restrictions in Sections D.28 and D.34).  
The counters should never be left within one cycle of overflow when disabled because  
that can cause some interrupts to be blocked in anticipation of an overflow interrupt  
(see PALcode restriction 32).  
If a counter is at the overflow threshold and a value is written to that counter, the  
counter signals an overflow interrupt upon leaving PALmode, even if that counter is  
disabled. To avoid that interrupt, the PALcode should clear the interrupt by writing to  
HW_INT_CLR.  
Interrupts are disabled in PALmode.  
As a quirk of the implementation, while counting is disabled, a read of PCTR_CTL can  
yield value+some increment, where value is the actual value in PCTR_CTL, and incre-  
ment for PCTR0 is in the range 0..4 (retired instructions in that cycle), and increment  
for PCTR1 is dependent on SL1.  
6.10.2 Aggregate Mode Programming Guidelines  
Use the following information to program counters in Aggregate mode.  
6.10.2.1 Aggregate Mode Precautions  
Counters continue to count after overflow.  
Only the counters return useful data. See Table 6–11 for counting modes.  
Counters can be read by a PALcode instruction at any time to get the aggregate count.  
6–18 Privileged Architecture Library Code  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Performance Counter Support  
The legal range for PCTR0 when writing the IPR is 0:(2**20-16).  
The legal range for PCTR1 when writing the IPR is 0:(2**20-4).  
6.10.2.2 Operation  
1. Setup  
The following IPRs need to be set up by PALcode instructions.  
IPR Name  
IER_CM  
PCTX  
Relevant Fields Meaning  
PCEN[1:0]  
PPCE  
SL0  
Enable Interrupts.  
Enable Process Performance Counting or use I_CTL[SPCE].  
PCTR_CTL  
Selects Aggregate or ProfileMe mode; set to 0 for Aggregate mode.  
SL1  
Selects PCTR0 and PCTR1 counting modes. See Table 6–11 for more infor-  
mation.  
PCTR0[19:0]  
PCTR1[19:0]  
Set counter 0 starting value [0:(2**20-16)]. See Section 6.10.1 for setup  
precautions.  
Set counter 1 starting value [0:(2**20-4)]. See Section 6.10.1 for setup pre-  
cautions.  
I_CTL  
SPCE  
Enable System Performance Counting or use PCTX[PPCE].  
Enable performance counter 0.  
PCT0_EN  
PCT1_EN  
Enable performance counter 1.  
2. Count  
If PCTR0 and PCTR1 are enabled, will increment according to modes selected by  
SL0 and SL1.  
3. Overflow  
If PCEN[1:0] is enabled, PC[1:0] is set when PCTR0 or PCTR1 overflows.  
4. Hardware interrrupt  
When PC[1:0] is set, the PALcode interrupt routine is entered. Interrupt is acknowl-  
edged and PALcode generates an interrupt to the operating system performance  
monitoring utility.  
5. Operating system interrupt handler  
The handler should read the IPR PCTR_CTL, as shown in Table 6–10, to note  
which counter overflowed in the handler's data structures. The handler may read the  
counter to see how many events have happened since the overflow.  
The handler may also choose to write the counters to control the frequency of inter-  
rupts.  
Table 6–10 Aggregate Mode Returned IPR Contents  
IPR  
Field  
Contents  
PCTR_CTL  
PCTR0[19:0]  
PCTR1[19:0]  
Counter #0 value  
Counter #1 value  
21264/EV68A Hardware Reference Manual  
Privileged Architecture Library Code 6–19  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Performance Counter Support  
6.10.2.3 Aggregate Counting Mode Description  
6.10.2.3.1 Cycle counting  
Counts cycles.  
PCTR0 is incremented by the number of cycles counted, that is, 1.  
6.10.2.3.2 Retired instructions cycles  
PCTR0 is incremented by up to 8 retired instructions per cycle when enabled via  
I_CTL[PCT0_EN] and either I_CTL[SPCE] or PCTX[PPCE]. On overflow, an inter-  
rupt is triggered as ISUM[PC0] if enabled via IER_CM[PCEN0].  
The 21264/EV68A can retire up to 11 instructions per cycle, which exceeds PCTR0's  
maximum increment of 8 per cycle. However, no retires go uncounted because the  
21264/EV68A cannot sustain 11 retires per cycle, and the 21264/EV68A corrects  
PCTR0 in subsequent cycles.  
A squashed instruction does not count as a retire.  
6.10.2.3.3 Bcache miss or long latency probes cycles  
This input counts the number of times the Bcache result was a miss.  
Essentially, a long latency probe is a data request from other processes that cause  
Bcache misses in a system.  
This count is phase shifted three cycles early and thus includes events that occurred  
three cycles before the start and before the end of the ProfileMe window.  
6.10.2.3.4 Mbox replay traps cycles  
This input counts Mbox replay traps.  
6.10.2.4 Counter Modes for Aggregate Mode  
Table 6–11 shows the counter modes that are used with Aggregate mode.  
Table 6–11 Aggregate Mode Performance Counter IPR Input Select Fields  
SL0[4]  
SL1[3:2] PCTR0  
PCTR1  
0
0
0
0
00  
01  
10  
11  
Retired instructions  
Cycle counting  
Cycle counting  
Retired instructions  
Cycle counting  
Not defined  
Bcache miss or long latency probes  
Mbox replay traps  
6.10.3 ProfileMe Mode Programming Guidelines  
Use the following information to program counters in ProfileMe mode.  
6.10.3.1 ProfileMe Mode Precautions  
Squashed NOPs count as valid fetched instructions.  
Counter 1 must be explicitly cleared in the trap handler before each data collection.  
6–20 Privileged Architecture Library Code  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                     
Performance Counter Support  
The CMOV instruction is decomposed into two valid fetched instructions that, in the  
absence of stalls, are fetched in consecutive cycles. See Table 6–12 for more informa-  
tion.  
Table 6–12 CMOV Decomposed  
Instruction  
New Instructions  
CMOV Ra, Rb--> Rc  
CMOV1 Ra, oldRc −−> newRc1  
CMOV2 newRc1, Rb −−> newRc2  
6.10.3.2 Operation  
1. Setup  
The following IPRs need to be set up by using PALcode instructions.  
IPR Name  
IER_CM  
PCTX  
Relevant Fields Meaning  
PCEN[1:0]  
PPCE  
Enable Interrupts.  
Enable Process Performance Counting or use I_CTL[SPCE].  
Selects Aggregate or ProfileMe mode; set to 1 for ProfileMe mode.  
PCTR_CTL SL0  
SL1  
Selects PCTR0 and PCTR1 counting modes. See Table 6–14 for more infor-  
mation.  
PCTR0[19:0]  
Set counter 0 value (2**20-N). This selects approximately the Nth valid  
fetched instruction as the profiled instruction. Because writes to PCTR0 are  
incremented by 0..4, the profiled instruction is one of the (N-4)th to Nth valid  
fetched instructions. See Section 6.10.1 for more setup precautions.  
PCTR1[19:0]  
SPCE  
Set counter 1 value = 0. See Section 6.10.1 for more setup precautions.  
Enable System Performance Counting or use PCTX[PPCE].  
Enable performance counter 0.  
I_CTL  
PCT0_EN  
PCT1_EN  
Enable performance counter 1.  
2. Open window  
PCTR0 accumulates up to 4 valid fetched instructions per cycle when enabled via  
I_CTL[PCT0_EN] and either I_CTL[SPCE] or PCTX[PPCE].  
The valid fetched instruction that causes PCTR0 to overflow opens the window and  
becomes the profiled instruction and covers a period of time near to when the  
instruction was in flight. The first cycle of the window is the 5th cycle after the  
instruction was fetched. A residual count of up to 7 valid fetched instructions is  
accumulated in PCTR0 in the two cycles between overflow and the start of the Pro-  
fileMe window. This residual count is returned in I_STAT[overcount(2,0)].  
3. Count  
If PCTR0 and PCTR1 are enabled, they increment according to modes selected by  
SL0 & SL1.  
4. End window  
The last cycle of the window depends on whether the instruction traps, retires,  
aborts, and/or is squashed by the fetcher.  
21264/EV68A Hardware Reference Manual  
Privileged Architecture Library Code 6–21  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Performance Counter Support  
For instructions that cause a trap, the last cycle in the window is the 2nd cycle after  
the trap. Mispredicted branches are included in this category.  
For nontrapping instructions that retire, the last cycle in the window is the 2nd  
cycle after the instruction retires.  
For instructions that abort, the last cycle in the window is the 2nd cycle after the  
trap that caused the abort.  
For instructions that are squashed (such as TRAPB), the last cycle in the window is  
approximately the 2nd cycle after the squashed instruction would have aborted or  
retired.  
Every non-squashed valid fetched instruction either aborts or retires, but not both.  
In either case, the instruction may also trap.  
PCTR0 is disabled from counting until PCTR_CTL is next written.  
5. Interrupt PALcode  
When ISUM field PC[1:0] is set, execution of PCTR0's or PCTR1's interrupt PAL-  
code is performed.  
6. Operating system interrupt handler  
The handler should first read the IPRs in Table 6–13 and then write PCTR_CTL to set  
up the next interrupt.  
Table 6–13 ProfileMe Mode Returned IPR Contents  
IPR Name  
PMPC[63:0] All  
I_STAT ICM  
Relevant Fields  
Meaning  
Profiled PC.  
Instruction was in a new Icache fill stream.  
TRP  
Instruction caused a trap and was not in the shadow of  
a younger trapping instruction.  
MIS  
Conditional branch mispredict.  
Exception type code.  
TRAP TYPE  
LSO  
Load-store order replay trap.  
Counter 0 overcount.  
OVR  
PCTR_CTL VAL  
TAK  
Instruction retired valid.  
Branch direction if instruction is a conditional branch.  
PM_STALLED  
Instruction stalled for at least one cycle between fetch  
and map stages of pipeline.  
PM_KILLED_BM  
Instruction killed during or before cycle in which it  
was mapped.  
PCTR0[19:0]  
PCTR1[19:0]  
Counter 0 value.  
Counter 1 value.  
6–22 Privileged Architecture Library Code  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
Performance Counter Support  
6.10.3.3 ProfileMe Counting Mode Description  
6.10.3.3.1 Cycle counting  
In ProfileMe mode, either counter counts cycles during the window of the profiled  
instruction.  
6.10.3.3.2 Inum retire delay cycles  
This input is used to measure a lower bound on the inum retire delay of the profiled  
instruction. The maximum final value of PCTR1 is the length of the ProfileMe window  
minus 2.  
Counts cycles that a profiled instruction delayed the retire pointer advance during the  
ProfileMe window. The 21264/EV68A tracks instructions in the pipeline by allocating  
them "inums" near the front of the pipeline. All inums are retired in the order in which  
they were allocated at the end of the pipeline.  
Inums are allocated in batches of four, so there may be more inums allocated than there  
are program instructions in flight. Every inum is retired in order, including those for  
aborted instructions.  
The "retire pointer" points to the next inum to be retired. An inum retires in the cycle  
that the retire pointer advances past the inum.  
Let X and Y be consecutive inums in the allocation order. The "inum retire delay" of Y  
is [(cycle in which Y retired) – (cycle in which X retired)]. A large inum retire delay  
indicates a possible performance bottleneck (for example, an instruction stalled on a  
data cache miss).  
6.10.3.3.3 Retired instructions cycles  
When counting retired instructions in ProfileMe mode, the final count in PCTR0 may  
include instructions that retired before the ProfileMe window and may exclude instruc-  
tions that retired near the end of the ProfileMe window. These discrepancies are caused  
by a variable delay between the time that an instruction retires and the time that PCTR0  
is incremented for that retire. This discrepancy is in the range of plus or minus 4 retired  
instructions.  
6.10.3.3.4 Bcache miss or long latency probes cycles  
This input counts the number of times the Bcache result was a miss.  
Essentially, a long latency probe is a data request from other processes that cause  
Bcache misses in a system.  
This count is phase shifted three cycles early and thus includes events that occurred  
three cycles before the start and before the end of the ProfileMe window.  
6.10.3.3.5 Mbox replay traps cycles  
This input counts Mbox replay traps.  
PCTR1 is enabled to count Mbox replay traps that occur during a window that is the  
ProfileMe window phase-shifted one cycle later. The first replay trap counted would be  
the 7th cycle after the instruction is fetched.  
21264/EV68A Hardware Reference Manual  
Privileged Architecture Library Code 6–23  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Performance Counter Support  
6.10.3.4 Counter Modes for ProfileMe Mode  
Table 6–14 shows the counter modes that are used with ProfileMe mode.  
Table 6–14 ProfileMe Mode PCTR_CTL Input Select Fields  
SL0[4] SL1[3:2] PCTR0  
PCTR1  
1
1
1
1
00  
01  
10  
11  
Retired instructions  
Cycle counting  
Cycle counting  
Inum retire delay  
Bcache miss or long latency probes  
Mbox replay traps  
Retired instructions  
Cycle counting  
6–24 Privileged Architecture Library Code  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
7
Initialization and Configuration  
This chapter provides information on 21264/EV68A-specific microprocessor system  
initialization and configuration. It is organized as follows:  
Power-up reset flow  
Fault reset flow  
Energy star certification and sleep mode flow  
Warm reset flow  
Array initialization  
Initialization mode processing  
External interface initialization  
Internal processor register (IPR) reset state  
IEEE 1149.1 test port reset  
Reset state machine state transitions  
Phase-locked loop (PLL) functional description  
Initialization is controlled by the reset state machine, which is responsible for four  
major operations. Table 7–1 describes the four major operations.  
Table 7–1 21264/EV68A Reset State Machine Major Operations  
Operation  
Function  
Ramp up  
Sequence the PLL input and output dividers (X and Z ) to gradually raise the internal  
div  
div  
GCLK frequency and generate time intervals for the PLL to re-establish lock.  
BiST/SROM  
Receive a synchronous transfer on the ClkFwdRst_H pin in order to start built-in self-test and  
SROM load at a predictable GCLK cycle.  
Clock forward Receive a synchronous transfer on the ClkFwdRst_H pin in order to initialize the clock for-  
interface  
warding interface.  
Ramp down  
Sequence the PLL input and output dividers (X and Z ) to gradually lower the internal  
div  
div  
GCLK frequency during sleep mode.  
7.1 Power-Up Reset Flow and the Reset_L and DCOK_H Pins  
The 21264/EV68A reset sequence is triggered using the two input signals Reset_L and  
DCOK_H in a sequence that is described in Section 7.1.1. After Reset_L is deasserted,  
the following sequence of operations takes place:  
21264/EV68A Hardware Reference Manual  
Initialization and Configuration  
7–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
Power-Up Reset Flow and the Reset_L and DCOK_H Pins  
1. The clock forwarding and system clock ratio configuration information is loaded  
onto the 21264/EV68A. See Section 7.1.2.  
2. The internal PLL is ramped up to operating frequency.  
3. The internal arrays built-in self-test (BiST) is run, followed by Icache initialization  
using an external serial ROM (SROM) interface.  
The 21264/EV68A systems, unlike the Alpha 21064 and 21164 microprocessor  
systems, are required to have an SROM. The SROM provides the only means to  
configure the system port, and the SROM pins can be used as a software-controlled  
UART.  
The Icache must contain PALcode that starts at location 0x780. This code is used to  
configure the 21264/EV68A IPRs as necessary before causing any offchip read or  
write commands. This allows the 21264/EV68A to be configured to match the  
external system implementation.  
4. After configuring the 21264/EV68A, control can be transferred to code anywhere  
in memory, including the noncacheable regions. The Icache can be flushed by a  
write operation to the ITB invalidate-all register after control is transferred. This  
transfer of control should be to addresses not loaded in the Icache by the SROM  
interface or the Icache may provide unexpected instructions.  
5. Typically, any state required by the PALcode is initialized and then the console is  
started (switching out of PALmode and into native mode). The console code initial-  
izes and configures the system and boots an operating system from an I/O device  
such as a disk or the network.  
Figure 7–1 shows the sequence of events at power-up, or cold reset. In Figure 7–1, note  
the following symbols for constraints and information:  
Constraints:  
A
B
Setup (A0) and hold (A1) for IRQ’s to be latched by DCOK (2 ns for each).  
Enough time for Reset_L to propagate through 5 stages of RESET synchronizer (clocked by the inter-  
nal framing clock, which is driven by EV6Clk_x). Worst case for the 21264/EV68A would be 5x8x15  
= 600 GCLK cycles.  
C
Min = 1 FrameClk cycle.  
Information:  
a
b
c
d
e
f
8 GCLK cycles from DCOK assertion to first “real” EV6Clk_x cycle.  
Approximately 525 GCLK cycles for external framing clock to be sampled and captured.  
1 FrameClk_x cycle.  
3 FrameClk_x cycles.  
Approximately 264 GCLK cycles to prevent first command from appearing too early.  
Approximately 700,000 GCLK cycles for BiST + approximately 100,000 GCLK cycles fixed time +  
approximately 50,000 GCLK cycles per line of Icache for SROM load.  
g
16 GCLK cycles.  
7–2  
Initialization and Configuration  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Power-Up Reset Flow and the Reset_L and DCOK_H Pins  
Figure 7–1 Power-Up Timing Sequence  
A0 A1  
valid  
IRQ_H  
DCOK_H  
a
Reset_L  
state  
B
f
WAIT_SETTLE  
WAIT_NORMAL  
RAMP1  
RAMP2  
WAIT_ClkFwdRst0  
b
WAIT_BiST  
WAIT_ClkFwdRst1  
RUN  
c
e
SromOE_L  
ClkFwdRst_H  
no min  
no min  
C
d
internal ClkFwdRst  
TestStat_H  
g
external Clks  
End of BiST  
BiST Fails  
BiST Passes  
FM-06486B.FH8  
7.1.1 Power Sequencing and Reset State for Signal Pins  
Power sequencing and avoiding potential failure mechanisms is described in Section  
The reset state for the signal pins is listed in Table 7–2.  
Table 7–2 Signal Pin Reset State  
Signal  
Reset State  
Signal  
Reset State  
Bcache  
BcAdd_H[23:4]  
BcCheck_H[15:0]  
BcData_H[127:0]  
BcDataInClk_H[7:0]  
BcDataOE_L  
BcDataOutClk_x[3:0]  
BcDataWr_L  
BcLoad_L  
Tristated  
Tristated  
Tristated  
NA (input)  
Tristated  
Tristated  
Tristated  
Tristated  
Tristated  
BcTagInClk_H  
BcTagOE_L  
NA (input)  
Tristated  
Tristated  
Tristated  
Tristated  
Tristated  
Tristated  
BcTagOutClk_x  
BcTagParity_H  
BcTagShared_H  
BcTagValid_H  
BcTagWr_L  
BcTag_H[42:20]  
BcVref  
NA  
(I_DC_REF)  
BcTagDirty_H  
System Interface  
IRQ_H[5:0]  
Tristated  
NA (input)  
NA (input)  
NA (input)  
SysDataInClk_H[7:0]  
SysDataInValid_L  
NA (input)  
NA (input)  
SysAddIn_L[14:0]  
SysAddInClk_L  
SysDataOutClk_L[7:0] Tristated  
21264/EV68A Hardware Reference Manual  
Initialization and Configuration  
7–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Power-Up Reset Flow and the Reset_L and DCOK_H Pins  
Table 7–2 Signal Pin Reset State (Continued)  
Signal  
Reset State  
Signal  
Reset State  
SysAddOut_L[14:0]  
Initially, during power-up reset, state SysDataOutValid_L  
is not defined. If not during power-  
up, preserves previous state. Then,  
after the clock forward reset period  
(as the external clocks start), signal  
driven to NZNOP until the reset  
NA (input)  
state machine enters RUN, when it  
is driven to NOP.  
SysAddOutClk_L  
SysCheck_L[7:0]  
Tristated  
Tristated  
SysFillValid_L  
SysVref  
NA (input)  
NA  
(I_DC_REF)  
SysData_L[63:0]  
Clocks  
Tristated  
ClkFwdRst_H  
NA (input)  
NA (input)  
FrameClk_x  
NA (input)  
ClkIn_H  
ClkIn_L  
NA  
(I_DC_REF)  
PLL_VDD  
EV6Clk_H  
EV6Clk_L  
NA (input)  
Miscellaneous  
DCOK_H  
Must be deasserted until dc voltage Tck_H  
NA (input)  
reaches proper operating level.  
PllBypass_H  
Reset_L  
NA (input)  
NA (input)  
Tristated  
Tdi_H  
NA (input)  
Unspecified  
Tristated  
Tdo_H  
SromClk_H  
SromData_H  
SromOE_L  
TestStat_H  
Tms_H  
Trst_L  
NA (input)  
Tristated  
NA (input)  
NA (input)  
In addition, as power is being ramped, Reset_L must be asserted — this allows the  
21264/EV68A to reset internal state. Once the target voltage levels are attained, sys-  
tems should assert DCOK_H. This indicates to the 21264/EV68A that internal logic  
functions can be evaluated correctly and that the power-up sequence should be contin-  
ued. Prior to DCOK_H being asserted, the logic internal to the 21264/EV68A is being  
reset and the internal clock network is running (either clocked by the PLL VCO, which  
is at a nominal speed, or by ClkIn_H, if the PLL is bypassed).  
The reset state machine is in state WAIT_SETTLE.  
7.1.2 Clock Forwarding and System Clock Ratio Configuration  
When DCOK_H is asserted, the 21264/EV68A samples several pins and latches in  
some initialization state, including the value of the PLL Y divisor, which specifies  
div  
the ratio of the system clock to the internal clock (see Section 7.11.2.3), and enables the  
charge pump on the phase-locked loop.  
7–4  
Initialization and Configuration  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Power-Up Reset Flow and the Reset_L and DCOK_H Pins  
Table 7–3 summarizes the pins and the suggested/required initialization state. Most of  
this information is supplied by placing (switch-selectable or hardwired) weak pull-ups  
or pull-downs on the IRQ_H pins. The IRQ_H pins are sampled on the rising edge of  
DCOK_H, during which time the 21264/EV68A is in reset and is not generating any  
system activity. During normal operation, the IRQ_H pins supply interrupt requests to  
the 21264/EV68A.  
It is possible to disable the 21264/EV68A PLL and source GCLK directly from  
ClkIn_x. This mode is selected via PllBypass_H. The 21264/EV68A still produces a  
divided-down clock on EV6Clk_x; this output clock, which tracks GCLK, can be used  
in a feedback loop to generate a locked input clock via an external PLL. The input  
clock can be locked against a slower speed system reference clock.  
Table 7–3 Pin Signal Names and Initialization State  
Signal Name  
Sample Time  
Function  
Value  
1
PllBypass_H  
Continuous input Select ClkIn_x onto GCLK instead of internal 0 Bypass  
PLL.  
1 Use PLL  
ClkFwdRst_H Sampling method  
according to  
IRQ_H[4]  
Reset_L  
Continuous input  
IRQ_H[5]  
Rising edge of  
DCOK_H  
Select 1:1 FrameClk mode.  
Internal FrameClk can be generated two ways:  
0 Sample with  
FrameClk_H  
1 Use a copy of  
EV6Clk_H  
1
By sampling FrameClk_H. Used if  
FrameClk_H is slower than ClkIn_H.  
2
As a direct copy of EV6Clk_H. Used if  
FrameClk_H is the same frequency as  
ClkIn_H or is DC.  
IRQ_H[4]  
Rising edge of  
DCOK_H  
Select method of sampling ClkFwdRst_H to  
produce internal ClkFwdRst — either with  
external or internal copy of FrameClk_x.  
0 Sample with Exter-  
nal FrameClk_x  
1 Sample with Inter-  
nal Frameclk  
IRQ_H[3:0]  
Rising edge of  
DCOK_H  
Select Y divisor value. This is the divide-  
down factor between GCLK and EV6Clk_x.  
IRQ_H[3:0] Divisor  
div  
0011  
0100  
0101  
0110  
0111  
0000  
1000  
1001  
1010  
1011  
1100  
1101  
1110  
1111  
3
4
When the PLL is in use and the 21264/EV68A is  
ramped-up to full speed, the VCO adjusts in  
order to phase-align (and rate-match) EV6Clk_x  
to ClkIn_x. When the PLL is not in use, and  
ClkIn_x is bypassed onto GCLK, EV6Clk_x is  
5
6
7
8
9
10  
11  
12  
13  
14  
15  
16  
slower than ClkIn_x by the divisor Y  
.
div  
21264/EV68A Hardware Reference Manual  
Initialization and Configuration  
7–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
Power-Up Reset Flow and the Reset_L and DCOK_H Pins  
Table 7–3 Pin Signal Names and Initialization State (Continued)  
Signal Name  
Sample Time  
Function  
Value  
DCOK_H  
Continuous input When deasserted, initializes the internal 21264/  
EV68A reset state machine and keeps the PLL  
internal oscillator running at a nominal speed.  
Assertion, which implies power to the 21264/  
EV68A is good, causes configuration informa-  
tion to be sampled.  
1
The maximum permissible instantaneous change in ClkIn_x frequency is 333 MHz (to prevent cur-  
rent spikes).  
7.1.3 PLL Ramp Up  
After the configuration is loaded through the IRQ_H pins, the next phase in the power  
up flow is the internal PLL ramp up sequence. Ramping up of the PLL is required to  
guarantee that the dynamic change in frequency will not cause the supply on the 21264/  
EV68A to fall due to the supply loop inductance. Clock control circuitry steps GCLK  
th  
from power-up/reset clocking to 1/16 operating frequency, to ½ operating frequency,  
and finally normal operating frequency.  
After the assertion of DCOK_H, the 21264/EV68A waits for the deassertion of  
Reset_L from the system while the PLL attempts to achieve a lock. The PLL internal  
ramp dividers are set to divide down the input clock by 16 and the PLL attempts to  
achieve lock against an effective input frequency of ClkIn_x/16. Once lock is  
achieved, the actual internal frequency (GCLK) is ClkIn_x*(Y divisor value)/16.  
div  
There should be a minimum delay of 100 ms between the assertion of DCOK_H and  
the deassertion of Reset_L to allow for this locking The reset state machine is in the  
WAIT_NOMINAL state.  
After the deassertion of Reset_L, the reset state machine goes into the RAMP1 state.  
The 21264/EV68A ramps the internal frequency, by changing the effective input fre-  
quency of the PLL to ClkIn_x/2 for a sufficient lock interval (at most, 20 µs at 400  
MHz). The state machine then goes into the RAMP2 state, changing the effective input  
frequency to ClkIn/1 for an additional lock interval (about 20 µs). The lock periods are  
generated by the internal duration counter, which is driven by GCLK. The counter  
counts 4108 GCLK cycles during the ClkIn_x/2 lock interval. Note that GCLK is pro-  
duced by the output of the PLL, which is locking to an input clock which is 1/2 of the  
operating frequency — therefore, the 4108 cycle interval constitutes a 6-20 µs interval  
when the operating frequency is 400–1250 MHz. Then, the counter counts 8205 GCLK  
cycles during the ClkIn_x/1 lock interval.  
7.1.4 BiST and SROM Load and the TestStat_H Pin  
The 21264/EV68A uses the deassertion of ClkFwdRst_H (which must be deasserted  
for a minimum of one FrameClk_H cycle and then reasserted) to begin built-in self-  
test (BiST). The reset state machine goes into the WAIT_BiST state. Details on BiST  
are given in Chapter 11. The power-up BiST lasts approximately 700,000 cycles. The  
result of the self-test is made available on the TestStat_H pin. The pin is forced low by  
the system reset. It is then forced high during BiST.  
7–6  
Initialization and Configuration  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Power-Up Reset Flow and the Reset_L and DCOK_H Pins  
As BiST completes, the TestStat_H pin is held low for 16 GCLK cycles. Then, if BiST  
succeeds, the pin remains low. Otherwise, it is asserted. After successfully completing  
BiST, the 21264/EV68A then performs the SROM load sequence (described in Chapter  
11). After the SROM load sequence is finished, the 21264/EV68A deasserts  
SromOE_L.  
7.1.5 Clock Forward Reset and System Interface Initialization  
After the deassertion of SromOE_L, the reset state machine enters the  
WAIT_ClkFwdRst1 state, where the 21264/EV68A waits for the system to deassert  
ClkFwdReset_H. The 21264/EV68A samples the deasserting edge of  
ClkFwdReset_H to take synchronous actions. It uses this synchronous event to reset  
the clock forwarding interface, start the outgoing clocks, and deassert internal reset.  
The chip then waits 264 cycles before issuing commands. The reset state machine is  
then in RUN and the 21264/EV68A begins fetching code at address 0x780.  
Table 7–4 lists signals relevant to the power-up flow, provides a short description of  
each, and any relevant constraints.  
Table 7–4 Power-Up Flow Signals and Their Constraints  
Signal Name  
Description  
Constraint  
ClkIn_x  
Differential clocks that are Clocks must be running before DCOK_H is  
inputs to PLL or are  
bypassed onto GCLK  
directly  
asserted.  
VDD supply to PLL  
PLL_VDD must lead VDD.  
PLL_VDD  
VDD  
VDD supply to the 21264/  
EV68A chip logic (except  
PLL)  
DCOK_H  
Reset_L  
Logic signal to the 21264/  
EV68A that the VDD sup-  
ply is good  
RESET pin asserted by  
SYSTEM to the 21264/  
EV68A  
Reset_L must be asserted prior to DCOK_H and  
must remain asserted for at least 100 ms after  
DCOK_H is asserted. This allows for PLL settling  
time. Deassertion of Reset_L causes the 21264/  
EV68A to ramp divisors to their final value and  
begin BiST.  
ClkFwdRst_H  
Deassertion #1  
Signal asserted by SYS-  
TEM to synchronously  
ClkFwdRst_H must be deasserted after PLL has  
achieved its lock in its final divisor value (about 20  
commence built-in self-test µs). The deassertion causes built-in self-test to  
and SROM load  
begin on an internal clock cycle that corresponds to  
one framing clock cycle after ClkFwdRst_H is  
deasserted. ClkFwdRst_H can be asserted after  
one frame clock cycle. See Figure 7–1.  
ClkFwdRst_H  
Signal asserted by SYS-  
ClkFwdRst_H must be deasserted when the Cbox  
Deassertion #2  
TEM to initialize and reset has loaded configuration information. This occurs  
clock forwarding interfaces as the first part of the serial ROM load, after BiST  
is run. Once ClkFwdRst_H is deasserted, the  
interface is initialized and can receive probe  
requests from the 21264/EV68A.  
21264/EV68A Hardware Reference Manual  
Initialization and Configuration  
7–7  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Fault Reset Flow  
7.2 Fault Reset Flow  
The fault reset sequence of operation is triggered by the assertion of the ClkFwdRst_H  
signal line. Figure 7–2 shows the fault reset sequence of operation. The reset state  
machine is initially in RUN state. ClkFwdRst_H is asserted by the system, which  
causes the state machine to transition to the WAIT_FAULT_RESET state.  
The 21264/EV68A internally resets a minimum amount of internal state. Note the  
effects of that reset on the IPRs in Table 7–5.  
Table 7–5 Effect on IPRs After Fault Reset  
IPR  
After Reset  
PAL_BASE  
I_CTL  
Maintained (not reset)  
Bit value = 3 (both Icaches are enabled)  
Set  
PCTX[FPE]  
WRITE_MANY Cleared (That is, the WRITE_MANY chain is initialized and the Bcache is turned off.)  
EXC_ADDR  
Set to an address that is close to the PC  
The 21264/EV68A then waits for ClkFwdRst_H to deassert twice:  
One deassert to transition directly to the WAIT_ClkFwdRst1 state without perform-  
ing any BiST  
One deassert to initialize the clock forwarding interface  
The 21264/EV68A then begins fetching code at PAL_BASE + 0x780.  
Figure 7–2 shows the fault reset sequence of operation. In Figure 7–2, note the follow-  
ing symbols for constraints and information:  
Constraints:  
A
Min = 1 FrameClk_x cycle  
Information:  
a
b
c
e
f
Approximately 264 GCLK cycles  
Approximately 525 GCLK cycles for external framing clock to be sampled and captured  
1 FrameClk_x cycle plus 2 GCLK cycles  
Next FrameClk_x rising edge  
3 FrameClk_x cycles  
g
Approximately 264 GCLK cycles to prevent first command from appearing too early  
7–8  
Initialization and Configuration  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
Energy Star Certification and Sleep Mode Flow  
Figure 7–2 Fault Reset Sequence of Operation  
internal clks aligned  
e
RUN  
state  
WAIT_ClkFwdRst0  
WAIT_ClkFwdRst1  
g
RUN  
WAIT_FAULT_RESET  
a
b
c
SromOE_L  
ClkFwdRst_H  
no min  
no min  
A
f
internal ClkFwdRst  
external Clks  
FM-06488B.AI4  
7.3 Energy Star Certification and Sleep Mode Flow  
The 21264/EV68A is Energy Star compliant. Energy Star is a program administered by  
the Environmental Protection Agency to reduce energy consumption. For compliance,  
a computer must automatically enter a low power sleep mode using 30 watts or less  
after a specified period of inactivity. When the system is awakened, the user shall be  
returned automatically to the same situation that existed prior to entering sleep mode.  
During normal operation, the 21264/EV68A encounters inactive periods and enters a  
mode that saves the entire active processor state to memory.  
The PALcode is responsible for saving all necessary state to DRAM and flushing the  
caches.  
The sleep mode sequence of operations is triggered by the PALcode twice performing a  
HW_MTPR to the Ibox SLEEP IPR. The first write prevents the assertion of  
ClkFwdRst_H from fault-resetting the chip.  
The PALcode then informs the system, in an implementation-dependent way, that it  
may assert ClkFwdRst_H.  
On the second HW_MTPR to the SLEEP IPR, the PLL begins to ramp down and the  
21264/EV68A can then respond to the ClkFwdRst_H that was asserted by the system,  
causing the outgoing clocks from the 21264/EV68A to stop.  
The PLL ramp-down sequence takes exactly the same amount of time as the ramp up  
sequence described in Section 7.1.3. The same internal duration counter is used and the  
reset state machine transitions through the DOWN1, DOWN2, and DOWN3 states  
which have similar PLL divisor ratios and clock speeds to the RAMP2, RAMP1, and  
WAIT_NOMINAL states.  
21264/EV68A Hardware Reference Manual  
Initialization and Configuration  
7–9  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
Energy Star Certification and Sleep Mode Flow  
After the PLL has finished ramping down, the reset state machine enters the  
WAIT_INTERRUPT state. Note the effects of the entry into that state on the IPRs  
listed in Table 7–6.  
Table 7–6 Effect on IPRs After Transition Through Sleep Mode  
IPR  
Effects After Transition Through Sleep Mode  
Maintained (not reset)  
PAL_BASE  
I_CTL  
Bit value = 3 (both Icaches are enabled)  
Set  
PCTX[FPE]  
WRITE_MANY Cleared (That is, the WRITE_MANY chain is initialized and the Bcache is  
turned off.)  
Note that Interrupt enables are maintained during sleep mode, enabling the 21264/  
EV68A to wake up. The 21264/EV68A waits for either an unmasked clock interrupt or  
an unmasked device interrupt from the system.  
When an enabled interrupt occurs, the PLL ramps back to full frequency. Subsequent to  
that, the 21264/EV68A performs a built-in self-initialization (BiSI), a shortened built-in  
self-test, which initializes the internal arrayed structures. The SROM is not reloaded.  
Instead, the 21264/EV68A begins fetching code from the system at address  
PAL_BASE + 0x780.  
Figure 7–3 shows the sleep mode sequence of operations. In Figure 7–3, note the fol-  
lowing constraint and informational symbols:  
Constraints:  
A
Min = 1 FrameClk_x cycle  
Informational symbols:  
a
b
c
d
e
f
Approximately 525 GCLK cycles for external framing clock to be sampled and captured  
Next FrameClk_x rising edge  
1 FrameClk_x cycle  
3 FrameClk_x cycles  
Approximately 264 GCLK cycles to prevent first command from appearing too early  
Approximately 8192 GCLK cycles for BiSI  
16 GCLK cycles  
g
7–10 Initialization and Configuration  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
Warm Reset Flow  
Figure 7–3 Sleep Mode Sequence of Operation  
internal clks  
b
f
state  
RUN  
DOWN1  
DOWN2  
DOWN3  
WAIT_INTR  
RAMP1  
RAMP2  
WAIT_ClkFwdRst0  
a
WAIT_BiSI  
WAIT_ClkFwdRst1  
RUN  
c
e
SLEEP IPR  
Wake-up interrupt  
SromOE_L  
ClkFwdRst_H  
no min  
no min  
A
d
TestStat_H  
internal ClkFwdRst  
external Clks  
FM-06487A.AI4  
Table 7–7 describes each signal and constraint for the sleep mode sequence.  
Table 7–7 Signals and Constraints for the Sleep Mode Sequence  
Signal Name  
Description  
Constraint  
ClkFwdRst_H  
Signal asserted by the system to  
ClkFwdRst_H must be asserted by the system  
initialize and reset clock forwarding when entering sleep mode. The system deasserts  
interfaces  
ClkFwdRst_H no sooner than one FrameClk_H  
cycle after sourcing an interrupt to the 21264/  
EV68A.  
Forwarded clocks  
System interrupt  
Bit clocks forwarded to/from the  
21264/EV68A  
Clocks stop running under ClkFwdRst_H.  
Asynchronous interrupt which  
causes the 21264/EV68A to exit  
sleep mode  
7.4 Warm Reset Flow  
The warm reset sequence of operation is triggered by the assertion of the Reset_L sig-  
nal line. The reset state machine is initially in RUN state. The 21264/EV68A then, by  
default, ramps down the PLL (similar to the sleep flow sequence) and the reset state  
machine ends up in the WAIT_RESET state.  
Note the effects of entry into that state on the IPRs listed in Table 7–8.  
Table 7–8 Effect on IPRs After Warm Reset  
IPR  
Effects After Warm Reset  
PAL_BASE  
I_CTL  
Cleared  
Cleared  
Set  
PCTX[FPE]  
WRITE_MANY Cleared (That is, the WRITE_MANY chain is initialized and the Bcache is  
turned off.)  
21264/EV68A Hardware Reference Manual  
Initialization and Configuration 7–11  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                   
Array Initialization  
The 21264/EV68A waits until Reset_L is deasserted before transitioning from the  
WAIT_RESET state. The 21264/EV68A ramps up the PLL until the state machine  
enters the WAIT_ClkFwdRst0 state. Note that the system must assert ClkFwdRst_H  
before the state machine enters the WAIT_ClkFwdRst0 state. Then, similarly to the  
other flows, SromOE_L is asserted and the system waits for the deassertion of  
ClkFwdRst_H.  
On the deassertion of ClkFwdRst_H, the 21264/EV68A performs BiST and the SROM  
loading procedure.  
After BiST and SROM loading have completed, SromOE_L deasserts and the 21264/  
EV68A waits for ClkFwdRst_H to deassert before starting the external clocks and, like  
the other flows, waits for 264 cycles before starting instructions.  
7.5 Array Initialization  
The following arrays are initialized by BiST:  
Icache and Icache tag  
Dcache, Dcache tag, and Duplicate Dcache tag  
Branch history table  
The external second-level cache (Bcache) is disabled by Reset_L.  
The Bcache must be initialized by PALcode before it is enabled.  
7.6 Initialization Mode Processing  
The initialization mode allows the 21264/EV68A to generate and manipulate cache  
blocks before the system interface has been initialized. Within the 21264/EV68A, the  
Cbox configuration registers are divided into the WRITE_ONCE and the  
WRITE_MANY shift register chains (see Sections 5.4.3 and 5.4.4). The  
WRITE_ONCE chain is loaded from the SROM during reset processing, and contains  
information such as the clock forwarding setup values. The WRITE_MANY chain can  
be written many times using MTPR instructions.  
The WRITE_MANY chain contains the following CSRs that are important to initializa-  
tion mode, which must be set to the values in Table 7–9 to initialize the Bcache.  
Table 7–9 WRITE_MANY Chain CSR Values for Bcache Initialization  
WRITE_MANY Chain CSRs  
Required Value at Initialization Mode  
BC_ENABLE  
1
The duplicate bits for BC_ENABLE in [14:12] must  
be 0 during initialization mode.  
BC_SIZE[3:0]  
The exact size or maximum size of the Bcache.  
INVAL_TO_DIRTY_ENABLE[1:0]  
SET_DIRTY_ENABLE[2:0]  
INIT_MODE  
1
0
1
7–12 Initialization and Configuration  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                           
Initialization Mode Processing  
Table 7–9 WRITE_MANY Chain CSR Values for Bcache Initialization (Continued)  
WRITE_MANY Chain CSRs  
EVICT_ENABLE  
Required Value at Initialization Mode  
0
0
0
BC_WRT_STS[3:0]  
BC_BANK_ENABLE  
Except for INIT_MODE, all the CSR registers have been described in earlier sections.  
When asserted, INIT_MODE has the following behavior:  
Cache block updates to the Dcache set the block to the Clean state.  
Updates to the Bcache use the BC_WRT_STS[3:0] bits.  
WrVictimBlk command generation to the system interface are squashed.  
Using the INVAL_TO_DIRTY_ENABLE and INIT_MODE registers, initialization  
code loaded from the SROM can generate and delete blocks inside the 21264/EV68A  
without system interaction. This behavior is very useful for initialization and startup  
processing, when the system interfaces are not fully functional. Figure 7–4 shows a  
code example for initializing Bcache.  
Figure 7–4 Example for Initializing Bcache  
Reset chip and load Icache with this code  
set init_mode  
;now all WrVictims are ignored  
;bc_enable_a  
1
;zeroblk_enable_a  
;set_dirty_enable_a  
;init_mode_a  
1
0
1
;enable_evict_a  
;bc_wrt_sts_a  
0
0
;bc_bank_enable_a  
;bc_size_a  
0
15  
;now all writes to Bcache actually invalidate  
;the Bcache. (if space was needed for scratch  
;pad, the status bits could just as  
;well be Valid)  
for 2 X bc_size  
;This loop generates legal ECC data, and  
;invalidate tags which are written to the  
;Bcache for all but the final 64KB of address.  
{ WH64 address }  
turn_off_bcache:  
;bc_enable_a  
0
0
0
1
0
0
0
0
;init_mode_a  
;bc_size_a  
;zeroblk_enable_a  
;enable_evict_a  
;set_dirty_enable_a  
;bc_bank_enable_a  
;bc_wrt_sts_a  
21264/EV68A Hardware Reference Manual  
Initialization and Configuration 7–13  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
External Interface Initialization  
SweepMemory:  
;Write good parity/ecc to memory by  
; writing a all memory locations. This is  
;done by WH64 of memory addresses  
turn_on_bcache:  
;bc_enable_a  
0
;bc_size_a  
Actual Bcache size  
;zeroblk_enable_a  
;set_dirty_enable_a  
;init_mode_a  
3
6
0
0
0
0
;enable_evict_a  
;bc_wrt_sts_a  
;bc_bank_enable_a  
for 2 X bc_size  
{ WH64 address }  
;This loop generates legal ECC data, and  
;invalidate tags which are written to the  
;Bcache for all but the final 64KB of address.  
for 2 X dcache size  
{ ECB address  
(done)  
}
;and cleans up the Dcache also.  
In addition to initialization, the dynamic programming ability of the WRITE_MANY  
chain provides the basic tools to build various other software flows such as dynamically  
changing the Bcache enable/size parameters for performance testing.  
7.7 External Interface Initialization  
After reset, the system interface is in the default configuration dictated by the reset state  
of the IPR bits that select the configuration options.  
The response to system interface commands and internally generated memory accesses  
is determined by this default configuration. System environments that are not compati-  
ble with the default configuration must use the SROM Icache load feature to initially  
load and execute a PALcode program to configure the external system interface unit  
IPRs as needed.  
7.8 Internal Processor Register Power-Up Reset State  
Many IPR bits are not initialized by reset. They are located in error-reporting registers  
and other IPR states. They must be initialized by initialization PALcode. Tables 7–5,  
7–6, and 7–8, list the effects on IPRs by fault reset, transition through sleep mode, and  
warm reset, respectively. Table 7–10 lists the state of all internal processor registers  
(IPRs) immediately following power-up reset. The table also specifies which registers  
need to be initialized by power-up PALcode.  
Table 7–10 Internal Processor Registers at Power-Up Reset State  
Mnemonic  
Ibox IPRs  
ITB_TAG  
ITB_PTE  
Register Name  
Reset State Comments  
ITB tag array write  
ITB PTE array write  
X
X
7–14 Initialization and Configuration  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Internal Processor Register Power-Up Reset State  
Table 7–10 Internal Processor Registers at Power-Up Reset State (Continued)  
Mnemonic  
ITB_IAP  
ITB_IA  
Register Name  
Reset State Comments  
ITB invalidate-all (ASM=0)  
ITB invalidate all  
X
X
Must be written to in PALcode.  
ITB_IS  
ITB invalidate single  
ProfileMePC  
X
PMPC  
X
EXC_ADDR  
IVA_FORM  
IER_CM  
SIRR  
Exception address  
Instruction VA format  
Interrupt enable current mode  
Software interrupt request  
Interrupt summary  
Hardware interrupt clear  
Exception summary  
PAL base address  
X
X
X
Must be written to in PALcode.  
X
ISUM  
X
X
HW_INT_CLR  
EXC_SUM  
PAL_BASE  
I_CTL  
Must be cleared in PALcode.  
X
Cleared  
Ibox control  
IC_EN = 3 All other bits are cleared on reset.  
I_STAT  
Ibox status  
X
X
X
X
Must be cleared in PALcode.  
IC_FLUSH  
CLR_MAP  
SLEEP  
Icache flush  
Clear virtual-to-physical map  
Sleep mode  
PCTX  
Ibox process context  
Performance counter control  
PCTX[FPE] is set. All other bits are X.  
PCTR_CTL  
Ebox IPRs  
CC  
X
Must be cleared in PALcode.  
Cycle counter  
X
X
X
X
X
Must be cleared in PALcode.  
CC_CTL  
VA  
Cycle counter control  
Virtual address  
Must be cleared in PALcode.  
VA_FORM  
VA_CTL  
Mbox IPRs  
DTB_TAG0  
DTB_TAG1  
DTB_PTE0  
DTB_PTE1  
DTB_ALTMODE  
DTB_IAP  
Virtual address format  
Virtual address control  
Must be cleared in PALcode.  
DTB tag array write 0  
Cleared  
Cleared  
Cleared  
Cleared  
X
DTB tag array write 1  
DTB PTE array write 0  
DTB PTE array write 1  
DTB alternate processor mode  
PALcode must initialize.  
DTB invalidate all process  
ASM = 0  
X
DTB_IA  
DTB invalidate all process  
X
Must be written to in PALcode.  
21264/EV68A Hardware Reference Manual  
Initialization and Configuration 7–15  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                                                           
IEEE 1149.1 Test Port Reset  
Table 7–10 Internal Processor Registers at Power-Up Reset State (Continued)  
Mnemonic  
DTB_IS0  
DTB_IS1  
DTB_ASN0  
DTB_ASN1  
MM_STAT  
M_CTL  
Register Name  
Reset State Comments  
DTB invalidate single (array 0)  
DTB invalidate single (array 1)  
DTB address space number 0  
DTB address space number 1  
Memory management status  
Mbox control  
X
X
Cleared  
Cleared  
X
Cleared  
DC_CTL  
Dcache control  
DC_CTL[7:2] are cleared at reset.  
DC_CTL[1:0] are set at power up.  
DC_STAT  
Cbox IPRs  
C_DATA  
C_SHFT  
Dcache status  
X
Must be cleared in PALcode.  
Cbox data  
X
X
Must be read in PALcode.  
Cbox shift control  
7.9 IEEE 1149.1 Test Port Reset  
Signal Trst_L must be asserted when powering up the 21264/EV68A. Trst_L must not  
be deasserted prior to assertion of DCOK_H. Trst_L can remain asserted during nor-  
mal operation of the 21264/EV68A.  
7.10 Reset State Machine  
The state diagram in Figure 7–5 summarizes how the 21264/EV68A transitions into  
running code. Each state is described in Table 7–11. Table 7–11 describes outputs and  
approximate state transition equations. Note that there are implicit transitions from  
each state to an appropriate down-ramp state when Reset_L is asserted.  
7–16 Initialization and Configuration  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                         
Reset State Machine  
Figure 7–5 21264/EV68A Reset State Machine State Diagram  
PLL Ramp Up  
Reset_L  
deasserted  
RAMP1  
[2,4]  
Counter  
finished  
WAIT_  
NOMINAL  
[16,32]  
RAMP2  
[1,2]  
DCOK_H  
asserted  
Counter  
finished  
WAIT_  
SETTLE  
[16,32]  
WAIT_ClkFwd  
Rst0  
Counter  
finished  
ClkFwdRst_H  
deasserted  
Out of  
Sleep  
Mode  
Reset_L  
deasserted  
Reset_L  
asserted  
Out of  
FAULT_  
RESET*  
Enabled  
Interrupt  
COLD  
WAIT_  
BiST  
WAIT_  
BiSI  
FAULT_  
RESET  
BiST  
finished  
BiSI  
finished  
Reset_L  
asserted  
WAIT_  
RESET  
WAIT_  
INTERRUPT  
ClkFwdRst_H  
asserted  
WAIT_ClkFwd  
Rst1  
Numbers in "[,]" are  
Xdiv and Zdiv divisors,  
respectively  
ClkFwdRst_H  
deasserted  
Counter  
finished &  
Sleep Mode  
Counter  
*No BiST/BiSI  
on recovery from Fault  
finished &  
Reset  
not Sleep Mode  
PLL Ramp Down  
RUN  
Counter  
finished  
Counter  
finished  
DOWN3  
[16,32]  
DOWN1  
[1,2]  
DOWN2  
[2,4]  
Sleep Mode  
or Reset_L  
asserted  
LKG-10982A-98WF  
Table 7–11 21264/EV68A Reset State Machine State Descriptions  
State Name  
Description  
COLD  
Chip cold. Transitioned to WAIT_SETTLE with assertion of Reset_L, PLL_VDD, and  
VDD.  
WAIT_SETTLE  
PLL_VDD asserted; PLL at minimum frequency.  
WAIT_NOMINAL  
Triggered by assertion of DCOK_H. PLL achieves a lock at X and Z divisors equal  
div div  
16 and 32, respectively.  
RAMP1  
Triggered by Reset_L deassertion; X and Z divisors are changed to 2 and 4, respec-  
div  
div  
tively, increasing the internal GCLK frequency. An internal duration counter is initial-  
ized to count 4108 GCLK cycles.  
21264/EV68A Hardware Reference Manual  
Initialization and Configuration 7–17  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Reset State Machine  
Table 7–11 21264/EV68A Reset State Machine State Descriptions (Continued)  
State Name  
Description  
RAMP2  
Triggered by the duration counter reaching 4108 cycles, the X and Z divisors are  
div  
div  
changed to 1 and 2, respectively, and the frequency is increased. The duration counter is  
reloaded to count 8205-cycles.  
WAIT_ClkFwdRst0  
Triggered by the duration counter reaching 8205 cycles (or by the deassertion of  
Reset_L while in the WAIT_RESET state). 21264/EV68A asserts SromOE_L and  
waits for SYSTEM to deassert ClkFwdReset_H. The deassertion must be synchronous  
to a falling edge of FrameClk_H. 21264/EV68A uses this deassertion to begin BiST  
and SROM load at a predictable time. 21264/EV68A samples and generates an internal,  
aligned copy of FrameClk_H, and, in turn, uses this clock to sample ClkFwdReset_H.  
WAIT_BiST  
WAIT_BiSI  
BiST and SROM load is started. The SROM first loads the Write-once chain and then  
reads the number of bits of Icache data to load.  
This state is entered when 'waking up' from sleep mode. 21264/EV68A receives an  
external interrupt, ramps the PLL, synchronously samples a transition on  
ClkFwdReset_H, and runs built-in self-initialization to clear the internal caches. Built-  
in self-test is not performed and the SROM is not loaded.  
WAIT_ClkFwdRst1  
Entered when the appropriate amount of BiST and SROM loading has been completed.  
21264/EV68A deasserts SromOE_L and waits for SYSTEM to deassert  
ClkFwdReset_H. The deassertion must be synchronous to a rising edge of  
FrameClk_H. 21264/EV68A uses this synchronous event to reset the clock forwarding  
interface and deassert internal reset. 21264/EV68A subsequently begins running code  
(either preloaded in the SROM or located in memory) and begins system transactions.  
RUN  
Chip is running software, interface is reset, and system transactions can be processed.  
From power-up, the Icache sets are enabled and contain bootstrap code loaded from the  
SROM; 21264/EV68A executes code from Icache. From wake-up, the Icache sets are  
disabled and 21264/EV68A fetches and executes code from DRAM.  
WAIT_RESET  
FAULT_RESET  
DOWN1  
Triggered by duration counter reaching 264 cycles, or when Reset_L is asserted when in  
WAIT_INTERRUPT state. 21264/EV68A waits in this state until Reset_L is deasserted,  
at which point, the PLL starts to ramp up again.  
ClkFwdReset is asserted while the 21264/EV68A is running. The 21264/EV68A inter-  
nally resets a minimum amount of internal state, waits for clock forward reset deasser-  
tion, and begins fetching code at PAL_BASE + 0x780.  
21264/EV68A was in a state in which GCLK was at its highest speed and Reset_L was  
asserted. Internal chip functions are reset and the internal duration counter is set to 8205  
cycles. The purpose of this sequence is to down-ramp the clocks in anticipation of power  
being removed. If power is not removed (that is, reset is being toggled), 21264/EV68A  
ramps the clocks back to the original speed.  
This state is also entered when software writes the I_CTL internal processor register to  
sleep mode.  
7–18 Initialization and Configuration  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                 
Phase-Lock Loop (PLL) Functional Description  
Table 7–11 21264/EV68A Reset State Machine State Descriptions (Continued)  
State Name  
Description  
DOWN2  
Triggered by duration counter reaching 8205 cycles, the PLL ramps GCLK frequency  
down by the first divider ratio (X and Z equal 2 and 4, respectively). This has the  
div  
div  
effect of halving the GCLK frequency. The duration counter is set to 4108 cycles.  
DOWN3  
Triggered by duration counter reaching 4108 cycles, the PLL ramps frequency down by  
the second divider ratio (X and Z equal 16 and 32, respectively). This has the  
div  
div  
effect of reducing the frequency by a factor of 16 (of the original frequency). The inter-  
nal counter is set to 264 cycles.  
WAIT_INTERRUPT Triggered by duration counter reaching 264 cycles, the 21264/EV68A waits for either an  
unmasked clock interrupt or unmasked device interrupt from system. The interrupts are  
wired to the interrupt request and enable internal registers. When an enabled interrupt  
occurs, the PLL ramps back to full frequency. Subsequent to that, the built-in self-init  
(BiSI) initializes arrayed structures. The SROM is not reloaded; instead, the 21264/  
EV68A begins fetching code from the SYSTEM.  
7.11 Phase-Lock Loop (PLL) Functional Description  
The PLL multiplies the clock frequency of a differential input reference clock and  
aligns the phase of its output to that differential input clock. Thus, the 21264/EV68A  
can communicate synchronously on clock boundaries with clock periods that are  
defined by the system.  
7.11.1 Differential Reference Clocks  
A skew-controlled, ac-coupled differential clock is provided to the PLL by way of  
ClkIn_x . ClkIn_x are input signals to a differential amplifier. The frequency of  
ClkIn_x can range from 80 MHz to 200 MHz. ClkIn_x can be sourced by a variety of  
components that include PECL fanout parts or system PLLs. ClkIn_x are also the pri-  
mary clock source for the 21264/EV68A when in PLL bypass mode.  
7.11.2 PLL Output Clocks  
The following sections summarize the PLL output clocks.  
7.11.2.1 GCLK  
The PLL provides an output clock, GCLK, with a frequency that can range from 400  
MHz to 1.25 GHz under full-speed conditions. GCLK is the nominal onchip clock that  
is distributed to the entire 21264/EV68A chip.  
7.11.2.2 Differential 21264/EV68A Clocks  
The EV6Clk_x output pads provide an external test point to measure the PLL phase  
alignment. They do not provide a clock source. EV6Clk_x are square-wave signals  
that drive rail-to-rail continually from 0 to VDD.  
7.11.2.3 Nominal Operating Frequency  
Under normal operating conditions, the frequency of the PLL output clock, GCLK, is a  
simple function of the Y divider value.  
div  
21264/EV68A Hardware Reference Manual  
Initialization and Configuration 7–19  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                               
Phase-Lock Loop (PLL) Functional Description  
Table 7–12 shows the allowable ClkIn_x frequencies for a given operating frequency  
of the 21264/EV68A and the Y divider. For example, to set the 21264/EV68A GCLK  
div  
frequency to 500 MHz with a ClkIn_x frequency of 166.7 MHz, the system must select  
a Y divider of 3 by placing the value 0011 on pins IRQ_H[3:0].  
div  
2
Table 7–12 Differential Reference Clock Frequencies in Full-Speed Lock  
1
GCLK  
Reference Clock Frequency (MHz) for Y  
Dividers  
div  
2
Period Frequency  
3
4
5
6
7
8
9
10  
11  
12  
13  
14  
15  
16  
(ns)  
2.5  
2.4  
2.3  
2.2  
2.1  
2.0  
1.9  
1.8  
1.7  
1.6  
1.5  
1.4  
1.3  
1.2  
1.1  
1.0  
0.9  
0.8  
(MHz)  
400  
133.3 100  
80  
83.3  
416.7  
434.8  
454.5  
476.2  
500  
138.9 104.2 83.3  
144.9 108.7 87.0  
151.2 113.6 90.9  
158.7 119.0 95.2  
166.7 125.0 100  
83.3  
526.3  
555.6  
588.2  
625  
175.4 131.6 105.3 87.7  
185.2 138.9 111.1 92.6  
196.1 147.1 117.6 98.0 84.0  
156.3 125.0 104.2 89.3  
666.7  
714.3  
769.2  
833.3  
909.1  
1000  
1111.1  
1250  
166.7 133.3 111.1 95.2 83.3  
178.6 142.9 119.1 102.0 89.3  
192.3 153.8 128.2 109.9 96.2 85.5  
166.7 138.9 119.0 104.2 92.6 83.3  
181.8 151.5 129.9 113.6 101 90.9  
111,1 100  
185.2 158.7 138.9 123.5 111.1 101.0 92.6 85.5  
178.6 156.3 138.9 125 113.6 104.2 96.2 89.3  
200  
166.7 142.9 125  
90.9 83.3  
1
2
Divider 16 is out of range for the 21264/EV68A and reserved for future use. Valid reference clock  
(ClkIn_x) frequencies for the 21264/EV68A are specified in the range from 80 to 200. Divider values  
that are out of that range are displayed as a dash “—”.  
Dividers of 1 and 2 are to be used only in a PLL test mode.  
7.11.2.4 Power-Up/Reset Clocking  
During the power-up/reset sequence, when not in PLL bypass mode, there may be a  
period of time when ClkIn_x is not yet running, but there is a voltage on PLL_VDD.  
The signal DCOK_H is deasserted until power is good throughout the system. The  
10% to 90% rise time of DCOK_H should be less than 2 ns. The deasserted state of  
DCOK_H and the presence of PLL_VDD causes the PLL to generate a global clock  
that is distributed throughout the 21264/EV68A with a frequency range of 1 MHz to  
500 MHz. The presence of the global clock during this period avoids permanent dam-  
age to the 21264/EV68A.  
7–20 Initialization and Configuration  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
8
Error Detection and Error Handling  
This chapter gives an overview of the 21264/EV68A error detection and error handling  
mechanisms, and is organized as follows:  
Data error correction code  
Icache data or tag parity error  
Dcache tag parity error  
Dcache data correctable ECC error  
Dcache store second error  
Dcache duplicate tag parity error  
Bcache tag parity error  
Bcache data correctable ECC error  
Memory/system port data correctable ECC error  
Bcache data correctable ECC error on a probe  
Double-bit fill errors  
Error case summary  
Table 8–1 summarizes the 21264/EV68A error detection.  
Table 8–1 21264/EV68A Error Detection Mechanisms  
Component  
Error Detection Mechanism  
Parity protected.  
Bcache tag  
Bcache data array  
Dcache tag array  
Dcache duplicate tag array  
Dcache data array  
Quadword-ECC protected.  
Parity protected.  
Parity protected.  
Quadword-ECC protected, however this mode of operation is  
only supported in systems that have ECC enabled on both the  
system and Bcache ports.  
Icache tag array  
Parity protected.  
Icache data array  
System port data bus  
Parity protected.  
Quadword-ECC protected.  
21264/EV68A Hardware Reference Manual  
Error Detection and Error Handling  
8–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Data Error Correction Code  
8.1 Data Error Correction Code  
The 21264/EV68A supports a quadword error correction code (ECC) for the system data  
bus. ECC is generated by the 21264/EV68A for all memory write transactions  
(WrVictimBlk) emitted from the 21264/EV68A and for all probe data. ECC is also  
checked on every memory read transaction for single-bit correction and double-bit error  
detection. Bcache data is checked for fills to the Dcache and Icache, and for Bcache-to-  
system transfers that are initiated by a probe (if enabled by the CSR  
ENABLE_PROBE_CHECK).  
The 21264/EV68A ECC implementation corrects single-bit errors in hardware.  
I/O write transaction data will not have a valid ECC (the ECC bits must be ignored by  
the system). Also, ECC checking is not performed on I/O read data.  
Error detection and correction can be enabled/disabled by way of Mbox IPR  
DC_CTL[DCDAT_ERR_EN].  
Table 8–2 shows the ECC code.  
Table 8–2 64-Bit Data and Check Bit ECC Code  
11 1111 1111 2222 2222 2233 3333 3333 4444 4444 4455 5555 5555 6666 CCCC CCCC  
0123 4567 8901 2345 6789 0123 4567 8901 2345 6789 0123 4567 8901 2345 6789 0123 0123 4567  
CB0  
CB1  
CB2  
CB3  
CB4  
CB5  
CB6  
CB7  
0111 0100 1101 0010 0111 0100 1101 0010 1000 1011 0010 1101 1000 1011 0010 1101 1000 0000  
1110 1010 1010 1000 1110 1010 1010 1000 1110 1010 1010 1000 1110 1010 1010 1000 0100 0000  
1001 1001 0110 0101 1001 1001 0110 0101 1001 1001 0110 0101 1001 1001 0110 0101 0010 0000  
1100 0111 0001 1100 1100 0111 0001 1100 1100 0111 0001 1100 1100 0111 0001 1100 0001 0000  
0011 1111 0000 0011 0011 1111 0000 0011 0011 1111 0000 0011 0011 1111 0000 0011 0000 1000  
0000 0000 1111 1111 0000 0000 1111 1111 0000 0000 1111 1111 0000 0000 1111 1111 0000 0100  
1111 1111 0000 0000 0000 0000 1111 1111 1111 1111 0000 0000 0000 0000 1111 1111 0000 0010  
1111 1111 0000 0000 0000 0000 1111 1111 0000 0000 1111 1111 1111 1111 0000 0000 0000 0001  
8.2 Icache Data or Tag Parity Error  
The following actions are performed when an Icache data or tag parity error occurs.  
1. When the hardware detects an error during an Icache read transaction, it traps and  
replays the instructions that were fetched during the error, then flushes the entire  
Icache so the re-fetched instructions do not come directly from the Icache.  
2. I_STAT[PAR] is set.  
3. A corrected read data (CRD) interrupt is posted, when enabled. (Pass 3 only)  
8.3 Dcache Tag Parity Error  
The primary copies of the Dcache tags are used only when servicing 21264/EV68A-gen-  
erated load and store instructions. There are correctable and uncorrectable forms of this  
error. If an issued load or store instruction detects a Dcache tag parity error, the following  
actions are performed:  
1. MM_STAT[DC_TAG_PERR] is set.  
2. A Dstream fault (DFAULT) is taken.  
8–2  
Error Detection and Error Handling  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                     
Dcache Data Single-Bit Correctable ECC Error  
3. The virtual address associated with the error is available in the VA register.  
4. The PALcode flushes the error block by temporarily disabling  
DC_CTL[DCTAG_PAR_EN] and evicting the block using two HW_LD instruc-  
tions. The onchip duplicate tag provides the correct victim address and cache  
coherence state.  
If a retried load instruction detects the Dcache tag parity error, the memory reference  
may have already been retired, so the EXC_ADDR is not available. In this case, the  
error is uncorrectable and the Mbox performs the following actions:  
Either DC_STAT[TPERR_P0] or DC_STAT[TPERR_P1] is set, indicating the  
source of the error.  
When enabled, a machine check (MCHK) is posted. The MCHK is taken when not  
in PALmode.  
8.4 Dcache Data Single-Bit Correctable ECC Error  
The following operations may cause Dcache data ECC errors:  
Load instructions  
Stores of less than quadword length  
Dcache victim read transactions  
The hardware flow used for Dcache data ECC errors depends on the event that  
caused the error.  
8.4.1 Load Instruction  
Loads that read data from the Dcache may do so either in the same cycle as the Dcache  
tag probe (typical case) or in some subsequent cycle (load-queue retry). The hardware  
functional flows for these two error cases differ slightly.  
When a load instruction reads the Dcache data array in the same cycle as the tag array,  
if an ECC error occurs on the LSD ECC error detectors, then the Ibox stops retiring  
instructions and does not resume retiring until after hardware recovers from the error.  
If an ECC error occurs on the LSD ECC error detectors, when a load instruction reads  
the Dcache tag array before it reads the Dcache data array, then the load instruction may  
have already been retired. In either case:  
The incorrect data is written into the load instruction’s destination register;  
however, the load queue retains the state associated with the load instruction.  
A consumer of the load instruction’s data may be issued before the error is  
recognized; however, the Ibox will invoke a replay trap at an instruction that is  
older than (or equal to) any instruction that consumes the load instruction’s data,  
and then stalls the replayed Istream in the map stage of the pipeline until the error is  
corrected.  
Given a READ_ERR read-type from the Mbox for the error load instruction, the  
Cbox scrubs the block in the Dcache by evicting the block into the victim buffer  
(thereby scrubbing it) and writing it back into the Dcache as follows:  
C_STAT[DSTREAM_DC_ERR] is set.  
21264/EV68A Hardware Reference Manual  
Error Detection and Error Handling  
8–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Dcache Store Second Error  
C_ADDR contains bits [19:6] of the Dcache address of the block that contains  
the error (bits [42:20] of the physical address are not updated).  
DC_STAT[ECC_ERR_LD] is set.  
The load queue retries the load and rewrites the register.  
A corrected read data (CRD) error interrupt is posted, when enabled.  
Note:  
Errors in speculative load instructions cause a CRD error interrupt  
to be posted but the data is not scrubbed by hardware. The PALcode  
cannot perform a scrub because C_STAT is zero and C_ADDR does not  
contain the address of the error.  
8.4.2 Store Instruction (Quadword or Smaller)  
A store instruction that is a quadword or smaller could invoke a Dcache ECC error,  
since the original quadword must be read to calculate the new check bits.  
The Mbox scrubs the original quadword and replays the write transaction.  
DC_STAT[ECC_ERR_ST] is set.  
A corrected read data (CRD) error interrupt is posted, when enabled.  
8.4.3 Dcache Victim Extracts  
Dcache victims with an ECC error are scrubbed as they are written into the  
victim data buffer.  
No status is logged.  
No exception is posted.  
8.5 Dcache Store Second Error  
A second store instruction error is logged when it occurs close behind the first.  
Neither error is corrected.  
DC_STAT[ECC_ERR_ST] is set.  
DC_STAT[SEO] is set.  
When enabled, a machine check (MCHK) is posted. The MCHK is taken when not  
in PALmode.  
8.6 Dcache Duplicate Tag Parity Error  
The Dcache duplicate tag has the correct version of the Dcache coherence state for the  
21264/EV68A, allowing it to be used for correct tag/status data when the Dcache tags  
generate a parity error. These tags are parity protected also; however, the Dcache dupli-  
cate tag cell is designed to be much more tolerant of soft errors. The parity generators  
for the duplicate tags are enabled whenever the Cbox performs a physically-indexed  
read transaction of eight locations in the tag array. If an error is generated, the following  
actions are taken:  
Dcache duplicate tag parity errors are not recoverable.  
8–4  
Error Detection and Error Handling  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
Bcache Tag Parity Error  
C_STAT[DC_PERR] is set.  
C_ADDR contains bits [42:6] of the Dcache duplicate tag address of the block that  
contains the error.  
When enabled, a machine check (MCHK) is posted. The MCHK is taken when not  
in PALmode.  
8.7 Bcache Tag Parity Error  
The Bcache tag parity is checked on all Bcache tag references, including references  
invoked by system probes. If an error is detected, the following actions are taken:  
Bcache tag parity errors are not recoverable.  
C_STAT[BC_PERR] is set.  
C_ADDR contains bits [42:6] of the Bcache address of the block that contains the  
error.  
When enabled, a machine check (MCHK) is posted. The MCHK is taken when not  
in PALmode.  
8.8 Controlling Bcache Block Parity Calculation  
Parity is calculated for either valid Bcache blocks or all Bcache blocks. The calculation  
is controlled by the value in Cbox CSR BC_VALID_MODE in the WRITE_MANY  
chain, as follows:  
If the MSB of BcTag_H is less than the value of Maximum PA in Table 4–13, then  
BC_VALID_MODE=1 and parity is calculated for only valid Bcache blocks.  
If the MSB of BcTag_H is greater than or equal to the value of Maximum PA in  
Table 4–13, then BC_VALID_MODE=0 and parity is calculated for all Bcache  
blocks.  
For example, if BcTag_H[38:20] and Maximum PA is 36, then 38 is greater than or  
equal to 36 and BC_VALID_MODE=0 and parity is calculated for all Bcache blocks.  
8.9 Bcache Data Single-Bit Correctable ECC Error  
The following actions may trigger Bcache data ECC errors:  
Icache fill, data possibly used by Icache  
Dcache fill, data possibly used by load instruction  
Bcache victim during an ECB instruction or during a Dcache/Bcache miss  
The recovery mechanism depends on the action that triggered the error.  
8.9.1 Icache Fill from Bcache  
For an Icache fill, the LSD ECC checkers detect the error, and bad Icache data parity is  
generated for the octaword that contains the quadword in error. If an error is detected,  
the following actions are taken:  
The hardware flushes the Icache.  
C_STAT[ISTREAM_BC_ERR] is set.  
21264/EV68A Hardware Reference Manual  
Error Detection and Error Handling  
8–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
Bcache Data Single-Bit Correctable ECC Error  
C_ADDR contains bits [42:6] of the Bcache fill address of the block that contains  
the error.  
C_SYNDROME_0[7:0] and C_SYNDROME_1[7:0] contain the syndrome of  
quadword 0 and 1, respectively, of the octaword subblock that contains the error.  
A machine check (MCHK) is posted and taken immediately. The PALcode machine  
check handler performs a scrubbing operation as described in Section D.36 to  
ensure that the origination point of the error is corrected.  
Note:  
A corrected read data (CRD) error interrupt is also posted in case this error  
is in a speculative path and the MCHK is removed. The CRD PALcode  
reads the status, to detect this condition, and scrubs the block. In the normal  
MCHK flow, the PALcode clears the pending CRD error.  
8.9.2 Dcache Fill from Bcache  
If the quadword in error is not used to satisfy a load instruction, a hardware recovery  
flow is not invoked. The quadword in error, and its associated check bits, are written  
into the Dcache. However, status is logged as shown in the bulleted list below, and a  
corrected read data (CRD) error interrupt is posted, when enabled. PALcode may elect  
to correct the error by scrubbing the block. If the error is not corrected by PALcode  
when it occurs, the error will be detected and corrected by a later load/victim operation.  
If the quadword in error is used to satisfy a load instruction, then the flow is very simi-  
lar to that used for a Dcache ECC error. The LSD ECC checker detects the error and the  
21264/EV68A performs the following actions:  
The load instruction’s destination register is written with incorrect data; however,  
the load queue will retain the state associated with the load instruction.  
A consumer of the load instruction’s data may be issued before the error is  
recognized. The Ibox will invoke a replay trap at an instruction that is older than (or  
equal to) any instruction that consumes the load instruction’s data. The 21264/  
EV68A then stalls the replayed Istream in the map stage of the pipeline, until the  
error is corrected.  
With a READ_ERR read type from the Mbox for the load instruction in error, the  
Cbox scrubs the block in the Dcache by evicting the block into the victim buffer  
and writing it back into the Dcache.  
C_STAT[DSTREAM_BC_ERR] is set.  
C_ADDR contains bits [42:6] of the Bcache fill address of the block that contains  
the error.  
C_SYNDROME_0[7:0] and C_SYNDROME_1[7:0] contain the syndrome of  
quadword 0 and 1, respectively, of the octaword subblock that contains the error.  
The load queue retries the load instruction and rewrites the register.  
DC_STAT[ECC_ERR_LD] is set.  
A corrected read data (CRD) error interrupt is posted, when enabled.  
8–6  
Error Detection and Error Handling  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Memory/System Port Single-Bit Data Correctable ECC Error  
Note:  
Errors in speculative load instructions cause a CRD error to be posted but  
the data is not scrubbed by hardware. The PALcode cannot perform a scrub  
operation because C_STAT is zero and C_ADDR does not contain the  
address of the block in error.  
8.9.3 Bcache Victim Read  
A victim from the Bcache is written directly to the system port, without correction. The  
ECC parity checker on the LSD detects the error and posts a corrected read data (CRD)  
error interrupt. The Cbox error register is not updated.  
8.9.3.1 Bcache Victim Read During a Dcache/Bcache Miss  
While the Bcache is servicing a Dcache miss and that Bcache access is also a miss, and  
an error occurs during that Bcache data access, the Cbox does not latch the error infor-  
mation. However, the Mbox correction state machine is activated and it invokes a CRD  
error despite the fact that no correction is performed.  
The Bcache access error is written out to memory and is subsequently detected and cor-  
rected by the next consumer of the data.  
No correction is made.  
No status is logged (C_STAT = 0).  
A CRD error interrupt is posted, when enabled.  
8.9.3.2 Bcache Victim Read During an ECB Instruction  
A victim from the Bcache that occurs while an ECB instruction is being executed is  
written directly to the system port without correction. No Cbox registers are set and no  
exception is taken.  
8.10 Memory/System Port Single-Bit Data Correctable ECC Error  
The following actions may cause memory/system port data ECC errors:  
Icache fill–data possibly used by Icache  
Dcache fill–data possibly used by a load instruction  
The recovery mechanism depends on the event that caused the error.  
8.10.1 Icache Fill from Memory  
For an Icache fill the LSD ECC generators detect the error, and bad Icache data  
parity is generated for the octaword that contains the quadword in error.  
The hardware flushes the Icache.  
C_STAT[ISTREAM_MEM_ERR] is set.  
C_ADDR contains bits [42:6] of the system memory fill address of the block that  
contains the error.  
C_SYNDROME_0[7:0] and C_SYNDROME_1[7:0] contain the syndrome of  
quadword 0 and 1, respectively, of the octaword subblock that contains the error.  
21264/EV68A Hardware Reference Manual  
Error Detection and Error Handling  
8–7  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                   
Memory/System Port Single-Bit Data Correctable ECC Error  
A machine check (MCHK) is posted and taken immediately. The PALcode machine  
check handler performs a scrubbing operation as described in Section D.36 to  
ensure that the origination point of the error is corrected.  
Note:  
Also, a corrected read data (CRD) error is posted, when enabled, in case  
this error is in a speculative path and the MCHK is removed. The CRD  
error PALcode reads the status to detect this condition and scrubs the block.  
In the normal MCHK flow, the PALcode clears the pending CRD error.  
8.10.2 Dcache Fill from Memory  
If the quadword in error is not used to satisfy a load instruction, no hardware  
recovery flow is invoked. The quadword in error, and its associated check bits, are writ-  
ten into the Dcache. However, status is logged as shown in the bulleted list below and a  
corrected read data (CRD) error interrupt is posted, when enabled. PALcode may  
choose to correct the error by scrubbing the block. If the error is not corrected by PAL-  
code at the time, the error will be detected and corrected by a load/victim operation.  
If the quadword in error is used to satisfy a load instruction, then the flow is very simi-  
lar to that used for a Dcache ECC error:  
The load instruction’s destination register is written with incorrect data; however,  
the load queue will retain the state associated with the load instruction.  
A consumer of the load instruction’s data may be issued before the error is  
recognized; however, the Ibox will invoke a replay trap at an instruction that is  
older than (or equal to) any instruction that consumes the load instruction’s data.  
The Ibox stalls the replayed Istream in the map stage of the pipeline until the error  
is corrected.  
With a READ_ERR read type from the Mbox for the load instruction in error, the  
Cbox scrubs the block in the Dcache by evicting the block into the victim buffer  
and writing it back into the Dcache.  
C_STAT[DSTREAM_MEM_ERR] is set.  
C_ADDR contains bits [42:6] of the system memory fill address of the block that  
contains the error.  
C_SYNDROME_0[7:0] and C_SYNDROME_1[7:0] contain the syndrome of  
quadword 0 and 1, respectively, of the octaword subblock that contains the error.  
The load queue retries the load instruction and rewrites the register.  
DC_STAT[ECC_ERR_LD] is set.  
A corrected read data (CRD) error interrupt is posted, when enabled.  
Note:  
Errors in speculative load instructions cause a CRD error to be posted but  
the data is not scrubbed by hardware. The PALcode cannot scrub the data  
because C_STAT is zero, and C_ADDR does not have the address of the  
block with the error.  
8–8  
Error Detection and Error Handling  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Bcache Data Single-Bit Correctable ECC Error on a Probe  
8.11 Bcache Data Single-Bit Correctable ECC Error on a Probe  
The probed processor extracts the block from its Bcache, signaling a corrected read  
data (CRD) error and latching error information. The single-bit ECC detected error data  
is not corrected by the probed processor, but is forwarded to the requesting processor.  
The requesting processor then detects a related system fill error as a result of this sys-  
tem probe transaction.  
No hardware correction is performed.  
C_STAT[PROBE_BC_ERR] is set.  
C_ADDR contains bit [42:6] of the Bcache address of the block that contains the  
error.  
C_SYNDROME_0[7:0] and C_SYNDROME_1[7:0] contain the syndrome of  
quadword 0 and 1, respectively, of the octaword subblock that contains the error.  
A CRD error interrupt is posted, when enabled.  
The PALcode on the probed processor may choose to scrub the error, though it will  
probably be scrubbed by the requesting processor.  
8.12 Double-Bit Fill Errors  
Double-bit errors for fills are detected, but not corrected, in the 21264/EV68A. The fol-  
lowing events may cause a double-bit fill error:  
Icache fill from Bcache  
Dcache fill from Bcache  
Icache fill from memory  
Dcache fill from memory  
If an error is detected, the following actions are taken:  
C_STAT is set to one of the following:  
ISTREAM_BC_DBL (Icache fill from Bcache)  
DSTEAM_BC_DBL (Dcache fill from Bcache)  
ISTREAM_MEM_DBL (Icache fill from memory)  
DSTREAM_MEM_DBL (Dcache fill from memory)  
C_ADDR contains bits [42:6] of the system memory fill address of the block that  
contains the error.  
When enabled, a machine check (MCHK) is posted. The MCHK is taken when not  
in PALmode.  
A double-bit fill error from memory, marked by the data's corresponding ECC,  
when written to cache, also writes the corresponding ECC to cache. Any consumer  
of that error (such as another CPU) also consumes the corresponding ECC value.  
Note:  
C_ADDR may be inaccurate in heavy traffic conditions. C_STAT is accu-  
rate.  
21264/EV68A Hardware Reference Manual  
Error Detection and Error Handling  
8–9  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Error Case Summary  
8.13 Error Case Summary  
Table 8–3 summarizes the various error cases and their ramifications.  
Table 8–3 Error Case Summary  
Hardware  
Action  
Error  
Exception Status  
CRD ISTAT[PAR]  
PALcode Action  
Icache data or tag  
parity error  
Icache flushed Log as CRD  
Dcache tag parity  
error (on issue)  
DFAULT MM_STAT[DC_TAG_PERR]  
VA[address]  
Evict with two  
HW_LDs and log as  
CRD  
1
Dcache tag parity  
error (on retry)  
MCHK  
CRD  
DC_STAT[TPERR_P0] or  
DC_STAT[TPERR_P1]  
Log as MCHK  
Dcache single-bit  
ECC error on load  
DC_STAT[ECC_ERR_LD]  
Corrected and Log as CRD  
C_STAT[DSTREAM_DC_ERR] scrubbed  
C_ADDR[bits [19:6] of the error  
address. [42:20] not updated.]  
Dcache single-bit  
ECC error on  
speculative load  
CRD  
CRD  
None  
DC_STAT[ECC_ERR_LD]  
C_STAT contains zero  
None  
Log as CRD  
Dcache single-bit  
ECC error on small  
store  
DC_STAT[ECC_ERR_ST]  
None  
Corrected and Log as CRD  
scrubbed  
Dcache single-bit  
ECC error on victim  
read  
Corrected and None  
scrubbed  
Dcache second error MCHK  
on store  
DC_STAT[SEO]  
No correction Log as MCHK  
on either store  
Dcache duplicate tag MCHK  
parity error  
C_STAT[DC_PERR]  
C_ADDR[error address]  
Uncorrectable Log as MCHK  
Bcache tag parity  
error  
MCHK  
C_STAT[BC_PERR]  
C_ADDR[error address]  
Uncorrectable Log as MCHK  
Bcache single-bit  
error on Icache fill  
MCHK  
C_STAT[ISTREAM_BC_ERR]  
Icache flushed Scrub error as described  
2
and CRD C_ADDR[error address]  
C_SYNDROME_0  
C_SYNDROME_1  
Bcache single-bit  
error on Dcache fill  
CRD  
DC_STAT[ECC_ERR_LD]  
C_STAT[DSTREAM_BC_ERR]  
C_ADDR[error address]  
C_SYNDROME_0  
Corrected and Scrub error as described  
scrubbed in  
3
Dcache  
C_SYNDROME_1  
Bcache victim read CRD  
on Dcache/Bcache  
miss  
DC_STAT[ECC_ERR_LD]  
C_STAT contains 0  
None  
None  
Log as CRD  
None  
Bcache victim read None  
on ECB  
None  
8–10 Error Detection and Error Handling  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Error Case Summary  
Table 8–3 Error Case Summary (Continued)  
Hardware  
Action  
Error  
Exception Status  
MCHK  
and CRD C_ADDR[error address]  
C_SYNDROME_0  
PALcode Action  
Memory single-bit  
error on Icache fill  
C_STAT[ISTREAM_MEM_ERR] Icache flushed Scrub error as described  
C_SYNDROME_1  
Memory single-bit  
error on Dcache fill  
CRD  
CRD  
DC_STAT[ECC_ERR_LD]  
C_STAT[DSTREAM_MEM_ERR] scrubbed in  
C_ADDR[error address]  
C_SYNDROME_0  
C_SYNDROME_1  
Corrected and Scrub error as described  
Dcache  
Bcache single-bit  
error on a probe hit  
C_STAT[PROBE_BC_ERR]  
C_ADDR[error address]  
C_SYNDROME_0  
C_SYNDROME_1  
None  
May scrub error as  
described in Section  
D.36.  
4
Log as CRD  
Bcache double-bit  
error on Icache fill  
MCHK  
MCHK  
C_STAT[ISTREAM_BC_DBL]  
C_ADDR[error address]  
None  
None  
None  
None  
Log as MCHK  
Log as MCHK  
Log as MCHK  
Log as MCHK  
Bcache double-bit  
error on Dcache fill  
C_STAT[DSTREAM_BC_DBL]  
C_ADDR[error address]  
Memory double-bit MCHK  
error on Icache fill  
C_STAT[ISTREAM_MEM_DBL]  
C_ADDR[error address]  
Memory double-bit MCHK  
error on Dcache fill  
C_STAT[DSTREAM_MEM_DBL]  
C_ADDR[error address]  
1
Machine check taken in native mode. It is deferred while in PALmode.  
CRD error posted in case the machine check is down a speculative path.  
2
3
For a single-bit error on a non-target quadword, the error is not corrected in hardware,  
but is corrected by PALcode during the scrub operation.  
4
The contents of C_ADDR may not be accurate when there is heavy cache fill traffic.  
21264/EV68A Hardware Reference Manual  
Error Detection and Error Handling 8–11  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
Download from Www.Somanuals.com. All Manuals Search And Download.  
9
Electrical Data  
This chapter describes the electrical characteristics of the 21264/EV68A and its inter-  
face pins. The chapter contains both ac and dc electrical characteristics and power sup-  
ply considerations, and is organized as follows:  
Electrical characteristics  
DC characteristics  
Power supply sequencing  
AC characteristics  
9.1 Electrical Characteristics  
Table 9–1 lists the maximum electrical ratings for the 21264/EV68A.  
Table 9–1 Maximum Electrical Ratings  
Characteristics  
Ratings  
Storage temperature  
–55° C to +125° C (–67° F to 257° F)  
Junction temperature  
0° C to 100° C (32° F to 212° F)  
Open-drain driver interface  
Push-pull driver interface  
Minimum dc voltage on signal pins  
2.0 V  
1.7 V  
VSS – 400 mV  
Maximum power @ indicated VDD  
for the following frequencies:  
Frequency  
750 MHz  
833 MHz  
875 MHz  
940 MHz  
Peak Power  
60 W @ 1.7 V  
67 W @ 1.7 V  
70 W @ 1.7 V  
75 W @ 1.7 V  
Notes:  
Stresses above those listed under the given maximum electrical ratings may  
cause permanent device failure. Functionality at or above these  
limits is not implied. Exposure to these limits for extended periods of time  
may affect device reliability.  
Power data is preliminary and based on measurements from a limited set of  
material.  
21264/EV68A Hardware Reference Manual  
Electrical Data  
9–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                 
DC Characteristics  
9.2 DC Characteristics  
This section contains the dc characteristics for the 21264/EV68A. The 21264/EV68A  
pins can be divided into 10 distinct electrical signal types. The mapping between these  
signal types and the package pins is shown in Chapter 3. Table 9–2 shows the signal  
types.  
Table 9–2 Signal Types  
Signal Type  
I_DC_POWER  
I_DC_REF  
I_DA  
Description  
Supply voltage pins (VDD/PLL_VDD)  
Input dc reference pin  
Input differential amplifier receiver  
Input differential amplifier clock receiver  
Open-drain output driver  
I_DA_CLK  
O_OD  
O_OD_TP  
O_PP  
Open-drain driver for test pins  
Push-pull output driver  
O_PP_CLK  
B_DA_OD  
B_DA_PP  
Push-pull output clock driver  
Bidirectional differential amplifier receiver — open-drain  
Bidirectional differential amplifier receiver — push-pull  
Tables 9–3 through 9–12 show the dc switching characteristics of each signal type.  
Also, the following notes apply to Tables 9–3 through 9–12.  
1. The differential voltage, Vdiff, is the absolute difference between the differential  
input pins.  
2. Delta V  
is defined as the open-circuit differential voltage on the appropriate  
BIAS  
differential pairs. Test condition for these inputs are to let the input network self  
bias and measure the open circuit voltage. The test load must be 1M ohm. In nor-  
mal operation, these inputs are coupled with a 680-pF capacitor.  
3. Functional operation of the 21264/EV68A with less than all VDD and VSS pins  
connected is not implied.  
4. The test load is a 50-ohm resistor to VDD/2. The resistor can be connected to the  
21264/EV68A pin by a 50-ohm transmission line of any length.  
5. DC test conditions set the minimum swing required. These dc limits set the trip  
point precision.  
6. Input pin capacitance values include 2.0 pF added for package capacitance.  
9–2  
Electrical Data  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                                 
DC Characteristics  
Note:  
Current out of a 21264/EV68A pin is represented by a – symbol while a +  
symbol indicates current flowing into a 21264/EV68A pin.  
Table 9–3 VDD (I_DC_POWER)  
Parameter Symbol  
VDD  
Description  
Test Conditions Minimum Maximum  
Processor core supply voltage  
Processor power required (sleep)  
1.6 V  
1.7 V  
1
Power (sleep)  
@ VDD = 1.7 V  
Note 3  
12 W  
PLL_VDD  
PLL supply voltage  
2.45 V  
2.55 V  
25 mA  
PLL_IDD  
PLL supply current (running)  
Freq = 940 MHz  
1
Power measured at 37.5 MHz while running the “Ebox aliveness test.”  
Table 9–4 Input DC Reference Pin (I_DC_REF)  
Parameter  
Symbol  
Description  
Test Conditions Minimum  
Maximum  
VREF  
DC input reference voltage  
Input current  
600 mV  
VDD – 100 mV  
| I |  
VSS V VDD  
150 µA  
I
Table 9–5 Input Differential Amplifier Receiver (I_DA)  
Parameter  
Symbol  
Description  
Test Conditions Minimum  
Maximum  
V
V
Low-level input voltage  
High-level input voltage  
Input current  
Note 5  
VREF – 200 mV  
IL  
VREF + 200 mV —  
IH  
| I |  
VSS V VDD —  
Freq =10 MHz  
150 µA  
I
C
Input-pin capacitance  
5.7 pF  
Note 6  
IN  
Table 9–6 Input Differential Amplifier Clock Receiver (I_DA_CLK)  
Parameter  
Symbol  
Description  
Test Conditions Minimum  
Maximum  
V
Differential input voltage  
Open-circuit differential  
200 mv Note 1  
diff  
| V  
|
I ≤ ± 1 µA  
50 mV  
BIAS  
Note 2  
| I |  
Input current  
VSS V VDD  
150 µA  
I
C
Input-pin capacitance  
Freq =10 MHz  
5.0 pF  
Note 6  
IN  
21264/EV68A Hardware Reference Manual  
Electrical Data  
9–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                       
DC Characteristics  
Table 9–7 Pin Type: Open-Drain Output Driver (O_OD)  
Parameter  
Symbol  
Test  
Conditions  
Description  
Minimum  
Maximum  
400 mV  
150 µA  
V
|I  
Low-level output voltage  
High impedance output current  
Open-drain pin capacitance  
I
= 50 mA  
OL  
OZ  
OD  
OL  
|
0 < V < VDD  
C
Freq = 10 MHz  
5.7 pF  
Note 6  
Table 9–8 Bidirectional, Differential Amplifier Receiver, Open-Drain Output Driver (B_DA_OD)  
Parameter  
Symbol  
Description  
Test Conditions Minimum  
Maximum  
VREF –200 mv  
V
V
V
Low-level input voltage  
High-level input voltage  
Low-level output voltage  
Input current  
Note 5  
IL  
VREF + 200 mV  
IH  
OL  
I
= 50 mA  
400 mV  
OL  
1
| I |  
VSS V VDD  
150 µA  
I
C
Input-pin capacitance  
Freq =10 MHz  
5.7 pF  
Note 6  
IN  
1
Measurement taken with output driver disabled.  
Table 9–9 Pin Type: Open-Drain Driver for Test Pins (O_OD_TP)  
Parameter  
Symbol  
Test  
Conditions  
Description  
Minimum  
Maximum  
400 mV  
V
Low-level output voltage  
High-impedance output current  
Pin capacitance  
I
= 15 mA  
OL  
OZ  
OD  
OL  
| I  
|
0 < V < VDD  
150 µA  
C
_TP  
Freq = 10 MHz  
5.2 pF  
Note 6  
Table 9–10 Bidirectional, Differential Amplifier Receiver, Push-Pull Output Driver (B_DA_PP)  
Parameter  
Symbol  
Description  
Test Conditions Minimum  
Maximum  
V
V
V
V
Low-level input voltage  
High-level input voltage  
Low-level output voltage  
High-level output voltage  
Input current  
VREF – 200 mV  
IL  
VREF + 200 mV —  
IH  
I
I
= 6 mA  
400 mV  
OL  
OH  
OL  
OH  
= –6 mA  
VDD – 400 mV  
1
| I |  
VSS V VDD  
150 µA  
I
C
Input-pin capacitance  
Freq =10 MHz  
6.0 pF  
Note 6  
IN  
1
Measurement taken with output driver disabled.  
9–4  
Electrical Data  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
               
Power Supply Sequencing and Avoiding Potential Failure Mechanisms  
Table 9–11 Push-Pull Output Driver (O_PP)  
Parameter  
Symbol  
Test  
Conditions  
Description  
Minimum  
Maximum  
500 mV  
V
V
| I  
Low-level output voltage  
High-level output voltage  
High-impedance output current  
Open-drain pin capacitance  
I
I
= 40 mA  
OL  
OH  
OZ  
OD  
OL  
OL  
= –40 mA  
VDD – 500 mV  
|
0 < V < VDD  
150 µA  
C
Freq = 10 MHz  
6.0 pF  
Note 6  
Table 9–12 Push-Pull Output Clock Driver (O_PP_CLK)  
Parameter  
Symbol  
Test  
Conditions  
Description  
Minimum  
Maximum  
VDD/2 – 325 mV  
V
V
Low-level output voltage  
High-level output voltage  
Note 4  
OL  
OH  
Note 4  
VDD/2 + 325 mV  
1
| I  
|
High-impedance output  
current  
0 < V < VDD  
40 mA  
OZ  
1
Measured value includes current from onchip termination structures.  
9.3 Power Supply Sequencing and Avoiding Potential Failure Mech-  
anisms  
Before the power-on sequencing can occur, systems should ensure that DCOK_H is  
deasserted and Reset_L is asserted. Then, systems ramp power to the 21264/EV68A  
PLL_VDD @ 2.5 V and the 21264/EV68A power planes (VDD @ 1.65 V, not to  
exceed 1.70 V under any circumstances), with PLL_VDD leading VDD. Systems  
should supply differential clocks to the 21264/EV68A on ClkIn_H and ClkIn_L. The  
clocks should be running as power is supplied.  
When enabling the power supply inputs in a system, three failure mechanisms must be  
avoided:  
1. Bidirectional signal buses must not conflict during power-up. A conflict on these  
buses can generate high current conditions, which can compromise the reliability of  
the associated chips.  
2. Similarly, input receivers should not see intermediate voltage levels that can also  
generate high current conditions, which can compromise the reliability of the  
receiving chip.  
3. Finally, no CMOS chip should see an input voltage that is higher than its internal  
VDD. In such a condition, a reasonable level of charge can be injected into the bulk  
of the die. This condition can expose the chip to a positive-feedback latchup  
condition.  
The 21264/EV68A addresses those three failure mechanisms by disabling all of its  
outputs and bidirectional pins (with three exceptions) until the assertion of DCOK_H.  
The three exceptions are Tdo_H, EV6Clk_L, and EV6Clk_H. Tdo_H is used only in  
21264/EV68A Hardware Reference Manual  
Electrical Data  
9–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
AC Characteristics  
the tester environment and does not need to be disabled. EV6Clk_L and EV6Clk_H  
are outputs that are both generated and consumed by the 21264/EV68A; thus, VDD  
tracks for both the producer and consumer.  
On the push-pull interfaces:  
Disabling all output drivers leaves the output signal at the DC bias point of the ter-  
mination network.  
Disabling the bidirectional drivers leaves the other consumers of the bus as the bus  
master.  
On the open-drain interfaces:  
Disabling all output drivers leaves the output signal at the voltage of the open-drain  
pull-up.  
Disabling all bidirectional drivers leaves the other consumers of the bus as the bus  
master.  
To avoid failure mechanism number two, systems must sequence and control external  
signal flow in such a way as to avoid zero differential into the 21264/EV68A input  
receivers (I_DA, I_DA_CLK, B_DA_OD, B_DA_PP, and B_DA_PP). Finally, to  
avoid failure mechanism number three, systems must sequence input and bidirectional  
pins (I_DA, I_DA_CLK, B_DA_OD, B_DA_PP, and I_DC_REF) such that the 21264/  
EV68A does not see a voltage above its VDD.  
In addition, as power is being ramped, Reset_L must be asserted — this allows the  
21264/EV68A to reset internal state. Once the target voltage levels are attained,  
systems should assert DCOK_H. This indicates to the 21264/EV68A that internal  
logic functions can be evaluated correctly and that the power-up sequence should be  
continued. Prior to DCOK_H being asserted, the logic internal to the 21264/EV68A is  
being reset and the internal clock network is running (either clocked by the VCO, which  
is at a nominal speed, or by ClkIn_H, if the PLL is bypassed).  
The reset state machine is in state WAIT_SETTLE.  
9.4 AC Characteristics  
Abbreviations:  
The following abbreviations apply to Table 9–13:  
TSU = Setup time  
Duty cycle = Minimum clock duty cycle  
TDH = Hold time  
Slew rate = referenced to signal edge  
AC Test Conditions:  
The following conditions apply to the measurements that are listed in Table 9–13:  
TSU, TDH, and TSkew values include the following onchip and package varia-  
tions: min/max metal capacitance, GCLK skew, transistor gate length variation  
(OCV), across-chip temperature and voltage variation, and package lead-length  
variation.  
9–6  
Electrical Data  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
AC Characteristics  
The AC specification values for TSU, TDH, and TSkew assume a ± 0.4 V input  
signal amplitude, relative to Vref, one volt per nanosecond slew rates, no mismatch  
in pin loading, and zero time-offset between forwarded clocks and data. Any devia-  
tion from those conditions can result in degraded chip performance.  
VDD is in the range between 1.55 V and 1.65 V.  
SysVref is 1.2 Volts.  
BcVref is 0.75 Volts.  
All output skew data is based on simulation into a 50-ohm transmission line that is  
terminated with 50 ohms to VDD/2 for Bcache timing, and with 50 ohms to VDD  
for all other timing.  
Timings are measured at the pins as follows:  
For open-drain outputs, timing is measured to (V + V  
)/2. Where V  
is  
term  
ol  
term  
the offchip termination voltage for system signals.  
For non-open-drain outputs, timing is measured to (V + V )/2.  
For all inputs other than type I_DA_CLK, timing is measured to the point  
where the input signal crosses VREF.  
ol oh  
For type I_DA_CLK inputs, timing is measured when the voltage on the com-  
plementary inputs is equal.  
Table 9–13 AC Specifications  
Signal Name  
Type  
Reference Signal  
TSU1 TDH2 TSkew  
Duty Cycle TSlew  
SysAddIn_L[14:0  
SysFillValid_L  
I_DA  
I_DA  
I_DA  
I_DA  
I_DA  
O_OD  
O_OD  
SysAddInClk_L  
SysAddInClk_L  
SysAddInClk_L  
SysAddInClk_L  
NA  
400 ps 400 ps NA  
400 ps 400 ps NA  
400 ps 400 ps NA  
400 ps 400 ps NA  
NA  
1.0 V/ns  
1.0 V/ns  
1.0 V/ns  
1.0 V/ns  
1.0 V/ns  
NA  
NA  
SysDataInValid_L  
SysDataOutValid_L  
SysAddInClk_L  
SysAddOut_L[14:0]  
SysAddOutClk_L  
SysData_L[63:0]  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
45–55%  
SysAddOutClk_L  
EV6Clk_x  
± 300 ps3 NA  
± 350 ps 45-55%  
NA  
± 300 ps3 NA  
400 ps 400 ps NA NA  
± 300 ps3 NA  
NA  
B_DA_OD SysDataInClk_H[7:0]  
SysDataOutClk_L[7:0]4  
400 ps 400 ps NA  
NA NA  
1.0 V/ns  
NA  
SysCheck_L[7:0]  
B_DA_OD SysDataInClk_H[7:0]  
SysDataOutClk_L[7:0]4  
1.0 V/ns  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
SysDataInClk_H[7:0]  
SysDataOutClk_L[7:0]  
BcAdd_H[23:4]  
BcDataOE_L  
I_DA  
NA  
NA  
45-55%  
45-55%  
1.0 V/ns  
NA  
O_OD  
O_PP  
EV6Clk_x  
± 350 ps  
BcTagOutClk_x  
BcDataOutClk_x[3:0]7  
± 300 ps5,6 NA  
45-55%  
O_PP  
BcLoad_L  
O_PP  
38-63%8  
40-60%9  
BcDataWr_L  
O_PP  
BcData_H[127:0]  
B_DA_PP  
BcDataOutClk_x[3:0]10  
NA  
NA  
± 300 ps6 45-55%  
38-63%8  
1.0 V/ns  
NA  
40-60%9  
NA  
BcDataInClk_H[7:0]  
400 ps 400 ps NA  
NA  
NA  
21264/EV68A Hardware Reference Manual  
Electrical Data  
9–7  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
AC Characteristics  
Table 9–13 AC Specifications (Continued)  
Signal Name  
Type  
Reference Signal  
TSU1 TDH2 TSkew  
Duty Cycle TSlew  
BcDataInClk_H[7:0]  
BcDataOutClk_H[3:0]  
BcDataOutClk_L[3:0]  
BcTag_H[42:20]  
I_DA  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
45-55%  
O_PP  
EV6Clk_x  
± 350 ps  
± 350 ps  
O_PP  
EV6Clk_x  
B_DA_PP  
B_DA_PP  
BcTagInClk_H  
BcTagInClk_H  
BcTagOutClk_x  
BcTagInClk_H  
BcTagInClk_H  
BcTagOutClk_x  
BcTagInClk_H  
BcTagOutClk_x  
400 ps 400 ps NA  
400 ps 400 ps NA  
NA  
1.0 V/ns  
1.0 V/ns  
NA  
BcTagDirty_H  
NA  
38-63%8  
NA  
BcTagParity_H  
BcTagShared_H  
B_DA_PP  
B_DA_PP  
400 ps 400 ps NA  
400 ps 400 ps NA  
1.0 V/ns  
1.0 V/ns  
NA  
NA  
40-60%9  
NA  
BcTagValid_H  
B_DA_PP  
400 ps 400 ps NA  
1.0 V/ns  
NA  
NA  
NA  
± 300 ps6 45-55%  
BcTagOE_L  
BcTagWr_L  
BcTagInClk_H  
BcTagOutClk_x  
IRQ_H[5:0]  
Reset_L12  
O_PP  
O_PP  
I_DA  
NA  
NA  
NA  
NA  
NA  
NA  
45-55%  
O_PP  
I_DA  
EV6Clk_x  
DCOK_H  
± 350 ps  
10 ns11 10 ns11 NA  
NA  
100 mV/ns  
100 mV/ns  
100 mV/ns  
100 mV/ns  
1.0 V/ns  
1.0 V/ns  
NA  
I_DA  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
NA  
DCOK_H13  
PllBypass_H14  
ClkIn_x15  
I_DA  
NA  
I_DA  
NA  
I_DA_CLK  
40–60%16  
NA  
FrameClk_x17  
EV6Clk_x18  
EV6Clk_x19  
ClkFwdRst_H  
SromData_H  
SromOE_L  
I_DA_CLK ClkIn_x  
O_PP_CLK ClkIn_x  
400 ps 400 ps NA  
NA NA ±1.0 ns  
YDiv±5%  
Cycle Compression Specification: See Note 19  
400 ps 400 ps NA  
I_DA  
I_DA  
O_OD  
FrameClk_x  
NA  
1.0 V/ns  
SromClk_H  
2.0 ns 2.0 ns NA  
100 mV/ns  
EV6Clk_x  
NA  
NA  
± 2.0 ns  
SromClk_H20  
O_OD  
EV6Clk_x  
NA  
NA  
± 7.0 ns  
Tms_H  
Trst_L21  
Tdi_H  
I_DA  
I_DA  
I_DA  
O_OD  
Tck_H  
Tck_H  
Tck_H  
Tck_H  
2.0 ns 2.0 ns NA  
NA NA NA  
2.0 ns 2.0 ns NA  
NA  
NA  
NA  
NA  
100 mV/ns  
100 mV/ns  
100 mV/ns  
NA  
Tdo_H  
NA  
NA  
± 7.0 ns  
Tck_H  
I_DA  
IEEE 1149.1 Port Freq. = 5.0  
MHz Max.  
NA  
NA  
NA  
NA  
NA  
45-55%  
NA  
100 mV/ns  
NA  
TestStat_H  
O_OD  
EV6Clk_x  
± 4.0 ns  
1
The TSU specified for all clock-forwarded signal groups is with respect to the associated clock.  
9–8  
Electrical Data  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
AC Characteristics  
2
3
The TDH specified for all clock-forwarded signal groups is with respect to the associated clock.  
The TSkew value applies only when the SYS_CLK_DELAY[0:1] entry in the Cbox WRITE_ONCE  
chain (Table 5–24) is set to zero phases of delay between forwarded clock out and address/data.  
4
5
The TSkew specified for SysData_L signals is only with respect to the associated clock.  
These signals should be referenced to BcTagOutClk_x when measuring TSkew, provided that  
BcTagOutClkl_x and BcDataOutClk_x have no programmed offset.  
6
The TSkew value applies only when the BC_CLK_DELAY[0:1] entry in the Cbox WRITE_ONCE  
chain (Table 5–24) is set to zero phases of delay for Bcache clock.  
7
The TSkew specified for BcAdd_H signals is only with respect to the associated clock.  
The duty cycle for 2.5X single data mode 2 GCLK phases high and 3 GCLK phases low.  
The duty cycle for 3.5X single data mode 3 GCLK phases high and 4 GCLK phases low.  
The TSkew specified for BcData_H signals is only with respect to the associated clock pair.  
8
9
10  
11  
IRQ_H[5:0] must have their TSU and TDH times referenced to DCOK_H during power-up to ensure  
the correct Y divider and resulting EV6Clk_x duty cycle. When the 21264/EV68A is executing  
instructions IRQ_H[5:0] act as normal asynchronous pins to handle interrupts.  
12  
13  
14  
Reset_L is an asynchronous pin. It may be asserted asynchronously.  
DCOK_H is an asynchronous pin. Note the minimum slew rate on the assertion edge.  
PllBypass_H may not switch when ClkIn_x is running. This pin must either be deasserted during  
power-up or the 21264/EV68A core power pin (VDD pins) indicating the 21264/EV68A’s internal  
PLL will be used. Note that it is illegal to use PllBypass_H asserted during power-up unless a  
ClkIn_x is present.  
15  
16  
See Section 7.11.2 for a discussion of ClkIn_x as it relates to operating the 21264/EV68A’s internal  
PLL versus running the 21264/EV68A in PLL bypass mode. ClkIn_x has specific input jitter require-  
ments to ensure optimum performance of the internal 21264/EV68A PLL.  
In PLL bypass mode, duty cycle deviation from 50%–50% directly degrades device operating fre-  
quency.  
17  
18  
The TSU and TDH of FrameClk_x are referenced to the deasserting edge of ClkIn_x.  
This signal is a feedback to the internal PLL and may be monitored for overall 21264/EV68A jitter. It  
can also be used as a feedback signal to an external PLL when in PLL bypass mode. Proper termina-  
tion of EV6Clk_x is imperative.  
19  
The cycle or phase cannot be more than 5% shorter than the nominal. Do not confuse this measure-  
ment with duty cycle.  
20  
21  
The period for SromClk_H is 256 GCLK cycles.  
When Trst_L is deasserted, Tms_H must not change state. Trst_L is asserted asynchronously but  
may be deasserted synchronously.  
21264/EV68A Hardware Reference Manual  
Electrical Data  
9–9  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Download from Www.Somanuals.com. All Manuals Search And Download.  
10  
Thermal Management  
This chapter describes the 21264/EV68A thermal management and thermal design  
considerations, and is organized as follows:  
Operating temperature  
Heat sink specifications  
Thermal design considerations  
10.1 Operating Temperature  
The 21264/EV68A is specified to operate when the temperature at the center of the heat  
sink (T is as shown in Table 10–1. Temperature T should be measured at the center of  
c)  
c
the heat sink, between the two package studs. The GRAFOIL pad is the interface mate-  
rial between the package and the heat sink.  
Table 10–1 Operating Temperature at Heat Sink Center (T )  
c
T
Frequency  
750 MHz  
833 MHz  
875 MHz  
940 MHz  
c
80.8° C  
78.6° C  
77.6° C  
76.0° C  
Note:  
Compaq recommends using the heat sink because it reduces overheating  
that causes thermal failure.  
21264/EV68A Hardware Reference Manual  
Thermal Management 10–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Operating Temperature  
Table 10–2 lists the values for the center of heat-sink-to-ambient (θ a) for the 21264/  
c
EV68A 587-pin PGA. Tables 10–3 through 10–6 show the allowable T (without  
a
exceeding T ) at various airflows.  
c
Table 10–2 θ at Various Airflows for 21264/EV68A  
ca  
Airflow (linear ft/min)  
200  
400  
600  
0.36  
800  
θ a with heat sink type 1 (°C/W)  
0.68  
0.62  
0.42  
0.35  
0.33  
0.25  
c
θ a with heat sink type 2 (°C/W)  
0.28  
c
1
θ a with heat sink type 3 (°C/W)  
— 0.32 —  
c
1
Heat sink type 3 has a 80 mm × 80 mm × 15 mm fan attached.  
Table 10–3 Maximum T for 21264/EV68A @ 750 MHz and @ 1.7 V with Various Airflows  
a
Airflow (linear ft/min)  
200  
400  
600  
59.2  
800  
Maximum T with heat sink type 1 (°C)  
40.0  
43.6  
55.6  
59.8  
61.0  
65.8  
a
Maximum T with heat sink type 2 (°C)  
64.0  
a
1
Maximum T with heat sink type 3 (°C)  
— 61.6 —  
a
1
Heat sink type 3 has a 80 mm × 80 mm × 15 mm fan attached.  
Table 10–4 Maximum T for 21264/EV68A @ 833 MHz and @ 1.7 V with Various Airflows  
a
Airflow (linear ft/min)  
200  
400  
600  
54.4  
800  
Maximum T with heat sink type 1 (°C)  
33.0  
37.0  
50.4  
55.1  
56.5  
61.8  
a
Maximum T with heat sink type 2 (°C)  
59.8  
a
1
Maximum T with heat sink type 3 (°C)  
— 57.1 —  
a
1
Heat sink type 3 has a 80 mm × 80 mm × 15 mm fan attached.  
Table 10–5 Maximum T for 21264/EV68A @ 875 MHz and @ 1.7 V with Various Airflows  
a
Airflow (linear ft/min)  
200  
400  
600  
52.4  
800  
Maximum T with heat sink type 1 (°C)  
30.0  
34.2  
48.2  
53.1  
54.5  
60.1  
a
Maximum T with heat sink type 2 (°C)  
58.0  
a
1
Maximum T with heat sink type 3 (°C)  
— 55.2 —  
a
1
Heat sink type 3 has a 80 mm × 80 mm × 15 mm fan attached.  
Table 10–6 Maximum T for 21264/EV68A @ 940 MHz and @ 1.7 V with Various Airflows  
a
Airflow (linear ft/min)  
200  
400  
600  
49.0  
800  
Maximum T with heat sink type 1 (°C)  
25.0  
29.5  
44.5  
49.8  
51.3  
57.3  
a
Maximum T with heat sink type 2 (°C)  
55.0  
a
1
Maximum T with heat sink type 3 (°C)  
— 52.0 —  
a
1
Heat sink type 3 has a 80 mm × 80 mm × 15 mm fan attached.  
10–2 Thermal Management  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Heat Sink Specifications  
10.2 Heat Sink Specifications  
Three heat sink types are specified. The mounting holes for all three are in line with the  
cooling fins.  
Figure 10–1 shows the heat sink type 1, along with its approximate dimensions.  
Figure 10–1 Type 1 Heat Sink  
4.10 in  
1.55 in  
2.05 in  
1.00 in  
4.10 in  
0.10 in  
0.05 in  
1.50 in  
0.30 in  
LK99-0210a  
21264/EV68A Hardware Reference Manual  
Thermal Management 10–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Heat Sink Specifications  
Figure 10–2 shows the heat sink type 2, along with its approximate dimensions.  
Figure 10–2 Type 2 Heat Sink  
3.75 in  
1.875 in  
1.375 in  
1.875 in  
1.00 in  
3.75 in  
1.015 in  
0.055 in  
0.075 in  
0.04 in  
2.35 in  
0.25 in  
LK99-0211a  
10–4 Thermal Management  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
Heat Sink Specifications  
Figure 10–3 shows heat sink type 3, along with its approximate dimensions.  
The cooling fins of heat sink type 3 are cross-cut. Also, an 80 mm × 80 mm × 15 mm  
fan is attached to heat sink type 3.  
Figure 10–3 Type 3 Heat Sink  
3.20 in  
1.600 in  
3.20 in  
1.00 in  
1.600 in  
1.100 in  
0.200 in  
0.575 in  
0.20 in  
0.100 in  
0.05 in  
0.060 in  
0.080 in  
0.040 in  
0.250 in  
0.55 in  
0.090 in  
2.00 in  
0.885 in  
0.30 in  
3.200 in  
2.605 in  
LK99-0212a  
21264/EV68A Hardware Reference Manual  
Thermal Management 10–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
Thermal Design Considerations  
10.3 Thermal Design Considerations  
Follow these guidelines for printed circuit board (PCB) component placement:  
Orient the 21264/EV68A on the PCB with the heat sink fins aligned with the air-  
flow direction.  
Avoid preheating ambient air. Place the 21264/EV68A on the PCB so that inlet air  
is not preheated by any other PCB components.  
Do not place other high power devices in the vicinity of the 21264/EV68A.  
Do not restrict the airflow across the 21264/EV68A heat sink. Placement of other  
devices must allow for maximum system airflow in order to maximize the performance  
of the heat sink.  
10–6 Thermal Management  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
11  
Testability and Diagnostics  
This chapter describes the 21264/EV68A user-oriented testability and diagnostic fea-  
tures. These features include automatic power-up self-test, Icache initialization from  
external serial ROMs, and the serial diagnostic terminal port.  
The boundary-scan register, which is another testability and diagnostic feature, is listed  
in Appendix B. The boundary-scan register is compatible with IEEE Standard 1149.1.  
This chapter is organized as follows:  
Test pins  
SROM/serial diagnostic terminal port  
IEEE 1149.1 port  
TestStat_H pin  
Power-up self-test and initialization  
Notes on IEEE 1149.1 operation and compliance  
The 21264/EV68A has several manufacturing test features that are used only by the  
factory, and they are beyond the scope of this chapter.  
11.1 Test Pins  
The 21264/EV68A test access ports include the IEEE 1149.1 test access port, a dual-  
purpose SROM/Serial diagnostic terminal port, and a test status output pin. Table 11–1  
lists the test access port pins.  
Table 11–1 Dedicated Test Port Pins  
Pin Name  
Tms_H  
Type  
Input  
Input  
Input  
Input  
Output  
Input  
Function  
IEEE 1149.1 test mode select  
IEEE 1149.1 test data in  
IEEE 1149.1 test logic reset  
IEEE 1149.1 test clock  
Tdi_H  
Trst_L  
Tck_H  
Tdo_H  
IEEE 1149.1 test data output  
SROM data/Diagnostic terminal data input  
SromData_H  
21264/EV68A Hardware Reference Manual  
Testability and Diagnostics 11–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
SROM/Serial Diagnostic Terminal Port  
Table 11–1 Dedicated Test Port Pins (Continued)  
Pin Name  
Type  
Function  
SromClk_H  
SromOE_L  
TestStat_H  
Output  
Output  
Output  
SROM clock/Diagnostic terminal data output  
SROM enable/Diagnostic terminal enable  
BiST status/timeout output  
11.2 SROM/Serial Diagnostic Terminal Port  
This port supports two functions. During power-up, it supports automatic initialization  
of the Cbox configuration registers and the Icache from the system serial ROMs. After  
power-up, it supports a serial diagnostic terminal.  
11.2.1 SROM Load Operation  
The following actions are performed while the SROM is loaded:  
The SromOE_L pin supplies the output enable as well as the reset to the serial  
ROM. (Refer to the serial ROM specifications for details.) The 21264/EV68A  
asserts this signal low for the duration of the Icache load from the serial ROM.  
When the load has been completed, the signal remains deasserted.  
The SromClk_H pin supplies the clock to the SROM that causes it to advance to  
the next bit. Simultaneously, it causes the existing data on the SromData_H pin to  
be shifted into an internal shift register. The cycle time of this clock is 256 times the  
CPU clock rate. (If the FASTROM flag is set, the rate is 16 times the CPU clock  
rate.) The hold time on SromData_H is 2* CPU cycle time with respect to  
SromClk_H.  
The SromData_H pin reads data from the SROM.  
Every data and tag bit in Icache is loaded by that sequence.  
11.2.2 Serial Terminal Port  
After the SROM data is loaded into the Icache, the three SROM interface signals can be  
used as a software UART and the pins become parallel I/O pins that can drive a system  
debug or diagnostic terminal by using an interface such as RS422.  
The serial line interface is automatically enabled if the SromOE_L pin is wired to the  
following pins:  
An active high enable RS422 (or 26LS32) driver, driving to SromData_H  
An active high enable RS422 (or 26LS31) receiver, driven from SromClk_H  
After reset, SromClk_H is driven from the Ibox I_CTL[SL_XMIT]. This register is  
cleared during reset, so it starts driving as a 0, but it can be written by software. The  
data becomes available at the pin after the HW_MTPR instruction that wrote  
I_CTL[SL_XMIT] is retired.  
11–2 Testability and Diagnostics  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                   
IEEE 1149.1 Port  
On the receive side, while in native mode, any transition on the Ibox I_CTL  
[SL_RCV], driven from the SromData_H pin, results in a trap to the PALcode inter-  
rupt handler. When in PALmode, all interrupts are blocked. The interrupt routine then  
begins sampling I_CTL [SL_RCV] under a software timing loop to input as much data  
as needed, using the chosen serial line protocol.  
11.3 IEEE 1149.1 Port  
The IEEE 1149.1 Test Access Port consists of the Tdi_H, Tdo_H, Tms_H, Tck_H,  
and Trst_L pins. These pins access the IEEE 1149.1 mandated public test features as  
well as several private chip manufacturing test features.  
The port meets all requirements of the standard except that there are no pull-ups on the  
Tdi_H, Tms_H, and Trst_L pins, as required by the present standard.  
The scope of 1149.1 compliant features on the 21264/EV68A is limited to the board  
level assembly verification test. The systems that do not intend to drive this port must  
terminate the port pins as follows: pull-ups on Tdi_H and Tms_H, pull-downs on  
Tck_H and Trst_L.  
The port logic consists of the usual standard compliant components, namely, the TAP  
Controller State Machine, the Instruction Register, and the Bypass Register.  
The Bypass Register provides a short shift path through the chip’s IEEE 1149.1 logic. It  
is generally useful at the board level testing. It consists of a 1-bit shift register.  
The Instruction Register holds test instructions. On the 21264/EV68A, this register is 5  
bits wide. Table 11–2 describes the supported instructions. The instruction set supports  
several public and private instructions. The public instructions operate and produce  
behavior compliant with the standard. The private instructions are used for chip manu-  
facturing test and must not be used outside of chip manufacturing.  
Table 11–2 IEEE 1149.1 Instructions and Opcodes  
Opcode  
Instruction  
Operation/Function  
00xxx  
01xxx  
10xxx  
Private  
These instructions are for factory test use only. The user must  
not load them as they may have a harmful effect on the  
21264/EV68A.  
11000  
11001  
11010  
11011  
SAMPLE  
HIGHZ  
CLAMP  
EXTEST  
Private  
IEEE 1149.1 SAMPLE instruction.  
IEEE 1149.1 HIGHZ instruction.  
IEEE 1149.1 CLAMP instruction.  
IEEE 1149.1 EXTEST instruction.  
11100  
11101  
11110  
These instructions are for factory test use only. The user must  
not load them as they may have a harmful effect on the  
21264/EV68A.  
11111  
BYPASS  
IEEE 1149.1 BYPASS instruction.  
Figure 11–3 shows the TAP controller state machine state diagram. The signal Tms_H  
controls the state transitions that occur with the rising clock edge. TAP state machine  
states are decoded and used for initiating various actions for testing.  
21264/EV68A Hardware Reference Manual  
Testability and Diagnostics 11–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
TestStat_H Pin  
Table 11–3 TAP Controller State Machine  
Test Logic  
Reset  
1
0
1
1
1
Run-Test/Idle  
Select-DR-Scan  
Select-IR-Scan  
0
0
0
Capture-DR  
0
1
1
Capture-IR  
0
Shift-IR  
1
0
0
Shift-DR  
1
0
1
1
Exit1-DR  
0
Exit1-IR  
0
Values  
shown  
are for  
TMS.  
Pause-DR  
1
Pause-IR  
1
0
0
0
Exit2-DR  
1
Exit2-IR  
1
Update-DR  
Update-IR  
1
0
1
0
Scan Sequence  
Scan Sequence  
MK145508.AI4  
11.4 TestStat_H Pin  
The TestStat_H pin serves two purposes. During power-up, it indicates BiST pass/fail  
status. After power-up, it indicates the 21264/EV68A timeout event.  
The system reset forces TestStat_H to low. Tbox forces it high during the internal BiST  
and array initialization operations. During result extraction (DoResult state), the Tbox  
drives it low for 16 cycles. After that, the pin remains low if the BiST has passes, other-  
wise, it is asserted high and remains high until chip is reset again. Figure 11–1 pictori-  
ally shows the behavior of the pin during the power-up operations.  
Note:  
A system designer may sample the TestStat_H pin on the first rising edge  
of the SromClk_H pin to determine BiST results. After the power-up dur-  
ing the normal chip operation, whenever the 21264/EV68A does not retire  
an instruction for 2K CPU cycles, the pin is asserted high for 3 CPU cycles.  
11–4 Testability and Diagnostics  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Power-Up Self-Test and Initialization  
Figure 11–1 TestStat_H Pin Timing During Power-Up Built-In Self-Test (BiST)  
ClkFwdReset_L  
Tbox_Reset_A_L  
TBox Reset Engine  
TestStatus_H  
Idle  
DoBist  
DoResult  
DoSROM  
Idle  
BiSTResult  
BiSTResult OR T  
LKG-10950A-98WF  
Figure 11–2 TestStat_H Pin Timing During Built-In Self-Initialization (BiSI)  
Tbox_Rst_A_L1  
TBox Reset Engine1  
Idle  
DoMfgSelfinit  
Idle  
TestStatus_H1  
ClkFwdRst_L1  
TimeOut  
LKG-10951A-98WF  
11.5 Power-Up Self-Test and Initialization  
Upon powering up, the 21264/EV68A automatically performs the self-test of all major  
embedded RAM arrays and then loads the Cbox configuration registers and the instruc-  
tion cache from the system SROM. The chip’s internal logic is held in reset during  
these operations. See Chapter 9 for sequencing of power-up operations.  
11.5.1 Built-in Self-Test  
The power-up self-test is performed on the instruction cache and tag arrays, the data  
cache and tag arrays, the triplicate tag arrays, and the various RAM arrays located in the  
branch history table logic. The power-up self-test lasts for approximately 700,000 CPU  
cycles. The result of self-test is made available as Pass/Fail status on the TestStat_H  
pin (see Section 11.4).  
The result of self-test is also available in an IPR bit. Software can read this status  
through IPR I_CTL(23) (0 = pass, 1 = fail). See Section 5.2.15.  
The power-up BiST leaves all bits in all arrays initialized to zeroes. The instruction  
cache and the tag are reinitialized as part of the SROM initialization step. This is  
detailed in Section 11.5.2.  
11.5.2 SROM Initialization  
Power-up initialization on the 21264/EV68A is different from previous generation  
Alpha systems in two aspects. First, in the 21264/EV68A systems, the presence of  
serial ROMs is mandatory as initialization of several Cbox configuration registers  
depends on them. Second, it is possible to skip or partially fill Icache from serial  
ROMs. Figure 11–3 shows the map of the data in serial ROMs.  
21264/EV68A Hardware Reference Manual  
Testability and Diagnostics 11–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                 
Power-Up Self-Test and Initialization  
In the SROM represented in Figure 11–3, the length for fields Cbox Config  
Data(0,n) plus MBZ(m,0) must equal 367 bits. (If Cbox Config Data(0,n) is  
(0,366), MBZ would be zero.)  
For the 21264/EV68A, Cbox Config Data is 304 bits; the value for n is 303.  
Therefore, the value MBZ field for Pass 3 is:  
MBZ(m,0) = 367 minus 304 = 63 = (62,0)  
Tables 11–4 and 5–24 describe the details of the Icache and Cbox bit fields, respec-  
tively. Note that fetch_count(1,0) must be 3, which guarantees that the SROM never  
partially loads an Icache block.  
Figure 11–3 SROM Content Map  
fetch [0](0,192) fetch[j-1](0,192) fetch[j](0,192) fetch_count(11,0) Cbox Config Data(0, n) MBZ(m,0)  
(first block) (last block)  
11.5.2.1 Serial Instruction Cache Load Operation  
All Icache bits, including each block’s tag, address space number (ASN), address space  
match (ASM), and valid and branch history bits are loaded serially from offchip serial  
ROMs. Once the serial load has been invoked by the chip reset sequence, the cache is  
loaded from the lower to the higher addresses.  
The serial Icache fill invoked by the chip reset sequence operates internally at a fre-  
GCLK  
256  
---------------  
quency of  
.
Table 11–4 lists the Icache bit fields in an SROM line. Fetch bits are listed in the order  
of shift direction (to down and to right). In Table 11–4:  
Bit Type  
Meaning  
c
Disp_add carry  
Instruction  
i
iq  
Iqueue predecodes  
Trouble bits  
Destination valid  
Ea_src  
tr  
dv  
ea  
par-MBZ  
Must be zero  
The load occurs at the rate of 1 bit per 256 CPU cycles. The chip outputs a 50% duty  
cycle clock on the SromClk_H pin.  
The serial ROMs can contain enough Alpha code to complete the configuration of the  
external interface (for example, set the timing on the external cache RAMs, and diag-  
nose the path between the CPU chip and the real ROM).  
11–6 Testability and Diagnostics  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Notes on IEEE 1149.1 Operation and Compliance  
The instruction cache lines are loaded in the reverse order. If the fetch_count(9,0) is  
zero, then, no instruction cache lines are loaded. Since the valid bits are already cleared  
by the BiST operation, the first instruction fetch is missed in the instruction cache and  
the chip seeks instructions from the offchip memory.  
Table 11–4 Icache Bit Fields in an SROM Line  
Fetch Bit Icache Data Fetch Bit Icache Data  
Fetch Bit Icache Data  
172 lp_train  
173:175 lp_src(2:0)  
0
par-MBZ  
c[3]  
86  
87  
par-MBZ  
c[0]  
1
2:27  
i[3](25,20,24,19,23,18,22,17 88:113  
,21,16:0)  
i[0](25,20,24,19,3,18,22,17, 176:181 lp_idx(14:9)  
21,16:0)  
28  
c[2]  
114  
c[1]  
182:186 lp_idx(8:4)  
187 lp_idx(15)  
29:42  
i[2](25,20,24,19,  
115:128  
i[1] (25,20,24,19,  
23,18,22,17,21,16:12)  
23,18,22,17,21,16:12)  
43  
parity  
129  
parity  
188:192 lp_ssp[4:0]  
44:55  
56  
i[2](11:0)  
dv[3]  
130:141  
142  
i[1](11:0)  
dv[0]  
57:59  
60:65  
66,68  
69  
iq[3](2:0)  
i[3](26:31)  
ea[3](2:0)  
dv[2]  
143:145  
146:151  
152:154  
155  
iq[0](2:0)  
i[0](26:31)  
ea[0](2:0)  
dv[1]  
70,72  
73:78  
79:81  
82:85  
iq[2](2:0)  
i[2](26:31)  
ea[2](2:0)  
tr(7:4)  
156:158  
159:164  
165:167  
168:171  
iq[1](2:0)  
i[1](26:31)  
ea[1](2:0)  
tr(0:3)  
Refer to the Alpha Motherboards Software Developer’s Kit (SDK) for example C code  
that calculates the predecode values of a serial Icache load.  
11.6 Notes on IEEE 1149.1 Operation and Compliance  
1. IEEE 1149.1 port pins on the 21264/EV68A are not pulled up or pulled down on  
the chip. The necessary pull-up or pull-down function must be implemented on the  
board.  
2. Tms_H should not change when Trst_L is being deasserted.  
References  
IEEE Std. 1149.1-1993 A Test Access Port and Boundary Scan Architecture.  
See Appendix B for a listing of the Boundary-Scan Register.  
21264/EV68A Hardware Reference Manual  
Testability and Diagnostics 11–7  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Download from Www.Somanuals.com. All Manuals Search And Download.  
A
Alpha Instruction Set  
This appendix provides a summary of the Alpha instruction set and describes the  
21264/EV68A IEEE floating-point conformance. It is organized as follows:  
Alpha instruction summary  
Reserved opcodes  
IEEE floating-point instructions  
VAX floating-point instructions  
Independent floating-point instructions  
Opcode summary  
Required PALcode function codes  
IEEE floating-point conformance  
A.1 Alpha Instruction Summary  
This section contains a summary of all Alpha architecture instructions. All values are in  
hexadecimal radix. Table A–1 describes the contents of the Format and Opcode col-  
umns that are in Table A–2.  
Table A–1 Instruction Format and Opcode Notation  
Format Opcode  
Instruction Format  
Symbol Notation  
Meaning  
Branch  
Bra  
F-P  
oo  
oo is the 6-bit opcode field.  
Floating-point  
oo.fff  
oo is the 6-bit opcode field.  
fff is the 11-bit function code field.  
Memory  
Mem  
oo  
oo is the 6-bit opcode field.  
Memory/function code Mfc  
oo.ffff  
oo is the 6-bit opcode field.  
ffff is the 16-bit function code in the displacement  
field.  
21264/EV68A Hardware Reference Manual  
Alpha Instruction Set A–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Alpha Instruction Summary  
Table A–1 Instruction Format and Opcode Notation (Continued)  
Format  
Symbol  
Opcode  
Notation  
Instruction Format  
Meaning  
Memory/ branch  
Mbr  
Opr  
Pcd  
oo.h  
oo.ff  
oo  
oo is the 6-bit opcode field.  
h is the high-order 2 bits of the displacement field.  
Operate  
oo is the 6-bit opcode field.  
ff is the 7-bit function code field.  
PALcode  
oo is the 6-bit opcode field; the particular PAL-  
code instruction is specified in the 26-bit function  
code field.  
Qualifiers for operate instructions are shown in Table A–2. Qualifiers for IEEE and  
VAX floating-point instructions are shown in Tables A–5 and A–6, respectively.  
Table A–2 Architecture Instructions  
Mnemonic  
ADDF  
ADDG  
ADDL  
ADDL/V  
ADDQ  
ADDQ/V  
ADDS  
ADDT  
AMASK  
AND  
Format  
F-P  
F-P  
Opr  
Opr  
Opr  
Opr  
F-P  
F-P  
Opr  
Opr  
Bra  
Bra  
Bra  
Opr  
Opr  
Bra  
Bra  
Bra  
Bra  
Bra  
Bra  
Opcode Description  
15.080 Add F_floating  
15.0A0 Add G_floating  
10.00  
10.40  
10.20  
10.60  
Add longword  
Add longword with integer overflow enable  
Add quadword  
Add quadword with integer overflow enable  
16.080 Add S_floating  
16.0A0 Add T_floating  
11.61  
11.00  
39  
Architecture mask  
Logical product  
Branch if = zero  
Branch if zero  
Branch if > zero  
Bit clear  
BEQ  
BGE  
3E  
BGT  
3F  
BIC  
11.08  
11.20  
38  
BIS  
Logical sum  
BLBC  
BLBS  
BLE  
Branch if low bit clear  
Branch if low bit set  
Branch if zero  
Branch if < zero  
3C  
3B  
BLT  
3A  
BNE  
3D  
Branch if zero  
BR  
30  
Unconditional branch  
A–2 Alpha Instruction Set  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
Alpha Instruction Summary  
Table A–2 Architecture Instructions (Continued)  
Mnemonic  
BSR  
Format  
Mbr  
Pcd  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
F-P  
F-P  
F-P  
Opr  
Opr  
F-P  
F-P  
F-P  
F-P  
Opr  
Opr  
F-P  
F-P  
F-P  
Opr  
Opr  
Opr  
F-P  
F-P  
F-P  
Opcode Description  
34  
Branch to subroutine  
CALL_PAL  
CMOVEQ  
CMOVGE  
CMOVGT  
CMOVLBC  
CMOVLBS  
CMOVLE  
CMOVLT  
CMOVNE  
CMPBGE  
CMPEQ  
CMPGEQ  
CMPGLE  
CMPGLT  
CMPLE  
00  
Trap to PALcode  
CMOVE if = zero  
CMOVE if zero  
CMOVE if > zero  
11.24  
11.46  
11.66  
11.16  
11.14  
11.64  
11.44  
11.26  
10.0F  
10.2D  
CMOVE if low bit clear  
CMOVE if low bit set  
CMOVE if zero  
CMOVE if < zero  
CMOVE if zero  
Compare byte  
Compare signed quadword equal  
15.0A5 Compare G_floating equal  
15.0A7 Compare G_floating less than or equal  
15.0A6 Compare G_floating less than  
10.6D  
10.4D  
Compare signed quadword less than or equal  
Compare signed quadword less than  
CMPLT  
CMPTEQ  
CMPTLE  
CMPTLT  
CMPTUN  
CMPULE  
CMPULT  
CPYS  
16.0A5 Compare T_floating equal  
16.0A7 Compare T_floating less than or equal  
16.0A6 Compare T_floating less than  
16.0A4 Compare T_floating unordered  
10.3D  
10.1D  
Compare unsigned quadword less than or equal  
Compare unsigned quadword less than  
17.020 Copy sign  
CPYSE  
17.022 Copy sign and exponent  
17.021 Copy sign negate  
CPYSN  
CTLZ  
1C.32  
1C.30  
1C.33  
Count leading zero  
Count population  
Count trailing zero  
CTPOP  
CTTZ  
CVTDG  
CVTGD  
CVTGF  
15.09E Convert D_floating to G_floating  
15.0AD Convert G_floating to D_floating  
15.0AC Convert G_floating to F_floating  
21264/EV68A Hardware Reference Manual  
Alpha Instruction Set A–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Alpha Instruction Summary  
Table A–2 Architecture Instructions (Continued)  
Mnemonic  
CVTGQ  
CVTLQ  
CVTQF  
CVTQG  
CVTQL  
CVTQS  
CVTQT  
CVTST  
CVTTQ  
CVTTS  
DIVF  
Format  
F-P  
F-P  
F-P  
F-P  
F-P  
F-P  
F-P  
F-P  
F-P  
F-P  
F-P  
F-P  
F-P  
F-P  
Mfc  
Opr  
Mfc  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Bra  
Bra  
Bra  
Bra  
Bra  
Bra  
F-P  
F-P  
Opcode Description  
15.0AF Convert G_floating to quadword  
17.010 Convert longword to quadword  
15.0BC Convert quadword to F_floating  
15.0BE Convert quadword to G_floating  
17.030 Convert quadword to longword  
16.0BC Convert quadword to S_floating  
16.0BE Convert quadword to T_floating  
16.2AC Convert S_floating to T_floating  
16.0AF Convert T_floating to quadword  
16.0AC Convert T_floating to S_floating  
15.083 Divide F_floating  
DIVG  
15.0A3 Divide G_floating  
DIVS  
16.083 Divide S_floating  
DIVT  
16.0A3 Divide T_floating  
ECB  
18.E800 Evict cache block  
EQV  
11.48  
Logical equivalence  
EXCB  
18.0400 Exception barrier  
EXTBL  
EXTLH  
EXTLL  
EXTQH  
EXTQL  
EXTWH  
EXTWL  
FBEQ  
12.06  
12.6A  
12.26  
12.7A  
12.36  
12.5A  
12.16  
31  
Extract byte low  
Extract longword high  
Extract longword low  
Extract quadword high  
Extract quadword low  
Extract word high  
Extract word low  
Floating branch if = zero  
Floating branch if zero  
Floating branch if > zero  
Floating branch if zero  
Floating branch if < zero  
FBGE  
36  
FBGT  
37  
FBLE  
33  
FBLT  
32  
FBNE  
35  
Floating branch if zero  
FCMOVEQ  
FCMOVGE  
17.02A FCMOVE if = zero  
17.02D FCMOVE if zero  
A–4 Alpha Instruction Set  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Alpha Instruction Summary  
Table A–2 Architecture Instructions (Continued)  
Mnemonic  
FCMOVGT  
FCMOVLE  
FCMOVLT  
FCMOVNE  
FETCH  
FETCH_M  
FTOIS  
Format  
F-P  
Opcode Description  
17.02F FCMOVE if > zero  
17.02E FCMOVE if zero  
17.02C FCMOVE if < zero  
F-P  
F-P  
F-P  
17.02B  
18.8000 Prefetch data  
18.A000 Prefetch data, modify intent  
FCMOVE if zero  
Mfc  
Mfc  
F-P  
1C.78  
1C.70  
11.6C  
12.0B  
12.67  
12.2B  
12.77  
12.3B  
12.57  
12.1B  
Floating to integer move, S_floating  
Floating to integer move, T_floating  
Implementation version  
Insert byte low  
FTOIT  
F-P  
IMPLVER  
INSBL  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
F-P  
INSLH  
Insert longword high  
Insert longword low  
INSLL  
INSQH  
Insert quadword high  
Insert quadword low  
Insert word high  
INSQL  
INSWH  
INSWL  
ITOFF  
Insert word low  
14.014 Integer to floating move, F_floating  
14.004 Integer to floating move, S_floating  
14.024 Integer to floating move, T_floating  
ITOFS  
F-P  
ITOFT  
F-P  
JMP  
Mbr  
Mbr  
1A.0  
1A.1  
1A.3  
08  
Jump  
JSR  
Jump to subroutine  
Jump to subroutine return  
Load address  
JSR_COROUTINE Mbr  
LDA  
Mem  
Mem  
Mem  
Mem  
Mem  
Mem  
Mem  
Mem  
Mem  
Mem  
LDAH  
LDBU  
LDF  
09  
Load address high  
0A  
20  
Load zero-extended byte  
Load F_floating  
LDG  
21  
Load G_floating  
LDL  
28  
Load sign-extended longword  
Load sign-extended longword locked  
Load quadword  
LDL_L  
LDQ  
2A  
29  
LDQ_L  
LDQ_U  
2B  
Load quadword locked  
Load unaligned quadword  
0B  
21264/EV68A Hardware Reference Manual  
Alpha Instruction Set A–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Alpha Instruction Summary  
Table A–2 Architecture Instructions (Continued)  
Mnemonic  
LDS  
Format  
Mem  
Mem  
Mem  
Opr  
Opr  
Opr  
Opr  
Mfc  
F-P  
Opcode Description  
22  
Load S_floating  
LDT  
23  
Load T_floating  
LDWU  
0C  
Load zero-extended word  
Vector signed byte maximum  
Vector signed word maximum  
Vector unsigned byte maximum  
MAXSB8  
MAXSW4  
MAXUB8  
MAXUW4  
MB  
1C.3E  
1C.3F  
1C.3C  
1C.3D Vector unsigned word maximum  
18.4000 Memory barrier  
MF_FPCR  
MINSB8  
MINSW4  
MINUB8  
MINUW4  
MSKBL  
MSKLH  
MSKLL  
MSKQH  
MSKQL  
MSKWH  
MSKWL  
MT_FPCR  
MULF  
17.025 Move from FPCR  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
F-P  
1C.38  
1C.39  
Vector signed byte minimum  
Vector signed word minimum  
1C.3A Vector unsigned byte minimum  
1C.3B  
12.02  
12.62  
12.22  
12.72  
12.32  
12.52  
12.12  
Vector unsigned word minimum  
Mask byte low  
Mask longword high  
Mask longword low  
Mask quadword high  
Mask quadword low  
Mask word high  
Mask word low  
17.024 Move to FPCR  
F-P  
15.082 Multiply F_floating  
15.0A2 Multiply G_floating  
MULG  
F-P  
MULL  
Opr  
Opr  
Opr  
Opr  
F-P  
13.00  
13.40  
13.20  
13.60  
Multiply longword  
MULL/V  
MULQ  
Multiply longword with integer overflow enable  
Multiply quadword  
MULQ/V  
MULS  
Multiply quadword with integer overflow enable  
16.082 Multiply S_floating  
16.0A2 Multiply T_floating  
MULT  
F-P  
ORNOT  
PERR  
Opr  
Opr  
Opr  
11.28  
1C.31  
1C.37  
Logical sum with complement  
Pixel error  
PKLB  
Pack longwords to bytes  
A–6 Alpha Instruction Set  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Alpha Instruction Summary  
Table A–2 Architecture Instructions (Continued)  
Mnemonic  
PKWB  
RC  
Format  
Opr  
Opcode Description  
1C.36  
Pack words to bytes  
Mfc  
Mbr  
Mfc  
Mfc  
Opr  
18.E000 Read and clear  
RET  
1A.2  
Return from subroutine  
RPCC  
RS  
18.C000 Read process cycle counter  
18.F000 Read and set  
S4ADDL  
S4ADDQ  
S4SUBL  
S4SUBQ  
S8ADDL  
S8ADDQ  
S8SUBL  
S8SUBQ  
SEXTB  
SEXTW  
SLL  
10.02  
10.22  
10.0B  
10.2B  
10.12  
10.32  
10.1B  
10.3B  
1C.00  
1C.01  
12.39  
Scaled add longword by 4  
Scaled add quadword by 4  
Scaled subtract longword by 4  
Scaled subtract quadword by 4  
Scaled add longword by 8  
Scaled add quadword by 8  
Scaled subtract longword by 8  
Scaled subtract quadword by 8  
Sign extend byte  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Opr  
Sign extend word  
Opr  
Shift left logical  
SQRTF  
SQRTG  
SQRTS  
SQRTT  
SRA  
F-P  
14.08A Square root F_floating  
14.0AA Square root G_floating  
14.08B Square root S_floating  
14.0AB Square root T_floating  
F-P  
F-P  
F-P  
Opr  
12.3C  
12.34  
0E  
Shift right arithmetic  
Shift right logical  
Store byte  
SRL  
Opr  
STB  
Mem  
Mem  
Mem  
Mem  
Mem  
Mem  
Mem  
Mem  
Mem  
Mem  
STF  
24  
Store F_floating  
STG  
25  
Store G_floating  
STL  
2C  
Store longword  
STL_C  
STQ  
2E  
Store longword conditional  
Store quadword  
2D  
2F  
STQ_C  
STQ_U  
STS  
Store quadword conditional  
Store unaligned quadword  
Store S_floating  
0F  
26  
STT  
27  
Store T_floating  
21264/EV68A Hardware Reference Manual  
Alpha Instruction Set A–7  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Reserved Opcodes  
Table A–2 Architecture Instructions (Continued)  
Mnemonic  
STW  
Format  
Mem  
F-P  
Opcode Description  
0D  
Store word  
SUBF  
15.081 Subtract F_floating  
15.0A1 Subtract G_floating  
SUBG  
F-P  
SUBL  
Opr  
Opr  
Opr  
Opr  
F-P  
10.09  
10.49  
10.29  
10.69  
Subtract longword  
SUBL/V  
SUBQ  
Subtract longword with integer overflow enable  
Subtract quadword  
SUBQ/V  
SUBS  
Subtract quadword with integer overflow enable  
16.081 Subtract S_floating  
16.0A1 Subtract T_floating  
18.0000 Trap barrier  
SUBT  
F-P  
TRAPB  
UMULH  
UNPKBL  
UNPKBW  
WH64  
Mfc  
Opr  
Opr  
Opr  
Mfc  
Mfc  
Opr  
Opr  
Opr  
13.30  
1C.35  
1C.34  
Unsigned multiply quadword high  
Unpack bytes to longwords  
Unpack bytes to words  
18.F800 Write hint — 64 bytes  
18.4400 Write memory barrier  
WMB  
XOR  
11.40  
12.30  
12.31  
Logical difference  
Zero bytes  
ZAP  
ZAPNOT  
Zero bytes not  
A.2 Reserved Opcodes  
This section describes the opcodes that are reserved in the Alpha architecture. They can  
be reserved for Compaq or for PALcode.  
A.2.1 Opcodes Reserved for Compaq  
Table A–3 lists opcodes reserved for Compaq.  
Table A–3 Opcodes Reserved for Compaq  
Mnemonic  
OPC01  
Opcode  
01  
Mnemonic  
OPC05  
OPC06  
OPC07  
Opcode  
05  
OPC02  
02  
06  
OPC03  
03  
07  
OPC04  
04  
A–8 Alpha Instruction Set  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
IEEE Floating-Point Instructions  
A.2.2 Opcodes Reserved for PALcode  
Table A–4 lists the 21264/EV68A-specific instructions. See Chapter 2 for more  
information.  
Table A–4 Opcodes Reserved for PALcode  
21264/EV68A  
Mnemonic  
Architecture  
Mnemonic  
Opcode  
1B  
Function  
HW_LD  
HW_ST  
HW_REI  
PAL1B  
PAL1F  
PAL1E  
Performs Dstream load instructions.  
Performs Dstream store instructions.  
1F  
1E  
Returns instruction flow to the program counter (PC) pointed  
to by EXC_ADDR internal processor register (IPR).  
HW_MFPR  
HW_MTPR  
19  
PAL19  
PAL1D  
Accesses the Ibox, Mbox, and Dcache IPRs.  
Accesses the Ibox, Mbox, and Dcache IPRs.  
1D  
A.3 IEEE Floating-Point Instructions  
Table A–5 lists the hexadecimal value of the 11-bit function code field for the IEEE  
floating-point instructions, with and without qualifiers. The opcode for these  
instructions is 16 .  
16  
Table A–5 IEEE Floating-Point Instruction Function Codes  
Mnemonic  
ADDS  
None  
080  
/C  
/M  
/D  
/U  
/UC  
100  
120  
/UM  
140  
160  
/UD  
1C0  
1E0  
000  
020  
040  
060  
0C0  
0E0  
180  
1A0  
ADDT  
0A0  
0A5  
0A6  
0A7  
0A4  
0BC  
0BE  
CMPTEQ  
CMPTLT  
CMPTLE  
CMPTUN  
CVTQS  
03C  
03E  
07C  
07E  
0FC  
0FE  
CVTQT  
See  
CVTST  
below  
See  
CVTTQ  
CVTTS  
DIVS  
below  
0AC  
083  
02C  
003  
023  
002  
022  
06C  
043  
063  
042  
062  
0EC  
0C3  
0E3  
0C2  
0E2  
1AC  
183  
12C  
103  
123  
102  
122  
16C  
143  
163  
142  
162  
1EC  
1C3  
1E3  
1C2  
1E2  
DIVT  
0A3  
082  
1A3  
182  
MULS  
MULT  
0A2  
1A2  
21264/EV68A Hardware Reference Manual  
Alpha Instruction Set A–9  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                     
IEEE Floating-Point Instructions  
Table A–5 IEEE Floating-Point Instruction Function Codes (Continued)  
SQRTS  
SQRTT  
SUBS  
08B  
0AB  
081  
00B  
02B  
001  
021  
04B  
06B  
041  
061  
0CB  
0EB  
0C1  
0E1  
18B  
1AB  
181  
10B  
12B  
101  
121  
14B  
16B  
141  
161  
1CB  
1EB  
1C1  
1E1  
SUBT  
0A1  
1A1  
Mnemonic  
ADDS  
/SU  
/SUC  
500  
520  
/SUM  
540  
/SUD  
5C0  
5E0  
/SUI  
780  
7A0  
/SUIC  
700  
/SUIM  
740  
/SUID  
7C0  
7E0  
580  
5A0  
5A5  
5A6  
5A7  
5A4  
ADDT  
560  
720  
760  
CMPTEQ  
CMPTLT  
CMPTLE  
CMPTUN  
CVTQS  
CVTQT  
CVTTS  
DIVS  
7BC  
7BE  
7AC  
783  
73C  
73E  
72C  
703  
723  
702  
722  
70B  
72B  
701  
721  
77C  
77E  
76C  
743  
763  
742  
762  
74B  
76B  
741  
761  
7FC  
7FE  
7EC  
7C3  
7E3  
7C2  
7E2  
7CB  
7EB  
7C1  
7E1  
5AC  
583  
52C  
503  
523  
502  
522  
50B  
52B  
501  
521  
56C  
543  
563  
542  
562  
54B  
56B  
541  
561  
5EC  
5C3  
5E3  
5C2  
5E2  
5CB  
5EB  
5C1  
5E1  
DIVT  
5A3  
582  
7A3  
782  
MULS  
MULT  
5A2  
58B  
5AB  
581  
7A2  
78B  
7AB  
781  
SQRTS  
SQRTT  
SUBS  
SUBT  
5A1  
7A1  
Mnemonic  
CVTST  
None  
/S  
2AC  
6AC  
Mnemonic  
CVTTQ  
None  
/C  
/V  
/VC  
/SV  
/SVC  
/SVI  
/SVIC  
0AF  
02F  
1AF  
12F  
5AF  
52F  
7AF  
72F  
Mnemonic  
CVTTQ  
D
/VD  
/SVD  
/SVID  
/M  
/VM  
/SVM  
/SVIM  
0EF  
1EF  
5EF  
7EF  
06F  
16F  
56F  
76F  
A–10 Alpha Instruction Set  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
VAX Floating-Point Instructions  
Programming Note:  
In order to use CMPTxx with software completion trap handling, it is necessary to  
specify the /SU IEEE trap mode, even though an underflow trap is not possible. In order  
to use CVTQS or CVTQT with software completion trap handling, it is necessary to  
specify the /SUI IEEE trap mode, even though an underflow trap is not possible.  
A.4 VAX Floating-Point Instructions  
Table A–6 lists the hexadecimal value of the 11-bit function code field for the VAX  
floating-point instructions. The opcode for these instructions is 15 .  
16  
Table A–6 VAX Floating-Point Instruction Function Codes  
Mnemonic  
ADDF  
None  
080  
/C  
/U  
/UC  
100  
120  
/S  
/SC  
400  
420  
/SU  
580  
5A0  
/SUC  
500  
000  
020  
180  
1A0  
480  
4A0  
4A5  
4A7  
4A6  
49E  
4AD  
4AC  
ADDG  
0A0  
520  
CMPGEQ  
CMPGLE  
CMPGLT  
CVTDG  
CVTGD  
CVTGF  
CVTGQ  
CVTQF  
CVTQG  
DIVF  
0A5  
0A7  
0A6  
09E  
01E  
02D  
02C  
19E  
11E  
12D  
12C  
41E  
42D  
42C  
59E  
51E  
52D  
52C  
0AD  
0AC  
See below  
0BC  
0BE  
083  
1AD  
1AC  
5AD  
5AC  
03C  
03E  
003  
023  
002  
022  
00A  
02A  
001  
021  
/C  
183  
1A3  
182  
1A2  
18A  
1AA  
181  
1A1  
/V  
103  
123  
102  
122  
10A  
12A  
101  
121  
/VC  
483  
4A3  
482  
4A2  
48A  
4AA  
481  
4A1  
/S  
403  
423  
402  
422  
40A  
42A  
401  
421  
/SC  
583  
5A3  
582  
503  
523  
502  
522  
50A  
52A  
501  
521  
/SVC  
52F  
DIVG  
0A3  
082  
MULF  
MULG  
0A2  
08A  
0AA  
081  
5A2  
58A  
5AA  
581  
SQRTF  
SQRTG  
SUBF  
SUBG  
0A1  
None  
0AF  
5A1  
/SV  
Mnemonic  
CVTGQ  
02F  
1AF  
12F  
4AF  
42F  
5AF  
A.5 Independent Floating-Point Instructions  
Table A–7 lists the hexadecimal value of the 11-bit function code field for the floating-  
point instructions that are not directly tied to IEEE or VAX floating point. The opcode  
for the following instructions is 17 .  
16  
21264/EV68A Hardware Reference Manual  
Alpha Instruction Set A–11  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Opcode Summary  
Table A–7 Independent Floating-Point Instruction Function Codes  
Mnemonic  
None  
/V  
/SV  
CPYS  
020  
022  
CPYSE  
021  
010  
030  
02A  
02D  
02F  
02E  
02C  
025  
024  
CPYSN  
CVTLQ  
130  
530  
CVTQL  
FCMOVEQ  
FCMOVGE  
FCMOVGT  
FCMOVLE  
FCMOVLT  
MF_FPCR  
MT_FPCR  
A.6 Opcode Summary  
Table A–8 lists all Alpha opcodes from 00 (CALL_PAL) through 3F (BGT). In the  
table, the column headings that appear over the instructions have a granularity of 8 .  
16  
The rows beneath the Offset column supply the individual hexadecimal number to  
resolve that granularity.  
If an instruction column has a 0 in the right (low) hexadecimal digit, replace that 0 with  
the number to the left of the backslash (\) in the Offset column on the instruction’s row.  
If an instruction column has an 8 in the right (low) hexadecimal digit, replace that 8  
with the number to the right of the backslash in the Offset column.  
For example, the third row (2/A) under the 10 column contains the symbol INTS*,  
16  
representing the all-integer shift instructions. The opcode for those instructions would  
then be 12 because the 0 in 10 is replaced by the 2 in the Offset column. Likewise, the  
16  
third row under the 18 column contains the symbol JSR*, representing all jump  
16  
instructions. The opcode for those instructions is 1A because the 8 in the heading is  
replaced by the number to the right of the backslash in the Offset column. The  
instruction format is listed under the instruction symbol.  
Table A–8 Opcode Summary  
Offset  
0/8  
00  
08  
10  
18  
20  
28  
30  
38  
PAL* LDA  
(pal)  
INTA* MISC* LDF  
(mem) (mem)  
LDL  
(mem)  
BR  
(br)  
BLBC  
(br)  
(mem) (op)  
1/9  
Res  
LDAH INTL* PAL  
LDG  
(mem)  
LDQ  
(mem)  
FBEQ  
(br)  
BEQ  
(br)  
\
\
(mem)  
(op)  
2/A  
LDBU Res  
INTS* JSR*  
LDS  
LDL_L  
(mem)  
FBLT  
(br)  
BLT  
(br)  
(op)  
(mem) (mem)  
A–12 Alpha Instruction Set  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Required PALcode Function Codes  
Table A–8 Opcode Summary (Continued)  
Offset  
3/B  
00  
08  
10  
18  
20  
28  
30  
38  
Res  
LDQ_U INTM* PAL  
\
(mem) (op)  
LDT  
(mem)  
LDQ_L  
(mem)  
FBLE  
(br)  
BLE  
(br)  
\
4/C  
5/D  
6/E  
7/F  
LDWU Res  
ITFP* FPTI*  
STF  
(mem)  
STL  
(mem)  
BSR  
(br)  
BLBS  
(br)  
Res  
Res  
Res  
STW  
STB  
FLTV* PAL  
\
STG  
(mem)  
STQ  
(mem)  
FBNE  
(br)  
BNE  
(br)  
\
\
\
(op)  
FLTI*  
(op)  
PAL  
STS  
(mem)  
STL_C  
(mem)  
FBGE  
(br)  
BGE  
(br)  
\
STQ_U FLTL* PAL  
\
STT  
(mem)  
STQ_C  
(mem)  
FBGT  
(br)  
BGT  
(br)  
(mem) (op)  
Table A–9 explains the symbols used in Table A–8.  
Table A–9 Key to Opcode Summary Used in Table A–8  
Symbol  
Meaning  
FLTI*  
FLTL*  
FLTV*  
FPTI*  
INTA*  
INTL*  
INTM*  
INTS*  
ITFP*  
JSR*  
IEEE floating-point instruction opcodes  
Floating-point operate instruction opcodes  
VAX floating-point instruction opcodes  
Floating-point to integer register move opcodes  
Integer arithmetic instruction opcodes  
Integer logical instruction opcodes  
Integer multiply instruction opcodes  
Integer shift instruction opcodes  
Integer to floating-point register move opcodes  
Jump instruction opcodes  
MISC*  
PAL*  
Miscellaneous instruction opcodes  
PALcode instruction (CALL_PAL) opcodes  
Reserved for PALcode  
PAL  
\
\
Res  
Reserved for Compaq  
A.7 Required PALcode Function Codes  
Table A–10 lists opcodes required for all Alpha implementations. The notation used is  
oo.ffff, where oo is the hexadecimal 6-bit opcode and ffff is the hexadecimal 26-bit  
function code.  
Table A–10 Required PALcode Function Codes  
Mnemonic  
DRAINA  
HALT  
Type  
Function Code  
00.0002  
Privileged  
Privileged  
Unprivileged  
00.0000  
IMB  
00.0086  
21264/EV68A Hardware Reference Manual  
Alpha Instruction Set A–13  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
IEEE Floating-Point Conformance  
A.8 IEEE Floating-Point Conformance  
The 21264/EV68A supports the IEEE floating-point operations defined in the Alpha  
System Reference Manual, Revision 8 and therefore also from the Alpha Architecture  
Reference Manual, Fourth Edition. Support for a complete implementation of the IEEE  
Standard for Binary Floating-Point Arithmetic (ANSI/IEEE Standard 754 1985) is pro-  
vided by a combination of hardware and software. The 21264/EV68A provides several  
hardware features to facilitate complete support of the IEEE standard.  
The 21264/EV68A provides the following hardware features to facilitate complete sup-  
port of the IEEE standard:  
The 21264/EV68A implements precise exception handling in hardware, as denoted  
by the AMASK instruction returning bit 9 set. TRAPB instructions are treated as  
NOPs and are not issued.  
The 21264/EV68A accepts both Signaling and Quiet NaNs as input operands and  
propagates them as specified by the Alpha architecture. In addition, the 21264/  
EV68A delivers a canonical Quiet NaN when an operation is required to produce a  
NaN value and none of its inputs are NaNs. Encodings for Signaling NaN and  
Quiet NaN are defined by the Alpha Architecture Reference Manual, Fourth Edi-  
tion.  
The 21264/EV68A accepts infinity operands and implements infinity arithmetic as  
defined by the IEEE standard and the Alpha Architecture Reference Manual,  
Fourth Edition.  
The 21264/EV68A implements SQRT for single (SQRTS) and double (SQRTT)  
precision in hardware.  
Note:  
In addition, the 21264/EV68A also implements the VAX SQRTF and  
SQRTG instructions.  
The 21264/EV68A implements the FPCR[DNZ] bit. When FPCR[DNZ] is set,  
denormal input operand traps can be avoided for arithmetic operations that include  
the /S qualifier. When FPCR[DNZ] is clear, denormal input operands for arithmetic  
operations produce an unmaskable denormal trap. CPYSE/CPYSN, FCMOVxx,  
and MF_FPCR/MT_FPCR are not arithmetic operations, and pass denormal values  
without initiating arithmetic traps.  
The 21264/EV68A implements the following disable bits in the floating-point con-  
trol register (FPCR):  
Underflow disable (UNFD)  
Overflow disable (OVFD)  
Inexact result disable (INED)  
Division by zero disable (DZED)  
Invalid operation disable (INVD)  
A–14 Alpha Instruction Set  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
IEEE Floating-Point Conformance  
If one of these bits is set, and an instruction with the /S qualifier set generates the  
associated exception, the 21264/EV68A produces the IEEE nontrapping result and  
suppresses the trap. These nontrapping responses include correctly signed  
infinity, largest finite number, and Quiet NaNs as specified by the IEEE  
standard.  
The 21264/EV68A does not produce a denormal result for the underflow exception.  
Instead, a true zero (+0) is written to the destination register. In the 21264/EV68A,  
the FPCR underflow to zero (UNDZ) bit must be set if the underflow disable  
(UNFD) bit is set. If desired, trapping on underflow can be enabled by the instruc-  
tion and the FPCR, and software may compute the denormal value as defined in the  
IEEE standard.  
The 21264/EV68A records floating-point exception information in two places:  
The FPCR status bits record the occurrence of all exceptions that are detected,  
whether or not the corresponding trap is enabled. The status bits are cleared only  
through an explicit clear command (MT_FPCR); hence, the exception information  
they record is a summary of all exceptions that have occurred since the last time  
they were cleared.  
If an exception is detected and the corresponding trap is enabled by the instruction,  
and is not disabled by the FPCR control bits, the 21264/EV68A will record the  
condition in the EXC_SUM register and initiate an arithmetic trap.  
The following items apply to Table A–11:  
The 21264/EV68A traps on a denormal input operand for all arithmetic operations  
unless FPCR[DNZ] = 1.  
Input operand traps take precedence over arithmetic result traps.  
The following abbreviations are used:  
Inf: Infinity  
QNaN: Quiet NaN  
SNaN: Signalling NaN  
CQNaN: Canonical Quiet NaN  
For IEEE instructions with /S, Table A–11 lists all exceptional input and output  
conditions recognized by the 21264/EV68A, along with the result and exception  
generated for each condition.  
Table A–11 Exceptional Input and Output Conditions  
21264/EV68A Hardware  
Alpha Instructions  
ADDx SUBx INPUT  
Inf operand  
Supplied Result  
Exception  
±Inf  
(none)  
QNaN operand  
QNaN  
QNaN  
CQNaN  
(none)  
SNaN operand  
Invalid Op  
Invalid Op  
Effective subtract of two Inf operands  
ADDx SUBx OUTPUT  
21264/EV68A Hardware Reference Manual  
Alpha Instruction Set A–15  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
IEEE Floating-Point Conformance  
Table A–11 Exceptional Input and Output Conditions (Continued)  
21264/EV68A Hardware  
Supplied Result  
Alpha Instructions  
Exponent overflow  
Exponent underflow  
Inexact result  
Exception  
Overflow  
Underflow  
Inexact  
±Inf or ±MAX  
+0  
Result  
A–16 Alpha Instruction Set  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
IEEE Floating-Point Conformance  
Table A–11 Exceptional Input and Output Conditions (Continued)  
21264/EV68A Hardware  
Supplied Result  
Alpha Instructions  
MULx INPUT  
Inf operand  
Exception  
±Inf  
(none)  
QNaN operand  
SNaN operand  
0 * Inf  
QNaN  
QNaN  
CQNaN  
(none)  
Invalid Op  
Invalid Op  
MULx OUTPUT (same as ADDx)  
DIVx INPUT  
QNaN operand  
SNaN operand  
0/0 or Inf/Inf  
QNaN  
QNaN  
CQNaN  
±Inf  
(none)  
Invalid Op  
Invalid Op  
Div Zero  
(none)  
A/0 (A not 0)  
A/Inf  
±0  
Inf/A  
±Inf  
(none)  
DIVx OUTPUT (same as ADDx)  
SQRTx INPUT  
+Inf operand  
+Inf  
(none)  
QNaN operand  
SNaN operand  
-A (A not 0)  
QNaN  
QNaN  
CQNaN  
-0  
(none)  
Invalid Op  
Invalid Op  
(none)  
-0  
SQRTx OUTPUT  
Inexact result  
root  
Inexact  
CMPTEQ CMPTUN INPUT  
Inf operand  
True or False  
(none)  
QNaN operand  
SNaN operand  
CMPTLT CMPTLE INPUT  
Inf operand  
False for EQ, True for UN  
False for EQ,True for UN  
(none)  
Invalid Op  
True or False  
False  
(none)  
QNaN operand  
SNaN operand  
CVTfi INPUT  
Inf operand  
Invalid Op  
Invalid Op  
False  
0
0
Invalid Op  
Invalid Op  
QNaN operand  
21264/EV68A Hardware Reference Manual  
Alpha Instruction Set A–17  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Table A–11 Exceptional Input and Output Conditions (Continued)  
21264/EV68A Hardware  
Alpha Instructions  
Supplied Result  
Exception  
SNaN operand  
0
Invalid Op  
CVTfi OUTPUT  
Inexact result  
Result  
Inexact  
Integer overflow  
CVTif OUTPUT  
Inexact result  
Truncated result  
Invalid Op  
Result  
Inexact  
CVTff INPUT  
Inf operand  
±Inf  
(none)  
QNaN operand  
QNaN  
QNaN  
(none)  
SNaN operand  
Invalid Op  
CVTff OUTPUT (same as ADDx)  
FBEQ FBNE FBLT FBLE FBGT FBGE  
LDS LDT  
STS STT  
CPYS CPYSN  
FCMOVx  
See Section 2.14 for information about the floating-point control register (FPCR).  
Download from Www.Somanuals.com. All Manuals Search And Download.  
B
21264/EV68A Boundary-Scan Register  
This appendix contains the BSDL description of the 21264/EV68A boundary-scan reg-  
ister.  
B.1 Boundary-Scan Register  
The Boundary-Scan Register (BSR) on the 21264/EV68A is 367 bits long. It is  
accessed by the three public (SAMPLE, EXTEST, CLAMP) instructions. The register  
operation for the public instructions is compliant with the IEEE 1149.1 standard.  
The boundary-scan register covers all input, output, and bidirectional pins with the  
exception of the compliance enable pins and pins that are power-supply-type or analog  
in nature. The BSDL for the boundary-scan register is given in Section B.1.1.  
B.1.1 BSDL Description of the Alpha 21264/EV68A Boundary-Scan Register  
-------------------------------------------------------------------------------  
-- alpha21264b.bsdl  
--The BSDL Description for EV6’s IEEE 1149.1 Circuits  
-------------------------------------------------------------------------------  
--  
Revision History  
--Rev Date  
Description  
-- 1.0 Feb 99  
First external release  
-------------------------------------------------------------------------------  
entity Alpha_21264b is-- (ref B.8)  
generic (PHYSICAL_PIN_MAP :string := "PGA_EV6");-- (ref B.8.2)  
port (-- (ref B.8.3)  
TestStat_H  
SromOE_L  
SromClk_H  
SromData_H  
Reset_L  
:out  
:out  
:out  
:in  
bit  
bit  
bit  
bit  
;
;
;
;
:in  
bit  
;
IRQ_H  
:in  
bit_vector (0 to 5)  
;
DcOk_H  
:linkage bit  
:linkage bit  
:linkage bit  
:linkage bit  
:linkage bit  
:linkage bit  
; -- Compliance enable input  
NoConnect_0  
NoConnect_1  
PllBypass_H  
FrameClk_H  
FrameClk_L  
ClkFwdRst_H  
BcCheck_H  
BcData_H  
SysData_L  
SysCheck_L  
BcDataInClk_H  
SysDataOutClk_L  
Spare_7  
; -- n/c  
; -- n/c  
;
;
;
;
:in  
bit  
:inout  
:inout  
:inout  
:inout  
:in  
bit_vector (0 to 15);  
bit_vector (0 to 127);  
bit_vector (0 to 63) ;  
bit_vector (0 to 7)  
bit_vector (0 to 7)  
bit_vector (0 to 7)  
;
;
;
;
:out  
:linkage bit_vector (0 to 7)  
21264/EV68A Hardware Reference Manual  
21264/EV68A Boundary-Scan Register B–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                   
Boundary-Scan Register  
SysDataInClk_H  
BcDataOutClk_L  
BcDataOutClk_H  
ClkIn_H  
ClkIn_L  
PLL_VDD  
EV6Clk_H  
EV6Clk_L  
Spare_4  
Spare_5  
:in  
:out  
:out  
:linkage bit  
:linkage bit  
:linkage bit  
:linkage bit  
:linkage bit  
:linkage bit  
:linkage bit  
bit_vector (0 to 7)  
bit_vector (0 to 3) ; -- JWB corrected  
bit_vector (0 to 3) ; -- JWB corrected  
;
; -- Oscillator  
; -- Oscillator  
;
;
;
;
;
BcTag_H  
:inout  
bit_vector (20 to 42);  
BcVref  
:linkage bit  
;
BcTagInClk_H  
BcTagClkIn_H  
BcTagParity_H  
BcTagShared_H  
BcTagDirty_H  
BcTagValid_H  
BcTagOutClk_L  
BcTagOutClk_H  
BcTagOE_L  
:in  
bit  
; -- Name in model:  
:inout  
:inout  
:inout  
:inout  
:out  
:out  
:out  
:out  
:out  
:out  
:out  
:out  
:out  
:in  
bit  
bit  
bit  
bit  
bit  
bit  
bit  
bit  
bit  
bit  
bit  
;
;
;
;
;
;
;
;
;
;
;
BcTagWr_L  
BcDataWr_L  
BcLoad_L  
BcDataOE_L  
BcAdd_H  
bit_vector (4 to 23) ;  
bit_vector (0 to 14) ;  
bit_vector (0 to 14) ;  
bit  
bit  
SysAddOut_L  
SysAddIn_L  
SysAddInClk_L  
SysAddOutClk_L  
SysVref  
:in  
:out  
:linkage bit  
;
; --JWB added  
; --JWB added  
SysFillValid_L  
SysDataInValid_L  
SysDataOutValid_L  
Spare_0  
MiscVref  
Spare_2  
:in  
:in  
:in  
bit  
bit  
bit  
;
;
;
:linkage bit  
:linkage bit  
:linkage bit  
; -- n/c  
; --  
; -- n/c  
Tdi_H  
Tdo_H  
Trst_L  
Tck_H  
:in  
:out  
:in  
:in  
:in  
bit  
bit  
bit  
bit  
bit  
;
;
;
;
;
Tms_H  
VSS:linkage bit_vector (0 to 103);  
VDD  
:linkage bit_vector (0 to 93) );  
use STD_1149_1_1994.all ;-- (ref B.8.4)  
attribute COMPONENT_CONFORMANCE of Alpha_21264: entity is "STD_1149_1_1993";  
attribute PIN_MAP of Alpha_21264 : entity is PHYSICAL_PIN_MAP ;  
constant PGA_EV6 : PIN_MAP_STRING :=  
"
" &  
"SysAddIn_L  
: (BD30, BC29, AY28, BE29, AW27, BA27, BD28, BE27, "&  
"
AY26, BC25, BB24, AV24, BD24, BE23, AW23),  
"&  
"&  
"&  
"&  
"SysAddInClk_L  
"SysVref  
"SysFillValid_L  
:
:
:
BB26,  
BA25,  
BC23,  
"SysAddOut_L  
: (AW33, BE39, BD36, BC35, BA33, AY32, BE35, AV30, "&  
"
BB32, BA31, BE33, AW29, BC31, AV28, BB30),  
BD34,  
"&  
"&  
"SysAddOutClk_L  
:
"SysData_L  
: (F14 , G13 , F12 , H12 , H10 , G7 , F6 , K8 , "&  
M6 , N7 , P6 , T8 , V8 , V6 , W7 , Y6 , "&  
AB8 , AC7 , AD8 , AE5 , AH6 , AH8 , AJ7 , AL5 , "&  
AP8 , AR7 , AT8 , AV6 , AV10, AW11, AV12, AW13, "&  
F32 , F34 , H34 , G35 , F40 , G39 , K38 , J41 , "&  
M40 , N39 , P40 , T38 , V40 , W41 , W39 , Y40 , "&  
"
"
"
"
"
B–2 21264/EV68A Boundary-Scan Register  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Boundary-Scan Register  
"
"
AB38, AC39, AD38, AF40, AH38, AJ39, AL41, AK38, "&  
AN39, AP38, AR39, AT38, AY38, AV36, AW35, AV34),"&  
: (L7 , AA5 , AK8 , BA13, L39 , AA41, AM40, AY34),"&  
: (D8 , P4 , AF6 , AY6 , E37 , R43 , AG41, AV40),"&  
"SysCheck_L  
"SysDataInClk_H  
:
"SysDataOutClk_L  
: (G11 , U7 , AG7 , AY8 , H36 , R41 , AH40, AW39),"&  
"SysDataInValid_L  
"SysDataOutValid_L  
:
:
BD22,  
BB22,  
"&  
"&  
"BcAdd_H  
"
: (B28 , E27 , A29 , G27 , C29 , F28 , B30 , D30 ,"&  
C31 , H28 , G29 , A33 , E31 , D32 , B34 , A35 ,"&  
"
B36 , H30 , C35 , E33 ),  
"&  
"&  
"&  
"BcDataOE_L  
"BcLoad_L  
"BcDataWr_L  
:
:
:
A27 ,  
F26 ,  
D26 ,  
"&  
"BcData_H  
"
: (B10 , D10 , A5 , C5 , C3 , E3 , H6 , E1  
J3 , K2 , L3 , M2 , T2 , U1 , V2 , Y4  
,"&  
,"&  
"
"
"
"
"
"
AC1 , AD2 , AE3 , AG1 , AK2 , AL3 , AR1 , AP2 ,"&  
AY2 , BB2 , AW5 , BB4 , BB8 , BE5 , BB10, BE7 ,"&  
G33 , C37 , B40 , C41 , C43 , E43 , G41 , F44 ,"&  
K44 , N41 , M44 , P42 , U43 , V44 , Y42 , AB44 ,"&  
AD42, AE43, AF42, AJ45, AK42, AN45, AP44, AN41 ,"&  
AW45, AU41, AY44, BA43, BC43, BD42, BB38, BE41 ,"&  
"
"
C11 , A7 , C9 , B6 , B4 , D4 , G5 , D2  
H4 , G1 , N5 , L1 , N1 , U3 , W5 , W1  
,"&  
,"&  
"
"
"
"
"
"
AB2 , AC3 , AD4 , AF4 , AJ3 , AK4 , AN1 , AM4 ,"&  
AU5 , BA1 , BA3 , BC3 , BD6 , BA9 , BC9 , AY12 ,"&  
A39 , D36 , A41 , B42 , D42 , D44 , H40 , H42 ,"&  
G45 , L43 , L45 , N45 , T44 , U45 , W45 , AA43 ,"&  
AC43, AD44, AE41, AG45, AK44, AL43, AM42, AR45 ,"&  
AP40, BA45, AV42, BB44, BB42, BC41, BA37, BD40),"&  
: (F2 , AB4 , AT2 , BC11, M38 , AB42, AU43, BC37 ,"&  
M8 , AA3 , AW1 , BD10, E45 , AC45, AT44, BB36),"&  
: (E7 , R3 , AH2 , BC5 , F38 , U39 , AH44, AY40),"&  
: (F8 , T4 , AJ1 , BD4 , E39 , V38 , AJ43, BA39),"&  
"BcCheck_H  
"
"BcDataInClk_H  
"Spare_7  
"BcDataOutClk_L  
"BcDataOutClk_H  
"BcTag_H  
"
: (K4 , AV4 , K42 , AT42),  
: (J5 , AU3 , J43 , AR43),  
"&  
"&  
: (E13 , H16 , A11 , B12 , D14 , E15 , A13 , G17 ,"&  
C15 , H18 , D16 , B16 , C17 , A17 , E19 , B18 ,"&  
"
A19 , F20 , D20 , E21 , C21 , D22 , H22 ),  
"&  
"BcTagValid_H  
"BcTagDirty_H  
"BcTagShared_H  
"BcTagParity_H  
"BcTagOE_L  
"BcTagWr_L  
"BcTagInClk_H  
"BcVref  
"BcTagOutClk_L  
"BcTagOutClk_H  
"IRQ_H  
:
:
:
:
:
:
:
:
:
:
B24 ,  
C23 ,  
G23 ,  
B22 ,  
H24 ,  
E25 ,  
G19 ,  
F18 ,  
D24 ,  
C25 ,  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
: (BA15, BE13, AW17, AV18, BC15, BB16),  
"&  
"Reset_L  
"SromData_H  
"SromCLK_H  
"SromOE_L  
"Tms_H  
"Tck_H  
"Trst_L  
"Tdi_H  
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
BD16,  
BC17,  
AW19,  
BE17,  
BD18,  
BE19,  
AY20,  
BA21,  
BB20,  
BA19,  
AM8 ,  
AN7 ,  
AV16,  
AW15,  
BD12,  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"Tdo_H  
"TestStat_H  
"ClkIn_H  
"ClkIn_L  
"FrameClk_H  
"FrameClk_L  
"PllBypass_H  
21264/EV68A Hardware Reference Manual  
21264/EV68A Boundary-Scan Register B–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Boundary-Scan Register  
"NoConnect_0  
"NoConnect_1  
"ClkFwdRst_H  
"EV6Clk_H  
"EV6Clk_L  
"Spare_4  
"Spare_5  
"PLL_VDD  
"Spare_0  
"MiscVref  
"Spare_2  
"DCOK_H  
:
:
:
:
:
:
:
:
:
:
:
:
BB14,  
BD2 ,  
BE11,  
AM6 ,  
AL7 ,  
AT4 ,  
AR3 ,  
AV8 ,  
BC21,  
AV22,  
BE9 ,  
AY18,  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"VSS:  
"
"
"
"
"
"
"
"
"
"
"
"
(C1 , W3 , AR5 , G9 , E17 , G25 , C33 , AA39, "&  
BA41, R45 , J1 , AG3 , BA5 , AW9 , BA17, AW25, "&  
BC33, AE39, A43 , AA45, R1 , AN3 , C7 , C19 , "&  
BE25, E35 , AL39, G43 , AE45, AA1 , AW3 , J7 , "&  
E11 , BC19, C27 , BA35, AU39, N43 , AL45, AE1 , "&  
BE3 , R7 , BA11, A21 , BC27, A37 , BC39, W43 , "&  
AU45, AL1 , E5 , AA7 , C13 , G21 , E29 , G37 , "&  
E41 , AG43, BC45, AU1 , L5 , AE7 , BC13, AW21, "&  
BA29, AW37, L41 , AN43, BC1 , U5 , AU7 , A15 , "&  
BE21, A31 , BE37, U41 , AW43, A3 , AC5 , AW7 , "&  
G15 , E23 , G31 , C39 , AC41, BE43, G3 , AJ5 , "&  
BC7 , AY14, BA23, AW31, J39 , AJ41, C45 , N3 , "&  
AN5 , A9 , BE15, A25 , BE31, R39 , AR41, J45 , "&  
E9 , R5 , AG5 , BA7 , D38 , T42 , AG39, AW41) ,"&  
"VDD  
"
"
"
"
"
"
"
"
: (B2 , V4 , AP6 , D12 , B20 , H26 , BD32, AM38, "&  
BB40, Y44 , H2 , AH4 , AT6 , BB12, H20 , AV26, "&  
D34 , AV38, F42 , AF44, P2 , AP4 , BB6 , B14 , "&  
AV20, BD26, BB34, BD38, M42 , AM44, Y2 , AY4 , "&  
B8 , H14 , BD20, D28 , F36 , D40 , V42 , AV44, "&  
AF2 , D6 , P8 , AV14, F22 , BB28, AY36, K40 , "&  
AH42, BD44, AM2 , K6 , Y8 , BD14, AY22, F30 , "&  
B38 , T40 , AP42, AV2 , T6 , AF8 , F16 , A23 , "&  
AY30, H38 , AB40, AY42, AB6 , BD8 , AY16, F24 , "&  
B32 , P38 , AD40, B44 , F4 , AD6 , F10 , D18 , "&  
AY24, H32 , Y38 , AK40, H44 , M4 , AK6 , AY10, "&  
"
"
"
BB18, B26 , AV32, AF38, AT40, P44 )  
";  
constant numeric_EV6 : PIN_MAP_STRING :=  
"
" &  
"SysAddIn_L  
: (559 , 536 , 468 , 580 , 445 , 490 , 558 , 579 , "&  
"
467 , 534 , 511 , 421 , 556 , 577 , 443 ),  
"&  
"&  
"&  
"SysAddInClk_L  
"SysVref  
"SysFillValid_L  
:
:
512 ,  
489 ,  
533 ,  
:
"&  
"SysAddOut_L  
: (448 , 585 , 562 , 539 , 493 , 470 , 583 , 424 , "&  
"
515 , 492 , 582 , 446 , 537 , 423 , 514 ),  
561 ,  
"&  
"&  
"SysAddOutClk_L  
:
"SysData_L  
"
"
"
"
"
"
"
"SysCheck_L  
"SysDataInClk_H  
"SysDataOutClk_L  
"SysDataInValid_L  
"SysDataOutValid_L  
"BcAdd_H  
: (118 , 140 , 117 , 161 , 160 , 137 , 114 , 189 , "&  
204 , 213 , 220 , 237 , 253 , 252 , 261 , 268 , "&  
285 , 293 , 301 , 308 , 332 , 333 , 341 , 356 , "&  
381 , 389 , 397 , 412 , 414 , 437 , 415 , 438 , "&  
127 , 128 , 172 , 151 , 131 , 153 , 190 , 183 , "&  
207 , 214 , 223 , 238 , 255 , 263 , 262 , 271 , "&  
286 , 294 , 302 , 319 , 334 , 342 , 359 , 350 , "&  
374 , 382 , 390 , 398 , 473 , 427 , 449 , 426 ),"&  
: (197 , 276 , 349 , 483 , 198 , 279 , 367 , 471 ),"&  
: (70 , 219 , 316 , 457 , 107 , 232 , 327 , 429 ),"&  
: (139 , 245 , 325 , 458 , 173 , 231 , 335 , 451 ),"&  
:
:
555 ,  
510 ,  
"&  
"&  
: (35 , 102 , 14 , 147 , 58 , 125 , 36 , 81 , "&  
59 , 169 , 148 , 16 , 104 , 82 , 38 , 17 , "&  
"
"
39 , 170 , 61 , 105 ),  
13  
"&  
"&  
"BcDataOE_L  
:
,
B–4 21264/EV68A Boundary-Scan Register  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Boundary-Scan Register  
"BcLoad_L  
"BcDataWr_L  
"BcData_H  
"
"
"
"
"
"
"
:
:
124 ,  
79  
"&  
"&  
,
: (26 , 71 , 2  
, 46 , 45 , 90 , 159 , 89 , "&  
179 , 186 , 195 , 202 , 234 , 242 , 250 , 267 , "&  
290 , 298 , 307 , 322 , 346 , 355 , 386 , 378 , "&  
455 , 500 , 434 , 501 , 503 , 568 , 504 , 569 , "&  
150 , 62 , 41 , 64 , 65 , 110 , 154 , 133 , "&  
193 , 215 , 209 , 224 , 248 , 257 , 272 , 289 , "&  
304 , 312 , 320 , 345 , 352 , 377 , 385 , 375 , "&  
454 , 407 , 476 , 498 , 543 , 565 , 518 , 586 , "&  
"
49 , 3  
, 48 , 24 , 23 , 68 , 136 , 67 , "&  
"
"
"
"
"
"
"
158 , 134 , 212 , 194 , 210 , 243 , 260 , 258 , "&  
282 , 291 , 299 , 315 , 339 , 347 , 370 , 363 , "&  
404 , 477 , 478 , 523 , 547 , 481 , 526 , 460 , "&  
19 , 84 , 20 , 42 , 87 , 88 , 175 , 176 , "&  
156 , 200 , 201 , 217 , 241 , 249 , 265 , 280 , "&  
296 , 305 , 311 , 329 , 353 , 360 , 368 , 393 , "&  
383 , 499 , 430 , 521 , 520 , 542 , 495 , 564 ),"&  
: (112 , 283 , 394 , 527 , 206 , 288 , 408 , 540 , "&  
205 , 275 , 432 , 549 , 111 , 297 , 401 , 517 ),"&  
: (92 , 227 , 330 , 524 , 130 , 246 , 337 , 474 ),"&  
: (115 , 235 , 338 , 546 , 108 , 254 , 344 , 496 ),"&  
"BcCheck_H  
"
"BcDataInClk_H  
"Spare_7  
"BcDataOutClk_L  
"BcDataOutClk_H  
"BcTag_H  
"
: (187 , 411 , 192 , 400 ),  
: (180 , 403 , 184 , 392 ),  
"&  
"&  
: (95 , 163 , 5  
, 27 , 73 , 96 , 6  
, 142 , "&  
51 , 164 , 74 , 29 , 52 , 8  
, 98 , 30 , "&  
"
9
, 121 , 76 , 99 , 54 , 77 , 166 ),  
"&  
"BcTagValid_H  
"BcTagDirty_H  
"BcTagShared_H  
"BcTagParity_H  
"BcTagOE_L  
"BcTagWr_L  
"BcTagInClk_H  
"BcVref  
"BcTagOutClk_L  
"BcTagOutClk_H  
"IRQ_H  
:
:
:
:
:
:
:
:
:
:
33  
55  
145 ,  
32  
167 ,  
101 ,  
143 ,  
120 ,  
78  
56  
,
,
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
,
,
,
: (484 , 572 , 440 , 418 , 529 , 507 ),  
"&  
"Reset_L  
"SromData_H  
"SromCLK_H  
"SromOE_L  
"Tms_H  
"Tck_H  
"Trst_L  
"Tdi_H  
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
552 ,  
530 ,  
441 ,  
574 ,  
553 ,  
575 ,  
464 ,  
487 ,  
509 ,  
486 ,  
365 ,  
373 ,  
417 ,  
439 ,  
550 ,  
506 ,  
545 ,  
571 ,  
364 ,  
357 ,  
395 ,  
387 ,  
413 ,  
532 ,  
420 ,  
570 ,  
463 ,  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"&  
"Tdo_H  
"TestStat_H  
"ClkIn_H  
"ClkIn_L  
"FrameClk_H  
"FrameClk_L  
"PllBypass_H  
"NoConnect_0  
"NoConnect_1  
"ClkFwdRst_H  
"EV6Clk_H  
"EV6Clk_L  
"Spare_4  
"Spare_5  
"PLL_VDD  
"Spare_0  
"MiscVref  
"Spare_2  
"DCOK_H  
21264/EV68A Hardware Reference Manual  
21264/EV68A Boundary-Scan Register B–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Boundary-Scan Register  
"VSS  
"
"
"
"
"
"
"
: (44 , 259 , 388 , 138 , 97 , 146 , 60 , 278 , "&  
497 , 233 , 178 , 323 , 479 , 436 , 485 , 444 , "&  
538 , 310 , 21 , 281 , 226 , 371 , 47 , 53 , "&  
578 , 106 , 358 , 155 , 313 , 274 , 433 , 181 , "&  
94 , 531 , 57 , 494 , 406 , 216 , 361 , 306 , "&  
567 , 229 , 482 , 10 , 535 , 18 , 541 , 264 , "&  
409 , 354 , 91 , 277 , 50 , 144 , 103 , 152 , "&  
109 , 328 , 544 , 402 , 196 , 309 , 528 , 442 , "&  
"
491 , 450 , 199 , 376 , 522 , 244 , 405 , 7  
, "&  
"
576 , 15 , 584 , 247 , 453 , 1 , 292 , 435 , "&  
"
"
141 , 100 , 149 , 63 , 295 , 587 , 135 , 340 , "&  
525 , 461 , 488 , 447 , 182 , 343 , 66 , 211 , "&  
"
372 , 4  
, 573 , 12 , 581 , 230 , 391 , 185 ,"&  
93 , 228 , 324 , 480 , 85 , 240 , 326 , 452 ),"&  
: (22 , 251 , 380 , 72 , 31 , 168 , 560 , 366 , "&  
519 , 273 , 157 , 331 , 396 , 505 , 165 , 422 , "&  
83 , 428 , 132 , 321 , 218 , 379 , 502 , 28 , "&  
419 , 557 , 516 , 563 , 208 , 369 , 266 , 456 , "&  
25 , 162 , 554 , 80 , 129 , 86 , 256 , 431 , "&  
314 , 69 , 221 , 416 , 122 , 513 , 472 , 191 , "&  
336 , 566 , 362 , 188 , 269 , 551 , 465 , 126 , "&  
40 , 239 , 384 , 410 , 236 , 317 , 119 , 11 , "&  
469 , 174 , 287 , 475 , 284 , 548 , 462 , 123 , "&  
37 , 222 , 303 , 43 , 113 , 300 , 116 , 75 , "&  
466 , 171 , 270 , 351 , 177 , 203 , 348 , 459 , "&  
"VDD  
"
"
"
"
"
"
"
"
"
"
"
508 , 34 , 425 , 318 , 399 , 225 )  
";  
attribute PORT_GROUPING of Alpha_21264b : entity is-- (Ref B.8.8. See Note 4.  
"Differential_Voltage ( (CLKIN_H), (CLKIN_L) )" ;  
attribute TAP_SCAN_CLOCK of Tck_H : signal is (5.0e6, LOW);  
attribute TAP_SCAN_IN  
attribute TAP_SCAN_OUT  
of Tdi_H : signal is TRUE;  
of Tdo_H : signal is TRUE;  
attribute TAP_SCAN_MODE of Tms_H : signal is TRUE;  
attribute TAP_SCAN_RESET of Trst_L : signal is TRUE;  
attribute COMPLIANCE_PATTERNS of Alpha_21264b : entity is -- (Ref B.8.10). See Note 4.  
"(DcOk_H), (1)" ;  
attribute INSTRUCTION_LENGTH of Alpha_21264b : entity is 5 ;  
attribute INSTRUCTION_OPCODE of Alpha_21264b : entity is  
"EXTEST  
"SAMPLE  
"CLAMP  
"HIGHZ  
"DIE_ID  
"BYPASS  
(11011),"&-- No longer mandated to be (00000)!  
(11000),"&-- JWB changed "PRELOAD" to "SAMPLE"  
(11010),"&  
(11001),"&  
(11110),"&  
(11111)";  
attribute INSTRUCTION_CAPTURE of Alpha_21264b : entity is "00001"  
;
attribute INSTRUCTION_PRIVATE of Alpha_21264b : entity is "Private"; -- See Note 4.  
attribute REGISTER_ACCESS of Alpha_21264b : entity is-- (ref B.8.13)  
"BOUNDARY (EXTEST, SAMPLE)," &-- Redundant. Added for completeness  
"BYPASS  
(BYPASS, HIGHZ, CLAMP)," &-- ditto  
"DIE_ID[32] (DIE_ID)";  
attribute BOUNDARY_LENGTH  
of Alpha_21264b : entity is 367 ;  
attribute BOUNDARY_REGISTER of Alpha_21264b : entity is  
-----------------------------------------------------------------------------  
-- scan cell  
-- cell type port  
safe  
function |  
cntrl disable disable  
cell value state  
----|-----|-----|-------------------|--------|---|----|------|--------------  
" 366 ( BC_2, TestStat_H,  
" 365 ( BC_2, SromOE_L,  
" 364 ( BC_2, SromClk_H,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
x
X
x
), "& --  
), "& --  
), "& --  
B–6 21264/EV68A Boundary-Scan Register  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Boundary-Scan Register  
" 363 ( BC_2, SromData_H,  
" 362 ( BC_3, reset_L,  
" 361 ( BC_3, IRQ_H(5),  
" 360 ( BC_3, IRQ_H(4),  
" 359 ( BC_3, IRQ_H(3),  
" 358 ( BC_3, IRQ_H(2),  
" 357 ( BC_3, IRQ_H(1),  
" 356 ( BC_3, IRQ_H(0),  
" 355 ( BC_3, ClkFwdRst_H,  
" 354 ( BC_2, BcCheck_H(3),  
" 353 ( BC_2, BcCheck_H(11),  
" 352 ( BC_2, SysCheck_L(3),  
" 351 ( BC_2, BcData_H(31),  
" 350 ( BC_2, BcData_H(95),  
" 349 ( BC_2, SysData_L(31),  
" 348 ( BC_2, BcData_H(30),  
" 347 ( BC_2, BcData_H(94),  
" 346 ( BC_2, SysData_L(30),  
" 345 ( BC_2, BcData_H(29),  
" 344 ( BC_2, BcData_H(93),  
" 343 ( BC_2, SysData_L(29),  
" 342 ( BC_2, BcData_H(28),  
" 341 ( BC_2, BcData_H(92),  
" 340 ( BC_2, SysData_L(28),  
" 339 ( BC_3, *,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
CONTROL,  
INPUT,  
x
x
x
x
x
x
x
x
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
x
x, 339, 0,  
x, 339, 0,  
x, 336, 0,  
x, 339, 0,  
x, 339, 0,  
x, 336, 0,  
x, 339, 0,  
x, 339, 0,  
x, 336, 0,  
x, 339, 0,  
x, 339, 0,  
x, 336, 0,  
x, 339, 0,  
x, 339, 0,  
x, 336, 0,  
0
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& -- bccell0  
Z
Z
Z
Z
Z
Z
" 338 ( BC_3, BcDataInClk_H(3),  
x
), "& --  
" 337 ( BC_2, SysDataOutClk_L(3), OUTPUT2,  
" 336 ( BC_3, *, CONTROL,  
" 335 ( BC_3, SysDataInClk_H(3), INPUT,  
x
0
x
), "& --  
), "& -- sccell0  
), "& --  
" 334 ( BC_2, BcData_H(27),  
" 333 ( BC_2, BcData_H(91),  
" 332 ( BC_2, SysData_L(27),  
" 331 ( BC_2, BcData_H(26),  
" 330 ( BC_2, BcData_H(90),  
" 329 ( BC_2, SysData_L(26),  
" 328 ( BC_2, BcData_H(25),  
" 327 ( BC_2, BcData_H(89),  
" 326 ( BC_2, SysData_L(25),  
" 325 ( BC_2, BcData_H(24),  
" 324 ( BC_2, BcData_H(88),  
" 323 ( BC_2, SysData_L(24),  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
x, 339, 0,  
x, 339, 0,  
x, 336, 0,  
x, 339, 0,  
x, 339, 0,  
x, 336, 0,  
x, 339, 0,  
x, 339, 0,  
x, 336, 0,  
x, 339, 0,  
x, 339, 0,  
x, 336, 0,  
x
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
WEAK1 ), "& --  
Z
Z
WEAK1 ), "& --  
Z
Z
WEAK1 ), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
" 322 ( BC_2, BcDataOutClk_L(1), OUTPUT2,  
" 321 ( BC_2, BcDataOutClk_H(1), OUTPUT2,  
x
), "& --  
" 320 ( BC_2, BcCheck_H(2),  
" 319 ( BC_2, BcCheck_H(10),  
" 318 ( BC_2, SysCheck_L(2),  
" 317 ( BC_2, BcData_H(23),  
" 316 ( BC_2, BcData_H(87),  
" 315 ( BC_2, SysData_L(23),  
" 314 ( BC_2, BcData_H(22),  
" 313 ( BC_2, BcData_H(86),  
" 312 ( BC_2, SysData_L(22),  
" 311 ( BC_2, BcData_H(21),  
" 310 ( BC_2, BcData_H(85),  
" 309 ( BC_2, SysData_L(21),  
" 308 ( BC_2, BcData_H(20),  
" 307 ( BC_2, BcData_H(84),  
" 306 ( BC_2, SysData_L(20),  
" 305 ( BC_3, *,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
CONTROL,  
INPUT,  
x, 305, 0,  
x, 305, 0,  
x, 302, 0,  
x, 305, 0,  
x, 305, 0,  
x, 302, 0,  
x, 305, 0,  
x, 305, 0,  
x, 302, 0,  
x, 305, 0,  
x, 305, 0,  
x, 302, 0,  
x, 305, 0,  
x, 305, 0,  
x, 302, 0,  
0
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
WEAK1 ), "& --  
Z
Z
WEAK1 ), "& --  
Z
Z
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& -- bccell1  
" 304 ( BC_3, BcDataInClk_H(2),  
x
), "& --  
" 303 ( BC_2, SysDataOutClk_L(2), OUTPUT2,  
" 302 ( BC_3, *, CONTROL,  
" 301 ( BC_3, SysDataInClk_H(2), INPUT,  
x
0
x
), "& --  
), "& -- sccell1  
), "& --  
" 300 ( BC_2, BcData_H(19),  
" 299 ( BC_2, BcData_H(83),  
BIDIR,  
BIDIR,  
x, 305, 0,  
x, 305, 0,  
Z
Z
), "& --  
), "& --  
21264/EV68A Hardware Reference Manual  
21264/EV68A Boundary-Scan Register B–7  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Boundary-Scan Register  
" 298 ( BC_2, SysData_L(19),  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
CONTROL,  
INPUT,  
x, 302, 0,  
x, 305, 0,  
x, 305, 0,  
x, 302, 0,  
x, 305, 0,  
x, 305, 0,  
x, 302, 0,  
x, 305, 0,  
x, 305, 0,  
x, 302, 0,  
x, 273, 0,  
x, 273, 0,  
x, 270, 0,  
x, 273, 0,  
x, 273, 0,  
x, 270, 0,  
x, 273, 0,  
x, 273, 0,  
x, 270, 0,  
x, 273, 0,  
x, 273, 0,  
x, 270, 0,  
x, 273, 0,  
x, 273, 0,  
x, 270, 0,  
0
WEAK1 ), "& --  
" 297 ( BC_2, BcData_H(18),  
" 296 ( BC_2, BcData_H(82),  
" 295 ( BC_2, SysData_L(18),  
" 294 ( BC_2, BcData_H(17),  
" 293 ( BC_2, BcData_H(81),  
" 292 ( BC_2, SysData_L(17),  
" 291 ( BC_2, BcData_H(16),  
" 290 ( BC_2, BcData_H(80),  
" 289 ( BC_2, SysData_L(16),  
" 288 ( BC_2, BcCheck_H(1),  
" 287 ( BC_2, BcCheck_H(9),  
" 286 ( BC_2, SysCheck_L(1),  
" 285 ( BC_2, BcData_H(15),  
" 284 ( BC_2, BcData_H(79),  
" 283 ( BC_2, SysData_L(15),  
" 282 ( BC_2, BcData_H(14),  
" 281 ( BC_2, BcData_H(78),  
" 280 ( BC_2, SysData_L(14),  
" 279 ( BC_2, BcData_H(13),  
" 278 ( BC_2, BcData_H(77),  
" 277 ( BC_2, SysData_L(13),  
" 276 ( BC_2, BcData_H(12),  
" 275 ( BC_2, BcData_H(76),  
" 274 ( BC_2, SysData_L(12),  
" 273 ( BC_3, *,  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
), "& -- bccell2  
" 272 ( BC_3, BcDataInClk_H(1),  
x
), "& --  
" 271 ( BC_2, SysDataOutClk_L(1), OUTPUT2,  
" 270 ( BC_3, *, CONTROL,  
" 269 ( BC_3, SysDataInClk_H(1), INPUT,  
x
0
x
), "& --  
), "& -- sccell2  
), "& --  
" 268 ( BC_2, BcData_H(11),  
" 267 ( BC_2, BcData_H(75),  
" 266 ( BC_2, SysData_L(11),  
" 265 ( BC_2, BcData_H(10),  
" 264 ( BC_2, BcData_H(74),  
" 263 ( BC_2, SysData_L(10),  
" 262 ( BC_2, BcData_H(9) ,  
" 261 ( BC_2, BcData_H(73),  
" 260 ( BC_2, SysData_L(9),  
" 259 ( BC_2, BcData_H(8) ,  
" 258 ( BC_2, BcData_H(72),  
" 257 ( BC_2, SysData_L(8),  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
x, 273, 0,  
x, 273, 0,  
x, 270, 0,  
x, 273, 0,  
x, 273, 0,  
x, 270, 0,  
x, 273, 0,  
x, 273, 0,  
x, 270, 0,  
x, 273, 0,  
x, 273, 0,  
x, 270, 0,  
x
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
WEAK1 ), "& --  
Z
Z
WEAK1 ), "& --  
Z
Z
WEAK1 ), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
" 256 ( BC_2, BcDataOutClk_L(0), OUTPUT2,  
" 255 ( BC_2, BcDataOutClk_H(0), OUTPUT2,  
x
), "& --  
" 254 ( BC_2, BcCheck_H(0),  
" 253 ( BC_2, BcCheck_H(8),  
" 252 ( BC_2, SysCheck_L(0),  
" 251 ( BC_2, BcData_H(7) ,  
" 250 ( BC_2, BcData_H(71),  
" 249 ( BC_2, SysData_L(7),  
" 248 ( BC_2, BcData_H(6) ,  
" 247 ( BC_2, BcData_H(70),  
" 246 ( BC_2, SysData_L(6),  
" 245 ( BC_2, BcData_H(5) ,  
" 244 ( BC_2, BcData_H(69),  
" 243 ( BC_2, SysData_L(5),  
" 242 ( BC_2, BcData_H(4) ,  
" 241 ( BC_2, BcData_H(68),  
" 240 ( BC_2, SysData_L(4),  
" 239 ( BC_3, *,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
CONTROL,  
INPUT,  
x, 239, 0,  
x, 239, 0,  
x, 236, 0,  
x, 239, 0,  
x, 239, 0,  
x, 236, 0,  
x, 239, 0,  
x, 239, 0,  
x, 236, 0,  
x, 239, 0,  
x, 239, 0,  
x, 236, 0,  
x, 239, 0,  
x, 239, 0,  
x, 236, 0,  
0
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
WEAK1 ), "& --  
Z
Z
WEAK1 ), "& --  
Z
Z
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& -- bccell3  
" 238 ( BC_3, BcDataInClk_H(0),  
x
), "& --  
" 237 ( BC_2, SysDataOutClk_L(0), OUTPUT2,  
" 236 ( BC_3, *, CONTROL,  
" 235 ( BC_3, SysDataInClk_H(0), INPUT,  
x
0
x
), "& --  
), "& -- sccell3  
), "& --  
" 234 ( BC_2, BcData_H(3) ,  
BIDIR,  
x, 239, 0,  
Z
), "& --  
B–8 21264/EV68A Boundary-Scan Register  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Boundary-Scan Register  
" 233 ( BC_2, BcData_H(67),  
" 232 ( BC_2, SysData_L(3),  
" 231 ( BC_2, BcData_H(2) ,  
" 230 ( BC_2, BcData_H(66),  
" 229 ( BC_2, SysData_L(2),  
" 228 ( BC_2, BcData_H(1) ,  
" 227 ( BC_2, BcData_H(65),  
" 226 ( BC_2, SysData_L(1),  
" 225 ( BC_2, BcData_H(0) ,  
" 224 ( BC_2, BcData_H(64),  
" 223 ( BC_2, SysData_L(0),  
" 222 ( BC_2, BcTag_H(20),  
" 221 ( BC_2, BcTag_H(21),  
" 220 ( BC_2, BcTag_H(22),  
" 219 ( BC_2, BcTag_H(23),  
" 218 ( BC_2, BcTag_H(24),  
" 217 ( BC_2, BcTag_H(25),  
" 216 ( BC_2, BcTag_H(26),  
" 215 ( BC_2, BcTag_H(27),  
" 214 ( BC_2, BcTag_H(28),  
" 213 ( BC_2, BcTag_H(29),  
" 212 ( BC_2, BcTag_H(30),  
" 211 ( BC_2, BcTag_H(31),  
" 210 ( BC_2, BcTag_H(32),  
" 209 ( BC_2, BcTag_H(33),  
" 208 ( BC_3, *,  
" 207 ( BC_3, BcTagInClk_H,  
" 206 ( BC_2, BcTag_H(34),  
" 205 ( BC_2, BcTag_H(35),  
" 204 ( BC_2, BcTag_H(36),  
" 203 ( BC_2, BcTag_H(37),  
" 202 ( BC_2, BcTag_H(38),  
" 201 ( BC_2, BcTag_H(39),  
" 200 ( BC_2, BcTag_H(40),  
" 199 ( BC_2, BcTag_H(41),  
" 198 ( BC_2, BcTag_H(42),  
" 197 ( BC_2, BcTagParity_H,  
" 196 ( BC_2, BcTagShared_H,  
" 195 ( BC_2, BcTagDirty_H,  
" 194 ( BC_2, BcTagValid_H,  
" 193 ( BC_2, BcTagOutClk_L,  
" 192 ( BC_2, BcTagOutClk_H,  
" 191 ( BC_2, BcTagOE_L,  
" 190 ( BC_2, BcTagWr_L,  
" 189 ( BC_2, BcDataWr_L,  
" 188 ( BC_2, BcLoad_L,  
" 187 ( BC_2, BcDataOE_L,  
" 186 ( BC_2, BcAdd_H(4),  
" 185 ( BC_2, BcAdd_H(5),  
" 184 ( BC_2, BcAdd_H(6),  
" 183 ( BC_2, BcAdd_H(7),  
" 182 ( BC_2, BcAdd_H(8),  
" 181 ( BC_2, BcAdd_H(9),  
" 180 ( BC_2, BcAdd_H(10),  
" 179 ( BC_2, BcAdd_H(11),  
" 178 ( BC_2, BcAdd_H(12),  
" 177 ( BC_2, BcAdd_H(13),  
" 176 ( BC_2, BcAdd_H(14),  
" 175 ( BC_2, BcAdd_H(15),  
" 174 ( BC_2, BcAdd_H(16),  
" 173 ( BC_2, BcAdd_H(17),  
" 172 ( BC_2, BcAdd_H(18),  
" 171 ( BC_2, BcAdd_H(19),  
" 170 ( BC_2, BcAdd_H(20),  
" 169 ( BC_2, BcAdd_H(21),  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
CONTROL,  
INPUT,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
x, 239, 0,  
x, 236, 0,  
x, 239, 0,  
x, 239, 0,  
x, 236, 0,  
x, 239, 0,  
x, 239, 0,  
x, 236, 0,  
x, 239, 0,  
x, 239, 0,  
x, 236, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
0
x
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
Z
), "& --  
WEAK1 ), "& --  
Z
Z
WEAK1 ), "& --  
Z
Z
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& -- tccell0  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
x, 208, 0,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
x, 208, 0,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
21264/EV68A Hardware Reference Manual  
21264/EV68A Boundary-Scan Register B–9  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Boundary-Scan Register  
" 168 ( BC_2, BcAdd_H(22),  
OUTPUT2,  
OUTPUT2,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
x
x
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
" 167 ( BC_2, BcAdd_H(23),  
" 166 ( BC_2, SysData_L(32),  
" 165 ( BC_2, BcData_H(96),  
" 164 ( BC_2, BcData_H(32),  
" 163 ( BC_2, SysData_L(33),  
" 162 ( BC_2, BcData_H(97),  
" 161 ( BC_2, BcData_H(33),  
" 160 ( BC_2, SysData_L(34),  
" 159 ( BC_2, BcData_H(98),  
" 158 ( BC_2, BcData_H(34),  
" 157 ( BC_2, SysData_L(35),  
" 156 ( BC_2, BcData_H(99),  
" 155 ( BC_2, BcData_H(35),  
x, 150, 0,  
x, 153, 0,  
x, 153, 0,  
x, 150, 0,  
x, 153, 0,  
x, 153, 0,  
x, 150, 0,  
x, 153, 0,  
x, 153, 0,  
x, 150, 0,  
x, 153, 0,  
x, 153, 0,  
x
Z
Z
Z
Z
Z
Z
Z
Z
), "& --  
), "& --  
), "& --  
" 154 ( BC_3, SysDataInClk_H(4), INPUT,  
" 153 ( BC_3, *,  
" 152 ( BC_2, SysDataOutClk_L(4), OUTPUT2,  
CONTROL,  
0
x
), "& -- sccell4  
), "& --  
" 151 ( BC_3, BcDataInClk_H(4),  
" 150 ( BC_3, *,  
INPUT,  
CONTROL,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
x
0
), "& --  
), "& -- bccell4  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
" 149 ( BC_2, SysData_L(36),  
" 148 ( BC_2, BcData_H(100),  
" 147 ( BC_2, BcData_H(36),  
" 146 ( BC_2, SysData_L(37),  
" 145 ( BC_2, BcData_H(101),  
" 144 ( BC_2, BcData_H(37),  
" 143 ( BC_2, SysData_L(38),  
" 142 ( BC_2, BcData_H(102),  
" 141 ( BC_2, BcData_H(38),  
" 140 ( BC_2, SysData_L(39),  
" 139 ( BC_2, BcData_H(103),  
" 138 ( BC_2, BcData_H(39),  
" 137 ( BC_2, SysCheck_L(4),  
" 136 ( BC_2, BcCheck_H(12),  
" 135 ( BC_2, BcCheck_H(4),  
x, 150, 0,  
x, 153, 0,  
x, 153, 0,  
x, 150, 0,  
x, 153, 0,  
x, 153, 0,  
x, 150, 0,  
x, 153, 0,  
x, 153, 0,  
x, 150, 0,  
x, 153, 0,  
x, 153, 0,  
x, 150, 0,  
x, 153, 0,  
x, 153, 0,  
x
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
), "& --  
), "& --  
), "& --  
), "& --  
" 134 ( BC_2, BcDataOutClk_H(2), OUTPUT2,  
" 133 ( BC_2, BcDataOutClk_L(2), OUTPUT2,  
x
" 132 ( BC_2, SysData_L(40),  
" 131 ( BC_2, BcData_H(104),  
" 130 ( BC_2, BcData_H(40),  
" 129 ( BC_2, SysData_L(41),  
" 128 ( BC_2, BcData_H(105),  
" 127 ( BC_2, BcData_H(41),  
" 126 ( BC_2, SysData_L(42),  
" 125 ( BC_2, BcData_H(106),  
" 124 ( BC_2, BcData_H(42),  
" 123 ( BC_2, SysData_L(43),  
" 122 ( BC_2, BcData_H(107),  
" 121 ( BC_2, BcData_H(43),  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
x, 119, 0,  
x, 116, 0,  
x, 116, 0,  
x, 119, 0,  
x, 116, 0,  
x, 116, 0,  
x, 119, 0,  
x, 116, 0,  
x, 116, 0,  
x, 119, 0,  
x, 116, 0,  
x, 116, 0,  
x
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
" 120 ( BC_3, SysDataInClk_H(5), INPUT,  
), "& --  
" 119 ( BC_3, *,  
" 118 ( BC_2, SysDataOutClk_L(5), OUTPUT2,  
CONTROL,  
0
x
), "& -- sccell5  
), "& --  
" 117 ( BC_3, BcDataInClk_H(5),  
" 116 ( BC_3, *,  
INPUT,  
CONTROL,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
x
0
), "& --  
), "& -- bccell5  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
" 115 ( BC_2, SysData_L(44),  
" 114 ( BC_2, BcData_H(108),  
" 113 ( BC_2, BcData_H(44),  
" 112 ( BC_2, SysData_L(45),  
" 111 ( BC_2, BcData_H(109),  
" 110 ( BC_2, BcData_H(45),  
" 109 ( BC_2, SysData_L(46),  
" 108 ( BC_2, BcData_H(110),  
" 107 ( BC_2, BcData_H(46),  
" 106 ( BC_2, SysData_L(47),  
" 105 ( BC_2, BcData_H(111),  
" 104 ( BC_2, BcData_H(47),  
x, 119, 0,  
x, 116, 0,  
x, 116, 0,  
x, 119, 0,  
x, 116, 0,  
x, 116, 0,  
x, 119, 0,  
x, 116, 0,  
x, 116, 0,  
x, 119, 0,  
x, 116, 0,  
x, 116, 0,  
Z
Z
Z
Z
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
B–10 21264/EV68A Boundary-Scan Register  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Boundary-Scan Register  
" 103 ( BC_2, SysCheck_L(5),  
" 102 ( BC_2, BcCheck_H(13),  
" 101 ( BC_2, BcCheck_H(5),  
" 100 ( BC_2, SysData_L(48),  
" 99 ( BC_2, BcData_H(112),  
" 98 ( BC_2, BcData_H(48),  
" 97 ( BC_2, SysData_L(49),  
" 96 ( BC_2, BcData_H(113),  
" 95 ( BC_2, BcData_H(49),  
" 94 ( BC_2, SysData_L(50),  
" 93 ( BC_2, BcData_H(114),  
" 92 ( BC_2, BcData_H(50),  
" 91 ( BC_2, SysData_L(51),  
" 90 ( BC_2, BcData_H(115),  
" 89 ( BC_2, BcData_H(51),  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
x, 119, 0,  
x, 116, 0,  
x, 116, 0,  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
x, 87,  
x, 84,  
x, 84,  
x, 87,  
x, 84,  
x, 84,  
x, 87,  
x, 84,  
x, 84,  
x, 87,  
x, 84,  
x, 84,  
x
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
" 88 ( BC_3, SysDataInClk_H(6), INPUT,  
), "& --  
" 87 ( BC_3, *,  
" 86 ( BC_2, SysDataOutClk_L(6), OUTPUT2,  
CONTROL,  
0
x
), "& -- sccell6  
), "& --  
" 85 ( BC_3, BcDataInClk_H(6),  
" 84 ( BC_3, *,  
INPUT,  
CONTROL,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
x
0
), "& --  
), "& -- bccell6  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
" 83 ( BC_2, SysData_L(52),  
" 82 ( BC_2, BcData_H(116),  
" 81 ( BC_2, BcData_H(52),  
" 80 ( BC_2, SysData_L(53),  
" 79 ( BC_2, BcData_H(117),  
" 78 ( BC_2, BcData_H(53),  
" 77 ( BC_2, SysData_L(54),  
" 76 ( BC_2, BcData_H(118),  
" 75 ( BC_2, BcData_H(54),  
" 74 ( BC_2, SysData_L(55),  
" 73 ( BC_2, BcData_H(119),  
" 72 ( BC_2, BcData_H(55),  
" 71 ( BC_2, SysCheck_L(6),  
" 70 ( BC_2, BcCheck_H(14),  
" 69 ( BC_2, BcCheck_H(6),  
x, 87,  
x, 84,  
x, 84,  
x, 87,  
x, 84,  
x, 84,  
x, 87,  
x, 84,  
x, 84,  
x, 87,  
x, 84,  
x, 84,  
x, 87,  
x, 84,  
x, 84,  
x
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
), "& --  
), "& --  
), "& --  
), "& --  
" 68 ( BC_2, BcDataOutClk_H(3), OUTPUT2,  
" 67 ( BC_2, BcDataOutClk_L(3), OUTPUT2,  
x
" 66 ( BC_2, SysData_L(56),  
" 65 ( BC_2, BcData_H(120),  
" 64 ( BC_2, BcData_H(56),  
" 63 ( BC_2, SysData_L(57),  
" 62 ( BC_2, BcData_H(121),  
" 61 ( BC_2, BcData_H(57),  
" 60 ( BC_2, SysData_L(58),  
" 59 ( BC_2, BcData_H(122),  
" 58 ( BC_2, BcData_H(58),  
" 57 ( BC_2, SysData_L(59),  
" 56 ( BC_2, BcData_H(123),  
" 55 ( BC_2, BcData_H(59),  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
x, 53,  
x, 50,  
x, 50,  
x, 53,  
x, 50,  
x, 50,  
x, 53,  
x, 50,  
x, 50,  
x, 53,  
x, 50,  
x, 50,  
x
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
" 54 ( BC_3, SysDataInClk_H(7), INPUT,  
), "& --  
" 53 ( BC_3, *,  
" 52 ( BC_2, SysDataOutClk_L(7), OUTPUT2,  
CONTROL,  
0
x
), "& -- sccell7  
), "& --  
" 51 ( BC_3, BcDataInClk_H(7),  
" 50 ( BC_3, *,  
INPUT,  
CONTROL,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
x
0
), "& --  
), "& -- bccell7  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
), "& --  
WEAK1 ), "& --  
), "& --  
" 49 ( BC_2, SysData_L(60),  
" 48 ( BC_2, BcData_H(124),  
" 47 ( BC_2, BcData_H(60),  
" 46 ( BC_2, SysData_L(61),  
" 45 ( BC_2, BcData_H(125),  
" 44 ( BC_2, BcData_H(61),  
" 43 ( BC_2, SysData_L(62),  
" 42 ( BC_2, BcData_H(126),  
" 41 ( BC_2, BcData_H(62),  
" 40 ( BC_2, SysData_L(63),  
" 39 ( BC_2, BcData_H(127),  
x, 53,  
x, 50,  
x, 50,  
x, 53,  
x, 50,  
x, 50,  
x, 53,  
x, 50,  
x, 50,  
x, 53,  
x, 50,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
0,  
Z
Z
Z
Z
Z
Z
Z
21264/EV68A Hardware Reference Manual  
21264/EV68A Boundary-Scan Register B–11  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Boundary-Scan Register  
" 38 ( BC_2, BcData_H(63),  
BIDIR,  
BIDIR,  
BIDIR,  
BIDIR,  
x, 50,  
x, 53,  
x, 50,  
x, 50,  
0,  
0,  
0,  
0,  
Z
), "& --  
" 37 ( BC_2, SysCheck_L(7),  
" 36 ( BC_2, BcCheck_H(15),  
" 35 ( BC_2, BcCheck_H(7),  
" 34 ( BC_2, SysAddOut_L(0),  
" 33 ( BC_2, SysAddOut_L(1),  
" 32 ( BC_2, SysAddOut_L(2),  
" 31 ( BC_2, SysAddOut_L(3),  
" 30 ( BC_2, SysAddOut_L(4),  
" 29 ( BC_2, SysAddOut_L(5),  
" 28 ( BC_2, SysAddOut_L(6),  
" 27 ( BC_2, SysAddOut_L(7),  
" 26 ( BC_2, SysAddOutClk_L,  
" 25 ( BC_2, SysAddOut_L(8),  
" 24 ( BC_2, SysAddOut_L(9),  
" 23 ( BC_2, SysAddOut_L(10),  
" 22 ( BC_2, SysAddOut_L(11),  
" 21 ( BC_2, SysAddOut_L(12),  
" 20 ( BC_2, SysAddOut_L(13),  
" 19 ( BC_2, SysAddOut_L(14),  
" 18 ( BC_3, SysAddIn_L(0),  
" 17 ( BC_3, SysAddIn_L(1),  
" 16 ( BC_3, SysAddIn_L(2),  
" 15 ( BC_3, SysAddIn_L(3),  
" 14 ( BC_3, SysAddIn_L(4),  
" 13 ( BC_3, SysAddIn_L(5),  
" 12 ( BC_3, SysAddIn_L(6),  
" 11 ( BC_3, SysAddIn_L(7),  
" 10 ( BC_3, SysAddIn_L(8),  
WEAK1 ), "& --  
Z
Z
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
), "& --  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
OUTPUT2,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
" 9  
" 8  
" 7  
" 6  
" 5  
" 4  
" 3  
" 2  
" 1  
" 0  
( BC_3, SysAddInClk_L,  
( BC_3, SysAddIn_L(9),  
( BC_3, SysAddIn_L(10),  
( BC_3, SysAddIn_L(11),  
( BC_3, SysAddIn_L(12),  
( BC_3, SysAddIn_L(13),  
( BC_3, SysAddIn_L(14),  
( BC_3, SysFillValid_L,  
( BC_3, SysDataInValid_L,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
INPUT,  
( BC_3, SysDataOutValid_L, INPUT,  
)
";  
attribute DESIGN_WARNING of Alpha_21264b: entity is  
"1. IEEE 1149.1 circuits on Alpha 21264b are designed primarily to support  
"&  
testing in off-line module manufacturing environment. The SAMPLE/PRELOAD"&  
instruction support is designed primarily for supporting interconnection"&  
"
"
"
verification test and not for at-speed samples of pin data.  
"&  
"&  
"&  
";  
"2. TDO is Open-Drain signal.  
"3. Add comment on port pin electrical characteristics:  
"4. Comment out if compiler does not support this statement.  
end Alpha_21264b;  
B–12 21264/EV68A Boundary-Scan Register  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
C
Serial Icache Load Predecode Values  
See the Alpha Motherboards Software Developer’s Kit (SDK) for information.  
21264/EV68A Hardware Reference Manual  
Serial Icache Load Predecode Values C–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Download from Www.Somanuals.com. All Manuals Search And Download.  
D
PALcode Restrictions and Guidelines  
D.1 Restriction 1 : Reset Sequence Required by Retire Logic and  
Mapper  
For convenience of implementation, the Ibox retire logic done status bits are not initial-  
ized during reset. Instead, as shown in the example below, the first batch of valid  
instructions sweeps through inum-space and initializes these bits. The 80 status bits  
(one for each inflight instruction) must be marked not done by the first 80 instructions  
mapped after reset, and later marked done when those instructions are retired. There-  
fore, the first 20 fetch blocks must contain four valid instructions each, and must not  
contain any retire logic NOP instructions.  
reset:  
** (1) Initialize 80 retirator "done" status bits and  
** the integer and floating mapper destinations.  
** (2) Do A MTPR ITB_IA, which turns on the mapper source  
** enables.  
** (3) Create a map stall to complete the ITB_IA.  
**  
** State after execution of this code:  
**  
**  
**  
**  
**  
retirator initialized  
destinations mapped  
source mapping enabled  
itb flushed  
** The PALcode need not assume the following since the SROM is not  
** required to do these:  
**  
**  
**  
**  
*/  
dtb  
flushed  
dtb_asn0  
dtb_asn1  
dtb_alt_mode  
0
0
0
/*  
** Initialize retirator and destination map, doing 80 retires.  
*/  
addq  
addq  
addt  
mult  
r31,r31,r0  
r31,r31,r1  
f31,f31,f0  
f31,f31,f1  
/* initialize Int. Reg. 0*/  
/* initialize Int. Reg. 1*/  
/* initialize F.P. Reg. 0*/  
/* initialize F.P. Reg. 1*/  
addq  
addq  
r31,r31,r2  
r31,r31,r3  
/* initialize Int. Reg. 2*/  
/* initialize Int. Reg. 3*/  
21264/EV68A Hardware Reference Manual  
PALcode Restrictions and Guidelines D–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
           
Restriction 1 : Reset Sequence Required by Retire Logic and Mapper  
addt  
mult  
f31,f31,f2  
f31,f31,f3  
/* initialize F.P. Reg. 2*/  
/* initialize F.P. Reg. 3*/  
addq  
addq  
addt  
mult  
r31,r31,r4  
r31,r31,r5  
f31,f31,f4  
f31,f31,f5  
/* initialize Int. Reg. 4*/  
/* initialize Int. Reg. 5*/  
/* initialize F.P. Reg. 4*/  
/* initialize F.P. Reg. 5*/  
addq  
addq  
addt  
mult  
r31,r31,r6  
f31,r31,r7  
f31,f31,f6  
f31,f31,f7  
/* initialize Int. Reg. 6*/  
/* initialize Int. Reg. 7*/  
/* initialize F.P. Reg. 6*/  
/* initialize F.P. Reg. 7*/  
addq  
addq  
addt  
mult  
r31,r31,r8  
r31,r31,r9  
f31,f31,f8  
f31,f31,f9  
/* initialize Int. Reg. 8*/  
/* initialize Int. Reg. 9*/  
/* initialize F.P. Reg. 8*/  
/* initialize F.P. Reg. 9*/  
addq  
addq  
addt  
mult  
r31,r31,r10  
r31,r31,r11  
f31,f31,f10  
f31,f31,f11  
/* initialize Int. Reg. 10*/  
/* initialize Int. Reg. 11*/  
/* initialize F.P. Reg. 10*/  
/* initialize F.P. Reg. 11*/  
addq  
addq  
addt  
mult  
r31,r31,r12  
r31,r31,r13  
f31,f31,f12  
f31,f31,f13  
/* initialize Int. Reg. 12*/  
/* initialize Int. Reg. 13*/  
/* initialize F.P. Reg. 12*/  
/* initialize F.P. Reg. 13*/  
addq  
addq  
addt  
mult  
r31,r31,r14  
r31,r31,r15  
f31,f31,f14  
f31,f31,f15  
/* initialize Int. Reg. 14*/  
/* initialize Int. Reg. 15*/  
/* initialize F.P. Reg. 14*/  
/* initialize F.P. Reg. 15*/  
addq  
addq  
addt  
mult  
r31,r31,r16  
r31,r31,r17  
f31,f31,f16  
f31,f31,f17  
/* initialize Int. Reg. 16*/  
/* initialize Int. Reg. 17*/  
/* initialize F.P. Reg. 16*/  
/* initialize F.P. Reg. 17*/  
addq  
addq  
addt  
mult  
r31,r31,r18  
r31,r31,r19  
f31,f31,f18  
f31,f31,f19  
/* initialize Int. Reg. 18*/  
/* initialize Int. Reg. 19*/  
/* initialize F.P. Reg. 18*/  
/* initialize F.P. Reg. 19*/  
addq  
addq  
addt  
mult  
r31,r31,r20  
r31,r31,r21  
f31,f31,f20  
f31,f31,f21  
/* initialize Int. Reg. 20*/  
/* initialize Int. Reg. 21*/  
/* initialize F.P. Reg. 20*/  
/* initialize F.P. Reg. 21*/  
addq  
addq  
addt  
mult  
r31,r31,r22  
r31,r31,r23  
f31,f31,f22  
f31,f31,f23  
/* initialize Int. Reg. 22*/  
/* initialize Int. Reg. 23*/  
/* initialize F.P. Reg. 22*/  
/* initialize F.P. Reg. 23*/  
addq  
addq  
addt  
mult  
r31,r31,r24  
r31,r31,r25  
f31,f31,f24  
f31,f31,f25  
/* initialize Int. Reg. 24*/  
/* initialize Int. Reg. 25*/  
/* initialize F.P. Reg. 24*/  
/* initialize F.P. Reg. 25*/  
addq  
r31,r31,r26  
/* initialize Int. Reg. 26*/  
D–2 PALcode Restrictions and Guidelines  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Restriction 1 : Reset Sequence Required by Retire Logic and Mapper  
addq  
addt  
mult  
r31,r31,r27  
f31,f31,f26  
f31,f31,f27  
/* initialize Int. Reg. 27*/  
/* initialize F.P. Reg. 26*/  
/* initialize F.P. Reg. 27*/  
addq  
addq  
addt  
mult  
r31,r31,r28  
r31,r31,r29  
f31,f31,f28  
f31,f31,f29  
/* initialize Int. Reg. 28*/  
/* initialize Int. Reg. 29*/  
/* initialize F.P. Reg. 28*/  
/* initialize F.P. Reg. 29*/  
addq  
addt  
addq  
addq  
r31,r31,r30  
f31,f31,f30  
r31,r31,r0  
r31,r31,r0  
/* initialize Int. Reg. 30*/  
/* initialize F.P. Reg. 30*/  
/* initialize retirator 63*/  
/* initialize retirator 64*/  
addq  
addq  
addq  
addq  
r31,r31,r0  
r31,r31,r0  
r31,r31,r0  
r31,r31,r0  
/* initialize retirator 65*/  
/* initialize retirator 66*/  
/* initialize retirator 67*/  
/* initialize retirator 68*/  
addq  
addq  
addq  
addq  
r31,r31,r0  
r31,r31,r0  
r31,r31,r0  
r31,r31,r0  
/* initialize retirator 69*/  
/* initialize retirator 70*/  
/* initialize retirator 71*/  
/* initialize retirator 72*/  
addq  
addq  
addq  
addq  
r31,r31,r0  
r31,r31,r0  
r31,r31,r0  
r31,r31,r0  
/* initialize retirator 73*/  
/* initialize retirator 74*/  
/* initialize retirator 75*/  
/* initialize retirator 76*/  
addq  
addq  
addq  
addq  
r31,r31,r0  
r31,r31,r0  
r31,r31,r0  
r31,r31,r0  
/* initialize retirator 77*/  
/* initialize retirator 78*/  
/* initialize retirator 79*/  
/* initialize retirator 80*/  
/* stop deleting*/  
mtpr  
r31,EV6__ITB_IA  
/* flush the ITB (SCRBRD=4) *** this also  
turns on mapper source enables ****/  
/* flush the DTB (SCRBRD=7)*/  
/* clear VA_CTL (SCRBRD=5)*/  
/* clear M_CTL (SCRBRD=6)*/  
mtpr  
mtpr  
mtpr  
r31,EV6__DTB_IA  
r31,EV6__VA_CTL  
r31,EV6__M_CTL  
/*  
** Create a stall outside the IQ until the mtpr EV6__ITB_IA retires.  
** We can use DTB_ASNx even though we don’t seem to follow the restriction on  
** scoreboard bits (4-7).It’s okay because there are no real dstream  
** operations happening.  
*/  
mtpr r31,EV6__DTB_ASN0  
/* clear DTB_ASN0 (SCRBRD=4) creates a map-  
stall under the above mtpr to SCRBRD=4*/  
/* clear DTB_ASN1 (SCRBRD=7)*/  
mtpr r31,EV6__DTB_ASN1  
mtpr r31,EV6__CC_CTL  
/* clear CC_CTL (SCRBRD=5)*/  
mtpr r31,EV6__DTB_ALT_MODE/* clear DTB_ALT_MODE (SCRBRD=6)*/  
/*  
** MAP_SHADOW_REGISTERS  
**  
** The shadow registers are mapped. This code may be done by the SROM  
21264/EV68A Hardware Reference Manual  
PALcode Restrictions and Guidelines D–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Restriction 1 : Reset Sequence Required by Retire Logic and Mapper  
** or the PALcode, but it must be done in the manner and order below.  
**  
** It assumes that the retirator has been initialized, that the  
** non-shadow registers are mapped, and that mapper source enables are on.  
**  
** Source enables are on. For fault-reset and wake from sleep, we need to  
** ensure we are in the icache so we don’t fetch junk that touches the  
** shadow sources before we write the destinations. For normal reset,  
** we are already in the icache. However, so this macro is useful for  
** all cases, force the code into the icache before doing the mapping.  
**  
** Assume for fault-reset, and wake from sleep case, the exc_addr is  
** stored in r1.  
*/  
addq  
addq  
addq  
br  
r31,r31,r0  
r31,r31,r0  
r31,r31,r0  
r31, tch0  
/* nop*/  
/* nop*/  
/* nop*/  
/* fetch in next block*/  
.align 3  
nxt0: lda r0,0x0086(r31)  
/* load I_CTL.....*/  
mtpr  
br  
tch0: br  
r0,EV6__I_CTL  
r31, nxt1  
r31, tch1  
/* .....SDE=2, IC_EN=3 (SCRBRD=4)*/  
/* continue executing in next block*/  
/* fetch in next block*/  
nxt1: mtpr r31,EV6__IER_CM  
/* clear IER_CM (SCRBRD=4) creates a map-stall  
under the above mtpr to SCRBRD=4*/  
/* nop*/  
/* continue executing in next block*/  
/* fetch in next block*/  
addq  
br  
r31,r31,r0  
r31, nxt2  
r31, tch2  
tch1: br  
nxt2: addq  
r31,r31,r0  
/* 1st buffer fetch block for above map-  
stall*/  
addq  
br  
tch2: br  
r31,r31,r0  
r31, nxt3  
r31, tch3  
/* nop*/  
/* continue executing in next block*/  
/* fetch in next block*/  
nxt3: addq  
addq  
br  
tch3: br  
r31,r31,r0  
r31,r31,r0  
r31, nxt4  
r31, tch4  
/* 2nd buffer fetch block for above map-stall*/  
/* nop*/  
/* continue executing in next block*/  
/* fetch in next block*/  
nxt4: addq  
r31,r31,r0  
/* need 3rd buffer fetch block to get correct  
SDE bit for next fetch block*/  
/* nop*/  
/* continue executing in next block*/  
/* fetch in next block*/  
addq  
br  
tch4: br  
r31,r31,r0  
r31, nxt5  
r31, tch5  
nxt5: addq  
addq  
br  
tch5: br  
r31,r31,r4  
r31,r31,r5  
r31, nxt6  
r31, tch6  
/* initialize Shadow Reg. 0*/  
/* initialize Shadow Reg. 1*/  
/* continue executing in next block*/  
/* fetch in next block*/  
nxt6: addq  
addq  
br  
tch6: br  
r31,r31,r6  
r31,r31,r7  
r31, nxt7  
r31, tch7  
/* initialize Shadow Reg. 2*/  
/* initialize Shadow Reg. 3*/  
/* continue executing in next block*/  
/* fetch in next block*/  
nxt7: addq  
r31,r31,r20  
/* initialize Shadow Reg. 4*/  
D–4 PALcode Restrictions and Guidelines  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Restriction 1 : Reset Sequence Required by Retire Logic and Mapper  
addq  
br  
tch7: br  
r31,r31,r21  
r31, nxt8  
r31, tch8  
/* initialize Shadow Reg. 5*/  
/* continue executing in next block*/  
/* fetch in next block*/  
nxt8: addq  
addq  
br  
tch8: br  
nxt9:  
r31,r31,r22  
r31,r31,r23  
r31, nxt9  
/* initialize Shadow Reg. 6*/  
/* initialize Shadow Reg. 7*/  
/* continue executing in next block*/  
/* go back to 1st block and start executing*/  
r31, nxt0  
/*  
** INIT_WRITE_MANY  
**  
** Write the cbox write many chain, initializing the bcache configuration.  
**  
** This code is on a cache block boundary,  
**  
** *** the bcache is initialized OFF for the burnin test ***  
*/  
/*  
** Because we aligned on and fit into a icache block, and because sbe=0,  
** and because we do an mb at the beginning (which blocks further progress  
** until the entire block has been fetched in), we don’t have to  
** fool with pulling this code in before executing it.  
*/  
#undef bc_enable_a  
#undef init_mode_a  
#undef bc_size_a  
#undef zeroblk_enable_a  
#undef enable_evict_a  
#undef set_dirty_enable_a  
#undef bc_bank_enable_a  
#undef bc_wrt_sts_a  
#define bc_enable_a  
#define init_mode_a  
#define bc_size_a  
#define zeroblk_enable_a  
#define enable_evict_a  
#define set_dirty_enable_a  
#define bc_bank_enable_a  
#define bc_wrt_sts_a  
0
0
0
1
0
0
0
0
loadwm:  
lda  
sll  
r1, WRITE_MANY_CHAIN_H(r31)  
r1, 32, r1 /* data<35:32> */  
/* data<31:00> */  
/* shift in 6x 6-bits*/  
LDLI(r1, WRITE_MANY_CHAIN_L, r1)  
addq  
mb  
r31,6,r0  
/* wait for all istream/dstream to complete*/  
br  
r31, bccshf  
.align 6  
bccshf:mtpr r1,EV6__DATA  
/* shift in 6 bits*/  
/* decrement R0*/  
/* done if R0 is zero*/  
/* align next 6 bits*/  
subq  
beq  
r0,1,r0  
r0,bccend  
r1,6,r1  
srl  
21264/EV68A Hardware Reference Manual  
PALcode Restrictions and Guidelines D–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Restriction 1 : Reset Sequence Required by Retire Logic and Mapper  
br  
bccend:mtpr  
addq  
r31,bccshf  
/* continue shifting*/  
r31,EV6__EXC_ADDR + 16/* dummy IPR write - sets SCBD bit 4 */  
r31,r31,r0  
r31,r31,r1  
/* nop*/  
/* nop*/  
addq  
mtpr  
r31,EV6__EXC_ADDR + 16  
/* also a dummy IPR write -  
/* stalls until above write  
/* retires*/  
beq  
br  
addq  
r31, bccnxt  
r31, .-4  
r31,r31,r1  
/* predicts fall through in PALmode*/  
/* fools ibox predictor into infinite loop*/  
/* nop*/  
bccnxt:addq r31,4,r0  
mtpr  
/* load PCTX.....*/  
r0,EV6__PROCESS_CONTEXT  
/* ..... FPE=1 (SCRBRD=4)*/  
/* load DC_CTL.....*/  
lda  
r0,DC_CTL_INIT_K(r31)  
mtpr  
r0,EV6__DC_CTL  
/* .....ECC_EN=0, FHIT=0, SET_EN=3  
/* (SCRBRD=6)*/  
addq  
addq  
lda  
r31,r31,r0  
r31,r31,r1  
r0,0xff61(r31)  
r0,0xfc,r0  
/* nop*/  
/* nop*/  
/* R0 = ^xff61 (superpage) */  
/* PTE protection for DTB write in next  
block*/  
zap  
mtpr  
mtpr  
mtpr  
mtpr  
r31,EV6__DTB_TAG0 /* write DTB_TAG0 (SCRBRD=2,6)*/  
r31,EV6__DTB_TAG1 /* write DTB_TAG1 (SCRBRD=1,5)*/  
r0,EV6__DTB_PTE0 /* write DTB_PTE0 (SCRBRD=0,4)*/  
r0,EV6__DTB_PTE1 /* write DTB_PTE1 (SCRBRD=3,7)*/  
mtpr  
lda  
sll  
r31,EV6__SIRR  
r0,0x08FF(r31)  
r0,52,r0  
/* clear SIRR (SCRBRD=4)*/  
/* load FPCR.....*/  
/* .....initial FPCR value*/  
itoft r0, f0  
/* nop  
itoftr0,f0; value = 0x8FF0000000000000*/  
mt_fpcr f0  
/* nop  
mt_fpcrf0,f0,f0; do the load*/  
lda  
r0,0x2086(r31)  
/* load I_CTL.....*/  
ldah  
r0,0x0050(r0)  
/* .....TB_MB_EN=1, CALL_PAL_R23=1, SL_XMIT=1,  
/* SBE=0, SDE=2, IC_EN=3*/  
mtpr  
r0,EV6__I_CTL  
/* value = 0x0000000000502086 (SCRBRD=4)*/  
mtpr  
lda  
r31,EV6__CC  
r0,0x001F(r31)  
/* clear CC (SCRBRD=5)*/  
/* write-one-to-clear bits in HW_INT_CLR,  
/* I_STAT and DC_STAT*/  
sll  
r0,28,r0  
/* value = 0x00000001F0000000*/  
mtpr  
r0,EV6__HW_INT_CLR/* clear bits in HW_INT_CLR (SCRBRD=4)*/  
mtpr  
r0,EV6__I_STAT  
/* clear bits in I_STAT  
/*(SCRBRD=4) creates a map-stall  
/* under the above mtpr to SCRBRD=4*/  
/* value = 0x000000000000001F*/  
/* clear bits in DC_STAT (SCRBRD=6)*/  
/* nop*/  
lda  
mtpr  
addq  
r0,0x001F(r31)  
r0,EV6__DC_STAT  
r31,r31,r0  
mtpr  
r31,EV6__PCTR_CTL /* 1st buffer fetch block for above map-stall  
/* and 1st clear PCTR_CTL (SCRBRD=4)*/  
bis  
bis  
r31,1,r0  
r31,1,r0  
/* set up value for demon write*/  
/* set up value for demon write*/  
/* nop*/  
mulq/v r31,r31,r0  
D–6 PALcode Restrictions and Guidelines  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Restriction 1 : Reset Sequence Required by Retire Logic and Mapper  
mtpr  
r31,EV6__PCTR_CTL /* 2nd buffer fetch block for above map-stall  
/* and 2nd clear PCTR_CTL (SCRBRD=4)*/  
bis  
bis  
mulq  
r31,1,r0  
r31,1,r0  
r31,r31,r0  
/* set up value for demon write*/  
/* set up value for demon write*/  
/* nop*/  
lda  
mb  
r0,0x780(r31)  
/* this is new initialization stuff to  
prevent*/  
whint r0  
mb  
/* ld/st below from going off-chip */  
/* set up value for demon write*/  
bis  
r31,1,r0  
ldq_p r1,0x780(r31)  
ldq_p r0,0x788(r31)  
mb  
mb  
/* flush Pipe 0 LD logic*/  
/* flush Pipe 1 LD logic*/  
/* wait for LD’s to complete*/  
/* wait for LD’s to complete*/  
stq_p r1,0x780(r31)  
stq_p r0,0x788(r31)  
/* flush Pipe 0 ST logic*/  
/* flush Pipe 1 ST logic*/  
/* load loop count of 32*/  
bis  
r31, 32, r0  
jsr_init_loop:  
bsr  
r31,jsr_init_loop_nxt  
/* JSR to PC+4*/  
jsr_init_loop_nxt:  
stq_p r1,0x780(r31)  
/* flush Pipe 0 ST logic*/  
/* decrement loop count*/  
subq  
beq  
br  
r0,1,r0  
r0,jsr_init_done /* done?*/  
r31,jsr_init_loop /* continue loop*/  
jsr_init_done:  
lda  
sll  
itoft r0,f0  
r0,0x03FF(r31)  
r0,52,r0  
/* create FP one..... */  
/* .....value = 0x3FF0000000000000 */  
/* put it into F0 reg */  
addq  
r31,r31,r1  
/* nop (also clears R1) */  
mult  
addt  
divt  
f0,f0,f0  
f0,f0,f0  
f0,f0,f0  
/* flush mul-pipe */  
/* flush add-pipe */  
/* flush div-pipe */  
/* flush div-pipe */  
sqrtt f0,f0  
cvtqt f0,f0  
/* flush add-pipe (integer logic) */  
/* flush MVI logic */  
/* flush MVI logic */  
perr r31,r31,r0  
maxuw4 r31,r31,r0  
pkwb r31,r0  
/* flush MVI logic */  
rc  
r0  
/* clear interrupt flag*/  
/* nop (also clears R1)*/  
/* nop (also clears R1)*/  
/* nop (also clears R1)*/  
addq  
addq  
addq  
r31,r31,r1  
r31,r31,r1  
r31,r31,r1  
/*  
* This palbase init exists for the rare cases  
* when this code is loaded into upper memory.  
* That is the case when this code is loaded  
* and executed in memory on a system that has  
* already been initialized. This technique  
* can sometimes be used to debug snippets of  
* this code.  
*/  
21264/EV68A Hardware Reference Manual  
PALcode Restrictions and Guidelines D–7  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Restriction 2 : No Multiple Writers to IPRs in Same Scoreboard Group  
br  
palbase_init:  
br  
r31,palbase_init  
r0, br60  
/* r0 <- current location */  
br60: lda  
mtpr  
r1, (EntryPoint-br60)(r0) /* r1 <- location of codebase */  
r1, EV6__PAL_BASE /* set up pal_base register */  
bis  
r31, 2, r0  
mtpr  
r0, EV6__VA_CTL  
bis  
r31, 8, r0  
mtpr  
r0, EV6__M_CTL  
br  
r0, jmp0  
jmp0: addq  
r0, (jmp1-jmp0+1), r0  
hw_rets/jmp(r0)  
jmp1:  
lda  
r1, 1(r31)  
/* r1 <- cc_ctl enable bit */  
sll  
r1, 32, r1  
mtpr  
r1, EV6__CC_CTL  
/* Enable/clear the cycle counter. */  
/*  
** Now initialize the dcache to allow the  
** minidebugger so save gpr’s  
*/  
D.2 Restriction 2 : No Multiple Writers to IPRs in Same Scoreboard  
Group  
For convenience of implementation, only one explicit writer (HW_MTPR) to IPRs that  
are in the same group can appear in the same fetch block (octaword-aligned octaword).  
Multiple explicit writers to IPRs that are not in the same scoreboard group can appear.  
If this restriction is violated, the IPR readers might not see the in-order state. Also, the  
IPR might ultimately end up with a bad value.  
D.3 Restriction 4 : No Writers and Readers to IPRs in Same Score-  
board Group  
This restriction is made for the convenience of microprocessor implementation.  
An explicit reader of an IPR in a particular scoreboard group cannot follow an explicit  
writer (HW_MTPR) to an IPR in that same scoreboard group within one fetch block  
(octaword-aligned octaword). Also within one fetch block, an implicit reader of an IPR  
in a particular scoreboard group cannot follow an explicit writer (HW_MTPR) to an  
IPR in that scoreboard group. This restriction covers writes to DTB_PTE or DTB_TAG  
followed by LD, ST, or any memory operation, including all types of JMP instructions  
and HW_RET instructions that do not have the STALL bit set.  
D–8 PALcode Restrictions and Guidelines  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Guideline 6 : Avoid Consecutive Read-Modify-Write-Read-Modify-Write  
D.4 Guideline 6 : Avoid Consecutive Read-Modify-Write-Read-  
Modify-Write  
Avoid consecutive read-modify-write-read-modify-write sequences to IPRs in the same  
scoreboard group.  
The latency between the first write and the second read is determined by the retire  
latency of the IPR. For convenience of implementation, the latency between the time  
when the read is issued and when the final write is issued depends on the run-time con-  
tents of the issue queue. It is somewhere between four and nine cycles, even if there is  
no data dependency between the read and write.  
D.5 Restriction 7 : Replay Trap, Interrupt Code Sequence, and STF/  
ITOF  
On an Mbox replay trap, the 21264/EV68A Ibox guarantees that the refetched load or  
store instruction that caused the trap is issued before any newer load or store instruc-  
tions. For load and integer store instructions, this is a consequence of the natural opera-  
tion of the issue queue. The refetched instruction enters the age-prioritized queue ahead  
of newer load and store instructions and does not have any dependencies on dirty regis-  
ters.  
Because there is no overhead time for checking these register dependencies (that is, it is  
known upon enqueueing that there are no dirty registers), the queue will issue the  
refetched instruction in priority order. For floating-point store instructions, there is nor-  
mally some overhead associated with checking the floating-point source register dirty  
status, so the store instruction would normally wait before being issued. This would  
have the undesired consequence of allowing newer load and store instructions to be  
issued out of order. A deadlock can occur if issuing the instructions out-of-order causes  
the floating-point store instruction to continually replay the trap. To avoid the deadlock  
on a floating-point store instruction replay trap, the source register dirty status is not  
checked (the source register is assumed to be clean because the store instruction was  
issued previously).  
The hardware mechanism that keeps track of replayed floating-point store instructions,  
and cancels the dirty register check, requires some software restrictions to guarantee  
that it is applied appropriately to the replayed instruction and not to other floating-point  
store instructions. The hardware mechanism marks the position in the fetch block (low  
two bits of the PC) where the replay trap occurred. This action cancels the dirty float-  
ing-point source register check of the next valid instruction enqueued to the integer  
queue (integer, all load and store, and ITOF instructions) that has the same position in  
the fetch block (normally the replayed STF). If the PC is somehow diverted to a PAL-  
code flow, this hardware might inadvertently cancel the register check of some other  
STF (or ITOF) instruction. Fortunately, there are a minimal number of reasons why the  
PC might be diverted during a replay trap. They are interrupts and ITB fills.  
The following PALcode example shows that an STF or ITOF instruction, in a given  
position in a fetch block, must be preceded by a valid instruction that is issued out of  
the integer queue in the same position in an earlier fetch block. Acceptable instruction  
classes include load, integer store, and integer operate instructions that do not have R31  
as a destination or branch.  
21264/EV68A Hardware Reference Manual  
PALcode Restrictions and Guidelines D–9  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Restriction 9 : PALmode Istream Address Ranges  
Bad_interrupt_flow_entry:  
ADDQ R31,R31,R0  
STF Fa,(Rb) ; This STF might not undergo a dirty source register  
; check and might give wrong results  
ADDQ R31,R31,R0  
ADDQ R31,R31,R0  
................................  
Good_interrupt_flow_entry:  
ADDQ R31,R31,R0; Enables FP dirty source register  
; check for (PC[1:0] == 00)  
ADDQ R31,R31,R0; Enables FP dirty source register  
; check for (PC[1:0] == 01)  
ADDQ R31,R31,R0; Enables FP dirty source register  
; check for (PC[1:0] == 10)  
ADDQ R31,R31,R0; Enables FP dirty source register  
; check for (PC[1:0] == 11)  
ADDQ R31,R31,R0  
STF Fa,(Rb); This STF will successfully undergo  
; a dirty source register check  
ADDQ R31,R31,R0  
ADDQ R31,R31,R0  
D.6 Restriction 9 : PALmode Istream Address Ranges  
PALmode[physical] Istream addresses must ensure proper sign extension for the  
selected value of I_CTL[VA_48]. When I_CTL[VA_48] is clear, indicating 43-bit vir-  
tual address format, PALmode[physical] Istream addresses must sign-extend address  
bits above bit 42 although the physical address range is 44 bits. An illegal address can  
only be generated by a PALmode JSR-type instruction or a HW_RET instruction  
returning to a PALmode address.  
D.7 Restriction 10: Duplicate IPR Mode Bits  
The virtual address size is selectable by programming IPR bits I_CTL[VA_48]  
and VA_CTL[VA_48]. These bit values should usually be equal when operating in  
native (virtual) mode. The I_CTL[VA_48] bit determines the DTB double3/double4  
PALcode entry, the JSR mispredict comparison width, the VPC address generation  
width, the Istream ACV limits, and the IVA_FORM format selection. The  
VA_CTL[VA_48] bit determines the VA_FORM format selection and the Dstream  
ACV limits. IPR mode bits I_CTL[VA_FORM_32] and VA_CTL[VA_FORM_32]  
should be consistent when executing in native mode.  
D–10 PALcode Restrictions and Guidelines  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Restriction 11: Ibox IPR Update Synchronization  
D.8 Restriction 11: Ibox IPR Update Synchronization  
When updating any Ibox IPR, a return to native (virtual) mode should use the HW_RET  
instruction with the associated STALL bit set to ensure that the updated IPR value  
affects all instructions following the return path. The new IPR value takes effect only  
after the associated HW_MTPR instruction is retired.  
For update to some IPR fields with propagation delay, such as I_CTL[SDE] and  
PCTX[FPE], synchronization as described in Section D.32 is the preferred method of  
synchronization.  
D.9 Restriction 12: MFPR of Implicitly-Written IPRs EXC_ADDR,  
IVA_FORM, and EXC_SUM  
Implicitly written IPRs are non-renamed hardware registers that must be available for  
subsequent traps. After any trap to PALcode, hardware protects the values from a sec-  
ond implicit write by locking these registers and delaying subsequent traps for a safe  
(limited time). Their values can be read reliably by a HW_MFPR within the first four  
instructions of a PALcode flow and prior to any taken branch in that PALcode flow,  
whichever is earlier. These instructions should not include PALmode trapping instruc-  
tions. After the delimiting instruction defined above retires, these registers are unlocked  
and may change due to new exception conditions.  
If a second exception occurs before the registers are unlocked, it will be either delayed  
or forced to replay trap (a non-PALmode trap) until the register has been unlocked.  
After being unlocked, a subsequent new path exception condition will be allowed to  
reload the register and trap to PALcode. The 21264/EV68A may complete execution of  
the first PALcode flow, encountering the second exception condition before the delimit-  
ing instruction is retired, hence the need for the locking mechanism to ensure visibility  
of the initial register value.  
The VA_FORM, VA, and MM_STAT registers are not included in this list of protected  
IPRS. See Section D.24 for a description of how to protect these IPRs from subsequent  
implicit writers.  
D.10 Restriction 13 : DTB Fill Flow Collision  
Two DTB fill flows might collide such that the HW_MTPR’s in the second fill could be  
issued before all of the HW_MTPR’s in the first PALcode flow are retired. This can be  
prevented by putting appropriate software scoreboard barriers in the PALcode flow.  
D.11 Restriction 14 : HW_RET  
There can be no HW_RET in the first fetch block of a PALcode routine, other  
than CALL_PAL routines. With a HW_RET in the first fetch block of a PALcode rou-  
tine, the HW_RET will be mispredicted and the JSR/RETURN stack could lose its syn-  
chronization.  
21264/EV68A Hardware Reference Manual  
PALcode Restrictions and Guidelines D–11  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Guideline 16 : JSR-BAD VA  
D.12 Guideline 16 : JSR-BAD VA  
A JSR memory format instruction that generates a bad VA (IACV) trap requires PAL-  
code assistance to determine the correct exception address. If the  
EXC_SUM[BAD_IVA] is set, bits [63,1] of the exception address are valid in the VA  
IPR and not the EXC_ADDR as usual. The PALmode bit, however, is always located in  
EXC_ADDR[0] and must be combined, if necessary, by PALcode to determine the full  
exception address.  
D.13 Restriction 17: MTPR to DTB_TAG0/DTB_PTE0/DTB_TAG1/  
DTB_PTE1  
These four write operations must be executed atomically, that is, either all four must be  
retired or none of them may be retired.  
D.14 Restriction 18: No FP Operates, FP Conditional Branches,  
FTOI, or STF in Same Fetch Block as HW_MTPR  
No FP operate instructions (including Mx_FPCR), FP conditional branches, FTOI reg-  
ister move instructions, or FP store instructions are allowed in the same fetch block as  
any HW_MTPR instructions. This includes ADDx/MULx/DIVx/SQRTx/FPCondition-  
alBranch/STx/FTOIx, where x is any applicable FP data type, but does not include  
LDx/ITOFx.  
D.15 Restriction 19: HW_RET/STALL After Updating the FPCR by  
way of MT_FPCR in PALmode  
FPCR updating occurs in hardware based on the retirement of a nontrapping version of  
MT_FPCR (in PALcode). Use a HW_RET/STALL after the nontrapping MT_FPCR to  
achieve minimum latency (four cycles) between the retiring of the MT_FPCR and the  
first FLOP that uses the updated FPCR.  
D.16 Guideline 20 : I_CTL[SBE] Stream Buffer Enable  
The I_CTL[SBE] bits should not be enabled when running with the Icache disabled to  
avoid potentially long fill delays. When the Icache is disabled, the only method of sup-  
plying instructions is by way of a stream hit. If the fill is returned in non-sequential  
wrap order, the stream will continue fetching through the entire page while waiting for  
a hit. Normally the data will be found in the cache.  
D.17 Restriction 21: HW_RET/STALL After HW_MTPR ASN0/ASN1  
There must be a scoreboard bit-to-register dependency chain to prevent HW_MTPR  
ASN0 or HW_MTPR ASN1 from being issued while any of scoreboard bits [7:4] are  
set. The following example contains a code sequence that creates the dependency chain.  
:Assume Ra holds value to write to ASN0/ASN1  
HW_MFPR R0, VA, SCBD<7,6,5,4>  
XOR R0, R0, R0  
BIS R0, R9, R9  
D–12 PALcode Restrictions and Guidelines  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Restriction 22: HW_RET/STALL After HW_MTPR IS0/IS1  
BIS R31, R31, R31  
HW_MTPR R9, ASN0, SCBD<4>  
HW_MTPR R9, ASN1, SCBD<7>  
This sequence guarantees, through the register dependency on R0, that neither  
HW_MTPR are issued before scoreboard bits [7:4] are cleared. In addition, there must  
be a HW_RET/STALL after a HW_MTPR ASN0/HW_MTPR ASN1 pair. Finally,  
these two writes must be executed atomically, that is, either both must be retired or nei-  
ther may be retired.  
D.18 Restriction 22: HW_RET/STALL After HW_MTPR IS0/IS1  
There must be a scoreboard bit-to-register dependency chain to prevent either  
HW_MTPR IS0 or HW_MTPR IS1 from issuing instructions while any of scoreboard  
bits [7:4] are set. The following example contains a code sequence that creates the  
dependency chain.  
HW_MFPR R0, VA, SCBD<7,6,5,4>,R0  
XOR R0, R0, R0  
BIS R0, R9, R9  
BIS R31 ,R31, R31  
HW_MTPR R9, IS0, SCBD<6>  
HW_MTPR R9, IS1, SCBD<7>  
This sequence guarantees, through the register dependency on R0, that neither  
HW_MTPR are issued before scoreboard bits [7:4] are cleared. There must be a  
HW_RET/STALL after an HW_MTPR IS0/HW_MTPR IS1 pair. Also, these two  
writes must be executed atomically, that is, either both must be retired or neither may be  
retired.  
D.19 Restriction 23: HW_ST/P/CONDITIONAL Does Not Clear the  
Lock Flag  
A HW_ST/P/CONDITIONAL will not clear the lock flag such that a successive store-  
conditional (either STx_C or HW_ST/C) might succeed even in the absence of a load-  
locked instruction. In the 21264/EV68A, a store-conditional is forced to fail if there is  
an intervening memory operation between the store-conditional and its address-match-  
ing LDxL. The following example shows the memory operations.  
LDL/Q/F/G/S/T  
STL/Q/F/G/S/T  
LDQ_U (not to R31)  
STQ_U  
Absent from this list are HW_LD (any type), HW_ST (any type), ECB, and WH64.  
Their absence implies that they will not force a subsequent store-conditional instruction  
to fail. PALcode must insert a memory operation from the above list after a HW_ST/  
CONDITIONAL in order to force a future store-conditional to fail if it was not pre-  
ceded by a load-locked operation:  
HW_LDxL  
21264/EV68A Hardware Reference Manual  
PALcode Restrictions and Guidelines D–13  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Restriction 24: HW_RET/STALL After HW_MTPR IC_FLUSH, IC_FLUSH_ASM,  
xxx  
HW_ST/C -> R0  
Bxx R0, try_again  
STQ ; Force next ST/C to fail if no preceding LDxL  
HW_RET  
D.20 Restriction 24: HW_RET/STALL After HW_MTPR IC_FLUSH,  
IC_FLUSH_ASM, CLEAR_MAP  
There must be a HW_RET/STALL after a HW_MTPR IC_FLUSH, IC_FLUSH_ASM, or  
CLEAR_MAP. The Icache flush associated with these instructions will not occur until  
the HW_RET/STALL occurs and all outstanding Istream fetches have been completed.  
Also, there must be a guarantee that the HW_MTPR IC_FLUSH or HW_MTPR  
IC_FLUSH_ASM will not be retired simultaneously with the HW_RET/STALL. This  
can be ensured by inserting a conditional branch between the two (BNE R31, 0 cannot  
be mispredicted in PALmode), or by ensuring at least 10 instructions between the  
MTPR instruction and the HW_RET/STALL containing at least one instruction in each  
quad aligned group with a valid destination. Finally, the HW_RET/STALL that is used  
for CLEAR_MAP cannot trigger a cache flush. That is, if both a CLEAR_MAP and  
IC_FLUSH are desired, there must be two HW_RET/STALLs, one following each  
HW_MTPR.  
D.21 Restriction 25: HW_MTPR ITB_IA After Reset  
An HW_MTPR ITB_IA is required in the reset PALcode to initialize the ITB. It is also  
required that PALcode not be exited, even via a mispredicted path until this  
HW_MTPR ITB_IA has been retired. PALmode can change temporarily after fetching  
a HW_RET, regardless of the STALL qualifier, down a mispredicted path leading to use  
of the ITB before it is actually initialized.  
Unexpected instruction fetch and execution can occur following misprediction of any  
memory format control instruction (JMP, JSR, RET, JSR_CO, or HW_JMP, HW_JSR,  
HW_RET, HW_JSR_CO regardless of the STALL qualifier), or after any mispredicted  
conditional branch instruction. If the unexpected instruction flow contains a HW_RET  
instruction, PALmode may be exited prematurely.  
One way to ensure that PALmode is not exited is to place the HW_MTPR ITB_IA at  
least 80 instructions before any possible HW_RET instruction can be encountered via  
any fetch path. Since memory format control instructions can mispredict to any cache  
location, they should also be avoided within these 80 instructions.  
D.22 Guideline 26: Conditional Branches in PALcode  
To avoid pollution of the branch predictors and improve overall branch prediction accu-  
racy, conditional branch instructions in PALcode will be predicted to not be taken. The  
only exception to this rule are conditional branches within the first cache fetch (up to  
four instructions) of all PALcode flows except CALL_PAL flows. Conditional branches  
should be avoided in this window.  
D–14 PALcode Restrictions and Guidelines  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
Restriction 27: Reset of ‘Force-Fail Lock Flag’ State in PALcode  
D.23 Restriction 27: Reset of ‘Force-Fail Lock Flag’ State in PALcode  
A virtual mode load or store is required in PALcode before the execution of any load-  
locked or store-conditional instructions. The virtual-mode load or store may not be a  
HW_LD, HW_ST, LDx_L, ECB, or WH64.  
D.24 Restriction 28: Enforce Ordering Between IPRs Implicitly Writ-  
ten by Loads and Subsequent Loads  
Certain IPRs, which are updated as a result of faulting memory operations, require soft-  
ware assistance to maintain ordering against newer instructions. Consider the following  
code sequence:  
HW_MFPR IPR_MM_STAT  
LDQ rx, (ry)  
These instructions would typically be issued in-order. The HW_MFPR is data-ready  
and both instructions use a lower subcluster. However, the HW_MFPRs (and  
HW_MTPRs) respond to certain resource-busy indications and are not issued when the  
Mbox informs the Ibox that a certain set of resources (store-bubbles) are busy. The LDs  
respond to a different set of resource-busy indications (load-bubbles) and could be  
issued around the HW_MFPR in the presence of the former. Software assistance is  
required to enforce the issue order. One sure way to enforce the issue order is to insert  
an MB instruction before the first load that occurs after the HW_MFPR MM_STAT.  
The VA, VA_FORM, and DC_CTL registers require a similar constraint. All LOAD  
instructions except HW_LD might modify any or all of these registers. HW_LD does  
not modify MM_STAT.  
D.25 Guideline 29 : JSR, JMP, RET, and JSR_COR in PALcode  
Unprivileged JSR, JMP, RET, and JSR_COR instructions will always mispredict when  
used in PALcode. In addition, HW_RET to a PALmode target will always mispredict  
since the JSR stack only predicts native-mode return addresses. HW_RET to a native-  
mode target uses the JSR stack for prediction and should usually be used when exiting  
PALmode in order to maintain JSR stack alignment since all PALmode traps also push  
the value of the EXC_ADDR on the JSR stack.  
Privileged versions of the JSR type instructions (HW_JSR,HW_JMP,HW_JSR_COR)  
can be used both within PALmode or to exit PALmode and generate a predicted target  
based on their hint bits and the current processor PALmode state.  
D.26 Restriction 30 : HW_MTPR and HW_MFPR to the Cbox CSR  
External bus activity must be isolated from writes and reads to the Cbox CSR. This  
requires that all Dstream and Istream fills must be avoided until after the HW_MTPR/  
HW_MFPR updates are completed. An MB instruction can block Dstream activity, but  
blocking all Istream fills, including prefetches, requires more extensive code. The fol-  
lowing code example blocks all Istream fill requests and stalls instruction fetch until  
after the desired MTPR/MFPR action is completed. This code disables Istream  
prefetching by way of a HW_MTPR to I_CTL[SBE], IC_FLUSH, and  
HW_RET_STALL sequence.  
21264/EV68A Hardware Reference Manual  
PALcode Restrictions and Guidelines D–15  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Restriction 30 : HW_MTPR and HW_MFPR to the Cbox CSR  
ALIGN_FETCH_BLOCK  
sys__cbox:  
mb  
; quiet the dstream  
; (4,0L) get i_ctl  
hw_mfpr p6, EV6__I_CTL  
lda  
and  
p4, ^xFCFF(r31)  
p6, p4, p4  
; mask for clearing SBE bits  
; clear SBE bits  
sbe_off_offset = <sys__cbox_sbe_off_done - sys__cbox_sbe_off>  
hw_mtpr p4, EV6__I_CTL ; (4,0L) write new i_ctl  
br  
p6, sys__cbox_sbe_off  
sys__cbox_sbe_off:  
addq  
bsr  
p6, #<sbe_off_offset+1>, p6  
r31, .  
; past stall in palmode  
; stack push  
ALIGN_FETCH_BLOCK <^x47FF041F>; align  
hw_mtpr r31, EV6__IC_FLUSH  
; (4,0L) eliminate prefetches  
; pvc #24  
bne  
r31, .  
PVC_JSR sbe_off  
; synch and flush  
hw_ret_stall (p6)  
PVC_JSR sbe_off, dest=1  
; use ret, pop stack  
; br stops predictor  
sys__cbox_sbe_off_done:  
br r31, sys__cbox_touch1  
; now pull in the next block  
ALIGN_CACHE_BLOCK  
sys__cbox_over1:  
addq  
; block 1  
r31, #11, p6  
; initialize shift count (11x)  
; initialize shift data  
; go to block 2  
;
addq  
br  
r31, r31, p7  
r31, sys__cbox_over2  
sys__cbox_touch1:  
br  
r31, sys__cbox_touch2  
; touch block 2  
sys__cbox_over2:  
hw_mtpr r31, EV6__SHIFT_CONTROL  
; block 2  
; (6,0L) shift in 6 bits  
; decrement shift count  
; go to block 3  
;
subq  
br  
p6, #1, p6  
r31, sys__cbox_over3  
sys__cbox_touch2:  
br  
r31, sys__cbox_touch3  
; touch block 3  
sys__cbox_over3:  
hw_mtpr r31, <EV6__MM_STAT ! 64 >  
; block 3  
; (6,0L) wait for shift  
; return in pal mode  
; go to block 4  
;
bis  
br  
p5, #1, p5  
r31, sys__cbox_over4  
sys__cbox_touch3:  
br  
r31, sys__cbox_touch4  
; touch block 4  
sys__cbox_over4:  
hw_mfpr p4, EV6__DATA  
; block 4  
; (6,0L) read cbox data  
; nop  
; go to block 5  
;
; touch block 5  
bis  
br  
r31, r31, r31  
r31, sys__cbox_over5  
sys__cbox_touch4:  
br  
r31, sys__cbox_touch5  
sys__cbox_over5:  
; block 5  
and  
addq  
br  
p4, #^x3F, p4  
p4, p7, p7  
r31, sys__cbox_over6  
; clean to <5:0>  
; accumulate shift data  
; go to block 6  
;
sys__cbox_touch5:  
br  
r31, sys__cbox_touch6  
; touch block 6  
D–16 PALcode Restrictions and Guidelines  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Restriction 31 : I_CTL[VA_48] Update  
sys__cbox_over6:  
beq  
; block 6  
; branch if done  
; nop  
; go to block 7  
;
; touch block 7  
p6, sys__cbox_over8  
r31, r31, r31  
r31, sys__cbox_over7  
bis  
br  
sys__cbox_touch6:  
br  
r31, sys__cbox_touch7  
sys__cbox_over7:  
; block 7  
bis  
sll  
br  
p7, r31, p20  
p7, #6, p7  
r31, sys__cbox_over2  
; save before shifting  
; shift data 6 bits left  
; do next shift  
;
sys__cbox_touch7:  
br  
r31, sys__cbox_touch8  
; touch block 8  
sys__cbox_over8:  
beq  
; block 8  
; predict not taken  
r31, sys__cbox_cbox_done  
PVC_VIOLATE <1006>  
br  
r31, .-4  
; predict back to infinite loop  
bis  
r31, r31, r31  
;
sys__cbox_touch8:  
;
br  
r31, sys__cbox_over1  
; now start executing the shifts  
sys__cbox_cbox_done:  
; now restore i_ctl  
; (4,0L) get i_ctl  
hw_mfpr p6, EV6__I_CTL  
lda  
or  
p4, <3@EV6__I_CTL__SBE__S>(r31) ; sbe bits  
p6, p4, p4  
; set SBE bits  
bis  
r31, r31, r31  
hw_mtpr p4, EV6__I_CTL  
; (4,0L) restore i_ctl  
PVC_JSR cbox, bsr=1, dest=1  
hw_ret_stall (p5)  
; return to caller with stall  
D.27 Restriction 31 : I_CTL[VA_48] Update  
The VA_48 virtual address format cannot be changed while executing a JSR, JMP,  
GOTO, JSR_COROUTINE, or HW_RET instruction. A simple method of ensuring  
that the address does not change is to write I_CTL twice, in two separate fetch blocks,  
with the same data. The second write will stall the pipeline and ensure that the mode  
cannot change, even down a mispredicted path, while a following JSR type instruction  
might be using the address comparison logic.  
D.28 Restriction 32 : PCTR_CTL Update  
The performance counter must not be left in a state near overflow. If counting is dis-  
abled, the counters may produce multiple overflow signals if the counter output is not  
updated due to the counter being disabled. A repeated overflow signal with counters  
disabled can block other incoming interrupt requests while the overflow state persists.  
To avoid this situation, reads or writes to the counters should not leave a value near  
overflow. In normal operation, with counters enabled, a counter overflow will produce  
an overflow pulse, clear the counter, and produce a performance counter interrupt.  
Interrupts can only be blocked for one cycle.  
21264/EV68A Hardware Reference Manual  
PALcode Restrictions and Guidelines D–17  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Restriction 33 : HW_LD Physical/Lock Use  
D.29 Restriction 33 : HW_LD Physical/Lock Use  
The HW_LD physical/lock instruction must be one of the first three instructions in a  
quad-instruction aligned fetch block. A pipeline error can occur if the HW_LD physi-  
cal/lock is fetched as the fourth instruction of the fetch block.  
D.30 Restriction 34 : Writing Multiple ITB Entries in the Same PAL-  
code Flow  
Before a PALcode flow writes multiple ITB entries, additional scoreboard bits should  
be set to avoid possible corruption of the TAG IPR prior to final update in the ITB. The  
addition of scoreboard bits 0 and 4 to the standard scoreboard bit 6 for ITB_TAG will  
prevent subsequent HW_MTPR ITB_TAG writes from changing the staging register  
TAG value prior to retirement of the HW_MTPR ITB_PTE that triggers the final ITB  
update.  
D.31 Guideline 35 : HW_INT_CLR Update  
When writing the HW_INT_CLR IPR to clear interrupt requests, it may be necessary to  
write the same value twice in distinct fetch blocks to ensure that the interrupt request is  
cleared before exiting PALcode. A second write will cause a scoreboard stall until the  
first write retires, creating a convenient synchronization with the PALmode exit.  
D.32 Restriction 36 : Updating I_CTL[SDE]  
A software interlock is required between updates of the I_CTL[SDE] and a subsequent  
instruction fetch that may use any destination registers. A suggested method of ensuring  
this interlock is to use two MTPR I_CTL instructions in separate fetch blocks, followed  
by three more fetch blocks of non-NOP instructions.  
D.33 Restriction 37 : Updating VA_CTL[VA_48]  
A software interlock is required between updates of the VA_CTL[VA_48] and follow-  
ing LD or ST instructions. This is necessary since the VA_CTL update will not occur  
until the HW_MTPR VA_CTL instruction retires. A sufficient method of ensuring this  
interlock is to write the VA_CTL with the same data in two successive fetch blocks,  
causing a mapper stall. The dependant LD or ST instructions can be placed in any loca-  
tion of the second fetch block.  
D.34 Restriction 38 : Updating PCTR_CTL  
When updating the PCTR_CTL, it may be necessary to write the update value twice. If  
the counter being updated is currently disabled by way of the respective I_CTL or  
PCTX bits, the value must be written twice to ensure that the counter overflow is prop-  
erly cleared. The overflow bit is conditionally latched using the same write enable as  
the counter update, so an additional write of the counter value will ensure that the over-  
flow logic accurately reflects the addition of the new counter value plus the input condi-  
tions. The new update value must not be within one cycle of overflow (within 16 for  
SL0, within 4 for SL1) as required by Section D.28.  
D–18 PALcode Restrictions and Guidelines  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
                         
Guideline 39: Writing Multiple DTB Entries in the Same PAL Flow  
D.35 Guideline 39: Writing Multiple DTB Entries in the Same PAL  
Flow  
If a PALcode flow intends to write multiple DTB entries (as would occur in a double  
miss), it must take care to keep subsequent HW_MTPR DTB_TAGx writes from cor-  
rupting the staging register TAG values prior to retirement of the HW_MTPR  
DTB_PTEx, which triggers the final DTB update.  
For example, in the double miss DTB flow, the following code could be used to hold up  
the return to the single miss flow (the numbers in parentheses are the scoreboard bits):  
hw_mtpr r4, EV6__DTB_TAG0  
hw_mtpr r4, EV6__DTB_TAG1  
hw_mtpr r5, EV6__DTB_PTE0  
hw_mtpr r5, EV6__DTB_PTE1  
; (2&6) write tag0  
; (1&5) write tag 1  
; (0&4) write pte0  
; (3&7) write pte1  
bis r31, r31, r31  
; force new fetch block  
bis r31, r31, r31  
bis r31, r31, r31  
hw_mtpr r31, <EV6__MM_STAT ! ^x80>  
; (7) wait for pte write  
; return to single miss  
hw_ret (r6)  
D.36 Restriction 40: Scrubbing a Single-Bit Error  
On Bcache and Memory single bit errors on Icache fills, the hardware flushes the  
Icache, but the PALcode must scrub the block in the Bcache and memory. On Bcache  
and Memory single bit errors on Dcache fills, the hardware scrubs the Dcache as long  
as the error was on a target quadword, but the PALcode must scrub the Dcache for non-  
target quadwords, and must in general scrub the block in the Bcache and memory.  
The scrub consists of reading each quadword in the block, with at least one exclusive  
access load/store to ensure the corrected data will be scrubbed in Bcache and memory.  
The scrub itself causes a CRD to be flagged, which is cleared by the PALcode before  
exiting to native mode.  
; Sample code for scrubbing a single bit error.  
;
; Since we only have the block address, and the hardware only corrects  
;
target quadwords, we read each quadword.  
; In order to ensure eviction to bcache and memory, a store  
;
;
;
;
is needed to mark the block dirty. An exclusive access is  
used to ensure we scrub in main memory. Virtual access is  
used because of restrictions in use of hw_ld/hw_st lock  
instructions.  
; After the scrub, read the cbox chain again.  
; The scrub will cause a crd, but will get cleared with a write  
; to hw_int_clr.  
;
; Current state:  
;
;
r5  
base of crd logout frame  
hw_ldq/p r4, MCHK_CRD__C_ADDR(r5)  
; get address back  
bis  
bis  
bis  
r31, r31, r31  
r31, r31, r31  
r31, r31, r31  
21264/EV68A Hardware Reference Manual  
PALcode Restrictions and Guidelines D–19  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
Restriction 40: Scrubbing a Single-Bit Error  
hw_mtpr r31, EV6__DTB_IA  
; (7,1L) flush dtb  
; set WE, RE  
lda  
bis  
bis  
r20, ^x3301(r31)  
r31, r31, r31  
r31, r31, r31  
hw_mtpr r31, <EV6__MM_STAT ! ^x80>  
; wait for retire  
srl  
sll  
bis  
r4, #13, r6  
; shift byte offset  
r6, #EV6__DTB_PTE0__PFN__S, r6 ; shift into position  
r6, r20, r6  
; produce pte  
hw_mtpr r4, EV6__DTB_TAG0  
hw_mtpr r4, EV6__DTB_TAG1  
hw_mtpr r6, EV6__DTB_PTE0  
hw_mtpr r6, EV6__DTB_PTE1  
; (2&6,0L) write tag0  
; (1&5,1L) write tag1  
; (0&4,0L) write pte0  
; (3&7,1L) write pte1  
mb  
; quiet before we start  
bis  
bis  
bis  
r31, r31, r31  
r31, r31, r31  
r31, r31, r31  
ldq  
ldq  
ldq  
ldq  
ldq  
ldq  
ldq  
mb  
r6, ^x00(r4)  
r6, ^x08(r4)  
r6, ^x10(r4)  
r6, ^x18(r4)  
r6, ^x20(r4)  
r6, ^x28(r4)  
r6, ^x30(r4)  
; re-read the bad block QW #0  
; re-read the bad block QW #1  
; re-read the bad block QW #2  
; re-read the bad block QW #3  
; re-read the bad block QW #4  
; re-read the bad block QW #5  
; re-read the bad block QW #6  
; no other mem-ops till done  
ldq_l  
stq_c  
mb  
r6, ^x38(r4)  
r6, ^x38(r4)  
; re-read the bad block QW #7  
; now store it to force scrub  
and  
r6, r31, r6  
; consumer of above  
beq  
br  
r6, sys__crd_scrub_done  
r31, .-4  
; these 2 lines......  
; .....stop pre-fetching  
sys__crd_scrub_done:  
bsr  
bis  
r7, sys__cbox  
r31, r31, r31  
; clean the cbox error chain  
; (7,1L) flush dtb  
hw_mtpr r31, EV6__DTB_IA  
bis  
bis  
bis  
r31, r31, r31  
r31, r31, r31  
r31, r31, r31  
hw_mtpr r31, <EV6__MM_STAT ! ^x80>  
; wait for retire  
; get a 1  
bis  
sll  
r31, #1, r7  
r7, #EV6__HW_INT_CLR__CR__S, r7 ; shift into position  
hw_mtpr r7, EV6__HW_INT_CLR  
; (4,0L) clear crd  
lda  
r7, EV6__DC_STAT_W1C_CRD(r31)  
; W1C bits  
; (6,0L)  
hw_mtpr r7, EV6__DC_STAT  
bis  
bis  
r31, r31 ,r31  
r31, r31 ,r31  
hw_mtpr r31, <EV6__MM_STAT ! ^x50>  
; stall till they retire  
D–20 PALcode Restrictions and Guidelines  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Restriction 41: MTPR ITB_TAG, MTPR ITB_PTE Must be in the Same Fetch Block  
D.37 Restriction 41: MTPR ITB_TAG, MTPR ITB_PTE Must be in the  
Same Fetch Block  
Write the ITB_TAG and ITB_PTE registers in the same fetch block. This avoids a  
mispredict path write of invalid data to the ITB_TAG register.  
D.38 Restriction 42: Updating VA_CTL, CC_CTL, or CC IPRs  
When writing to the VA_CTL, CC_CTL, or CC IPRs, write the same value twice in dis-  
tinct fetch blocks. This ensures that the instruction is retired before any mispredict from  
a younger branch, DTB miss trap, or hw_ret_stall.  
D.39 Restriction 43: No Trappable Instructions Along with  
HW_MTPR  
There cannot be any mispredictable/trappable instructions together with an HW_MTPR  
in the current fetch block.  
D.40 Restriction 44: Not Applicable to the 21264/EV68A  
D.41 Restriction 45: No HW_JMP or JMP Intructions in PALcode  
Do not include HW_JMP or JMP instructions in PALcode; use HW_RET instead.  
HW_JMP always predicts in PALmode, and may mispredict to random cache blocks.  
This may cause speculative code to begin executing in PALmode and may have unex-  
pected side effects such as I/O stream references.  
HW_RET always predicts in native mode, and when it mispredicts, it avoids specula-  
tive execution in PALmode.  
21264/EV68A Hardware Reference Manual  
PALcode Restrictions and Guidelines D–21  
Download from Www.Somanuals.com. All Manuals Search And Download.  
         
Restriction 46: Avoiding Livelocks in Speculative Load CRD Handlers  
D.42 Restriction 46: Avoiding Livelocks in Speculative Load CRD  
Handlers  
Speculative load CRD handlers that release from the interrupt without scrubbing a  
cache block could suffer from the following livelock condition:  
1. An initial error on a speculative load forces a CRD interrupt.  
2. The CRD releases without scrubbing the block. A speculative load in the shadow of  
the hw_ret (or hw_ret_stall) touches a Dcache location that has the single-bit error,  
forcing a CRD.  
3. The CRD handler is entered again immediately.  
4. Go to (2).  
This problem can be avoided if all jumps in the CRD handler path for speculative loads  
use the following sequence:  
mb  
; make sure hw_ret goes  
ALIGN_FETCH_BLOCK <^x47FF041F>  
mulq  
mulq  
p6, #1, p6  
p6, #1, p6  
; Hold up loads  
; Hold up loads  
; Hold up loads  
; Ignore restriction 43  
; Return  
hw_mtpr p6, <EV6__MM_STAT ! ^x44>  
PVC_VIOLATE<43>  
hw_ret_stall (p23)  
This sequence prevents speculative loads from issuing in the shadow of the  
hw_ret_stall. Note that it is a violation of restriction 4 to have in the same fetch block a  
MTPR that specifies scoreboard bit 2 (an explicit writer in the memory operation  
group) and a HW_RET (an implicit reader in the memory operation group). Under nor-  
mal circumstances, the intention would be for a HW_RET to wait until the MTPR  
issues, and that can only be enforced by putting the two instructions in different fetch  
blocks. In this case, the intention is for the HW_RET to issue before the MTPR. The  
hardware does not enforce the scoreboarding when the two instructions are in the same  
fetch block, and thus the HW_RET can issue and mispredict before any speculative  
loads (which are held up by the MTPR) can issue.  
D.43 Restriction 47: Cache Eviction for Single-Bit Cache Errors  
A live lock can occur if issuing instructions out-of-order causes a floating-point store  
instruction (with sberr) to replay trap.  
A hardware mechanism exists that keeps track of replayed floating-point store instruc-  
tions, and cancels the dirty register check. See Section D.5 for more details.  
If the floating-point store instruction has an sberr and the CRD_HANDLER is entered/  
exited before the instruction is replayed, the mechanism will lose track of the instruc-  
tion. When the instruction is replayed, the dirty register check is not canceled, and a  
replay trap occurs, causing the floating-point store instruction to continually replay the  
trap until the sberr is evicted from cache. The sberr will not evict, because the floating-  
point store instruction is killed by the replay trap. Killed instructions are not scrubed by  
the Error Recovery Machine, and CBOX_ERR[C_ADDR] may not contain the address  
of the sberr. Because CBOX_ERR[C_ADDR] is not guaranteed, the CRD_HANDLER  
might not evict the sberr.  
D–22 PALcode Restrictions and Guidelines  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
Restriction 47: Cache Eviction for Single-Bit Cache Errors  
If "CBOX_ERR[C_ADDR]" has not changed when the CRD_HANDLER is re-  
entered, or "CBOX_ERR[C_STAT] == 0x0", all cache locations should to be evicted to  
avoid the live lock described above.  
; Sample code for evicting cache.  
; This method loads a 64K block, then exits the CRD_HANDLER  
;
to check if the sberr has been evicted. If not it loads the next 64K block.  
; In the sample code below,  
;
;
sx is a shadow register  
ldi is a macro that loads a 64-bit constant into the specified register  
full_scrub:  
hw_ldq/p  
s5, 104(r31)  
ldi  
ldi  
ldi  
ldi  
s1, ^x200  
s2, ^x1C0  
s3, ^x10000  
s4, ^x2000000  
; Loop dec value  
; Start offset  
; Block size (64K -> size of dcache)  
; 2X bcache size  
addq  
ble  
s3,s5,s5  
s5, s4, <.+4>  
; Skip next instruction if ADDR  
;
.le. 2X bcache  
bis  
hw_stq/p  
subq  
mb  
r31, r31, s5  
s5, 104(r31)  
s5, s2, s5  
; Set ADDR = 0x0  
; Store ADDR for next pass thru  
; Make sure no speculative loads  
;
happen in the CRD handler  
.align 4, NOP_OPCODE  
;|  
;|  
;V  
blbc  
br  
r31, <.+4>  
r31, <.-4>  
.align 4, NOP_OPCODE  
; Make sure no speculative loads  
happen in the CRD handler  
;
next_reread:  
; ******** four cache blocks  
; Evict dcache by prefetching to all dcache indexes.  
; use 'hw_ldl r31 xxxx' Normal Prefetch  
; Do not use 'hw_ldq/p r31 xxx' Prefetch,  
; Evict Next because this will always access the same set in dcache.  
hw_ldl/p  
hw_ldl/p  
hw_ldl/p  
hw_ldl/p  
r31,^x1C0(s5)  
r31,^x180(s5)  
r31,^x140(s5)  
r31,^x100(s5)  
; Re-read the bad block QW #0  
; Re-read the bad block QW #0  
; Re-read the bad block QW #0  
; Re-read the bad block QW #0  
hw_ldl/p  
hw_ldl/p  
hw_ldl/p  
hw_ldl/p  
r31,^xC0(s5)  
r31,^x80(s5)  
r31,^x40(s5)  
r31,^x00(s5)  
; Re-read the bad block QW #0  
; Re-read the bad block QW #0  
; Re-read the bad block QW #0  
; Re-read the bad block QW #0  
subq  
subq  
ble  
br  
s5, s1, s5  
s3, s1, s3  
s3, <.+4>  
r31, next_reread  
s7, sys__cbox  
; Decrement addr  
; Decrement counter  
bsr  
; Read and clean cbox error ipr  
21264/EV68A Hardware Reference Manual  
PALcode Restrictions and Guidelines D–23  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Restriction 48: MB Bracketing of Dcache Writes to Force Bad Data ECC and Force  
D.44 Restriction 48: MB Bracketing of Dcache Writes to Force Bad  
Data ECC and Force Bad Tag Parity  
Writes to DC_CTL[F_BAD_DECC] and DC_CTL[DCDAT_ERR_EN] must be brack-  
eted by MB instructions to quiesce the memory system. The Istream must also be qui-  
esced before and during the sequence, as described in Section D.26.  
D–24 PALcode Restrictions and Guidelines  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
E
21264/EV68A-to-Bcache Pin Interface  
This appendix provides the pin interface between the21264/EV68A and Bcache  
SSRAMs.  
E.1 Forwarding Clock Pin Groupings  
Table E–1 lists the correspondance between the clock signals for the 21264/EV68A and  
Bcache (late-write non-bursting and dual-data rate) SSRAMs.  
Table E–1 Bcache Forwarding Clock Pin Groupings  
Pad and Pin  
Input Clock  
Output Clocks  
BcData_H[71:64,7:0]  
BcCheck_H[8,0]  
BcDataInClk_H[0]  
BcDataInClk_H[0]  
BcDataInClk_H[1]  
BcDataInClk_H[1]  
BcDataInClk_H[2]  
BcDataInClk_H[2]  
BcDataInClk_H[3]  
BcDataInClk_H[3]  
BcDataInClk_H[4]  
BcDataInClk_H[4]  
BcDataInClk_H[5]  
BcDataInClk_H[5]  
BcDataInClk_H[6]  
BcDataInClk_H[6]  
BcDataInClk_H[7]  
BcDataInClk_H[7]  
BcTagInClk_H  
BcDataOutClk_x[0]  
BcDataOutClk_x[0]  
BcDataOutClk_x[0]  
BcDataOutClk_x[0]  
BcDataOutClk_x[1]  
BcDataOutClk_x[1]  
BcDataOutClk_x[1]  
BcDataOutClk_x[1]  
BcDataOutClk_x[2]  
BcDataOutClk_x[2]  
BcDataOutClk_x[2]  
BcDataOutClk_x[2]  
BcDataOutClk_x[3]  
BcDataOutClk_x[3]  
BcDataOutClk_x[3]  
BcDataOutClk_x[3]  
BcTagOutClk_x  
BcData_H[79:72,15:8]  
BcCheck_H[9,1]  
BcData_H[87:80,23:16]  
BcCheck_H[10,2]  
BcData_H[95:88,31:24]  
BcCheck_H[11,3]  
BcData_H[103:96,39:32]  
BcCheck_H[12,4]  
BcData_H[111:104,47:40]  
BcCheck_H[13,5]  
BcData_H[119:112,55:48]  
BcCheck_H[14.6]  
BcData_H[127:120,63:56]  
BcCheck_H[15,7]  
BcTag_H[42:20]  
BcTagParity_H  
BcTagInClk_H  
BcTagOutClk_x  
21264/EV68A Hardware Reference Manual  
21264/EV68A-to-Bcache Pin Interface  
E–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
             
Late-Write Non-Bursting SSRAMs  
Table E–1 Bcache Forwarding Clock Pin Groupings (Continued)  
Pad and Pin  
Input Clock  
Output Clocks  
BcTagOutClk_x  
BcTagOutClk_x  
BcTagOutClk_x  
BcTagShared_H  
BcTagDirty_H  
BcTagValid_H  
BcTagInClk_H  
BcTagInClk_H  
BcTagInClk_H  
E.2 Late-Write Non-Bursting SSRAMs  
Table E–2 provides the data pin connections between late-write non-bursting SSRAMs  
and the 21264/EV68A or the system board. Table E–3 provides the same information  
for the tag pins.  
Data Pin Usage  
Table E–2 Late-Write Non-Bursting SSRAMs Data Pin Usage  
21264/EV68A Signal Name or Board Connection  
BcAdd_H[21:4]  
Late-Write SSRAM Data Pin Name  
SA_H[17:0]  
CK_H  
BcDataOutClk_H[3:0]  
Set from board to 1/2 the 21264/EV68A core voltage CK_L  
BcData_H[127:0]/BcCheck_H[15:0]  
BcDataWr_L  
DQx  
SW_L  
Unconnected  
Tck_H  
Unconnected  
Tdo_H  
Unconnected  
Tms_H  
Unconnected  
Tdi_H  
From board, pull down to VSS  
From board, pull down to VSS  
From board, pull down to VSS or BcDataOE_L  
G_L  
SBx_L  
SS_L (Vendor dependent)  
Tag Pin Usage  
Unused Bcache tag pins should be pulled to ground through a 200-ohm resistor.  
Table E–3 Late-Write Non-Bursting SSRAMs Tag Pin Usage  
21264/EV68A Signal Name or Board Connection  
BcAdd_H[22:6]  
Late-Write SSRAM Tag Pin Name  
SA_H[16:0]  
BcTag_H[42:20]  
DQx  
BcTagOE_L or from board, pull down to VSS  
BcTagWr_L  
SS_L (Vendor dependent)  
SW_L  
SBx_L  
CK_H  
From board, pull down toVSS  
BcTagOutClk_H  
E–2 21264/EV68A-to-Bcache Pin Interface  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
       
Dual-Data Rate SSRAMs  
Table E–3 Late-Write Non-Bursting SSRAMs Tag Pin Usage (Continued)  
21264/EV68A Signal Name or Board Connection Late-Write SSRAM Tag Pin Name  
Set from board to 1/2 the 21264/EV68A core voltage CK_L  
Set from board to 1/2 the 21264/EV68A core voltage VREF1_H  
VREF2_H  
Set from board (implementation dependent)  
BcTagValid_H  
ZQ_H  
DQx  
BcTagDirty_H  
DQx  
BcTagShared_H  
Unconnected  
DQx  
TMS_H  
TDI_H  
TCK_H  
TDC_H  
Unconnected  
Unconnected  
Unconnected  
E.3 Dual-Data Rate SSRAMs  
Table E–4 provides the data pin connections between dual-data rate SSRAMs and the  
21264/EV68A or the system board. Table E–5 provides the same information for the  
tag pins.  
Data Pin Usage  
Table E–4 Dual-Data Rate SSRAM Data Pin Usage  
21264/EV68A Signal Name or Board  
Connection  
Dual-Data Rate SSRAM Data Pin Name  
BcAdd_H[21:4]  
SA_H[17:0]  
BcData_H[33:20]/  
BcCheck_H[15:0]  
DQx  
BcLoad_L  
LD_L (B1)  
R/W_L(B2)  
LBO_L  
Q_L  
BcDataWr_L  
From board, pulled up to VDD  
From board, pulled down to VSS  
BcDataInClk_H  
CQ_H  
BcDataOutClk_H  
BcDataOutClk_L  
CK_H  
CK_L  
Set from board to 1/2 the 21264/EV68A core VREF1_H  
voltage  
VREF2_H  
Set from board (implementation-dependent)  
Unconnected or terminated  
From board, pulled up toVDD  
Unconnected  
ZQ_H  
CQ_L  
TCK_H  
TDO_H  
21264/EV68A Hardware Reference Manual  
21264/EV68A-to-Bcache Pin Interface  
E–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
     
Dual-Data Rate SSRAMs  
Table E–4 Dual-Data Rate SSRAM Data Pin Usage (Continued)  
21264/EV68A Signal Name or Board  
Connection  
Dual-Data Rate SSRAM Data Pin Name  
From board, pulled up to VDD  
From board, pulled up to VDD  
Unconnected or pulled down to VSS  
BcDataOE_L  
TMS_H  
TDI_H  
TRST_L  
OE_L (G_L)  
SD/DD_L (B3)  
From board, pulled down to VSS  
Tag Pin Usage  
Unused Bcache tag pins should be pulled to ground through a 200-ohm resister.  
Table E–5 Dual-Data Rate SSRAM Tag Pin Usage  
21264/EV68A Signal Name or Board  
Connection  
Dual-Data Rate SSRAM Tag Pin Name  
BcAdd_H[23:6]  
SA_H[17:0]  
DQx  
BcTag_H[33:20]  
BcTagOE_L  
LD_L (B1)  
R/W_L (B2)  
LBO_L  
BcTagWr_L  
From board, pulled up to VDD  
From board, pulled down to VSS  
Q_L  
SA[19:18]  
BcTagInClk_H  
CQ_H  
CK_H  
CK_L  
BcTagOutClk_H  
BcTagOutClk_L  
Set from board to 1/2 core voltage  
VREF1_H  
VREF2_H  
Set from board (implementation-dependent)  
BcTagValid_H  
ZQ_H  
DQx  
BcTagDirty_H  
DQx  
BcTagShared_H  
DQx  
BcTagParity_H  
DQx  
Unconnected or terminated  
From board, pulled up to VDD  
Unconnected  
CQ_L  
TCK_H  
TDO_H  
TMS_H  
TDI_H  
From board, pulled up to VDD  
From board, pulled up to VDD  
E–4 21264/EV68A-to-Bcache Pin Interface  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
Dual-Data Rate SSRAMs  
Table E–5 Dual-Data Rate SSRAM Tag Pin Usage (Continued)  
21264/EV68A Signal Name or Board  
Connection  
Dual-Data Rate SSRAM Tag Pin Name  
TRST_L  
Unconnected  
From board, pulled down to VSS  
From board, pulled up to VDD  
OE_L (G_L)  
SD/DD_L (B3)  
21264/EV68A Hardware Reference Manual  
21264/EV68A-to-Bcache Pin Interface  
E–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Glossary  
This glossary provides definitions for specific terms and acronyms associated with the  
Alpha 21264/EV68A microprocessor and chips in general.  
abort  
The unit stops the operation it is performing, without saving status, to perform some  
other operation.  
address space number (ASN)  
An optionally implemented register used to reduce the need for invalidation of cached  
address translations for process-specific addresses when a context switch occurs. ASNs  
are processor specific; the hardware makes no attempt to maintain coherency across  
multiple processors.  
address translation  
The process of mapping addresses from one address space to another.  
ALIGNED  
A datum of size 2**N is stored in memory at a byte address that is a multiple of 2**N  
(that is, one that has N low-order zeros).  
ALU  
Arithmetic logic unit.  
ANSI  
American National Standards Institute. An organization that develops and publishes  
standards for the computer industry.  
ASIC  
ASM  
ASN  
Application-specific integrated circuit.  
Address space match.  
See address space number.  
assert  
AST  
To cause a signal to change to its logical true state.  
See asynchronous system trap.  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
   
asynchronous system trap (AST)  
A software-simulated interrupt to a user-defined routine. ASTs enable a user process to  
be notified asynchronously, with respect to that process, of the occurrence of a specific  
event. If a user process has defined an AST routine for an event, the system interrupts  
the process and executes the AST routine when that event occurs. When the AST rou-  
tine exits, the system resumes execution of the process at the point where it was inter-  
rupted.  
bandwidth  
Bandwidth is often used to express the rate of data transfer in a bus or an I/O channel.  
barrier transaction  
A transaction on the external interface as a result of an MB (memory barrier) instruc-  
tion.  
Bcache  
See second-level cache.  
bidirectional  
Flowing in two directions. The buses are bidirectional; they carry both input and output  
signals.  
BiSI  
Built-in self-initialization.  
BiST  
bit  
Built-in self-test.  
Binary digit. The smallest unit of data in a binary notation system, designated as 0 or 1.  
bit time  
The total time that a signal conveys a single valid piece of information (specified in ns).  
All data and commands are associated with a clock and the receiver’s latch on both the  
rise and fall of the clock. Bit times are a multiple of the 21264/EV68A clocks. Systems  
must produce a bit time identical to 21264/EV68A’s bit time. The bit time is one-half  
the period of the forwarding clock.  
BIU  
Bus interface unit. See Cbox.  
Block exchange  
Memory feature that improves bus bandwidth by paralleling a cache victim write-back  
with a cache miss fill.  
board-level cache  
See second-level cache.  
Glossary–2  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
boot  
BSR  
Short for bootstrap. Loading an operating system into memory is called booting.  
Boundary-scan register.  
buffer  
An internal memory area used for temporary storage of data records during input or  
output operations.  
bugcheck  
bus  
A software condition, usually the response to software’s detection of an “internal incon-  
sistency,” which results in the execution of the system bugcheck code.  
A group of signals that consists of many transmission lines or wires. It interconnects  
computer system components to provide communications paths for addresses, data, and  
control information.  
byte  
Eight contiguous bits starting on an addressable byte boundary. The bits are numbered  
right to left, 0 through 7.  
byte granularity  
Memory systems are said to have byte granularity if adjacent bytes can be written con-  
currently and independently by different processes or processors.  
cache  
See cache memory.  
cache block  
The smallest unit of storage that can be allocated or manipulated in a cache. Also  
known as a cache line.  
cache coherence  
Maintaining cache coherence requires that when a processor accesses data cached in  
another processor, it must not receive incorrect data and when cached data is modified,  
all other processors that access that data receive modified data. Schemes for maintain-  
ing consistency can be implemented in hardware or software. Also called cache consis-  
tency.  
cache fill  
An operation that loads an entire cache block by using multiple read cycles from main  
memory.  
cache flush  
An operation that marks all cache blocks as invalid.  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
cache hit  
The status returned when a logic unit probes a cache memory and finds a valid cache  
entry at the probed address.  
cache interference  
The result of an operation that adversely affects the mechanisms and procedures used to  
keep frequently used items in a cache. Such interference may cause frequently used  
items to be removed from a cache or incur significant overhead operations to ensure  
correct results. Either action hampers performance.  
cache line  
See cache block.  
cache line buffer  
A buffer used to store a block of cache memory.  
cache memory  
A small, high-speed memory placed between slower main memory and the processor. A  
cache increases effective memory transfer rates and processor speed. It contains copies  
of data recently used by the processor and fetches several bytes of data from memory in  
anticipation that the processor will access the next sequential series of bytes. The  
21264/EV68A microprocessor contains two onchip internal caches. See also write-  
through cache and write-back cache.  
cache miss  
The status returned when cache memory is probed with no valid cache entry at the  
probed address.  
CALL_PAL instructions  
Special instructions used to invoke PALcode.  
Cbox  
External cache and system interface unit. Controls the Bcache and the system ports.  
central processing unit (CPU)  
The unit of the computer that is responsible for interpreting and executing instructions.  
CISC  
clean  
clock  
Complex instruction set computing. An instruction set that consists of a large number  
of complex instructions. Contrast with RISC.  
In the cache of a system bus node, refers to a cache line that is valid but has not been  
written.  
A signal used to synchronize the circuits in a computer.  
Glossary–4  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
clock offset (or clkoffset)  
The delay intentionally added to the forwarded clock to meet the setup and hold  
requirements at the Receive Flop.  
CMOS  
Complementary metal-oxide semiconductor. A silicon device formed by a process that  
combines PMOS and NMOS semiconductor material.  
conditional branch instructions  
Instructions that test a register for positive/negative or for zero/nonzero. They can also  
test integer registers for even/odd.  
control and status register (CSR)  
A device or controller register that resides in the processor’s I/O space. The CSR ini-  
tiates device activity and records its status.  
CPI  
Cycles per instruction.  
CPU  
See central processing unit.  
See control and status register.  
One clock interval.  
CSR  
cycle  
data bus  
Dcache  
A group of wires that carry data.  
Data cache. A cache reserved for storage of data. The Dcache does not contain instruc-  
tions.  
DDR  
Dual-data rate. A dual-data rate SSRAM can provide data on both the rising and falling  
edges of the clock signal.  
denormal  
DIP  
An IEEE floating-point bit pattern that represents a number whose magnitude lies  
between zero and the smallest finite number.  
Dual inline package.  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
direct-mapping cache  
A cache organization in which only one address comparison is needed to locate any  
data in the cache, because any block of main memory data can be placed in only one  
possible position in the cache.  
direct memory access (DMA)  
Access to memory by an I/O device that does not require processor intervention.  
dirty  
One status item for a cache block. The cache block is valid and has been written so that  
it may differ from the copy in system main memory.  
dirty victim  
Used in reference to a cache block in the cache of a system bus node. The cache block  
is valid but is about to be replaced due to a cache block resource conflict. The data must  
therefore be written to memory.  
DMA  
See direct memory access.  
DRAM  
Dynamic random-access memory. Read/write memory that must be refreshed (read  
from or written to) periodically to maintain the storage of information.  
DTB  
Data translation buffer. Also defined as Dstream translation buffer.  
DTL  
Diode-transistor logic.  
dual issue  
Two instructions are issued, in parallel, during the same microprocessor cycle. The  
instructions use different resources and so do not conflict.  
ECC  
Error correction code. Code and algorithms used by logic to facilitate error detection  
and correction. See also ECC error.  
ECC error  
An error detected by ECC logic, to indicate that data (or the protected “entity”) has  
been corrupted. The error may be correctable (soft error) or uncorrectable (hard error).  
ECL  
Emitter-coupled logic.  
EEPROM  
Electrically erasable programmable read-only memory. A memory device that can be  
byte-erased, written to, and read from. Contrast with FEPROM.  
Glossary–6  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
external cache  
See second-level cache.  
FEPROM  
Flash-erasable programmable read-only memory. FEPROMs can be bank- or bulk-  
erased. Contrast with EEPROM.  
FET  
FEU  
Field-effect transistor.  
The unit within the 21264/EV68A microprocessor that performs floating-point calcula-  
tions.  
firmware  
Machine instructions stored in nonvolatile memory.  
floating point  
A number system in which the position of the radix point is indicated by the exponent  
part and another part represents the significant digits or fractional part.  
flush  
See cache flush.  
forwarded clock  
A single-ended differential signal that is aligned with its associated fields. The for-  
warded clock is sourced and aligned by the sender with a period that is two times the bit  
time. Forwarded clocks must be 50% duty cycle clocks whose rising and falling edges  
are aligned with the changing edge of the data.  
FPGA  
FPLA  
FQ  
Field-programmable gate array.  
Field-programmable logic array.  
Floating-point issue queue.  
framing clock  
The framing clock defines the start of a transmission either from the system to the  
21264/EV68A or from the 21264/EV68A to the system. The framing clock is a power-  
of-2 multiple of the 21264/EV68A GCLK frequency, and is usually the system clock.  
The framing clock and the input oscillator can have the same frequency. The  
add_frame_select IPR sets that ratio of bit times to framing clock. The frame clock  
could have a period that is four times the bit time with a add_frame_select of 2X.  
Transfers begin on the rising and falling edge of the frame clock. This is useful for sys-  
tems that have system clocks with a period too small to perform the synchronous reset  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
of the clock forward logic. Additionally, the framing clock can have a period that is  
less than, equal to, or greater than the time it takes to send a full four cycle command/  
address.  
GCLK  
Global clock within the 21264/EV68A.  
granularity  
A characteristic of storage systems that defines the amount of data that can be read and/  
or written with a single instruction, or read and/or written independently.  
hardware interrupt request (HIR)  
An interrupt generated by a peripheral device.  
high-impedance state  
An electrical state of high resistance to current flow, which makes the device appear not  
physically connected to the circuit.  
hit  
See cache hit.  
Icache  
Instruction cache. A cache reserved for storage of instructions. One of the three areas of  
primary cache (located on the 21264/EV68A) used to store instructions. The Icache  
contains 8KB of memory space. It is a direct-mapped cache. Icache blocks, or lines,  
contain 32 bytes of instruction stream data with associated tag as well as a 6-bit ASM  
field and an 8-bit branch history field per block. Icache does not contain hardware for  
maintaining cache coherency with memory and is unaffected by the invalidate bus.  
IDU  
A logic unit within the 21264/EV68A microprocessor that fetches, decodes, and issues  
instructions. It also controls the microprocessor pipeline.  
IEEE Standard 754  
A set of formats and operations that apply to floating-point numbers. The formats cover  
32-, 64-, and 80-bit operand sizes.  
IEEE Standard 1149.1  
A standard for the Test Access Port and Boundary Scan Architecture used in board-  
level manufacturing test procedures.  
Inf  
Infinity.  
INT nn  
The term INTnn, where nn is one of 2, 4, 8, 16, 32, or 64, refers to a data field size of nn  
contiguous NATURALLY ALIGNED bytes. For example, INT4 refers to a NATU-  
RALLY ALIGNED longword.  
Glossary–8  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
interface reset  
A synchronously received reset signal that is used to preset and start the clock forward-  
ing circuitry. During this reset, all forwarded clocks are stopped and the presettable  
count values are applied to the counters; then, some number of cycles later, the clocks  
are enabled and are free running.  
Internal processor register (IPR)  
Special registers that are used to configure options or report status.  
IOWB  
IPGA  
IQ  
I/O write buffer.  
Interstitial pin grid array.  
Integer issue queue.  
ITB  
Instruction translation buffer.  
JFET  
latency  
LCC  
Junction field-effect transistor.  
The amount of time it takes the system to respond to an event.  
Leadless chip carrier.  
LFSR  
Linear feedback shift register.  
load/store architecture  
A characteristic of a machine architecture where data items are first loaded into a pro-  
cessor register, operated on, and then stored back to memory. No operations on memory  
other than load and store are provided by the instruction set.  
longword (LW)  
Four contiguous bytes starting on an arbitrary byte boundary. The bits are numbered  
from right to left, 0 through 31.  
LQ  
Load queue.  
LSB  
Least significant bit.  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
machine check  
An operating system action triggered by certain system hardware-detected errors that  
can be fatal to system operation. Once triggered, machine check handler software ana-  
lyzes the error.  
MAF  
Miss address file.  
main memory  
The large memory, external to the microprocessor, used for holding most instruction  
code and data. Usually built from cost-effective DRAM memory chips. May be used in  
connection with the microprocessor’s internal caches and an external cache.  
masked write  
A write cycle that only updates a subset of a nominal data block.  
MBO  
See must be one.  
Mbox  
This section of the processor unit performs address translation, interfaces to the  
Dcache, and performs several other functions.  
MBZ  
See must be zero.  
MESI protocol  
A cache consistency protocol with full support for multiprocessing. The MESI protocol  
consists of four states that define whether a block is modified (M), exclusive (E), shared  
(S), or invalid (I).  
MIPS  
Millions of instructions per second.  
miss  
See cache miss.  
module  
A board on which logic devices (such as transistors, resistors, and memory chips) are  
mounted and connected to perform a specific system function.  
module-level cache  
See second-level cache.  
MOS  
Metal-oxide semiconductor.  
MOSFET  
Metal-oxide semiconductor field-effect transistor.  
Glossary–10  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
MSI  
Medium-scale integration.  
multiprocessing  
A processing method that replicates the sequential computer and interconnects the col-  
lection so that each processor can execute the same or a different program at the same  
time.  
must be one (MBO)  
A field that must be supplied as one.  
must be zero (MBZ)  
A field that is reserved and must be supplied as zero. If examined, it must be assumed to  
be UNDEFINED.  
NaN  
Not-a-Number. An IEEE floating-point bit pattern that represents something other than  
a number. This comes in two forms: signaling NaNs (for Alpha, those with an initial  
fraction bit of 0) and quiet NaNs (for Alpha, those with an initial fraction bit of 1).  
NATURALLY ALIGNED  
See ALIGNED.  
NATURALLY ALIGNED data  
Data stored in memory such that the address of the data is evenly divisible by the size of  
the data in bytes. For example, an ALIGNED longword is stored such that the address  
of the longword is evenly divisible by 4.  
NMOS  
N-type metal-oxide semiconductor.  
Nonvolatile random-access memory.  
Observability linear feedback shift register.  
NVRAM  
OBL  
octaword  
Sixteen contiguous bytes starting on an arbitrary byte boundary. The bits are numbered  
from right to left, 0 through 127.  
OpenVMS Alpha operating system  
The version of the open VMS operating system for Alpha platforms.  
operand  
The data or register upon which an operation is performed.  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
output mux counter  
Counter used to select the output mux that drives address and data. It is reset with the  
Interface Reset and incremented by a copy of the locally generated forwarded clock.  
PAL  
Privileged architecture library. See also PALcode. See also Programmable array logic  
(hardware). A device that can be programmed by a process that blows individual fuses  
to create a circuit.  
PALcode  
Alpha privileged architecture library code, written to support Alpha microprocessors.  
PALcode implements architecturally defined behavior.  
PALmode  
parameter  
parity  
A special environment for running PALcode routines.  
A variable that is given a specific value that is passed to a program before execution.  
A method for checking the accuracy of data by calculating the sum of the number of  
ones in a piece of binary data. Even parity requires the correct sum to be an even num-  
ber, odd parity requires the correct sum to be an odd number.  
PGA  
Pin grid array.  
pipeline  
A CPU design technique whereby multiple instructions are simultaneously overlapped  
in execution.  
PLA  
PLCC  
PLD  
PLL  
Programmable logic array.  
Plastic leadless chip carrier or plastic-leaded chip carrier.  
Programmable logic device.  
Phase-locked loop.  
PMOS  
PQ  
P-type metal-oxide semiconductor.  
Probe queue.  
Glossary–12  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
PQFP  
Plastic quad flat pack.  
primary cache  
The cache that is the fastest and closest to the processor. The first-level caches, located  
on the CPU chip, composed of the Dcache and Icache.  
program counter  
That portion of the CPU that contains the virtual address of the next instruction to be  
executed. Most current CPUs implement the program counter (PC) as a register. This  
register may be visible to the programmer through the instruction set.  
PROM  
Programmable read-only memory.  
pull-down resistor  
A resistor placed between a signal line and a negative voltage.  
pull-up resistor  
A resistor placed between a signal line to a positive voltage.  
QNaN  
Quiet Nan. See NaN.  
quad issue  
Four instructions are issued, in parallel, during the same microprocessor cycle. The  
instructions use different resources and so do not conflict.  
quadword  
Eight contiguous bytes starting on an arbitrary byte boundary. The bits are numbered  
from right to left, 0 through 63.  
RAM  
RAS  
RAW  
Random-access memory.  
Row address select.  
Read-after-write.  
READ_BLOCK  
A transaction where the 21264/EV68A requests that an external logic unit fetch read  
data.  
read data wrapping  
System feature that reduces apparent memory latency by allowing read data cycles to  
differ the usual low-to-high sequence. Requires cooperation between the 21264/EV68A  
and external hardware.  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
read stream buffers  
Arrangement whereby each memory module independently prefetches DRAM data  
prior to an actual read request for that data. Reduces average memory latency while  
improving total memory bandwidth.  
receive counter  
Counter used to enable the receive flops. It is clocked by the incoming forwarded clock  
and reset by the Interface Reset.  
receive mux counter  
The receive mux counter is preset to a selectable starting point and incremented by the  
locally generated forward clock.  
register  
A temporary storage or control location in hardware logic.  
reliability  
The probability a device or system will not fail to perform its intended functions during  
a specified time interval when operated under stated conditions.  
reset  
RISC  
An action that causes a logic unit to interrupt the task it is performing and go to its ini-  
tialized state.  
Reduced instruction set computing. A computer with an instruction set that is paired  
down and reduced in complexity so that most can be performed in a single processor  
cycle. High-level compilers synthesize the more complex, least frequently used instruc-  
tions by breaking them down into simpler instructions. This approach allows the RISC  
architecture to implement a small, hardware-assisted instruction set, thus eliminating  
the need for microcode.  
ROM  
Read-only memory.  
RTL  
Register-transfer logic.  
SAM  
Serial access memory.  
SBO  
Should be one.  
SBZ  
Should be zero.  
scheduling  
The process of ordering instruction execution to obtain optimum performance.  
Glossary–14  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
SDRAM  
Synchronous dynamic random-access memory.  
second-level cache  
A cache memory provided outside of the microprocessor chip, usually located on the  
same module. Also called board-level, external, or module-level cache.  
set-associative  
A form of cache organization in which the location of a data block in main memory  
constrains, but does not completely determine, its location in the cache. Set-associative  
organization is a compromise between direct-mapped organization, in which data from  
a given address in main memory has only one possible cache location, and fully asso-  
ciative organization, in which data from anywhere in main memory can be put any-  
where in the cache. An “n-way set-associative” cache allows data from a given address  
in main memory to be cached in any of n locations.  
SIMM  
SIP  
Single inline memory module.  
Single inline package.  
SIPP  
Single inline pin package.  
Surface mount device.  
SMD  
SNaN  
SRAM  
SROM  
SSI  
Signaling NaN. See Nan.  
See SSRAM.  
Serial read-only memory.  
Small-scale integration.  
Synchronous static random-access memory.  
SSRAM  
stack  
An area of memory set aside for temporary data storage or for procedure and interrupt  
service linkages. A stack uses the last-in/first-out concept. As items are added to  
(pushed on) the stack, the stack pointer decrements. As items are retrieved from  
(popped off) the stack, the stack pointer increments.  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
STRAM  
Self-timed random-access memory.  
superpipelined  
Describes a pipelined machine that has a larger number of pipe stages and more com-  
plex scheduling and control. See also pipeline.  
superscalar  
system clock  
tag  
Describes a machine architecture that allows multiple independent instructions to be  
issued in parallel during a given clock cycle.  
The primary skew controlled clock used throughout the interface components to clock  
transfer between ASICs, main memory, and I/O bridges.  
The part of a cache block that holds the address information used to determine if a  
memory operation is a hit or a miss on that cache block.  
target clock  
TB  
Skew controlled clock that receives the output of the RECEIVE MUX.  
Translation buffer.  
tristate  
TTL  
Refers to a bused line that has three states: high, low, and high-impedance.  
Transistor-transistor logic.  
UART  
Universal asynchronous receiver-transmitter.  
UNALIGNED  
A datum of size 2**N stored at a byte address that is not a multiple of 2**N.  
unconditional branch instructions  
Instructions that change the flow of program control without regard to any condition.  
Contrast with conditional branch instructions.  
UNDEFINED  
An operation that may halt the processor or cause it to lose information. Only privileged  
software (that is, software running in kernel mode) can trigger an UNDEFINED opera-  
tion. (This meaning only applies when the word is written in all upper case.)  
Glossary–16  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
UNPREDICTABLE  
Results or occurrences that do not disrupt the basic operation of the processor; the pro-  
cessor continues to execute instructions in its normal manner. Privileged or unprivi-  
leged software can trigger UNPREDICTABLE results or occurrences. (This meaning  
only applies when the word is written in all upper case.)  
UVPROM  
VAF  
Ultraviolet (erasable) programmable read-only memory.  
See victim address file.  
valid  
Allocated. Valid cache blocks have been loaded with data and may return cache hits  
when accessed.  
VDF  
See victim data file.  
VHSIC  
victim  
Very-high-speed integrated circuit.  
Used in reference to a cache block in the cache of a system bus node. The cache block  
is valid but is about to be replaced due to a cache block resource conflict.  
victim address file  
The victim address file and the victim data file, together, form an 8-entry buffer used to  
hold information for transactions to the Bcache and main memory.  
victim data file  
The victim address file and the victim data file, together, form an 8-entry buffer used to  
hold information for transactions to the Bcache and main memory.  
virtual cache  
A cache that is addressed with virtual addresses. The tag of the cache is a virtual  
address. This process allows direct addressing of the cache without having to go  
through the translation buffer making cache hit times faster.  
VLSI  
VPC  
Very-large-scale integration.  
Virtual program counter.  
VRAM  
Video random-access memory.  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
WAR  
word  
Write-after-read.  
Two contiguous bytes (16 bits) starting on an arbitrary byte boundary. The bits are num-  
bered from right to left, 0 through 15.  
write-back  
A cache management technique in which write operation data is written into cache but  
is not written into main memory in the same operation. This may result in temporary  
differences between cache data and main memory data. Some logic unit must maintain  
coherency between cache and main memory.  
write-back cache  
Copies are kept of any data in the region; read and write operations may use the copies,  
and write operations use additional state to determine whether there are other copies to  
invalidate or update.  
WRITE_BLOCK  
A transaction where the 21264/EV68A requests that an external logic unit process write  
data.  
write data wrapping  
System feature that reduces apparent memory latency by allowing write data cycles to  
differ the usual low-to-high sequence. Requires cooperation between the 21264/EV68A  
and external hardware.  
write-through cache  
A cache management technique in which a write operation to cache also causes the  
same data to be written in main memory during the same operation. Copies are kept of  
any data in a region; read operations may use the copies, but write operations update the  
actual data location and either update or invalidate all copies.  
Glossary–18  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Index  
BC_CLOCK_OUT Cbox CSR, 4–45  
BC_CPU_CLK_DELAY Cbox CSR, 4–44, 4–45  
defined, 5–38  
Numerics  
21264/EV68A, features of, 1–3  
32_BYTE_IO Cbox CSR  
defined, 5–34  
BC_CPU_LATE_WRITE_NUM Cbox CSR  
defined, 5–35  
BC_DDM_FALL_EN Cbox CSR, 4–47  
defined, 5–36  
A
BC_DDM_RISE_EN Cbox CSR, 4–47  
Abbreviations, xix  
defined, 5–36  
binary multiples, xix  
BC_DDMF_ENABLE Cbox CSR, 4–47  
defined, 5–35  
register access, xix  
AC characteristics, 9–6  
Address conventions, xx  
Aggregate mode, 6–18  
BC_DDMR_ENABLE Cbox CSR, 4–47  
defined, 5–35  
BC_ENABLE Cbox CSR, 4–51, 5–39, 7–12  
BC_FDBK_EN Cbox CSR, 4–46  
defined, 5–38  
Aligned convention, xx  
Alpha instruction summary, A–1  
AMASK instruction values, 2–37  
ARITH synchronous trap, 6–14  
BC_FRM_CLK Cbox CSR, 4–47  
defined, 5–35  
BC_LAT_DATA_PATTERN Cbox CSR, 4–48  
defined, 5–35  
B
BC_LAT_TAG_PATTERN Cbox CSR, 4–48  
B_DA_OD pin type, 3–3, 9–2  
values for, 9–4  
defined, 5–35  
BC_LATE_WRITE_NUM Cbox CSR, 4–49  
defined, 5–35  
BC_LATE_WRITE_UPPER Cbox CSR  
B_DA_PP pin type, 3–3, 9–2  
values for, 9–4  
BC_BANK_ENABLE Cbox CSR, 4–51, 5–39,  
defined, 5–35  
BC_PENTIUM_MODE Cbox CSR, 4–51  
defined, 5–35  
BC_BPHASE_LD_VECTOR Cbox CSR, 4–45  
defined, 5–38  
BC_PERR error status in C_STAT, 5–42  
BC_BURST_MODE_ENABLE Cbox CSR, 4–51  
BC_RCV_MUX_CNT_PRESET Cbox CSR  
defined, 5–36  
BC_RCV_MUX_PRESET_CNT Cbox CSR, 4–48  
BC_RD_RD_BUBBLE Cbox CSR  
defined, 5–34  
defined, 5–35  
BC_CLEAN_VICTIM Cbox CSR, 4–23  
defined, 5–34  
BC_CLK_DELAY Cbox CSR, 4–45  
defined, 5–35  
BC_RD_WR_BUBBLES Cbox CSR, 4–49  
BC_CLK_LD_VECTOR Cbox CSR, 4–45  
defined, 5–38  
defined, 5–34  
BC_RDVICTIM Cbox CSR, 4–23, 4–26  
defined, 5–34  
BC_CLKFWD_ENABLE Cbox CSR, 4–47  
defined, 5–36  
21264/EV68A Hardware Reference Manual  
Index–1  
Download from Www.Somanuals.com. All Manuals Search And Download.  
 
BC_SJ_BANK_ENABLE Cbox CSR  
defined, 5–34  
BcTagParity_H signal pin, 3–4, 4–44  
BcTagShared_H signal pin, 3–4, 4–44  
BcTagValid_H signal pin, 3–4, 4–44  
BcTagWr_L signal pin, 3–4, 4–44  
BcVref signal pin, 3–4, 4–44  
BC_TAG_DDM_FALL_EN Cbox CSR, 4–47  
defined, 5–36  
BC_TAG_DDM_RISE_EN Cbox CSR, 4–47  
defined, 5–36  
BC_WR_RD_BUBBLES Cbox CSR, 4–49  
defined, 5–35  
BC_WR_WR_BUBBLE Cbox CSR, 4–53  
defined, 5–34  
BC_WRT_STS Cbox CSR, 5–39, 7–13  
Bcache  
Binary multiple abbreviations, xix  
banking, 4–53  
Bit notation conventions, xx  
Bounder-scan register, B–1  
Branch history table, initialized by BiST, 7–12  
Branch mispredication, pipeline abort delay from,  
clocking, 4–44  
control pins, 4–51  
data read transactions, 4–47  
data single-bit correctable ECC error, 8–5  
Branch predictor, 2–3  
BSDL description of the boundary-scan register,  
error case summary for, 8–10  
filling Dcache error, 8–6  
filling Icache error, 8–5  
forwarding clock pin groupings, E–1  
port, 4–42  
Built-in self-test, 11–5  
load, 7–6  
C
setting clock period, 4–45  
structure of, 4–7  
C_ADDR Cbox read register field, 5–42  
C_DATA Cbox data register, 5–33  
at power-on reset state, 7–16  
tag parity errors, 8–5  
tag read transactions, 4–47  
C_SHFT Cbox shift register, 5–33  
at power-on reset state, 7–16  
C_STAT Cbox read register field, 5–42  
C_STS Cbox read register field, 5–42  
C_SYNDROME_0 Cbox read register field, 5–41  
C_SYNDROME_1 Cbox read register field, 5–41  
Cache block states, 4–9  
BcAdd_H signal pins, 3–3, 4–43  
characteristics, 4–51  
BcCheck_H signal pins, 3–3, 4–43  
BcData_H signal pins, 3–3, 4–43  
BcDataInClk_H signal pins, 3–3, 4–43  
using, 4–53  
BcDataOE_L signal pin, 3–3, 4–43  
BcDataOutClk_x signal pins, 3–4, 4–43  
BcDataWr_L signal pin, 3–4, 4–44  
BcLoad_L signal pin, 3–4, 4–44  
BcTag_H signal pins, 3–4, 4–44  
BcTagDirty_H signal pin, 3–4, 4–44  
transitions, 4–10  
Cache coherency, 4–8  
CALL_PAL entry points, 6–12  
Caution convention, xx  
BcTagInClk_H signal pin, 3–4, 4–44  
using, 4–53  
BcTagOE_L signal pin, 3–4, 4–44  
BcTagOutClk_x signal pins, 3–4, 4–44  
Index–2  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Cbox  
Conventions, xix  
abbreviations, xix  
described, 2–11, 4–3  
address, xx  
duplicate Dcache tag array with, 4–13  
I/O write buffer, 2–11  
aligned, xx  
bit notation, xx  
caution, xx  
data units, xxi  
do not care, xxi  
external, xxi  
probe queue, 2–11  
read register, 5–41  
note, xxi  
shift register C_SHFT, 5–33  
victim address file, 2–11  
ranges and extents, xxi  
register figures, xxi  
signal names, xxi  
WRITE_MANY chain, 5–38  
WRITE_MANY chain example, 5–39  
WRITE_ONCE chain, 5–33  
CC cycle counter register, 5–3  
at power-on reset state, 7–15  
CTAG, 4–13  
CC_CTL cycle counter control register, 5–3  
at power-on reset state, 7–15  
D
CFR_EV6CLK_DELAY Cbox CSR, defined, 5–38  
CFR_FRMCLK_DELAY Cbox CSR, defined, 5–38  
CFR_GCLK_DELAY Cbox CSR, defined, 5–38  
Data merging  
load instructions in I/O address space, 2–28  
ChangeToDirtyFail, SysDc command, 4–10, 4–11,  
store instructions in I/O address space, 2–29  
Data transfer commands, system, 4–28  
Data types  
ChangeToDirtySuccess, SysDc command, 4–10,  
Choice predictor, 2–5  
ChxToDirty, 21264/EV68A command, 4–12  
CLAMP public instruction, B–1  
Clean cache block state, 4–9  
floating point support, 1–2  
supported, 1–1  
Data units convention, xxi  
Data wrap, 4–36  
interleaved, 4–37  
Clean/Shared cache block state, 4–10  
CleanToDirty, 21264/EV68A command, 4–22,  
DATA_VALID_DLY Cbox CSR, defined, 5–38  
dc  
system probes, with, 4–41  
CleanVictimBlk, 21264/EV68A command, 4–22,  
characteristics of, 9–2  
input pin capacitance defined, 9–2  
test load defined, 9–2  
DC_CTL Dcache control register, 5–30  
ClkFwdRst_H signal pin, 3–4, 4–30  
with system initialization, 7–7  
ClkIn_x signal pins, 3–4  
Clock forwarding, 7–4  
CLR_MAP clear virtual-to-physical map register,  
at power-on reset state, 7–16  
error correction and, 8–2  
DC_PERR error status in C_STAT, 5–42  
DC_STAT Dcache status register, 5–31  
at power-on reset state, 7–16  
at power-on reset state, 7–15  
CMOV instruction, special cases of, 2–26  
COLD reset machine state, 7–17  
Commands  
21264/EV68A to system, 4–19  
system to 21264/EV68A, 4–26  
when to NXM, 4–38  
21264/EV68A Hardware Reference Manual  
Index–3  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Dcache  
described, 2–12  
DTB_ASN0 address space number register 0  
at power-on reset state, 7–16  
DTB_ASN0 address space number registers 0, 5–28  
DTB_ASN1 address space number register 1, 5–28  
at power-on reset state, 7–16  
duplicate tags with, 4–13  
error case summary for, 8–10  
fill from Bcache error, 8–6  
fill from memory errors, 8–8  
initialized by BiST, 7–12  
pipelined, 2–16  
DTB_IA invalidate-all process register, 5–27  
at power-on reset state, 7–15  
DTB_IAP invalidate-all (ASM=0) process register,  
store second error, 8–4  
tag parity errors, 8–2  
victim extracts, 8–4  
DTB_IS0 invalidate single (array 0) register, 5–27  
Dcache data single-bit correctable ECC errors, 8–3  
Dcache tag, initialized by BiST, 7–12  
at power-on reset state, 7–16  
DTB_IS1 invalidate single (array 1) register, 5–27  
at power-on reset state, 7–16  
DTB_PTE0 array write 0 register  
DCOK_H signal pin, 3–4  
power-on reset flow, 7–1  
DCVIC_THRESHOLD Cbox CSR, defined, 5–34  
DFAULT fault, 6–13  
MTPR to, D–12  
DTB_PTE0 array write register 0, 5–26  
DTB_PTE1 array write 1 register, 5–26  
Differential 21264/EV68A clocks, 7–19  
Differential reference clocks, 7–19  
Dirty cache block state, 4–10  
Dirty/Shared cache block state, 4–10  
Do not care convention, xxi  
MTPR to, D–12  
DTB_TAG0 array write 0 register, 5–25  
MTPR to, D–12  
Double-bit fill errors, 8–9  
DTB_TAG1 array write 1 register, 5–25  
DOWN1 reset machine state, 7–18  
DOWN2 reset machine state, 7–19  
DOWN3 reset machine state, 7–19  
MTPR to, D–12  
DTBM_DOUBLE_3 fault, 6–13  
DTBM_DOUBLE_4 fault, 6–13  
DTBM_SINGLE fault, 6–13  
Dstream translation buffer, 2–13  
Dual-data rate SSRAM pin assignments, E–3  
DUP_TAG_ENABLE Cbox CSR, defined, 5–34  
Duplicate Dcache tag array, 2–11  
Duplicate Dcache, initialized by BiST, 7–12  
Duplicate tag stores, Bcache, 4–7  
DSTREAM_BC_DBL error status in C_STAT,  
DSTREAM_BC_ERR error status in C_STAT,  
DSTREAM_DC_ERR error status in C_STAT,  
DSTREAM_MEM_DBL error status in C_STAT,  
E
DSTREAM_MEM_ERR error status in C_STAT,  
Ebox  
cycle counter control register CC_CTL, 5–3  
described, 2–8  
DTB entries, writing multiple in same PAL flow,  
slotting, 2–18  
DTB fill, 6–14  
DTB, pipeline abort delay with, 2–16  
subclusters, 2–18  
virtual address format register VA_FORM, 5–5  
virtual address register, 5–4  
DTB_ALTMODE alternate processor mode register,  
ECB instruction, external interface reference, 4–5  
Index–4  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
ECC  
FetchBlkSpec, 21264/EV68A command, 4–22,  
64-bit data and check bit code, 8–2  
Dcache data single-bit correctable errors, 8–3  
for system data bus, 8–2  
Field notation convention, xxi  
Floating-point arithmetic trap, pipeline abort delay  
with, 2–16  
Floating-point control register, 2–35  
PALcode emulation of, 6–11  
errors, 8–7  
store instructions, 8–4  
ENABLE_EVICT Cbox CSR, 4–23, 5–39  
ENABLE_PROBE_CHECK Cbox CSR, 8–2  
defined, 5–35  
Floating-point instructions  
ENABLE_STC_COMMAND Cbox CSR, defined,  
IEEE, A–9  
Energy star certification, 7–9  
Error case summary, 8–10  
Floating-point issue queue, 2–7  
Forwarding clock pin groupings, E–1  
FrameClk_x signal pins, 3–5, 4–30  
Error detection mechanisms, 8–1  
EV6Clk_x signal pins, 3–4  
Evict, 21264/EV68A command, 4–13, 4–22, 4–39  
EVICT_ENABLE Cbox CSR, 7–13  
EXC_ADDR exception address register, 5–8  
G
after fault reset, 7–8  
GCLK, 7–19  
Global predictor, 2–4  
at power-on reset state, 7–15  
EXC_SUM exception summary register, 5–13  
at power-on reset state, 7–15  
H
Exception and interrupt logic, 2–8  
Exception condition summary, A–15  
External convention, xxi  
External interface initialization, 7–14  
EXTEST public instruction, B–1  
Heat sink center temperature, 10–1  
Heat sink specifications, 10–3  
HW_INT_CLR hardware interrupt clear register,  
updating, D–18  
HW_LD PALcode instruction, 6–3, A–9, D–18  
HW_MFPR PALcode instruction, 6–6, A–9  
HW_MTPR PALcode instruction, 6–6, A–9  
HW_REI PALcode instruction, A–9  
HW_RET PALcode instruction, 6–5  
HW_ST PALcode instruction, 6–4, A–9  
F
F31  
load instructions with, 2–23  
retire instructions with, 2–22  
Fast data disable mode, 4–33  
Fast data mode, 4–30, 4–31  
FAST_MODE_DISABLE Cbox CSR, 4–30  
defined, 5–34  
I
Fault reset flow, 7–8  
I/O address space  
Fault reset sequence of operations, 7–9  
FAULT_RESET reset machine state, 7–18  
Fbox  
instruction data merging, 2–29  
load instruction data merging, 2–28  
load instructions with, 2–27  
store instructions with, 2–29  
described, 2–10  
I/O write buffer, 2–11  
defined, 2–32  
FEN fault, 6–13  
FetchBlk, 21264/EV68A command, 4–22, 4–39  
system probes, with, 4–41  
21264/EV68A Hardware Reference Manual  
Index–5  
Download from Www.Somanuals.com. All Manuals Search And Download.  
I_CTL Ibox control register, 5–15  
IC_FLUSH Icache flush register  
at power-on reset state, 7–15  
IC_FLUSH_ASM Icache flush ASM register, 5–21  
after fault reset, 7–8  
after warm reset, 7–11  
at power-on reset state, 7–15  
PALshadow registers, 6–11  
through sleep mode, 7–10  
VA_48 field update, D–17  
Icache  
data errors, 8–2  
fill from Bcache error, 8–5  
fill from memory error, 8–7  
flush register IC_FLUSH, 5–21  
initialized by BiST, 7–12  
tag, initialized by BiST, 7–12  
I_DA pin type, 3–3, 9–2  
values for, 9–3  
I_DA_CLK pin type, 3–3, 9–2  
values for, 9–3  
I_DC_POWER pin type, 9–2  
I_DC_REF pin type, 3–3, 9–2  
values for, 9–3  
I_STAT Ibox status register, 5–18  
at power-on reset state, 7–15  
IACV fault, 6–13  
IEEE 1149.1  
test port reset, 7–16  
test port, operation of, 11–3  
IEEE floating-point conformance, A–14  
IEEE floating-point instruction opcodes, A–9  
IER_CM interrupt enable and current processor mode  
register, 5–9  
IMPLVER instruction values, 2–38  
Independent floating-point function codes, A–11  
INIT_MODE Cbox CSR, 5–39, 7–12  
Initialization mode processing, 7–12  
Ibox  
branch predictor, 2–3  
clear virtual-to-physical map register  
CLR_MAP, 5–21  
exception address register EXC_ADDR, 5–8  
exception and interrupt logic, 2–8  
exception summary register EXC_SUM, 5–13  
floating-point issue queue, 2–7  
Ibox process context register PCTX, 5–21  
Ibox status register I_STAT, 5–18  
Instruction fetch logic, 2–6  
instruction fetch logic, 2–6  
instruction virtual address format register  
IVA_FORM, 5–9  
Instruction fetch, pipelined, 2–14  
Instruction issue rules, 2–16  
Instruction latencies, pipelined, 2–20  
Instruction ordering, 2–30  
instruction-stream translation buffer, 2–5  
integer issue queue, 2–6  
internal processor registers, 5–1  
interrupt enable and current processor mode  
register IER_CM, 5–9  
interrupt summary register ISUM, 5–11  
ITB invalidate single register ITB_IS, 5–7  
ITB invalidate-all ASM (ASM=0) register  
ITB_IAP, 5–7  
Instruction retire latencies, minimum, 2–21  
Instruction retire rules  
ITB invalidate-all register ITB_IA, 5–7  
ITB PTE array write register ITB_PTE, 5–6  
ITB tag array write register ITB_TAG, 5–6  
PAL base register PAL_BASE, 5–15  
performance counter control register  
PCTR_CTL, 5–23  
Instruction slot, pipelined, 2–14  
Instruction-stream translation buffer, 2–5  
Int_Add_BcClk internal forwarded clock, 4–44,  
ProfileMe register PMPC, 5–8  
register rename maps, 2–6  
retire logic, 2–8  
subsections in, 2–2  
INT_FWD_CLK clock queue, 4–30  
Integer arithmetic trap, pipeline abort delay with,  
virtual program counter logic, 2–2  
Index–6  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
ITB_TAG array write register, 5–6  
at power-on reset state, 7–14  
IVA_FORM instruction virtual address format  
register, 5–9  
Integer issue queue, 2–6  
pipelined, 2–15  
at power-on reset state, 7–15  
Internal processor registers, 5–1  
accessing, 6–7  
J
implicitly written, 6–9  
ordering access, 6–9  
paired fetch order, 6–9  
scoreboard bits for, 6–8  
JITTER_CMD Cbox CSR, defined, 5–38  
JMP misprediction, in PALcode, D–15  
JSR misprediction  
INTERRUPT interrupt, 6–14  
INVAL_TO_DIRTY Cbox CSR, 4–23  
programming, 4–23  
in PALcode, D–15  
pipeline abort delay with, 2–16  
JSR_COR misprediction, in PALcode, D–15  
Junction temperature, 9–1  
INVAL_TO_DIRTY_ENABLE Cbox CSR, 5–39,  
InvalToDirty, 21264/EV68A command, 4–12,  
L
InvalToDirtyVic, 21264/EV68A command, 4–22,  
Late-write non-bursting SSRAM pin assignments,  
LDBU instruction, normal prefetch with, 2–23  
LDF instruction, normal prefetch with, 2–23  
LDG instruction, normal prefetch with, 2–23  
LDQ instruction, prefetch with evict next, 2–24  
LDS instruction, prefetch with modify intent, 2–23  
LDT instruction, normal prefetch with, 2–23  
LDWU instruction, normal prefetch with, 2–23  
LDx_L instructions  
IRQ_H signal pins, 3–5  
Istream, 2–5  
Istream memory references  
translation to external references, 4–5  
ISTREAM_BC_DBL error status in C_STAT, 5–42  
in-order processing for, 4–15  
locking mechanism for, 4–14  
ISTREAM_BC_ERR error status in C_STAT, 5–42  
Load hit speculation, 2–24  
Load instructions  
ISTREAM_MEM_DBL error status in C_STAT,  
ECC with, 8–3  
ISTREAM_MEM_ERR error status in C_STAT,  
Mbox order traps, 2–31  
ISUM interrupt summary register, 5–11  
at power-on reset state, 7–15  
ITB fill, 6–16  
ITB miss, pipeline abort delay with, 2–16  
memory reference ordering, 2–30  
translation to external interface, 4–5  
Load queue, described, 2–13  
Load-load order trap, 2–31  
Local predictor, 2–4  
Lock mechanism, 4–14  
Logic symbol, the 21264/EV68A, 3–2  
ITB_IA invalidate-all register, 5–7  
at power-on reset state, 7–15  
ITB_IAP invalidate-all (ASM=0) register, 5–7  
at power-on reset state, 7–15  
ITB_IS invalidate single register, 5–7  
at power-on reset state, 7–15  
ITB_MISS fault, 6–14  
ITB_PTE array write register, 5–6  
at power-on reset state, 7–14  
M
M_CTL Mbox control register, 5–29  
at power-on reset state, 7–16  
MB instruction processing, 2–33  
21264/EV68A Hardware Reference Manual  
Index–7  
Download from Www.Somanuals.com. All Manuals Search And Download.  
MB, 21264/EV68A command, 4–13, 4–21  
MB_CNT Cbox CSR, operation, 2–32  
MBDone, SysDc command, 4–13  
Mbox  
MM_STAT memory management status register,  
MT_FPCR instruction, 6–12  
MT_FPCR synchronous trap, 6–14  
Dcache control register DC_CTL, 5–30  
described, 2–12  
N
DTB address space number registers 0 and 1  
DTB_ASNx, 5–28  
NoConnect pin type, 3–3  
Nonexistent memory  
processing, 4–38  
NOP, 21264/EV68A command, 4–21  
DTB alternate processor mode register  
DTB_ALTMODE, 5–26  
DTB invalidate-all (ASM=0) process register  
DTB_IAP, 5–27  
Note convention, xxi  
Numbering convention, xxi  
NZNOP, 21264/EV68A command, 4–21  
DTB_ISx, 5–27  
DTB PTE array write registers 0 and 1  
DTB_PTEx, 5–26  
DTB tag array write registers 0 and 1  
DTB_TAGx, 5–25  
O
load queue, 2–13  
O_OD pin type, 3–3, 9–2  
values for, 9–4  
memory management status register  
O_OD_TP pin type, 3–3, 9–2  
values for, 9–4  
miss address file, 2–13  
O_PP pin type, 3–3, 9–2  
values for, 9–5  
order traps, 2–31  
pipeline abort delays, 2–16  
store queue, 2–13  
O_PP_CLK pin type, 3–3, 9–2  
values for, 9–5  
MBOX_BC_PRB_STALL Cbox CSR, defined,  
OPCDEC fault, 6–13  
Opcodes  
MCHK interrupt, 6–14  
Mechanical specifications, 3–17  
Memory  
IEEE floating-point, A–9  
independent floating-point, A–11  
reserved for Compaq, A–8  
reserved for PALcode, A–9  
summary of, A–12  
error case summary for, 8–11  
filling Dcache errors, 8–8  
filling Icache errors, 8–7  
VAX floating-point, A–11  
Operating temperature, 10–1  
Memory address space  
load instructions with, 2–27  
merging rules, 2–30  
store instructions with, 2–28  
Memory barrier instructions  
translation to external interface, 4–5  
Memory barriers, 2–32  
P
Packaging, 3–18  
Paired instruction fetch order, 6–9  
PAL_BASE register, 5–15  
MF_FPCR instruction, 6–12  
Microarchitecture  
summarized, 2–1  
MiscVref signal pin, 3–5  
after fault reset, 7–8  
after warm reset, 7–11  
at power-on reset state, 7–15  
through sleep mode, 7–10  
Miss address file, 2–13  
I/O address space loads, 2–27  
memory address space loads, 2–27  
memory address space stores, 2–28  
Index–8  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
PALcode  
Ports  
IEEE 1149.1, 11–3  
described, 6–1  
serial terminal, 11–2  
SROM load, 11–2  
exception entry points, 6–13  
guidelines for, D–1  
Power  
maximum, 9–1  
HW_LD instruction, 6–3  
HW_MFPR instruction, 6–6  
HW_MTPR instruction, 6–6  
HW_RET instruction, 6–5  
HW_ST instruction, 6–4  
required function codes, 6–3  
reserved opcodes for, 6–3  
restrictions for, D–1  
sleep defined, 9–3  
Power supply sequencing, 9–5  
Power-on  
reset flow, 7–1  
timing sequence, 7–3  
PALmode environment, 6–2  
PALshadow registers, 6–11  
PRB_TAG_ONLY Cbox CSR, 4–28  
defined, 5–34  
PCTR_CTL performance counter control counter  
Privileged architecture library code  
register  
See PALcode  
updating, D–17  
Probe commands, system, 4–26, 4–40  
Probe queue, 2–11  
PROBE_BC_ERR error status in C_STAT, 5–42  
ProbeResponse, 21264/EV68A command, 4–21,  
PCTR_CTL performance counter control register,  
updating, D–18  
PCTX Ibox process context register, 5–21  
after fault reset, 7–8  
ProfileMe mode, 6–20  
after warm reset, 7–11  
at power-on reset state, 7–15  
through sleep mode, 7–10  
Physical address considerations, 4–4  
Pipeline  
R
R31  
abort delay, 2–16  
load instructions with, 2–23  
retire instructions with, 2–22  
speculative loads to, 2–25  
Dcache access, 2–16  
Ebox execution, 2–16  
Ebox slotting, 2–18  
RAMP1 reset machine state, 7–17  
RAMP2 reset machine state, 7–18  
Ranges and extents convention, xxi  
Fbox execution, 2–16  
instruction fetch, 2–14  
instruction group definitions, 2–17  
instruction issue rules, 2–16  
instruction latencies, 2–20  
instruction retire rules, 2–21  
instruction slot, 2–14  
issue queue, 2–15  
RdBlk, 21264/EV68A command, 4–39  
RdBlkI, 21264/EV68A command, 4–39  
RdBlkMod, 21264/EV68A command, 4–39  
RdBlkModSpec, 21264/EV68A command, 4–39  
RdBlkModVic, 21264/EV68A command, 4–39  
RdBlkSpec, 21264/EV68A command, 4–39  
RdBlkSpecI, 21264/EV68A command, 4–39  
RdBlkVic, 21264/EV68A command, 4–39  
RdBlkVicI, 21264/EV68A command, 4–39  
RdBytes, 21264/EV68A command, 4–39  
RdLWs, 21264/EV68A command, 4–39  
RdQWs, 21264/EV68A command, 4–39  
organization, 2–13  
register maps, 2–15  
register reads, 2–16  
PLL  
description, 7–19  
ramp up, 7–6  
PLL_IDD, values for, 9–3  
PLL_VDD signal pin, 3–5  
PLL_VDD, values for, 9–3  
PllBypass_H signal pin, 3–5  
PMPC ProfileMe register, 5–8  
RDVIC_ACK_INHIBIT Cbox CSR, 4–25, 4–26  
defined, 5–34  
21264/EV68A Hardware Reference Manual  
Index–9  
Download from Www.Somanuals.com. All Manuals Search And Download.  
ReadBlk, 21264/EV68A command, 4–21  
system probes, with, 4–41  
ReadBlkI, 21264/EV68A command, 4–22  
Security holes  
with UNPREDICTABLE results, xxii  
Serial terminal port, 11–2  
ReadBlkMod, 21264/EV68A command, 4–22  
system probes, with, 4–41  
ReadBlkModSpec, 21264/EV68A command, 4–22  
SET_DIRTY_ENABLE Cbox CSR, 4–23, 5–39,  
programming, 4–24  
SharedToDirty, 21264/EV68A command, 4–22,  
ReadBlkModVic, 21264/EV68A command, 4–22  
ReadBlkSpec, 21264/EV68A command, 4–22  
ReadBlkSpecI, 21264/EV68A command, 4–22  
ReadBlkVic, 21264/EV68A command, 4–22  
ReadBlkVicI, 21264/EV68A command, 4–22  
ReadBytes, 21264/EV68A command, 4–22  
ReadData, SysDc command, 4–10, 4–11, 4–12  
ReadDataDirty, SysDc command, 4–10, 4–11, 4–12  
system probes, with, 4–41  
Signal name convention, xxi  
Signal pin types, defined, 3–3  
Signal pins  
test, 11–1  
Single-bit error scribbing, D–19  
Single-bit errors in hardware, correcting, 8–2  
SIRR software interrupt request register, 5–10  
at power-on reset state, 7–15  
Sleep mode  
ReadDataError, SysDc command, 4–10, 4–11,  
ReadDataShared, SysDc command, 4–10, 4–11,  
flow, 7–9  
ReadDataShared/Dirty, SysDc command, 4–10,  
SLEEP mode register, 5–21  
at power-on reset state, 7–15  
Spare pin type, 3–3  
SPEC_READ_ENABLE Cbox CSR, 4–23  
defined, 5–35  
ReadLWs, 21264/EV68A command, 4–22  
ReadQWs, 21264/EV68A command, 4–22  
Register access abbreviations, xix  
Register figure conventions, xxi  
Register maps, pipelined, 2–15  
Register rename maps, 2–6  
SROM content map, 11–6  
SROM initialization, 11–5  
SROM interface, in microarchitecture, 2–13  
SROM line, Icache bit fields in a, 11–6  
SROM load, 7–6  
Replay traps, 2–31  
RESET interrupt, 6–14  
Reset state machine  
major operations of, 7–1  
Reset_L signal pin, 3–5  
power-on reset flow, 7–1  
RET misprediction, in PALcode, D–15  
Retire logic, 2–8, D–1  
RO,n convention, xix  
RUN reset machine state, 7–18  
RW,n convention, xx  
SROM load operation, 11–2  
SromClk_H signal pin, 3–5, 11–2  
SromData_H signal pin, 3–5, 11–2  
SromOE_L signal pin, 3–5, 11–2  
SSRAMs  
dual-data rate pin assignments, E–3  
late-write non-bursting pin assignments, E–2  
STC_ENABLE Cbox CSR, 4–24  
STCChangeToDirty, 21264/EV68A command,  
Storage temperature, 9–1  
S
SAMPLE public instruction, B–1  
Scrubbing single-bit errors, D–19  
I_CTL Ibox control register  
updating I_CTL, D–18  
Index–10  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Store instructions  
SysAddInClk_L signal pin, 3–5  
SysAddOut_L signal pins, 3–5  
SysAddOutClk_L signal pin, 3–5  
SYSBUS_ACK_LIMIT Cbox CSR, 4–25  
defined, 5–34  
Dcache ECC errors with, 8–4  
I/O address space, 2–29  
I/O reference ordering, 2–31  
Mbox order traps, 2–31  
memory address space, 2–28  
memory reference ordering, 2–30  
translation to external interface, 4–5  
SYSBUS_FORMAT Cbox CSR, 4–21  
Store queue, 2–13  
Store-load order trap, 2–31  
STx_C instructions  
SYSBUS_MB_ENABLE Cbox CSR, 4–23  
defined, 5–34  
operation, 2–32  
SYSBUS_VIC_LIMIT Cbox CSR, 4–26  
in-order processing for, 4–15  
defined, 5–34  
locking mechanism for, 4–14  
SysCheck_L signal pin, 3–5  
SYSCLK, 4–31  
SysData_L signal pin, 3–5  
SysDataInClk_H signal pin, 3–5  
Summary bit, in FPCR, 2–36  
SysDataInValid_L signal pin, 3–5  
rules for, 4–34  
SysDataOutClk_L signal pin, 3–5  
SysDataOutValid_L signal pin, 3–5  
rules for, 4–35  
SYS_BPHASE_LD_VECTOR Cbox CSR, 4–18  
defined, 5–38  
SYS_BUS_FORMAT Cbox CSR, defined, 5–34  
SysDc commands, 4–11  
system probes, with, 4–42  
SysDc field, system to 21264/EV68A commands,  
SYS_BUS_SIZE Cbox CSR, 4–21  
defined, 5–34  
SYS_CLK_DELAY Cbox CSR, defined, 5–36  
SYS_CLK_LD_VECTOR Cbox CSR, 4–18  
defined, 5–38  
SYS_CLK_RATIO Cbox CSR, defined, 5–34  
SYSDC_DELAY Cbox CSR, 4–32  
defined, 5–38  
SysFillValid_L signal pin, 3–5  
SYS_CLKFWD_ENABLE Cbox CSR, defined,  
rules for, 4–35  
System clock ratio configuration, 7–4  
SYS_CPU_CLK_DELAY Cbox CSR  
defined, 5–38  
System initialization, 7–7  
System interface clocks, programming, 4–18  
System port, 4–16  
SYS_DDM_FALL_EN Cbox CSR, 4–18  
defined, 5–36  
SYS_DDM_RD_FALL_EN Cbox CSR, 4–18  
SysVref signal pin, 3–6  
SYS_DDM_RD_RISE_EN Cbox CSR, 4–19  
SYS_DDM_RISE_EN Cbox CSR, 4–18  
defined, 5–36  
T
Tag parity errors, 8–2  
TB fill flow, 2–34, 6–14  
Tck_H signal pin, 3–6  
Tdi_H signal pin, 3–6  
Tdo_H signal pin, 3–6  
Temperatures  
SYS_DDMF_ENABLE Cbox CSR, 4–19  
defined, 5–36  
SYS_DDMR_ENABLE Cbox CSR, 4–19  
defined, 5–36  
SYS_FDBK_EN Cbox CSR, 4–18  
defined, 5–38  
SYS_FRAME_LD_VECTOR Cbox CSR, 4–19,  
operating, 10–1  
Terminology, xix  
SYS_RCV_MUX_CNT_PRESET Cbox CSR, 4–31  
TestStat_H signal pin, 3–6  
defined, 5–36  
purpose for, 11–4  
SYS_RCV_MUX_PRESET Cbox CSR, 4–33  
SysAddIn_L signal pins, 3–5  
21264/EV68A Hardware Reference Manual  
Index–11  
Download from Www.Somanuals.com. All Manuals Search And Download.  
Thermal design characteristics, 10–6  
Tms_H signal pin, 3–6  
Traps  
WAIT_ClkFwdRst0 reset machine state, 7–18  
WAIT_ClkFwdRst1 reset machine state, 7–18  
WAIT_INTERRUPT reset machine state, 7–19  
WAIT_NOMINAL reset machine state, 7–17  
WAIT_RESET reset machine state, 7–18  
WAIT_SETTLE reset machine state, 7–17  
WAKEUP interrupt, 6–14  
load-load order, 2–31  
replay, 2–31  
Trst_L signal pin, 3–6  
WAR, eliminating, 2–6  
Warm reset flow, 7–11  
U
UNALIGN fault, 6–13  
Unaligned convention, xx  
UNDEFINED, xxii  
WAW  
eliminating, 2–6  
WMB instruction processing, 2–33  
WO,n convention, xx  
Wrap order  
UNPREDICTABLE, xxii  
double-pumped, 4–38  
V
interleaved, 4–37  
VA virtual address register, 5–4  
WrBytes, 21264/EV68A command, 4–22, 4–39  
Write hint instructions, translation to external  
interface, 4–5  
at power-on reset state, 7–15  
VA_CTL virtual address control register, 5–4  
at power-on reset state, 7–15  
WRITE_MANY chain, 5–38  
updating VA_48 field, D–18  
example, 5–39  
VA_FORM virtual address format register, 5–5  
at power-on reset state, 7–15  
WRITE_MANY register  
after fault reset, 7–8  
VAX floating-point instruction opcodes, A–11  
VBIAS defined, 9–2  
VDBFlushRequest, 21264/EV68A command, 4–21  
VDD signal pin list, 3–16  
after warm reset, 7–11  
through sleep mode, 7–10  
WRITE_ONCE chain description, 5–33  
WrLWs, 21264/EV68A command, 4–22, 4–39  
WrQWs, 21264/EV68A command, 4–22, 4–39  
VDD, values for, 9–3  
WrVictimBlk, 21264/EV68A command, 4–22,  
Vdiff defined, 9–2  
Victim address file  
system probes, with, 4–41  
described, 2–11  
Victim address file, described, 2–11  
X
Victim data buffer (VDB), 4–8  
Virtual address support, 1–2  
Virtual program counter logic, 2–2  
VREF, values for, 9–3  
X convention, xxi  
VSS signal pin list, 3–16  
W
WAIT_BiSI reset machine state, 7–18  
WAIT_BiST reset machine state, 7–18  
Index–12  
21264/EV68A Hardware Reference Manual  
Download from Www.Somanuals.com. All Manuals Search And Download.  

Craftsman Blower 358748260 User Manual
Craftsman Lawn Mower 917387258 User Manual
Crate Amplifiers Stereo Amplifier BX 4115 User Manual
Delfield Refrigerator SAHRI1 S User Manual
Dyson Vacuum Cleaner DC17 User Manual
Eaton Electrical Network Card 66103 User Manual
Exido Fan 271 005 User Manual
Extron electronic TV Cables MHRHF 5 User Manual
Faber Ventilation Hood Diamante User Manual
Global Machinery Company Sander SA146 User Manual