amd r700 series processors

23
AMD R700 Series Processors

Upload: moira

Post on 12-Jan-2016

64 views

Category:

Documents


5 download

DESCRIPTION

AMD R700 Series Processors. AMD R700 Series History. The AMD 700 chipset series – also known as the AMD 7-Series Chipsets A set of chipsets designed by ATI for AMD Phenom processors GA - late 2007 to end of 2008. CPU verses GPU. CPU Typically use a basic load instructions for data loads - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: AMD R700 Series Processors

AMD R700 Series Processors

Page 2: AMD R700 Series Processors

AMD R700 Series History

• The AMD 700 chipset series – also known as the AMD 7-Series Chipsets

• A set of chipsets designed by ATI for AMD Phenom processors

• GA - late 2007 to end of 2008

Page 3: AMD R700 Series Processors

CPU verses GPU

CPU • Typically use a basic load

instructions for data loads

• Processes instructions one at a time

• Located on the motherboard

GPU • Typically uses texture-fetch

instructions for data loads AND vertex-fetch for data loads

• Processes hundreds of instructions simultaneously

• Typically located on an IO card attached to the BUS

Page 4: AMD R700 Series Processors

AMD R700 Series Processor

– Design philosophy/rational of the AMD R7000 – related to the good design policies studied in class

Page 5: AMD R700 Series Processors

AMD R700 Instructions Control-flow

A program consists of two sections, control flow and clause.• Control flow instructions can initiate executions of the

following:• ALU (by referring to an appropriate clause)• Texture-fetch• Vertex-fetch

• Clause is a homogeneous group of instructions comprised of:• ALU• Texture-fetch• Vertex-fetch• Local data share• Memory read

Page 6: AMD R700 Series Processors

AMD R700 Registers• 128 General-purpose registers

– 128 bits wide– Organized as four 32-bit values

• 512 Constant registers– 128 bits wide,– Organized as four 32-bit values

• Address Register

Page 7: AMD R700 Series Processors

AMD R700 Registers• Loop index

– Initialized by software – Incremented by hardware on each iteration of a loop

• Integer Constant register– 96 bits wide (3x32)– GPU has read access– Main CPU has write access– Specified in the CF_CONST field of the CF_DWORD1

microcode format for the current LOOP* instruction

Page 8: AMD R700 Series Processors

AMD R700 Addressing

Addressing modes• Absolute • Loop-index-relative • Relative addressing

Page 9: AMD R700 Series Processors

AMD R700 Operands

• 3 source operands and 1 destination operand all of which have an absolute addressing mode enabling each to be accessed relative to address zero.

• Float• Double• Half• Signed/unsigned Integer

Page 10: AMD R700 Series Processors

AMD R700 Operation Repertoire

Arithmetic Operations on built-in integer, floating-point scalar, and vector data types.

•Add •Subtract•Multiply•Divide•Basic Linear Algebra Subroutines•Linear Algebra Package

•Fast Fourier Transform•Math Transcendental•Random Number Generator Routines•Stream Processing backend for load balancing of computations between CPU and stream processing

Page 11: AMD R700 Series Processors

AMD R700 Features

Instructions operate on 32-bit or 64-bit IEEE floating-point values and signed/unsigned integers.

• Instruction set• Control-flow• ALU Clause• Vertex-fetch• Texture-fetch• Memory Read• Data-Share Read/Write

Page 12: AMD R700 Series Processors

AMD R700 Instructions Memory Read

• Software initiated with the VTX or VTX_TC instructions

• Fetch data from one of three types of buffers• Scratch• Reduction• Scatter (general read/write)

• Can be intermixed within a clause that can consist to as many as 16 memory read instructions (memory read instructions cannot be in the same clause as texture or vertex fetch

instructions, or with local data share instructions).

Page 13: AMD R700 Series Processors

AMD R700 Instructions Data-Share Read/Write

• Software initiated with the TEX control flow instructions

• Within the clause, LDS uses common instruction encodings:

• MEM_DSR – reads• MEM_DSW – writes

LDS clause contains instructions that are issued sequentially. A write instruction followed by a read has all of the write data posted before the read so that data share within a clause can use a location repeatedly to exchange data.

Page 14: AMD R700 Series Processors

AMD R700 InstructionsVertex-fetch

• Software initiated with the VTX or VTX_TC instruction.

• Fetch vertices from the vertex buffer based on a GPR address.

• At most eight instructions long

Relative byte offset of the word in memory

Page 15: AMD R700 Series Processors

AMD R700 Instructions Texture-fetch

• Software initiated with the TEX instruction• Consists of instructions that lookup texture

elements known as texels, based on a GPR address or constant-fetch operations

• At most eight instructions long

Relative byte offset of the word in memory

Page 16: AMD R700 Series Processors

AMD R700 ALU InstructionsALU instructions are organized in pairs of two 32-

bit double words.• OP2 instruction - ALU_INST field uses a seven-bit

opcode, with the high three bits set to 000b.• OP3 instruction – at least 1 of the three high bits

of the ALU_INST field has a nonzero value.

Choice of 2 or 3 source operands

Byte offset of the double words

Page 17: AMD R700 Series Processors

AMD R700 ALU Instructions

The processor contains multiple sets of five scalar ALUs.

Four of the Five are called ALU.[X, Y, Z, W] and perform scalar operations on as many as three 32-bit data elements.

128 bits containing 4 – 32 bit elements in little-endian order

Most-significant element Lease-significant element

Page 18: AMD R700 Series Processors

AMD R700 Procedure CallsSync Barrier1-can run in parallel with prior instruction

COUNT Number of instructions slots to execute in the clause (values 1-16)

MSB of Count Field and Amount to increment call nesting counter by when executing a call statement (the call is skipped if the nesting depth + CALL_COUNT > 32) range 0-31

31 32 29:23 22 21 20 19 18:13 12:10 9:8 7:3 2:0

Whole_Quad_ModeAnd VPM are mutually exclusive (either WQM or VPM are set to 1)1-Execute instruction if ALL pixels are active and valid.

Valid_Pixel_Mode1-Execute instruction if invalid pixels are inactive

Control Flow Instruction, i.e. CF_INST_JUMP – execute jump statement End of Program

Specifies how to evaluate the condition test for each pixel

Control flow constant to use for flow control statements. Pop Count

31 0

Address

Offsets +4 and +0 are relative to the byte address specified in the host-written PGM_START_* register. Texture and Vertex clauses must start on 16-byte aligned addresses.

Page 19: AMD R700 Series Processors

AMD R700CISC or RISC

• CISC characteristics: • Number of operands per instruction• Complex set of operations in the ISA• Instructions work out of both on and off chip memory

• RISC characteristics: • Large number of registers• Separate instructions for load/store and data

processing

Page 20: AMD R700 Series Processors

Design Policies

The Good and the Bad1. Simplicity favors regularity

R700 series specializes in the processing of graphic instructions in parallel quickly

2. Smaller is fasterNot so good – it’s all about trade-offs

3. Make the common case fastThe R700 series processes graphics efficiently at high speeds

4. Good design demands good compromiseTrade error handling for high speed

Page 21: AMD R700 Series Processors

Conclusion

Pros • Multiple parallel stream

processing units (SPU)

• Each single instruction multiple data pipeline maintains a separate interface to memory

• Speed

Cons • Cost

• R700 programs do not support

• Exceptions

• Interrupts

• Errors

• Any event that can interrupt pipeline operations

• Size of the circuit board

Page 22: AMD R700 Series Processors

Conclusion

• AMD R7000 GPU is a specialized processing unit

• Depending on the application/use the trade-offs can be worth it

Page 23: AMD R700 Series Processors

References

• Ali Umut ˙Irt¨urk. "GUSTO: General Architecture Design Utility and Synthesis Tool for Optimization." Thesis. UNIVERSITY OF CALIFORNIA, SAN DIEGO, 2009. Web. 20 Apr. 2010. <http://cseweb.ucsd.edu/~kastner/papers/phd-thesis-irturk.pdf>.

• AMD 700 Chipset Series. 14 Apr. 2010. Web. 16 Apr. 2010. <http://en.wikipedia.org/wiki/AMD_700_chipset_series>.

• AMD 700 Chipset Series. Advanced Micro Devices, 2009. Print.

• ATI CTM Guide. Advance Micro Devices, Inc, 2006. Print.