csci1412 lecture 3

32
CSCI1412 Lecture 3 Hardware 3 More Architecture Dr John Cowell phones off (please)

Upload: urban

Post on 23-Feb-2016

41 views

Category:

Documents


0 download

DESCRIPTION

phones off (please). CSCI1412 Lecture 3. Hardware 3 More Architecture Dr John Cowell. Overview. How it works! the fetch / execute cycle in detail Measuring speed system clock, GHz , MIPS and FLOPS Advanced concepts cache, pipelining, parallelism memory issues dynamic and static RAM, - PowerPoint PPT Presentation

TRANSCRIPT

CSCI1412 - More Processor Architecture

CSCI1412Lecture 3Hardware 3More ArchitectureDr John Cowell

phones off (please)OverviewHow it works!the fetch / execute cycle in detailMeasuring speedsystem clock, GHz, MIPS and FLOPSAdvanced conceptscache, pipelining, parallelismmemory issuesdynamic and static RAM,SIMMS, DIMMS, and specialist memorymotherboardscomponent layout De Montfort University, 2007CSCI1412-HW-32The Fetch / Execute CycleThe Fetch / Execute Cycle De Montfort University, 2007CSCI1412-HW-34control unitRAMarithmetic / logic unitdecode execute

fetch (store)BusesComputer memory is made up of a set of locations. Each has a unique address.The address bus specifies the location. The data bus transfers the data.The control bus determines e.g. read or write De Montfort University, 2007CSCI1412-HW-35

RegistersA CPU contains special purpose registers (typically 32)Very high speed memory within the processor chipeach register contains a fixed number of bitse.g. each register in a 32-bit processor has 32 bitsContain instructions to be executed, data being operated on, etc.

Typically there are several named registersSCRsequence control register holds location of the next piece of information to be fetchedcontrols the sequence of instructionseach time it is accessed, it is automatically incremented (increased) by oneCIRcurrent instruction registerholds the instruction about to be processed De Montfort University, 2007CSCI1412-HW-36More RegistersRegisters, continued ...MARmemory address registerholds the location (the address) of information about to be read from or written to RAMMDRmemory data registerholds the value of information just read from or about to be written to RAMACCaccumulator(s)hold result(s) of processingSometimes a processor also has one or moreSTOgeneral purpose store(s)hold temporary data value(s) for processing De Montfort University, 2007CSCI1412-HW-37Machine CodeVery simple low level instructions.A single high level language instruction (e.g. VB) may require many machine code instructions.An integral part of the processor.An instruction has an operation code (opcode), followed by zero or more items of data (operands) De Montfort University, 2007CSCI1412-HW-38Machine CodeFor examplein Zilog Z80 machine code (8-bit processor)instruction C616 in hexadecimal means add the data held at the following location to the current accumulatorsuppose that the SCR currently holds 123416, ACC holds 516 and the contents of memory is as shown below.

What is the sequence the registers are used in?

De Montfort University, 2007CSCI1412-HW-39123416C6161235161016locationvalueOperation codeOperandAdding data to the Acc.SCR (address)MAR (address)MDR (data)CIR(instruction)Acc(data)1234---512341234--512351234--512351234C6-512351234C6C6512351235C6C6512361235C6C651236123510C651236123610C615 De Montfort University, 2007CSCI1412-HW-310123416C6161235161016locationvalueopcodeoperandSequence of ActionsFetchSCR MAR, put address of next instruction into the MAR SCR+1 SCR, point to the next memory locationMAR RAM MDR CIR, read from RAM address (MAR), into the MDR, into the CIRDecodeContents of CIR - instruction number C616 means ... data required ...ExecuteSCR MAR, put address of data into the MAR SCR+1 SCR, point to the next instructionMAR RAM MDR, read from RAM address(MAR), into the MDRStore MDR + ACC ACC, add the MDR and Ac contentsin this case, the result in stored in the accumulator De Montfort University, 2007CSCI1412-HW-311Measuring SpeedsThe System ClockWhat controls the fetch / execute cycle?the system clockthis is a quartz chip that provides pulses at a regular, rapid, rate, like a metronomen.b. not the same as the real date / time clockThe first microprocessor originally ran at 100 KHz, the Pentium IV is now at 1.2 4.0 GHzA clock tick starts the fetch / execute cycleit may take several (perhaps tens of) clock ticks to complete one complex instruction De Montfort University, 2007CSCI1412-HW-313GigahertzThe simplest measure of speed is just the rate at which the system clock ticksusually quoted in Gigahertz (GHz)1 Hertz = 1 cycle per second1 Megahertz = 1 million cycles per second1 Gigahertz = 1 billion cycles per secondThis is meaningful in one type of processore.g. 2.4 GHz Pentium is twice as quick as 1.2 GHzBut is not for comparing different processor typesdifferent processors may take different numbers of cycles to fetch / execute the same instructione.g. a Pentium takes X cycles to load a number into the accumulator, whereas a 68040 takes Y cycles De Montfort University, 2007CSCI1412-HW-314MIPSIn order to overcome the limitations of GHz, some manufacturers prefer to use MIPSmillions of instructions per secondfound by counting the number of cycles (on average) that a processor takes to execute an instructionHowever, this is still not very helpfulwhich instructions !?some instructions may be very short: LOAD ACC,0some instructions may be very longstore value zero into RAM from location 0x1000 to 0x1FFFCan be found by standard benchmarks De Montfort University, 2007CSCI1412-HW-315FLOPSPerhaps, as computers are often used for mathematical calculations, a better measure would be the number of floating point operations that can be carried out per secondFLOPS: floating point operations per secondfound by running standard mathematical benchmarksHowever, what use are FLOPS toa business person using a spreadsheet?a secretary writing letters on a word processor?a computer scientist compiling programs in C++? De Montfort University, 2007CSCI1412-HW-316BenchmarkingThere is no satisfactorily agreed single method of measuring the speed of computersactual system speed also depends on RAM speed, bus speeds, video performance, hard disk speeds, etc.Many magazines set up standard tasks simulating general office / scientific usee.g. Excel / Word running under Windows Vistathese may provide a good comparison of systems, but may only be applicable to one type of computer (Windows PC) for a short amount of timewhat happens when Windows Vista becomes obsolete!? De Montfort University, 2007CSCI1412-HW-317Other Architectural AspectsCachingIntermediate storage - uses high-speed SRAMHolds recently accessed instructions/datahigh probability that these will be re-usedDifferent types of cache:primary cache (Level 1) - in the processor 8Kb - 32 Kbfastest type of cachesecondary (Level 2) also now in the processor512Kb - 1Mb(used to be called cache-on-a-stick - COAST)disk cache (Level 3) - section of RAMspecified by the user (or automatically by operating system) De Montfort University, 2007CSCI1412-HW-319PipeliningTechnique used to increase processing speedProcessor begins to execute a second instruction before first has been completedTherefore several instructions are in the pipelineup to six instructions in the PentiumThe pipeline is divided into segmentssegments are processed concurrentlyAlso used in RAM to preload the next requested memory content De Montfort University, 2007CSCI1412-HW-320ParallelismIntel Pentium processors have a form of parallelism called:single instruction multiple data (SIMD)The same instruction is run on multiple data at the same timeimproves the speed at which sets of data requiring the same operation can be processedmost of these extensions are for floating-point ops.Typically used for complex co-ordinate transformsfound in e.g. 3-D games graphics when a picture is being updated to form the next frame in a motion De Montfort University, 2007CSCI1412-HW-321RAMRandom Access MemoryVolatile memory which loses its data when the power is switched off.Two main types:SRAM. Static RAMDRAM. Dynamic RAM De Montfort University, 2007CSCI1412-HW-322SRAM and DRAMDifferences between static and dynamic RAM:Dynamic RAM must be refreshed or it will lose its dataStatic RAM only needs current to be applied bits do not need to be refreshed.

Both SRAM and DRAM are volatile.

Most modern computers use some form of DRAM for the main memory. De Montfort University, 2007CSCI1412-HW-323SRAMUsed in small amounts in computers where very fast RAM is required, such as in the cache of many CPU's.DRAM is much less expensive than SRAM, but is usually slower and must constantly be refreshed in order to preserve its contents.Types of SRAM include:Asynchronous Static RAM Synchronous Burst Static RAM Pipeline Burst Static RAM

De Montfort University, 2007CSCI1412-HW-324DRAMDRAM each data bit is stored in a separate capacitor. The benefit of this is the avoidance of corruption.Dynamic because it requires refreshing data integrity.Types of DRAM include:SDRAM Synchronous Dynamic Random Access MemoryDDR SDRAM Double Data Rate SDRAM

De Montfort University, 2007CSCI1412-HW-325SDRAM SDRAM - Synchronous Dynamic Random Access Memory.Dynamic because it requires refreshing data integrity.Synchronous because it lines itself up with the computer system bus and processor. The computer's internal clock drives the entire mechanism.Can accept > 1 write command at a time - Pipelining. De Montfort University, 2007CSCI1412-HW-326DDR SDRAM DDR SDRAM (Double Data Rate Synchronous Dynamic Random Access Memory) Achieves nearly twice the bandwidth of single data rate SDRAM by double pumping (transferring data on the rising and falling edges of the clock signal) without increasing the clock frequency.

De Montfort University, 2007CSCI1412-HW-327DDR2 and DDR3DDR2 and DDR3An evolution of DDR, with higher internal bus speeds.DDR2 bus runs at twice the speed of DDR memory.DDR3 at even higher speeds.

Most modern computers use DDR, DDR2 or DDR3 packaged in DIMMs (Dual In-line memory Modules) electrical contacts plug directly into the main board.DIMMS have a 64 bit data bus (as do Pentium processors)SIMMS (now obsolete)have a 32 bit bus

De Montfort University, 2007CSCI1412-HW-328Mainboard Layout De Montfort University, 2007CSCI1412-HW-329

Intel D945GNT Dual-channel DDR2 667 / 533 / 400 memory support PCI Express* x16 graphics connectorTwo PCI Express* x1 connectorsFour Serial ATA ports (3.0 Gb/s)Integrated Intel PRO 10/100 Network ConnectionIntel High Definition Audio with 5.1 Surround Sound Eight Hi-Speed USB 2.0 ports Intel Precision Cooling Technology1Mainboard Layout De Montfort University, 2007CSCI1412-HW-330

A Auxiliary fan connector (optional)B SpeakerC PCI Express x1 bus add-in card connectors [2]D Audio codecE Front panel audio connectorF Ethernet deviceG PCI Conventional bus add-in card connectors [2]H PCI Express x16 bus add-in card connectorI Back panel connectorsJ +12V power connector (ATX12V)K Rear chassis fan connectorL LGA775 processor socketM Intel 82945G GMCHN Processor fan connectorO DIMM Channel A sockets [2]P DIMM Channel B sockets [2] connectorDD Intel 82801G I/O Controller Hub (ICH7)EE SPI flash deviceFF IEEE-1394a controller (optional)GG Front panel IEEE-1394a connectors (optional) [2]HH PCI Conventional bus add-in card connectors

Q SCSI LED connector (optional)R Legacy I/O controllerS Power connectorT Diskette drive connectorU Parallel ATE IDE connectorV BatteryW Front chassis fan connectorX BIOS Setup configuration jumper blockY Serial ATA connectors [4]Z Auxiliary front panel power LED connectorAA Front panel connectorBB Front panel USB connectors [2]CC Chassis intrusionMotherboard in Situ De Montfort University, 2007CSCI1412-HW-331

Cooling can be a problem....

SummaryHow it works!the fetch / execute cycle in detailMeasuring speedsystem clock, GHz, MIPS and FLOPSAdvanced conceptscache, pipelining, parallelismmemory issuesdynamic and static RAM,SIMMS and DIMMSmotherboardscomponent layout De Montfort University, 2007CSCI1412-HW-332