Intel® Core™ Intel® Core™ Duo ProcessorDuo Processor
Behrooz Jafarnejad
Winter 2006
2006 PC WorldWorld Class Award
July 2006 Intel® Core™ Duo processor named Product of the Year by PC World.
Outline
Microprocessor After Pentium® ProIntel® Core™ Duo Processor OverviewMicroarchitectureIntel Core 2 Duo Vs. AMD AM2Resources
Microprocessor Hall Of Fame
1995: Intel® Pentium® Pro Processor
– Released in the Fall of 1995.
– 5.5 million transistors.
– Designed for 32-bit server and workstation applications.
– Packaged with a second speed-enhancing cache memory chip.
Microprocessor Hall Of Fame
1997: Intel® Pentium® II Processor
– 7.5 million transistor.
– incorporates Intel® MMX™ technology, which is designed specifically to process video, audio and graphics data efficiently.
– high-speed cache memory chip.
Microprocessor Hall Of Fame
1999: Intel® Pentium® III Processor
– 9.5 million transistors.– Using 0.25-micron technology.– 70 new instructions that enhance the performance of:
• Advanced imaging• 3D• Streaming audio, video
Microprocessor Hall Of Fame
2000: Intel® Pentium® 4 Processor
– 42 million transistors.
– Circuit lines of 0.18 microns.
– Intel's first microprocessor, the 4004, ran at 108 KHz, compared to the Intel® Pentium® 4 processor's initial speed of 1.5 GHz. If automobile speed had increased similarly over the same period, you could now drive from San Francisco to New York (about 4100 Km) in about 13 seconds.
Microprocessor Hall Of Fame
2006: The Intel® Core™ Duo processor
– 151 million transistor.– Using 65 nm technology.– 2.33 – 2.50 GHz Clock Frequency.– 4-wide, 14 stage pipeline.– Low power consumption.
Benefits• New Microarchitecture:
– Low Power.– Higher Performance.
• At Home:– Ultra-quiet.– Sleek and low-power computing.
• For IT: – Reduced footprints – Lower power – Energy efficiency across client and server platforms.
• For Mobile Users:– greater computer performance and battery life to enable a variety of
small form factors that enable world-class computing "on the go.”
Performance for an Enhanced Digital Performance for an Enhanced Digital Entertainment ExperienceEntertainment Experience
IntelIntel®® Core™2 Core™2 Duo ProcessorDuo Processor
(Formerly known by the (Formerly known by the codename Conroe)codename Conroe)
New generation of technology New generation of technology
IntelIntel®® Core™ microarchitecture Core™ microarchitecture
High energy efficiencyHigh energy efficiency
Revolutionary performanceRevolutionary performance
IntelIntel®® Wide WideDynamic ExecutionDynamic Execution
IntelIntel®® Advanced AdvancedDigital Media BoostDigital Media Boost
IntelIntel®® Intelligent IntelligentPower CapabilityPower Capability
IntelIntel®® Smart SmartMemory AccessMemory Access
IntelIntel®® Advanced AdvancedSmart CacheSmart Cache
Five Key Innovations
Intel® AdvancedDigital Media Boost
IntelIntel®® Wide WideDynamic ExecutionDynamic Execution
Intel® SmartMemory Access
Intel® AdvancedSmart Cache
4-wide4-wide
14-stage pipeline14-stage pipeline
Macro-fusionMacro-fusion
Intel® IntelligentPower Capability
Five Key Innovations
Intel® SmartMemory Access
Intel® AdvancedSmart Cache
Intel® IntelligentPower Capability
Single-cycleSingle-cycle128-bit SSE128-bit SSE
Intel® WideDynamic Execution
IntelIntel®® Advanced AdvancedDigital Media BoostDigital Media Boost
Five Key Innovations
IntelIntel®® Advanced AdvancedSmart CacheSmart Cache
Intel® AdvancedDigital Media Boost
Intel® SmartMemory Access
Intel® IntelligentPower Capability
Shared L2 cacheShared L2 cache
Intel® WideDynamic Execution
Five Key Innovations
Intel® AdvancedDigital Media Boost
Intel® AdvancedSmart Cache
IntelIntel®® Smart SmartMemory AccessMemory Access
Intel® IntelligentPower Capability
Advanced Pre-fetchAdvanced Pre-fetch
MemoryMemoryDisambiguationDisambiguation
Intel® WideDynamic Execution
Five Key Innovations
Intel® AdvancedDigital Media Boost
Intel® SmartMemory Access
IntelIntel®® Intelligent IntelligentPower CapabilityPower Capability
Advanced Advanced Power GatingPower Gating
Intel® WideDynamic Execution
Intel® AdvancedSmart Cache
Five Key Innovations
Intel® Wide Dynamic Execution
• Fetch• Dispatch: Decode + (Read
from Memory)• Execute• Retire up: Write Back
• Macro-Fusion: combination of certain common x86 instructions into a single instruction for execution.
Pipeline Concept
In Computing, a pipeline is a set of data processing elements connected in series, so that the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in time-sliced fashion; in that case, some amount of buffer storage is often inserted between elements.
Intel® Wide Dynamic Execution
• Dynamic execution is a combination of such techniques:
– Data-Flow Analysis.– Out-of-Order Execution (OoOE).– Speculative Execution.– Super Scalar.Intel first implemented these techniques in the P6
microarchitecture used in the Pentium® Pro processor, Pentium® II processor and Pentium® III processors.
Intel® Wide Dynamic Execution
• It enables delivery of more instructions per clock cycle to improve execution time and energy efficiency.
• Every execution core is 33 percent wider than previous generations, allowing each core to fetch, dispatch, execute and retire up to four full instructions simultaneously.
Intel® Advanced Digital Media Boost
• SIMD:– In computing, SIMD (Single Instruction, Multiple Data)
is a technique employed to achieve data level parallelism, as in a vector or array processor.
• SSE (Streaming SIMD Extensions) – is a SIMD instruction set designed by INTEL and
introduced in 1999 in their Pentium III series processors as a reply to AMD's 3DNow! (which had debuted a year earlier).
– contains 70 new instructions.– SSE2/SSE3 are later versions of SSE.
Intel® Advanced Digital Media Boost
• Enables these 128-bit instructions to be completely executed at a throughput rate of one per clock cycle, effectively doubling the speed of execution for these instructions as compared to previous generations.
• This feature significantly improves performance when executing Streaming SIMD Extension (SSE/SSE2/SSE3) instructions:– Video, Speech and Image (MPEG).– Photo Processing.– Encryption.
Intel® Advanced Smart Cache
The Intel® Advanced Smart Cache is a multi-core optimized cache that significantly reduces latency to frequently used data, thus improving performance and efficiency by increasing the probability that each execution core of a multi-core processor can access data from a higher-performance, more efficient cache subsystem.
Intel® Smart Memory Access
• Optimizing the use of the available data bandwidth from the memory subsystem .
• Includes a new capability called “Memory Disambiguation“, which increases the efficiency of out-of-order processing by providing the execution cores with the built-in intelligence to speculatively load data for instructions that are about to execute before all previous store instructions are executed.
Intel® Intelligent Power Capability
• A set of capabilities designed to reduce power consumption and design requirements.
• This feature manages the runtime power consumption of all the processor's execution cores and allocates energy to the part which needs energy.
FeatureFeature DescriptionDescription FunctionFunction BenefitBenefit
IntelIntel® ® Advanced Advanced Smart CacheSmart Cache
Up to 4MB shared and multi-Up to 4MB shared and multi-core optimized core optimized L2 cacheL2 cacheHigher L2 cache to Higher L2 cache to processor core bandwidthprocessor core bandwidth
Improves execution core access Improves execution core access to data in high perf. to data in high perf. L2 cacheL2 cacheDynamically allocates cache Dynamically allocates cache based on core workload- entire based on core workload- entire L2 cache can be allocated to L2 cache can be allocated to one core (dedicated L2 for each one core (dedicated L2 for each core in PDP and K8 DC)core in PDP and K8 DC)
Better performance on Better performance on single and multithreaded single and multithreaded applicationsapplications
IntelIntel®® Advanced Advanced Digital Media Digital Media BoostBoost
Single cycle SSE/2/3 Single cycle SSE/2/3 instruction executioninstruction execution
Allows 128 bit SSE/2/3 Allows 128 bit SSE/2/3 instructions to execute in a instructions to execute in a single clock cycle (versus 2 single clock cycle (versus 2 cycles for PDP, Yonah-DC, and cycles for PDP, Yonah-DC, and K8 DC)K8 DC)
Better performance on Better performance on video, gaming and video, gaming and multimedia applications multimedia applications (Applications that rely on (Applications that rely on SSE instructions)SSE instructions)
IntelIntel®® Wide Wide Dynamic Dynamic ExecutionExecution
Efficient 4-wide, 14 Efficient 4-wide, 14 stage pipelinestage pipeline
Executes 4 instructions per clock Executes 4 instructions per clock (versus 3 per clock with PDP, (versus 3 per clock with PDP, Yonah-DC, and K8 DC)Yonah-DC, and K8 DC)
Better performance on Better performance on multiple application types multiple application types and user environmentsand user environments
IntelIntel® ® Intelligent Intelligent Power CapabilityPower Capability
Powers on processor Powers on processor elements only when neededelements only when neededMore precise control of More precise control of power to buses and arrays power to buses and arrays
Conroe 65W desktop Conroe 65W desktop mainstream TDPmainstream TDPMerom continues low power Merom continues low power mobile processor direction mobile processor direction
Can help enable Can help enable quieter, lower power system quieter, lower power system designsdesigns
IntelIntel®® Smart Smart Memory AccessMemory Access
Improved pre-fetchersImproved pre-fetchersOut of order memory accessOut of order memory access
Feeds the Intel Wide Dynamic Feeds the Intel Wide Dynamic Execution engine (IE, “fuel- Execution engine (IE, “fuel- injection” for the Core engine)injection” for the Core engine)Benefits for memory operations Benefits for memory operations reduce latencyreduce latency
Better performance on all Better performance on all types of applications and types of applications and user environmentsuser environments
New levels of performance and power efficiency based on Intel® CoreTM Microarchitecture
Intel Core 2 Duo Vs. AMD AM2
The results from SYSmark 2004SE, which simulates real-life workloads for both Internet Content Creation and Office Productivity. The content-creation part uses apps like Photoshop, 3ds Max, Dreamweaver, and more, while the office-productivity tests use typical office apps, such as PowerPoint, Word, and Excel.
Intel Core 2 Duo Vs. AMD AM2
Intel Core 2 Duo Vs. AMD AM2
Intel Core 2 Duo Vs. AMD AM2
Resources
• Intel.com• PCWorld.com• ExtremeTech.com• Wikipedia.org• Microsoft.com