1. 2 cad challenges for designing a high frequency multi-core soc implementation of the...

34
1

Upload: calvin-rich

Post on 28-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

11

Page 2: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

22

CAD Challenges For Designing CAD Challenges For Designing A High Frequency Multi-Core A High Frequency Multi-Core SoC Implementation Of The SoC Implementation Of The

First-Generation CELL Processor First-Generation CELL Processor

Neeraj PaliwalNeeraj PaliwalSenior Engineering ManagerSenior Engineering Manager

Advanced Processor DevelopmentAdvanced Processor Development

IBM Corporation, Austin TXIBM Corporation, Austin TX

Page 3: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

33

OutlineOutlineIntroduction Introduction Design Goals Design Goals

Design Goal Design Goal Design Challenges Design Challenges

Challenges Challenges CAD Methodology CAD Methodology

CAD Methodology DetailsCAD Methodology Details

Lessons Learned Lessons Learned Recommendation Recommendation

ConclusionConclusion

Page 4: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

44

Digital Media ApplicationsDigital Media Applications

Page 5: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

55

Design GoalsDesign GoalsDesign for natural human interactionDesign for natural human interaction– Realism requires Supercomputer attributes with extreme floating Realism requires Supercomputer attributes with extreme floating

point capabilitiespoint capabilities2 TFLOPS in the new Playstation3 System2 TFLOPS in the new Playstation3 System

Set new performance standardSet new performance standard– Exploits parallelism while achieving high frequencyExploits parallelism while achieving high frequency

Multiple HF CoresMultiple HF Cores

Foster innovation in Design & MethodologyFoster innovation in Design & Methodology– Holistic Design approachHolistic Design approach– Scalability and Flexibility through Modular designScalability and Flexibility through Modular design

Page 6: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

66

OutlineOutlineIntroduction Introduction Design Goals Design Goals

Design Goal Design Goal Design Challenges Design Challenges

Challenges Challenges CAD Methodology CAD Methodology

CAD Methodology DetailsCAD Methodology Details

Lessons Learned Lessons Learned Recommendation Recommendation

ConclusionConclusion

Page 7: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

77

Design ChallengesDesign Challenges

Triple ConstraintsTriple Constraints– PowerPower– FrequencyFrequency– CostCost

Design TrendsDesign Trends– SoC and Giga Scale IntegrationSoC and Giga Scale Integration– Multi-Core on a ChipMulti-Core on a Chip

Time to MarketTime to Market

Page 8: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

88

System Trends Toward IntegrationSystem Trends Toward Integration

Increased integration is driving processors to take on Increased integration is driving processors to take on many functions typically associated with systemsmany functions typically associated with systems– Integration forces processor developers to address off-load and Integration forces processor developers to address off-load and

acceleration in the design of the processoracceleration in the design of the processor– Integration of bridge chip functionalityIntegration of bridge chip functionality

Memory

Accel

Southbridge

Processor

Northbridge Memory

Cell

Processor

IO IO

Page 9: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

99

Giga Scale IntegrationGiga Scale Integration

CPU

Media

SecurityConfig.

IOSynergistic

Processor

Mem.

Contr.

Synergistic

Processor

64b Power

Processor

CPU

Media

Processor

Security

Processor

Network

Processor

Streaming

Graphics

Processor

NIC

GPU

Hardwired

Function

Programmable

ASIC

Cell

Need an innovative Design Methodology for High Frequency Multi-Core SoC

Page 10: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

1010

Implementation ChallengesImplementation Challenges

Technology ScalingTechnology Scaling– Minimize cross chip variations in delay and leakageMinimize cross chip variations in delay and leakage– Array bit cell stability, writability, yieldArray bit cell stability, writability, yield– Growing impact of wire RC vs. device speedGrowing impact of wire RC vs. device speed

11FO4 design within air-cooled power envelope11FO4 design within air-cooled power envelope– Power, Clock, Signal Distribution variation due to hot spots, inductance Power, Clock, Signal Distribution variation due to hot spots, inductance

effects, etceffects, etc– Multi Clock domainsMulti Clock domains– Intra-Chip interconnectionsIntra-Chip interconnections– Global Optimization with “triple constraints”: Frequency, Power, Cost Global Optimization with “triple constraints”: Frequency, Power, Cost

(Die Size and Yield)(Die Size and Yield)

Page 11: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

1111

OutlineOutlineIntroduction Introduction Design Goals Design Goals

Design Goal Design Goal Design Challenges Design Challenges

Challenges Challenges CAD Methodology CAD Methodology

CAD Methodology DetailsCAD Methodology Details

Lessons Learned Lessons Learned Recommendation Recommendation

ConclusionConclusion

Page 12: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

1212

Holistic Design ApproachHolistic Design ApproachDesignDesign– Cover all aspects of the designCover all aspects of the design

Circuits, Cores, Chips, System, SoftwareCircuits, Cores, Chips, System, Software

Development processDevelopment process– Fast ConvergenceFast Convergence

Top Down / Bottom UpTop Down / Bottom UpEarly Design Planning / Final ConvergenceEarly Design Planning / Final Convergence

– Adaptability and ScalabilityAdaptability and ScalabilityFor long duration projects need to allows for refinement of ideasFor long duration projects need to allows for refinement of ideas

Organizational structureOrganizational structure– Building the best processor development team spans across Building the best processor development team spans across

the globethe globe– Enable Learning and Adaptive to changes in marketEnable Learning and Adaptive to changes in market

Page 13: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

1313

Design Methodology PhilosophyDesign Methodology PhilosophyMicro architecture definition must go hand-in-Micro architecture definition must go hand-in-hand with physical floorplan definition – wire hand with physical floorplan definition – wire delays are major component of performancedelays are major component of performance

““Divide and Conquer”Divide and Conquer”– Chip hierarchy: macros, units, islands, partitions and chipChip hierarchy: macros, units, islands, partitions and chip– Macro is lowest level floorplannable objectMacro is lowest level floorplannable object– Physical partitioning represented in RTLPhysical partitioning represented in RTL– Each level of hierarchy verified independently (DRC, LVS, Each level of hierarchy verified independently (DRC, LVS,

Equivalence checking)Equivalence checking)

Formal Equivalence Checking required between Formal Equivalence Checking required between RTL and schematicRTL and schematic– Latch points must match – no retimingLatch points must match – no retiming– Performed hierarchically up to the chip levelPerformed hierarchically up to the chip level

VHDL drives physical designVHDL drives physical design

Derived data is auditedDerived data is audited

Page 14: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

1414

Schematic Illustration of Design HierarchySchematic Illustration of Design Hierarchy

Page 15: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

1515

High-Level

Design

Logic Design

Circuit/Physical

Design & Integration

Verification

Global

Processes

Hardware

Validation

Software

Development

Design Specs

Customer Reqs.

Business Plan

RTL Design

Mfg. Data

Workloads

S/W Dev. Kit

STI Development Process

To Manufacturing Sample Hardware To Customers

Page 16: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

1616

OutlineOutlineIntroduction Introduction Design Goals Design Goals

Design Goal Design Goal Design Challenges Design Challenges

Challenges Challenges CAD Methodology CAD Methodology

CAD Methodology DetailsCAD Methodology Details

Lessons Learned Lessons Learned Recommendation Recommendation

ConclusionConclusion

Page 17: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

1717

Chip/UnitVHDL

CustomVHDL

ArrayVHDL

RLMVHDL

Portals

DADB

MESAAWAN

Sim env(Fusion,

Specman)

Testcases

GenesysProXGEN

Portals/BooleDozer Portals

TestPat

TECH

ChipBench or CadenceFloorplan

Routing

Einstimer

TECH

Layout

CadenceComposer

DeviceVIMPowerSpice

Cadence/GYMLayout Editor

Layout

Verity

LVSERIE

PlacementPDSrtl

TECH

CadenceRoute

Layout

DeviceVIM

Verity

DCMRules

3DX

PDM

GlobalNoise

Device VIM

EinsTLT

DCM TimingRule

Gatemaker

TPGTECH

Macro Noise

Noise Rule

Echk

Merged Layout

NiagaraDRC, LVS

STI Chip Design Flow

PhysVIM

NoiseRules

DesignAudit

CPAMLAVA

PowerRule

LVS

TexPower

CadenceComposer

DeviceVIMPowerSpice

Ultrasim

Cadence/GYMLayout Editor

Layout

VerityESPCV

LVSERIE

SVV

ERIE

Page 18: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

1818

Design Data ManagementDesign Data Management

Seven sites & 450+ designersSeven sites & 450+ designers– Need a way to verify that every check has been run on every Need a way to verify that every check has been run on every

piece of data that is going on the chip => this process is called piece of data that is going on the chip => this process is called AuditAudit

– Over the course of the chip development, snapshots of the chip Over the course of the chip development, snapshots of the chip data are going to be needed so that different design teams can data are going to be needed so that different design teams can work with data that is of a certain quality. A work with data that is of a certain quality. A level level can be created can be created to identify that data => this process is called Promoteto identify that data => this process is called Promote

Page 19: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

1919

Circuit Design PhilosophyCircuit Design Philosophy

Strict design guidelines to minimize design Strict design guidelines to minimize design variationsvariations– Layout topology check and DFM rules for yieldLayout topology check and DFM rules for yield– Circuit topology and electrical checksCircuit topology and electrical checks– Global active clock pulse limiter for dynamic circuitsGlobal active clock pulse limiter for dynamic circuits– Hold time margin scale with clock path delayHold time margin scale with clock path delay

Reduce design sensitivity to technology Reduce design sensitivity to technology leakageleakage– Limited dynamic logic circuit usageLimited dynamic logic circuit usage– No Low-Vt devicesNo Low-Vt devices

Array yield focusArray yield focus– Array redundancy for bit cell stability failsArray redundancy for bit cell stability fails– Reduced cell stress during readReduced cell stress during read

Page 20: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

2020

Clock PhilosophyClock Philosophy

Clock Distribution using Grid-Tree approachClock Distribution using Grid-Tree approach– Minimal global clock skew – HOLD margin built into Minimal global clock skew – HOLD margin built into

latch timing rule latch timing rule – Do not include clock arrival times in chip static timing Do not include clock arrival times in chip static timing

– eliminates dependency on clock distribution – eliminates dependency on clock distribution analysis analysis

– Clock Distribution area is pre-allocated and tuned Clock Distribution area is pre-allocated and tuned concurrently with unit integrationconcurrently with unit integration

Main Mesh

Page 21: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

2121

Timing Practices – “Fast Convergence”Timing Practices – “Fast Convergence”

Macro partitioning encouraged to be on Macro partitioning encouraged to be on timing/latch boundariestiming/latch boundaries

Unit/Partition/Chip level static timing done early Unit/Partition/Chip level static timing done early and often - progressively improving accuracyand often - progressively improving accuracy– Shell rules -> schematic based rules -> layout extracted Shell rules -> schematic based rules -> layout extracted

rulesrules– Steiner routes -> add wire codes -> 3D extraction -> noise Steiner routes -> add wire codes -> 3D extraction -> noise

upliftuplift

All latches treated as hard timing boundaries, no All latches treated as hard timing boundaries, no transparencytransparency

Transistor level static timing required for all Transistor level static timing required for all macrosmacros

Page 22: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

2222

Hierarchical Timing ExampleHierarchical Timing ExampleTiming at 4 Levels of Timing at 4 Levels of Hierarchy:Hierarchy:

Unit (eg: sfx)Unit (eg: sfx) Island (eg: spu core)Island (eg: spu core) Partition (eg: spc)Partition (eg: spc) ChipChip

Hierarchical approach breaks Hierarchical approach breaks down larger problem into down larger problem into manageable pieces (Units)manageable pieces (Units)

Chip Timing run times all Chip Timing run times all paths across all hierarchies.paths across all hierarchies.

Internal Macro Timing Closed Internal Macro Timing Closed via EinsTLT but ALL paths via EinsTLT but ALL paths visible in chip runvisible in chip run

ChipPartition

Island

Unit A

Macro

Macro

Macro

Unit B

Page 23: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

2323

Noise Analysis ExampleNoise Analysis ExampleMacro Analysis Unit/Chip Analysis

Noise analysis with focus on transistors and wires

Global analysis with focus on behavior of wires

Page 24: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

2424

Power Management PracticesPower Management Practices

Dynamic power is controlled by fine-grain Dynamic power is controlled by fine-grain clock gatingclock gating

Leakage power is managed by adding lower Leakage power is managed by adding lower vt devices only where necessaryvt devices only where necessary

Accurate power estimationAccurate power estimation– Macro level uses circuit simulation and generates a power Macro level uses circuit simulation and generates a power

rule (0-50% input switching)rule (0-50% input switching)– Partition/Chip level uses behavior simulation with specific Partition/Chip level uses behavior simulation with specific

workloads and macro level power rulesworkloads and macro level power rules

Page 25: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

2525

Integration FlowIntegration FlowVHDL To Finished LayoutVHDL To Finished Layout

Common Code And Methodology Infrastructure With RLMCommon Code And Methodology Infrastructure With RLM

Additional Steps Unique To Unit ConstructionAdditional Steps Unique To Unit Construction– Generate Power BussesGenerate Power Busses– Buffer Planning/InsertionBuffer Planning/Insertion– Generate hierarchy design constraintsGenerate hierarchy design constraints– Decap InsertionDecap Insertion– Unit Clock Router, minimize powerUnit Clock Router, minimize power– Routing with noise awareness, wire bendingRouting with noise awareness, wire bending– Generate Power and Redundant ViasGenerate Power and Redundant Vias– Verification and Analysis: Extraction, Timing, IREM, Noise, Meth Verification and Analysis: Extraction, Timing, IREM, Noise, Meth

Check, Density Check, Yield Rule Check, DRC/LVS, VerityCheck, Density Check, Yield Rule Check, DRC/LVS, Verity

Saved Parameters For Each Design Making Rebuild SimpleSaved Parameters For Each Design Making Rebuild Simple– Use Of Existing Designs As Template For New DesignsUse Of Existing Designs As Template For New Designs

Page 26: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

2626

Hot Spot AnalysisHot Spot Analysis

Extensive thermal analysis Extensive thermal analysis early in the design cycleearly in the design cycle

Power maps created for use Power maps created for use with package and heat sink with package and heat sink models.models.

Steady state and transient Steady state and transient thermal behavior simulatedthermal behavior simulated

Analysis feedback to chip Analysis feedback to chip floorplan and thermal sensor floorplan and thermal sensor designdesign

Page 27: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

2727

Hierarchical VerificationHierarchical Verification

Top Down Specification / Bottom up Top Down Specification / Bottom up ImplementationImplementation

Test Generation: provide simulation with Test Generation: provide simulation with good stimulusgood stimulus

Model Build, Simulation, and AnalysisModel Build, Simulation, and Analysis

Formal VerificationFormal Verification

Page 28: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

2828

Test / Pervasive Design PracticesTest / Pervasive Design Practices

Distributed test functionsDistributed test functions– LBIST engine for coresLBIST engine for cores– ABIST engine for arraysABIST engine for arrays

Distributed debug featuresDistributed debug features– Common debug busCommon debug bus– Centralized trace arrayCentralized trace array

Centralized test and pervasive controlCentralized test and pervasive control– Common strategy for logic debug and performance monitoringCommon strategy for logic debug and performance monitoring– Monitor some activity externallyMonitor some activity externally

Early focus on design bring upEarly focus on design bring up– At speed test (internal chip scan, ABIST, programmable LBIST)At speed test (internal chip scan, ABIST, programmable LBIST)– On chip logic analyzer for debugOn chip logic analyzer for debug– On chip performance monitorOn chip performance monitor– Isolate, start, stop, step controls for lab debug.Isolate, start, stop, step controls for lab debug.

Page 29: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

2929

OutlineOutlineIntroduction Introduction Design Goals Design Goals

Design Goal Design Goal Design Challenges Design Challenges

Challenges Challenges CAD Methodology CAD Methodology

CAD Methodology DetailsCAD Methodology Details

Lessons Learned Lessons Learned Recommendation Recommendation

ConclusionConclusion

Page 30: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

3030

LessonsLessonsLearnedLearned

Data Translation Time Data Translation Time Open Access DB Open Access DB

Early PDV Planning Early PDV Planning Black box approach Black box approach

Layout automation Layout automation Migration and DFM friendly layouts Migration and DFM friendly layouts

Synthesis to layout loop Synthesis to layout loop Physical/DFM aware synthesis Physical/DFM aware synthesis

Hardware resource Hardware resource Linux based CAD flow for better Linux based CAD flow for better ROI and TATROI and TAT

Communication Communication Wiki based documentation system Wiki based documentation system

Multiple sites and IT/OS Issues Multiple sites and IT/OS Issues Regression suite Regression suite

RecommendationRecommendation

Page 31: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

3131

OutlineOutlineIntroduction Introduction Design Goals Design Goals

Design Goal Design Goal Design Challenges Design Challenges

Challenges Challenges CAD Methodology CAD Methodology

CAD Methodology DetailsCAD Methodology Details

Lessons Learned Lessons Learned Recommendation Recommendation

ConclusionConclusion

Page 32: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

3232

ConclusionsConclusions

The CELL processor, a multi-core design, was The CELL processor, a multi-core design, was successfully implemented usingsuccessfully implemented using– Innovative design methodologyInnovative design methodology– Good design practicesGood design practices– Rules for modularity and reuseRules for modularity and reuse– Triple Constraints for optimum design pointTriple Constraints for optimum design point

Correct operation has been observed with good Correct operation has been observed with good Frequency range (over 3.2GHz)Frequency range (over 3.2GHz)

Sony/SCEI announced PS3 System in 5/05Sony/SCEI announced PS3 System in 5/05

Recommendations being implemented in the next Recommendations being implemented in the next generation chips!generation chips!

Page 33: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

3333

AcknowledgementAcknowledgement

The Authors: Dac Pham (APDAC 2006 Presentation), Han-The Authors: Dac Pham (APDAC 2006 Presentation), Han-Werner Anderson, Erwin Behnen, Mark Bolliger, Sanjay Werner Anderson, Erwin Behnen, Mark Bolliger, Sanjay Gupta, Peter Hofstee, Paul Harvey, Charles Johns, Jim Kahle, Gupta, Peter Hofstee, Paul Harvey, Charles Johns, Jim Kahle, Atsushi Kameyama, John Keaty, Bob Le, Sang Lee, Tuyen Atsushi Kameyama, John Keaty, Bob Le, Sang Lee, Tuyen Nguyen, John Petrovick, Mydung Pham, Juergen Pille, Nguyen, John Petrovick, Mydung Pham, Juergen Pille, Stephen Posluszny, Mack Riley, Joseph Verock, James Stephen Posluszny, Mack Riley, Joseph Verock, James Warnock, Steve Weitzel, Dieter Wendel.Warnock, Steve Weitzel, Dieter Wendel.

Deep collaboration and many contributions from the entire Deep collaboration and many contributions from the entire SONY-Toshiba-IBM team who worked tirelessly side-by-side SONY-Toshiba-IBM team who worked tirelessly side-by-side on the design of this processor.on the design of this processor.

The executive management teams of the three companies The executive management teams of the three companies who provided management insight and created the right who provided management insight and created the right business conditions for this project.business conditions for this project.

Page 34: 1. 2 CAD Challenges For Designing A High Frequency Multi-Core SoC Implementation Of The First-Generation CELL Processor Neeraj Paliwal Senior Engineering

3434

Thank You