driving performance beyond moore’s...
TRANSCRIPT
© 2018 Arm Limited
Ian SmytheOctober 2018
Driving Performance
Beyond Moore’s Law
2 © 2018 Arm Limited © 2018 Arm Limited 2
Innovation continues to drive growth and performance demands on our compute devices
5G transformationMulti-day battery life SecuritySmall to large screen
Untethered. Connected. Immersive.
3 © 2018 Arm Limited
Last year we announced new DynamIQ processors
>20%More mobile performance vs Cortex-A73
SameSustained
performance as Cortex-A73
+40%Infrastructure performance vs Cortex-A72
Performance leadership in mobile
Best possible power profile
Improved performance in infrastructure
Up to
2xmore
performance
Up to
15%better power
efficiency
Up to
10xmore
configurable
For advanced use cases
Higher sustainerperformance
Edge to cloud scalability
4 © 2018 Arm Limited
Arm Cortex-A portfolio
2008 – 2013 2017201620152014
Year of IP release, volume devices in the subsequent year
Arm big.LITTLE compatible
Cortex-A5xseries
Cortex-A3xseries
Cortex-A9
Well-established, mid-range processor
Cortex-A5/A7
Smallest and lowest power
Armv7-A
Cortex-A15/A17
Infrastructure performance;
mobile efficiency
Cortex-A57Proven
infrastructure performance
Cortex-A72For all
applications
Cortex-A35Smallest, lowest power Armv8-A
Cortex-A53Balanced
performance and efficiency
Cortex-A73For mobile and
consumer
Cortex-A32Smallest, lowest
power 32-bitArmv8-A
Cortex-A55Highest
efficiency mid-range processor
64/32-bit
Cortex-A75Ground-breaking performance for
all markets
64/32-bit
32-bit64/32-bit
64/32-bit64/32-bit64/32-bit
64/32-bit
32-bit
32-bit
32-bit
2018
Cortex-A7xseries
Armv8-AArmv7-A
5 © 2018 Arm Limited
Arm Cortex-A portfolio
2008 – 2013 2017201620152014
Year of IP release, volume devices in the subsequent year
Arm big.LITTLE compatible
Cortex-A5xseries
Cortex-A3xseries
Cortex-A9
Well-established, mid-range processor
Cortex-A5/A7
Smallest and lowest power
Armv7-A
Cortex-A15/A17
Infrastructure performance;
mobile efficiency
Cortex-A57Proven
infrastructure performance
Cortex-A72For all
applications
Cortex-A35Smallest, lowest power Armv8-A
Cortex-A53Balanced
performance and efficiency
Cortex-A73For mobile and
consumer
Cortex-A32Smallest, lowest
power 32-bitArmv8-A
Cortex-A55Highest
efficiency mid-range processor
64/32-bit
Cortex-A75Ground-breaking performance for
all markets
64/32-bit
32-bit64/32-bit
64/32-bit64/32-bit64/32-bit
64/32-bit
32-bit
32-bit
32-bit
2018
Cortex-A76Laptop-class performance
with smartphone efficiency
64/32-bit
Cortex-A7xseries
Armv8-AArmv7-A
6 © 2018 Arm Limited
Designing for user experience from small screens to large
Energy savings and power efficiency
Efficient scheduling
• Power efficient cores for low demand tasks and background services
• Fast switching between cores
Power optimization
• Finer-grained speed control
• Autonomous memory power management
• Fast power on/sleep/off management
7 © 2018 Arm Limited
Designing for user experience from small screens to large
System responsiveness
High single thread performance for fantastic user facing response
• App. launch and closing
• Web browsing
• Productivity applications
Scalability with area efficient octa-core solution
• High CPU availability for performance and responsiveness
8 © 2018 Arm Limited © 2018 Arm Limited 8
Let’s take a closer look
9 © 2018 Arm Limited
Arm Cortex-A76 CPULaptop-class performance, smartphone experience
• Built from the ground up with new microarchitecture capabilities
• Built on innovative DynamIQ technology
• Battery life that can outlast your work day
Longer Battery Life
Better energy efficiency
Performance without compromise
IncreasingProductivity
Increased Machine Learning performance
Intelligent Computing
10 © 2018 Arm Limited
Cortex-A76: Performance efficiency - focus on the user
Cortex-A76 CPU is focused on performance and also performance efficiency
Performance efficiency - extract significantly more performance than any other microarchitecture at similar complexity
Requires intense focus on every aspect of the microarchitecture• More performance from every logic block
Focus on the end-user to enable sustained full-speed performance
• Yes, we also do well on benchmarks
11 © 2018 Arm Limited
Cortex-A76: Front-end
Front-end built to hide latency at high bandwidth
Multi-level branch-target caches
Hybrid indirect predictor - unparalleled prediction capability
Decode/Rename/Commit
Front-end
L1-ITLB
Bra
nch
p
red
icti
on Instruction Fetch
64K I-Cache
12 © 2018 Arm Limited
Cortex-A76: Decode/Rename/Commit
4-instruction/cycle, power-optimized decode
High-density decode/rename
Dispatch to out-of-order core and commit unit
Decode/Rename/Commit
Commit
DQ
De
cod
e
Re
gist
er r
en
ame
4-8
inst
ruct
ion
s/cy
cle
4 M
op
s/cy
cle
8 u
op
s/cy
cle
dis
pat
chDis
pat
ch
Execution core
L1 Data cache / MMU
13 © 2018 Arm Limited
Cortex-A76: Execution core
Uops dispatched to 120-entry issue queue capacity
Dual 128-bit ASIMD/FP execution pipelines
L1 Data cache / MMU
Execution core
IQ
IQ
IQ
Branch
ALU
ALU
ALU/MAC/DIV
FMUL/FADD/FDIV/ALU/IMAC
FMUL/FADD/ALU
Inte
ger
ASI
MD
14 © 2018 Arm Limited
Cortex-A76: Cache hierarchy and performance
Full cache hierarchy is co-optimized for latency and bandwidth
Sophisticated 4th generation prefetcher
256KB-512KB private L2 with 9-cycle LD-use
2M-4M DynamIQ L3 with 26-31 cycle LD-use
50% performance uplift from Cortex-A750
0.5
1
1.5
2
2.5
3
L1 cache L2 cache L3 cache DRAM
Memory hierarchy bandwidthCortex-A76 vs. Cortex-A75
15 © 2018 Arm Limited
Accelerating the performance curve in any workloads
Pushing the single-thread performance
• +25% more integer IPC than the Cortex-A75 CPU• +35% higher ASIMD/FP performance• +90% higher memory bandwidth
Boosting mobile experience
• +28% more Geekbench performance• +35% more Javascript performance
Enabling intelligence at the edge
• 3.9x more AI performance
Baseline IPC - frequency upside from here
1.58x1.79x
1.56x1.77x
2.44x
9.7x
SPECINT2K6 SPECFP2K6 Geekbenchv4
Javascript LMBenchmemcpy
GEMM lowp
Cortex-A73 Cortex-A75 Cortex-A76
IPC comparison - iso-process/-frequency
16 © 2018 Arm Limited
Cortex-A76 CPU delivers premium performance
Peak single-thread performance big.LITTLE performance 5W
Performance(relative scores based on AArch64 SpecInt2K6)
Cortex-A7316nm
Cortex-A7510nm
Cortex-A767nm
Configuration: Cortex-A73 – 2.45GHz, L1 64KB, L3 2MB: Cortex-A75 – 2.8 GHz, L1 64KB, L2 512KB, L3 2MB: Cortex-A76 – 3.3 GHz, L1 64KB, L2 512KB, L3 4MB
2xPerformance improvement
2.1x 1.9x
17 © 2018 Arm Limited
Cortex-A76
Building for the premium experience for advanced process
DynamIQ Shared Unit
CoreLink CCI-550
Cortex-A55
LPDDR4x
Memory SystemIntegrated TrustZone technology
DMC DMC
High-performance Cortex-A76 implementation3+ GHz in 7nm
Increasing Cortex-A55 CPU private L2 cache
Implementing 4MB L3 cache
Optimized memory system
Other IPs
18 © 2018 Arm Limited © 2018 Arm Limited 18
It starts with an ecosystem
1919
Miix 630
NovaGo Always On, Always Connected PCs powered by Snapdragon
Envy x2
Pace of innovation*Requires network connection and will support up to 20 hours of battery life
Yoga
C630
And more…
835
Credit: Qualcomm Technologies, Inc.
Process Technology
Mobile leadership
2008 2009 2010 2011 2012 2015 2016 201720142013
45nm
32nm
22nm
14/16nm
14/16nm
40nm
28nm
14nm12nm
10nm 1st GenSnapdragon 835
10nm 2nd GenSnapdragon 850
14nmSnapdragon 820
20 nmSnapdragon 810
28 nmSnapdragon S4
45nmSnapdragon S1
Time
Pro
ce
ss N
od
e
2018
Credit: Qualcomm Technologies, Inc.
Performance
Performance forscenarios that matter
Balancing powerand performance
Sustained performancethat doesn’t throttle
Credit: Qualcomm Technologies, Inc.
23 © 2018 Arm Limited
Delivering on promises
1.4x 1.3x 1.2x 1.3x1.4x 1.4x
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Speedometer Geekbenchsingle-core
Geekbenchmulti-core
WebXPRT 3 TouchXPRT16 MotionMark 1.0
Cortex-A75 based system performance(relative to Cortex-A73 system)
Improvement across all benchmarks
Over 25% minimum performance uplift
Source: Shrout Research, measured on Lenovo C630 and HP Envy x2 devices
24 © 2018 Arm Limited
1.3x
Web browsing battery life improvement (relative to Cortex-A73 system)
Extended battery life and thermal constraints on real-system
Running your apps longer Multi-day battery life
1.3x
Time improvement between charge(relative to Cortex-A73 system)
Source: Shrout Research, measured on Lenovo C630 and HP Envy x2 devices
25 © 2018 Arm Limited
The evolution of the always on, always connected pc
Image created by Arm based off Shrout Research data: download the full whitepaper at www.shroutresearch.com
26 © 2018 Arm Limited © 2018 Arm Limited 26
The journey ahead
27 © 2018 Arm Limited
Client Compute CPU roadmap
Cortex-A767nm
202020192018
‘Deimos’7nm
‘Hercules’7nm and 5nm
2017
Cortex-A7510nm
Cortex-A7316nm
28 © 2018 Arm Limited
Perf
orm
ance
Path to compute performance leadership with efficiency
Cortex-A73Cortex-A15 Cortex-A57 Cortex-A72 Cortex-A75 Cortex-A76 Deimos Hercules
2.5xincrease
Arm Compute
Intel Core i5 U-series
Single-core performance estimates based on SPECINT2k6
Core i5-4300u22nm
Core i5-6300u14nm
Core i5-7300u14nm
28nm 20nm 16nm 16nm 10nm 7nm 7nm 5nm2013
A performance trajectory surpassing Moore’s law
Unmatched year-over-year Arm CPU performance gains
29 © 2018 Arm Limited
Expanding the mobile experience
• Innovation on mobile from small screens to large is changing the user experience and continues to push growth
• The new premium IP delivers Laptop-class performance
• Arm with its ecosystem is aligning itself to meet customer needs and get ready for 5G evolution for truly connected experiences
3030
Thank YouDankeMerci谢谢ありがとうGraciasKiitos감사합니다धन्यवादתודה
© 2018 Arm Limited