an ising computing to solve combinatorial optimization …€¦ · · 2017-11-07optimization...
TRANSCRIPT
Research & Development group, Hitachi, Ltd., Tokyo, Japan
10/19/2017
5th Berkeley Symposium on Energy Efficient Electronic Systems & Steep Transistors Workshop
Masanao Yamaoka
An Ising Computing
to Solve Combinatorial Optimization Problems
“For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or Confidential Information.”
Difference between NN and Ising model
Nodes connected by bi-directional edges
Optimization problem is main target
1
Neural network
Connection
Ising model
(Boltzman machine)
Directional
(feedfoward NN)Bi-directional
Purpose LearningCombinatorial optimization
problems
Outline
Background
New-paradigm computing
CMOS Ising computer
Measurement results
For practical application
Importance of optimization
Optimization processing used in various industries
In IoT era, data volume large in both edge and cloud
Logisticsoperation
VLSI design Medical diagnosiswith images
Managementstrategy
Route selection Learning plancustomizing
Smart-gridcontrol
Roboticscontrol
Scheduling atwide-scale disaster 3
Example areas that needs optimization processing
Combinatorial optimization problem (1)
City
Pattern of solution candidates
Dis
tan
ce
Route1
Route2Route1
Route2Route3
Solution
Example: travelling salesman problem
Searching the shortest route that visits each city
exactly once and returns to the origin city with list of
cities and distances between cities
4
Combinatorial optimization problem (2)
Problem to explore an optimum solution for
minimum or maximum KPI in given conditions
Difficult to solve by conventional computers due to
enormous candidates of solution with large number
of parameters
Pattern of solutions (2n)
KP
I
: Solution candidates
Optimum solution
KPI: Key Performance Indicator
Exploring
5
Ising model
Ising model: expressing behavior of magnetic spins,
upper or lower directions
Spin status updated by interaction between spins to
minimize system energy
H: Energy of Ising model
si: Spin status (+1/-1)
Jij: Interaction coefficient
hj: External magnetic coefficient
J12 J23
J14 J25J36J45 J56
J47 J58J69
J78 J89
s1s2
s3
s4s5
s6
s7s8
s9j
j
j
ji
jiij hJH sss ,
6
Computing with Ising model
Shape of landscape of Ising model energy same as
KPI plot of combinatorial optimization problem
By mapping original problems to Ising model,
optimum solution acquired as ground state of model
En
erg
yo
f sy
stem
H(K
PI)
Spin status (2n patterns)
Ground state
n: number of spins
7
j
j
j
ji
jiij hJH sss ,
Ising modelOptimization
problems
Energy H KPI
Spin status si Control parameters
Interaction coefficient Jij
Input data(sensor data, etc)
Correspondence of parameters
CMOS Ising computing
Mimicking Ising model with CMOS circuits
Spin status updated by interaction with adjacent
spinsCMOS circuitsIsing model
J12 J23
J14 J25J36J45 J56
J47 J58J69
J78 J89
s1s2
s3
s4s5
s6
s7s8
s9
Ising model
Spin status si , +1/-1
Interaction Jij
s1 J12 J23
J45 J56
J78 J89
J14J25
J36
J47J58
J69
s2 s3
s4 s5 s6
s7 s8 s9
CMOS
Memory "1"/"0"
Interaction coefficient: memoryInteraction operation: digital circuits
CMOS circuit mimicking
Ising model
8
Rule of spin status update
Spin status updated to lower Ising model energy
Coefficient +: same direction
Coefficient - : opposite direction
Majority of adjacent spins effect accepted
Next spin status:in case a>b, s5=+1in case a<b, s5 =-1in case a=b, s5 =+1 or -1
a=number of (+1, +1) or (-1, -1)b=number of (+1, -1) or (-1, +1)(value from adjacent spin, coefficient)
Spin update rulesj
j
j
ji
jiij hJH sss ,
Spin status:
Jij > 0: same direction
Jij < 0: opposite direction
9
Operation of CMOS Ising computer
Spin-update rule achieved by majority voter
Each spin has update circuit and updated in parallel
Short calculation time even when number of spins
is large
SRAM Digital circuits
CMOS Ising computer
s1 J12 J23
J45 J56
J78 J89
J14J25
J36
J47J58
J69
s2 s3
s4 s5 s6
s7 s8 s9
Coefficient J25
Spin s5
Maj
ori
ty v
ote
r
Coefficient J45
Coefficient J58
Coefficient J56
s8
s4
s6
s2
Spin unit 10
CMOS annealing
Only using adjacent spin interaction, spin status
stuck at local minimum status
To avoid local minimum sticking, random status
transition used
Optimum solution not always acquired
Transition to lower energy
(adjacent spin interaction)
Avoidance of local minimum
(random transition)En
erg
yo
f sy
stem
H(K
PI)
Spin status (2n patterns)
Solution
n: number of spins11
Two sequences of 1-bit random numbers input,
propagated, and evaluated at spin interaction
Only two PRNGs provide randomness to whole chip
1-bit PRNG
1-b
it P
RN
G
PRNG: Pseudo Random Number Generator
Zigzag paths Spin inversion using random pulse
Output of the interaction circuit will be
inverted if both of the random pulses
are ‘1’
Majorityvoter
Spin unit
ANDXORSpin
Random pulse 1
Ran
do
m p
ulse 2
M. Hayashi et al., "An Accelerator Chip for Ground-state Searches of the Ising Model with Asynchronous Random Pulse Distribution,"
6th International Workshop on Advances in Networking and Computing
Random transition: random numbers
12
Random pulse control
Spin flip rate gradually lowered for annealing
Spin flip rate controlled by mark ratio of 1bit PRNGs
Mark ratio during ground-state search
Decreasing mark ratio has similar
effect to cooling schedule in
Simulated Annealing
PRNG
Controlling mark ratio of 1bit PRNG
Comparator
Thresholde.g. LFSR
0 (r < Threshold)1 (Otherwise)
1bit PRNG output @High threshold (0.75)
1bit PRNG output @Low threshold (0.25)
r
1bit PRNG
0
1
0
1
0
0.2
0.4
0.6
0.8
1
Mark ratio
Expected spin flip rate Time
Aggressively escape from local minima
Settle to nearby low energy solution
Time
Time
13
Fabrication results: Ising chip
4 mm3 m
m
1k-spin
1k-spin sub-array780 x 380 mm2
SRAM I/F
Items Values
Number of spins 20k (80 x 128 x 2)
Process 65 nm
Chip area 4x3=12 mm2
Area of spin 11.27 x 23.94 =270 mm2
Number ofSRAM cells
260k bits
Spin value: 20k bits
Interaction coefficient: 240k bits
Memory IF 100 MHz
Interaction speed 100 MHz
Operating currentof core circuits
(1.1 V)
Write: 2.0 mA
Read: 6.0 mA
Interaction: 44.6 mA
14
1st generation prototype
Computing node
Configuration 2U rack mount
Operation frequency
100 MHz
Number of spins
40k (2chips)
OS Linux
Ising chips
Ising chips installed on computing node
FPGA installed to control Ising chip
Accessed via LAN cable from PCs/servers
15
Measurement results with random numbers
MAX-cut problem, NP-complete problem, with 20k-
spin solved
Coefficient values set "ABC" appeared at optimum
Optimum solution not always acquired
Time (ms)
0 2 4 6 8 10
0
En
erg
y (x
10,0
00)
123
-1-2-3-4-5
Initial state
5 ms (500,000 steps)
10 ms (1,000,000 steps)
(a)
(b)
(c)
(a)
(b)(c)
16
Operating energy
1,800 times higher energy efficiency for 20k spin
problem than approximation algorithm on CPU
Number of spins (problem size)
En
erg
y ef
fici
ency
vs
app
roxi
mat
e al
go
rith
m
1
10
100
1000
10000
8 64 512 4096 32768
x 1800
Conditions:
Randomly generated Maximum-cut problems, energy for same accuracy solution
Ising chip: VDD=1.1 V, 100-MHz interaction, best solution among 10-times trial is selected.
Approximation algorithm: SG3(*) is operated on Core i5, 1.87 GHz, 10 W/core.
(*): Sera Kahruman et al., "On Greedy Construction Heuristics for the Max-Cut Problem,”
International Journal on Computational Science and Engineering, Volume 3, Number 3/2007, pp. 211-218, 2007. 17
Comparison of Ising computers
Room-temperature operation for easy use
Higher scalability by CMOS process
CMOS annealing Quantum annealing [1]
ApproachIsing computing
Semiconductor (CMOS) Superconductor
Coefficient value
2 bit (+/- 1 or 0) 4 bit
Accuracy Sufficient for application Better
Temperature Room temperature 15 mK
Power 0.05 W 15,000 W with cooling
Scalability(Num. of spins)
20k (65nm)Scalable with CMOS
2048
[1] http://www.dwavesys.com/18
19
How to solve "real" problems
Development of application and software
techniques required for Ising computer
Reduction of traffic jamReduction of logistics
costStable power supply
Systems Traffic Supply chain Power grid
Application
Software
Problem
Mapping
Graph
embedding
Shortest path problem
H(s) = ...
Ising model
on computer
Traveling salesman Max-flow problem
H(s) = ...
J12 J23
J14 J25 J36J45 J56J47 J58 J69
J78 J89
s1 s2 s3
s4 s5 s6
s7 s8 s9
N
i
N
j
N
a
NajiaijDH1 1 1
mod1,sss
2
1 1
2
1 1
11N
i
N
a
ia
N
a
N
i
ia ss
Mapping to Ising-model energy functions
J12=x, J23=y, ...
Hardware
Embedding to Ising hardware
Ising computer
20
Control PC- FPGA control- Software
Reconfigurable FPGA used to trial various structures (King's graph)
2nd generation prototype with FPGA
For application and software development, 2nd
generation FPGA prototype used
Various structures for trials (various topology,
various bit number of coefficient)
21
Demo of CMOS Ising computer
2. Communicationorder allocation
1. Frequency allocationfor wireless radio
3. Server security
4. Image inpainting 6. Noise reduction
8. Community coredetection
7. Facility allocation 9. Machin learning(Boosting)
5. Exploring explosion material
Detected
Detected
22
Ising model (physics model)
Express by Ising model
Ising computer
Transformation algorithm
Transform to HW topology
Problem
mapping
Graph
embedding
Processing
Graph embedding (software technique)
Complex graph embedded to simple graph
Graph embedding to use hardware efficiently
Summary
New computing is necessary for system
optimization with large amount of data.
CMOS Ising computing for combinatorial
optimization problem is proposed.
20k-spin prototype chip achieves 1,800 times
higher power efficiency.
Software techniques and related hardware
techniques are necessary for practical application.
“For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or Confidential Information.”