software and architectures for large-scale quantum computing

70
Software and Architectures for Large-Scale Quantum Computing Fred Chong Seymour Goodman Professor of Computer Architecture Department of Computer Science University of Chicago with Ken Brown, Margaret Martonosi, Diana Franklin, Ravi Chugh, John Reppy, Ali Javadi Abhari, Jeff Heckey, Daniel Kudrow, Shruti Patil, Adam Holmes, Alexey Lvov, Sergey Bravyi (GATech, Princeton, UChicago, IBM)

Upload: others

Post on 08-May-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Software and Architectures for Large-Scale Quantum Computing

Software and Architectures for Large-Scale Quantum Computing

Fred Chong Seymour Goodman Professor of Computer Architecture Department of Computer Science University of Chicago with Ken Brown, Margaret Martonosi, Diana Franklin, Ravi Chugh, John Reppy, Ali Javadi Abhari, Jeff Heckey, Daniel Kudrow, Shruti Patil, Adam Holmes, Alexey Lvov, Sergey Bravyi (GATech, Princeton, UChicago, IBM)

Page 2: Software and Architectures for Large-Scale Quantum Computing

F. Chong -- QC 2

Why Quantum Computing?

n  Factorization (Shor’s Algorithm) q  n3 instead of exponential

n  Search (Grover’s Algorithm) q  function evaluation q  Sqrt(n) instead of n

(CACM 2010)

Page 3: Software and Architectures for Large-Scale Quantum Computing

Progress in QC Algorithms

3

0

50

100

150

200

250

300

1980 1990 2000 2010 2020

http://math.nist.gov/quantum/zoo/

Page 4: Software and Architectures for Large-Scale Quantum Computing

F. Chong -- QC 4

Recent excitement for Quantum Computing

n  5 and 7-bit machines were built 15 years ago [Vandersypen00, Laflamme99]

n  Dwave 2048-bit quantum annealing n  Substantial investments by Google, Microsoft,

IBM, USA, UK n  Large-scale machine in 10-20 years

(Dwave)

Page 5: Software and Architectures for Large-Scale Quantum Computing

This Talk

n  A systems perspective to accelerate QC development q  Scalable architectures to guide device

development q  Software systems to enable pre-machine, large-scale applications work

n  Eg. 106 increase in efficiency in quantum chemistry (Microsoft) [arXiv:1403.1539v2]

5

(Chris Monroe, UMD)

Page 6: Software and Architectures for Large-Scale Quantum Computing

F. Chong -- QC 6

Outline

n  Introduction to Quantum Computing n  Lessons Learned

q  Specialization for reliability, parallelism, and performance q  Managing compiler resources for deep optimization q  Dynamic code generation for arbitrary rotations

n  Future research q  Validation of large quantum programs q  Network routing to schedule surface code operations

Page 7: Software and Architectures for Large-Scale Quantum Computing

INTRODUCTION TO QUANTUM COMPUTING

04:37 7

Page 8: Software and Architectures for Large-Scale Quantum Computing

F. Chong -- QC 8

Quantum Bits (qubit)

n  1 qubit probabilistically represents 2 states|a> = C0|0> + C1|1>

n  Every additional qubit doubles # states |ab> = C00|00> + C01|01> +

C10|10> +C11|11> n  Quantum parallelism on an exponential

number of statesn  But measurement collapses qubits to single

classical values

+

Page 9: Software and Architectures for Large-Scale Quantum Computing

F. Chong -- QC 9

( Vandersypen, Steffen, Breyta, Yannoni, Sherwood, and Chuang, 2001 )

7-qubit Quantum Computer

•  pentafluorobutadienyl cyclopentadienyldicarbonyliron

complex

•  Bulk spin NMR: nuclear spin qubits •  Decoherence in 1 sec; operations at 1 KHz •  Failure probability = 10-3 per operation •  Potentially 100 sec @ 10 KHz = 10-6 per op

Page 10: Software and Architectures for Large-Scale Quantum Computing

F. Chong -- QC 10

Not Gate

n  Flips probabilities for |0> and |1> n  Conservation of energy n  Reversibility => unitary matrix

(* means complex conjugate)

1 0 X GateBit-flip, Not

= + Xα β

α β 1 1 0

0

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

1222 =+=∑ βαCi

i

IXX T =⎥⎦

⎤⎢⎣

⎡⎥⎦

⎤⎢⎣

⎡=

0110

0110

)( *

Page 11: Software and Architectures for Large-Scale Quantum Computing

F. Chong -- QC 11

Controlled Not

n  Control bit determines whether X operates n  Control bit is affected by operation

01 00 + + 10 11 +

Controlled Not Controlled X

CNot =

X

b a d c 0

0

1 0 0 1

0 0 0

0 0

0 0

0 1

1

a b c d

Page 12: Software and Architectures for Large-Scale Quantum Computing

F. Chong -- QC 12

Universal Quantum Operations

T Gate

Z Gate Phase-flip

Controlled Not Controlled X

CNot

H Gate Hadamard

T

Z

H

X

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

21|β)(α0|β)(α

βα

1111

21 >−+>+

=⎥⎦

⎤⎢⎣

⎡⎥⎦

⎤⎢⎣

>−>=⎥⎦

⎤⎢⎣

⎡⎥⎦

⎤⎢⎣

⎡1|β0|α

βα

1-001

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

>+>+>+>=

⎥⎥⎥⎥

⎢⎢⎢⎢

⎥⎥⎥⎥

⎢⎢⎢⎢

11|e10|d01|b00|a

dcba

0100100000100001

>+>=⎥⎦

⎤⎢⎣

⎡⎥⎥

⎢⎢

⎡1|βe0|αe

βα

e0

0e 8iπ

8iπ-

8iπ

8iπ-

Page 13: Software and Architectures for Large-Scale Quantum Computing

F. Chong -- QC 13

Reliability is hard

n  Quantum computing should be hard q  Short lived q  Small systems

n  Can’t copy data (no-cloning theorem) q  Need to protect it

n  Can’t measure data q  How do we detect errors?

(Credit: Harald Risch)

Page 14: Software and Architectures for Large-Scale Quantum Computing

F. Chong -- QC 14

Quantum Error Correction

X12 X23 Error Type Action +1 +1 no error no action +1 -1 bit 3 flipped flip bit 3 -1 +1 bit 1 flipped flip bit 1 -1 -1 bit 2 flipped flip bit 2 (3-qubit code)

Page 15: Software and Architectures for Large-Scale Quantum Computing

F. Chong -- QC 15

Syndrome Measurement

0

1 Y Y

2 Y Y

X X

1 '

2 '

12 X

Page 16: Software and Architectures for Large-Scale Quantum Computing

F. Chong -- QC 16

3-bit Error Correction

X

X

X Y

Y

Y

1 A

0 A

2 Y

1 Y

0 Y

12 X

01 X X

2 '

1 '

0 '

X

X X

Page 17: Software and Architectures for Large-Scale Quantum Computing

UCSB 4/04 F. Chong -- QC 17

Error Correction is Crucial

n  Need continuous error correction q  can operate on encoded data [Shor96, Steane96, Gottesman99]

n  Threshold Theorem [Ahanorov 97] q  failure rate of 10-4 per op can be tolerated

n  Practical error rates are 10-6 to 10-9

Page 18: Software and Architectures for Large-Scale Quantum Computing

18

Concatenated Codes

Reliability increases doubly

exponentially.

Exponentially slower.

Exponentially greater resources.

1 logical qubit

Level 1: 7 physical qubits

Level 2: 49 physical qubits

Concatenated Steane Code

Page 19: Software and Architectures for Large-Scale Quantum Computing

F. Chong -- QC 19

Error Correction Overhead

Recursion Storage Operations Min. time

(k) (7k) ( 153k ) ( 5k ) 0 1 1 1 1 7 153 5 2 49 23,409 25 3 343 3,581,577 125 4 2,401 547,981,281 625 5 16,807 83,841,135,993 3125

•  7-qubit code [Steane96], applied recursively

Page 20: Software and Architectures for Large-Scale Quantum Computing

20

•  Trapping electrodes are attached on aluminum substrate •  Qubits are stored in the internal electronic states of each ion •  Lasers (or microwaves) implement logic gates and measurement •  Sympathetic recooling ions reduce vibrational heating

Trapped Ions for Quantum Computation

Electrode

Substrate Ion Space

Cirac and Zoller, PRL, v74, 1995; Kielpinski et. al. Nature, v417, 2002

Page 21: Software and Architectures for Large-Scale Quantum Computing

21

Quantum Teleportation?

n  Two ions are “entangled” in an inseparable state. n  One is sent to Alice and one to Bob n  With some error, Alice can force Bob’s ion into

resembling her data ion. n  By sending Bob two-bits of classical information of

what the error is, Bob can accurately recreate the Data ion.

Alice Bob

“Reliable” classical information channel

Data Ion

Entangled Ions (aka EPR pair)

Bennett, et. al. PRL, v70, 1993; Barrett, et. al. Nature 429, 2004

Page 22: Software and Architectures for Large-Scale Quantum Computing

LESSON 1: SPECIALIZATION

04:37 22

Page 23: Software and Architectures for Large-Scale Quantum Computing

23

Classical Control Processors

Logical Qubit

R R

Logical Qubit

R

Logical Qubit

R

Logical Qubit

R

Logical Qubit

R

Cla

ssic

al C

ontr

ol P

roce

ssor

s

“Quantum FPGA”

[Metodi et al, Micro05]

Page 24: Software and Architectures for Large-Scale Quantum Computing

24

Factoring an Integer

n  128-bit: 63,730 Toffoli Gates with 21 ECC steps per Toffoli for modular exponentiation. Thus we have 21(63,730)+QFT = 1.34 x 106 time steps = ~ 16 hours. è 16*1/.75 è ~21 hours

n  512-bit: 397.910 Toffoli Gates + QFT è ~5.5 days n  1024-bit: 964,919 Toffoli Gates + QFT è ~13.4 days n  2048-bit: 2,301,767 Toffoli Gates + QFT è ~32 days

Modular Exponentiation

Maxf x mod)( =QFT

Period of

f(x) Classical

Post processing

Page 25: Software and Architectures for Large-Scale Quantum Computing

25

Area Problem

Solution: Specialized Architecture Elements?

90cm x 90cm

Page 26: Software and Architectures for Large-Scale Quantum Computing

26

Design Pyramid

Speed

ReliabilityArea

Allowed Physical Component Reliability

Page 27: Software and Architectures for Large-Scale Quantum Computing

27

Limited Parallelism

n  Modular Exponentiation Component: The Draper Carry-Lookahead Adder (64-qubit Adder)

Page 28: Software and Architectures for Large-Scale Quantum Computing

28

Specialization

Ancilla : Data 2 : 1

Compute Block

Ancilla : Data 1 : 8

Memory Block

Logical Data Qubits Logical Ancilla Qubits

Page 29: Software and Architectures for Large-Scale Quantum Computing

29

Area Reduced Fa

ctor

of

Shor’s Alg. Adder Input Size

Page 30: Software and Architectures for Large-Scale Quantum Computing

30

Area Reduced

90cm x 90cm

28cm x 28cm

Page 31: Software and Architectures for Large-Scale Quantum Computing

04:37 31

Design Pyramid - Specialization

Speed

ReliabilityArea

Orig

Specialized

Page 32: Software and Architectures for Large-Scale Quantum Computing

32

Concatenated Codes

Reliability increases doubly

exponentially.

Exponentially slower.

Exponentially greater resources.

1 logical qubit

Level 1: 7 physical qubits

Level 2: 49 physical qubits

Concatenated Steane Code

Page 33: Software and Architectures for Large-Scale Quantum Computing

33

Faster Hierarchy

Memory Block

Compute Block

Cache @ Level 1 Compute @ Level 1

[Thaker et al, ISCA 2006]

Page 34: Software and Architectures for Large-Scale Quantum Computing

34

Performance Benefits

Shor’s Alg. Adder Input Size

Fact

or o

f

-2

0

2

4

6

8

10

256-bit 512-bit 1024-bit

Area Reduced

Perf. Change

Hierarchy: Area Reduced

Hierarchy: Perf. Change

Page 35: Software and Architectures for Large-Scale Quantum Computing

35

Orig

Hierarchy

Design Pyramid – Memory Hierarchy

Area

Speed

Reliability

Page 36: Software and Architectures for Large-Scale Quantum Computing

LESSION 2: MANAGING COMPILER RESOURCES

36

Page 37: Software and Architectures for Large-Scale Quantum Computing

The Scaffold Language and Compiler

37

1 #include "gates.h" 2 module main ( ) { 3 int i=0; 4 qreg extarget[4]; 5 qreg excontrol[4]; 6 forall(i=0; i<4; i++) { 7 CNOT(extarget[i],excontrol[i]); 8 } 9 }

n  Extended C q  No pointers q  Quantum datatypes q  Extensible gates q  Parallel loops q  Reversible logic

synthesis for classical functions (includes fixed point arithmetic)

[Heckey et al, ASPLOS 2015]

Page 38: Software and Architectures for Large-Scale Quantum Computing

Scalable Tailored QC Compilation

Quantum circuits often specialized to one problem input or size:

Benefits of Customization: Efficient circuits. Deeply and statically analyzable.

Vs. Lack of Scalability: Code explosion: > 1012 ops for some applications!

•  Need better balance of optimization and scalability •  QASM format changes: QASM-H, QASM-HL: 200,000X or

more code size savings •  Modular analysis

Page 39: Software and Architectures for Large-Scale Quantum Computing

Effect of Remodularization

•  Based on resource analysis, flatten modules with size less than a threshold

•  Limited by memory on compilation machine

0  

1  

2  

3  

4  

5  

6  

4.8E+08  5.0E+08  5.2E+08  5.4E+08  5.6E+08  5.8E+08  6.0E+08  6.2E+08  6.4E+08  

5k   10k   50k   100k   150k   2M  

Analysis  Tim

e  (s)  

Cri0cal  Path  Length  Es0mate  

(#  gates)  

Fla:ening  Threshold  for  Remodulariza0on  Binary  Welded  Tree  n=300,  s=1000  

Page 40: Software and Architectures for Large-Scale Quantum Computing

LESSON 3: DYNAMIC CODE GENERATION

04:37 40

Page 41: Software and Architectures for Large-Scale Quantum Computing

Quantum Code Generation for Arbitrary Rotations n  Arbitrary rotations are important, difficult to

compile for, and expensive to execute n  Unique sequence for every distinct rotation

q  Can be 4 TB of code! n  Sometimes need dynamic code generation

q  Rotation angles determined at runtime q  Large code size

41 [Kudrow et al, ISCA 2013]

Page 42: Software and Architectures for Large-Scale Quantum Computing

Rotation Decomposition

H gate T gate X gate H gate T† gate ...

Page 43: Software and Architectures for Large-Scale Quantum Computing

Rotation Decomposition

Scaffold QPL

module RotatePhi(qbit q) { Rz(q, Phi); }

QASM

module RotatePhi(qbit q) { T q H q Z q H q T q Z q ... }

Rotation gate

Decomposition

Page 44: Software and Architectures for Large-Scale Quantum Computing

Precomputed Library

●  Example: binary construction Generate library: Concatenate appropriate sequences

to approximate desired angle:

T, H, T, Z, T, Z, H, ...

Page 45: Software and Architectures for Large-Scale Quantum Computing

Results – Compilation Time

Page 46: Software and Architectures for Large-Scale Quantum Computing

Results – Compilation Time

Ion Trap

Neutral Atom

Superconductor Photons

Page 47: Software and Architectures for Large-Scale Quantum Computing

Dynamic Compilation Summary

●  Up to 100,000X speedup for dynamic compilation with 5X increase in sequence length

Page 48: Software and Architectures for Large-Scale Quantum Computing

FUTURE WORK 1: PROGRAM CORRECTNESS

04:37 48

Page 49: Software and Architectures for Large-Scale Quantum Computing

How do I know if my QC program is correct? n  Need: Specification language for QC algorithms n  Check implementation against the specification

q  Simulation for small problem sizes (~30 qubits) q  Symbolic execution for larger q  Type systems q  Model checking q  Certified compilation passes

n  Compiler checks general quantum properties q  No-cloning, entanglement, uncomputation

n  Checks or compiles based on programmer assertions too, where possible

Page 50: Software and Architectures for Large-Scale Quantum Computing

Programmer Assertion Example

b_eig_U = Eigenvalue(b,U) CascadeU(a,b,U) if not(b_eig_U) assert(Entangled(a,b))

Page 51: Software and Architectures for Large-Scale Quantum Computing

FUTURE WORK 2: SCALABLE SURFACE CODE SCHEDULING

04:37 51

Page 52: Software and Architectures for Large-Scale Quantum Computing

Surface Codes vs Network Routing

04:37 52

Page 53: Software and Architectures for Large-Scale Quantum Computing

Summary

n  QC is at an exciting time n  Software and architecture can generate key

insights and accelerate progress n  With the right models and abstractions,

classical techniques can have significant impact

https://github.com/ajavadia/ScaffCC.git http://people.cs.uchicago.edu/~ftchong

04:37 53

Page 54: Software and Architectures for Large-Scale Quantum Computing

BACKUP SLIDES

04:37 54

Page 55: Software and Architectures for Large-Scale Quantum Computing

Longest Path First Scheduling

Strategy: Minimize qubit motion by assigning long dependence chains to a single compute region, where they can compute locally with little communication.

1

2

3

4

5

6

7

8

9

10

11

12

a b c

H H C X

Z X T S C

X X C

C H T† T†

X S X

C

X

Page 56: Software and Architectures for Large-Scale Quantum Computing

Algorithms in Scaffold

56

Algorithm Lines of Code

Boolean Formula 479 Linear Systems 1741 Binary Welded Tree 608 Class Number 226 Triangle Finding 1231 Shortest Vector Problem 539 Ground State Estimation 554

Page 57: Software and Architectures for Large-Scale Quantum Computing

Popular Terms in Scaffold Code

57

Page 58: Software and Architectures for Large-Scale Quantum Computing

Scaffold Design Goals

n  Completeness and Expressiveness n  Integration with toolchain n  Leveraging existing compiler infrastructures n  Familiarity and Ease of use

58

Page 59: Software and Architectures for Large-Scale Quantum Computing

Scaffold Design Decisions

n  Imperative Programming Model n  Variant of C and Verilog n  C2QG modules n  Control primitives n  Leveraging Open-Source Compilation n  Modular Design n  Aggressive Syntax Checking

59

Page 60: Software and Architectures for Large-Scale Quantum Computing

Scaffold Safety

n  No pointer arithmetic or arbitrary dereferencing

n  Will use aggressive syntax and out-of-bounds checking at compile time

n  Type casting only for classical variables

60

Page 61: Software and Architectures for Large-Scale Quantum Computing

Scaffold Overview

61

1 #include "gates.h" 2 module main ( ) { 3 int i=0; 4 qreg extarget[4]; 5 qreg excontrol[4]; 6 forall(i=0; i<4; i++) { 7 CNOT(extarget[i],excontrol[i]); 8 } 9 }

Page 62: Software and Architectures for Large-Scale Quantum Computing

Data Types: Quantum Registers n  Quantum Registers

q  Basic use: n  qreg example_reg[n]; n  example_reg[m] n  example_reg[a..b] n  length (example_reg)

q  Concatenation: n  qreg reg1[p]; n  qreg reg2[q]; n  qreg reg = {reg1,reg2};

q  Individual qubits are also treated as qregs of size 1 n  qreg example_qubit[1];

62

+

Page 63: Software and Architectures for Large-Scale Quantum Computing

Expressing Parallelism n  Parallelism will be important to

runtime (and resource estimates) n  Initial coding of algorithms involved

many for loops that need not be sequential

n  Borrowing from data-parallel programming languages, we introduce a forall construct: q  forall is a for loop in which all iterations

are independent and can be executed in parallel

63

Page 64: Software and Architectures for Large-Scale Quantum Computing

Forall Examples

n  Linear Systems // Apply H ^ tensor N to g

for(i = 0; i < n_0; i++) { H(g[i]);

n  Boolean Formula

forall (i =0;i <(length(topregister)/2);i++){ Swap (topregister [i], topregister [topregister.length -1-i]); } … forall (index0=0; index0<w+m; index0++){

TOFFOLI(register[index0], register[index0-m], ctrl[0]); TOFFOLI(register[index0-m], register[index0], ctrl[0]);

}

64

Page 65: Software and Architectures for Large-Scale Quantum Computing

Classical-code-to-Quantum-Gate-sequence (C2QG)

n  Similar to C-to-HDL tools such as System-C n  Behavioral versus structural description n  Called with quantum variables as parameters n  Classical code within C2QG modules

synthesized using reversible logic synthesis (BBRL or RMDDS)

n  C2QG modules are statically synthesized q  C2QG modules can only call C2QG modules

internally

ORAQL/NQCS 65

Page 66: Software and Architectures for Large-Scale Quantum Computing

C2QG Code Example

// Normal module representing a Toffoli gate module Toffoli( qreg target,

qreg control1, qreg control2){ H(target); CNOT(target,control2); T(control2) Tdg(target); CNOT(target,control1); CNOT(control2,control1); Tdg(control2); T(target); CNOT(control2,control1); CNOT(target,control2); Tdg(target); CNOT(target,control1); T(target); T(control1); H(target);

}

ORAQL/NQCS 66

// C2QG module representing a Toffoli gate c2qg Toffoli( qint<1> target,

qint<1> control1, qint<1> control2){ if(control1 == 1 && control2==1) { target= ˜target; } else { target=target; }

}

Page 67: Software and Architectures for Large-Scale Quantum Computing

C2QG for Oracle Code

n  C2QG body sent to reversible gate synthesis tool

67

Page 68: Software and Architectures for Large-Scale Quantum Computing

C2QG content

n  Variable definitions n  Operators: Normal C operators may be used between variables, but

mixing qubits and classical will cause a compiler error. n  Conditional and loop constructs: C conditional and iteration

constructs are allowed, but they must be analyzable at compile time to allow the generation of reversible logic.

n  Calls to other c2qg modules: Within a C2QG module, only calls to other C2QG modules are allowed. Classical variables may be passed by reference.

68

Page 69: Software and Architectures for Large-Scale Quantum Computing

Scaffold Control Flow

n  Runtime conditions: compiler passes the conditions in classical assembly

n  Static classical conditions: evaluated by the compiler and circuit generated

n  Qubit state conditions: compiler generates code when using quantum bits as control lines

69

Page 70: Software and Architectures for Large-Scale Quantum Computing

Quantum Control Primitive

n  Compiler generates controlled versions of all gates within control conditionals

n  Restrictions: q  Control lines can not be

input parameters q  Only unitary operations

in the conditional body

70

//Module prototypes. They are defined elsewhere module U (qreg input[4], int n); module V (qreg input[4]); module W (qreg input[4], float p); //Quantum Control Primitive module control_example (qreg input[4]) { if (control_1[0]==1 && control_2[0]==1){ U(input);} else if (control_1[0]==1 && control_2[0]==0){ V(input);} else{ W(input);} }