scaffcc: a framework for compilation and analysis of
TRANSCRIPT
1
ScaffCC: A Framework for Compilation and Analysis of
Quantum Computing ProgramsAliJavadiAbhari,ShrutiPatil,
DanielKudrow,JeffHeckey,AlexeyLvov,FredericT.Chong,MargaretMartonosiPrincetonUniversity,UCSantaBarbara,IBM
2
QuantumComputingAdvantage• Acertainclassofproblemscanbesolvedsignificantlyfaster
bychangingtheparadigmofcomputing:usequantummechanicalsystemstostoreandmanipulateinformation.
• Example:Factoringalargeb-bitnumber
2
AsymptoticComplexity 232-digitnumberfactoring
Bestclassicalalgorithm(GNFS)[Buhler 1994]
2000yearsonasingle-coreAMDOpteron[Kelinjung et
al.2010]
Shor'squantumalgorithm (technologydependent–theoreticallylargespeedup)
O(exp( 649 b)13 (logb)
23 )
O(b3)
3
BackgroundonQuantumComputers• Aquantumbit(qubit) canexistin asuperpositionofstates:
• Quantumoperations(gates)transformthestateofqubits.• Measurement (observation) collapses ittoeither or .• Quantumcomputationisreversible.
3
ψ =α 0 +β 1
0 1
Quantum Assemblyqbit a[1], b[5];H(b[0]);H(b[1]);H(b[2]);
H(b[3]);H(b[4]);Z(a[0]);CNOT(a[0],b[1]);
4
CompilingQuantumCodes• Datatypesandinstructionsin
quantumcomputers:– Qubits,quantumgates
• DecoherencerequiresQECC– Logicalvs.PhysicalLevels
• Efficiencycrucial– Inefficienciesatlogicallevelare
amplified intogreaterphysicallevelQECCrequirements.
4
QuantumPrograminHigh-LevelDescriptionLanguage
Compiler&High-levelAnalyzer
ErrorCorrection
Mapper(Scheduling,PlacementandRouting)
QuantumPhysicalOperationLanguage
QuantumErrorCorrecting
Codes (QECC)
PhysicalMachineDescription
Scaffold
QASM
QASMwithQECC
Algorithm
ScaffCC
Logical
Physical
5
GoalsandContributions
• 1)Identifyingdifferencesincompiling forquantumvs.classicalcomputers
• 2)Providinggoodscalability topracticalalgorithmsizes
• 3)Automaticallysynthesizingreversiblecomputation(e.g.formathfunctions)
• 4)Developingimportantprogramanalysispasses
5
6
Benchmarks
Benchmark ClassicalTimeComplexity
Quantum TimeComplexity
Grover’sSearch
BinaryWeldedTree(BWT)
GroundStateEstimation(GSE)
TriangleFinding Problem(TFP)
BooleanFormula (BF)
ClassNumber(CN)
6
O(n) O( n )O( 14 2
n3 ) O( 14 k
4n9 )O(2n ) O(n5 )
O(n) O( n )O(n2 ) O(n1.3)
O((n lnn)0.5 ) O(log(n)log*(n))
7
ScaffoldProgramsandQuantumCircuits// main modulemodule main() {
// allocated qubits in mainqbit a[n], b[n], t[1];
// classical bits : measurement outcomecbit ma[n]; // iteration bound
int nstep = floor((pi/4)*sqrt(N))..// Grover iteration: Repeat O(N^0.5) times
for (istep=1; istep<=nstep; istep++) {Sqr(a,b);EQxMark(b, t, 0);
Sqr(a,b);diffuse(a);
}.
.// measure a to find outcomefor(i=0; i<n; i++)ma[i] = measZ(a[i]);
}
#include <math.h>#define n 5#define N pow(2,n)
// module prototypesmodule Sqr(qbit a[n], qbit b[n]);module EQxMark(qbit b[n], qbit t[1], int tF);
// diffusion modulemodule diffuse(qbit q[n]) {
// allocate qubits local to module
qbit x[n-1];
// Hadamard applied to q
for(j = 0; j < n; j++)H(q[j]);
...}
|0i H • · · ·
F�1n
|0i H • · · ·.
.
.
.
.
.
.
.
.
|0i H · · · •
| i /m U U2 · · · U2n�1
|0i /n|1i
|0i /n H⌦n
|1i H
|0i /n H⌦n
U!|1i H
|0i /n H⌦n
U!
H⌦n2|0nih0n|� In H⌦n
|1i H
Grover di↵usion operator
|0i /n H⌦n
U!
H⌦n2|0nih0n|� In H⌦n · · ·
|1i H · · ·
Repeat O(
pN) times
z }| {
| {z }
|0i H • · · ·
F�1n
|0i H • · · ·.
.
.
.
.
.
.
.
.
|0i H · · · •
| i /m U U2 · · · U2n�1
|0i /n|1i
|0i /n H⌦n
|1i H
|0i /n H⌦n
U!|1i H
|0i /n H⌦n
U!
H⌦n2|0nih0n|� In H⌦n
|1i H
Grover di↵usion operator
|0i /n H⌦n
U!
H⌦n2|0nih0n|� In H⌦n · · ·
|1i H · · ·
Repeat O(
pN) times
z }| {
| {z }
|0i H • · · ·
F�1n
|0i H • · · ·.
.
.
.
.
.
.
.
.
|0i H · · · •
| i /m U U2 · · · U2n�1
|0i /n|1i
|0i /n H⌦n
|1i H
|0i /n H⌦n
U!|1i H
|0i /n H⌦n
U!
H⌦n2|0nih0n|� In H⌦n
|1i H
Grover di↵usion operator
|0i /n H⌦n
U!
H⌦n2|0nih0n|� In H⌦n · · ·
|1i H · · ·
Repeat O(
pN) times
z }| {
| {z }
di↵usion module
a|0i /n H⌦n
U!
H⌦n2|0nih0n|� In H⌦n · · ·
b|0i /n H⌦n · · ·
Repeat O(
pN) times
z }| {
| {z }
2
di↵usion module
a|0i /n H⌦n
U!
H⌦n2|0nih0n|� In H⌦n · · ·
b|0i /n H⌦n · · ·
Repeat O(
pN) times
z }| {
| {z }
2
8
FromScaffoldtoQASM:DeepOptimizationthroughLLVM
8
• ScaffCCtranslatesfromScaffold ProgrammingLanguagetoQASM assemblylanguage.• ImplementedwithLLVM,arichandmaturecompilerframework.
• ModifiedClangfront-endparsesandconvertsScaffCCtoLLVMIntermediateRepresentation.
QAS
M
Gene
ratio
n ClassicalControl
Resolution
ScaffoldProgram
QASMClangFront-end
LLVM-IR
9
ScalabilityinCompilationandAnalysis(1)
• Quantumcircuitsaretypicallyspecializedtooneproblemsize,hencetheyaredeeplyandstaticallyanalyzable.– Classicalcontrolresolution
• StaticclassicalcontrolresolutionusingLLVMpasses– Maycausecodeexplosionduringcodetransformationoflargerproblems
9
10
ResolvingClassical ControlsintheCode
• Classicalcontrolsurroundingquantumcodemustberesolvedtodisambiguateforthehardwarethequbitsandtheexactsetofgates
10
.
.
#define s_ 3000module Oracle(qbit a[1], int j) {
double theta=(-1)*pow(2,j)/100;Rz(a[0],theta)
}module main() {
qbit a[1];int i,j;for (i=0;i<=s_;i++)
for (j=0;j<=3;j++)Oracle(a,j);
}
module Oracle_0(qbit a[1]) {Rz(a[0],-0.01);
}
module Oracle_3(qbit a[1]) {Rz(a[0],-0.08);
}
11
module EQxMark (qbit b[n], qbit t[1], int tF){
.
.if(tF==1)
CNOT(t[0], x[n-2]);else
Z(x[n-2]);..
}
module EQxMark_0 (qbit b[n], qbit t[1]){
.
.Z(x[n-2]); ..
}
module EQxMark_1 (qbit b[n], qbit t[1]){
.
.CNOT(t[0], x[n-2]);..
}
clone
module main (qbit b[n], qbit t[1]){
.for (i=0; i<2; i++)
EQxMark(b,t,i);.
}
module main (qbit b[n], qbit t[1]){
.EQxMark(b,t,0);EQxMark(b,t,1);.
}
unroll
inter-proceduralconstant
propagation
EQxMark(b,t,0); EQxMark_0(b,t);EQxMark(b,t,1); EQxMark_1(b,t);
ClassicalControlResolution
12
Pass-DrivenVs.Instrumentation-Driven
• Pass-Driven:– Loopunrolling– ProcedureCloning– Inter-proceduralConstantPropagation
• Instrumentation-Driven:– Leveragingthedualnatureofquantumprograms– Instrumentcodesuchthatafastclassicalprocessorexecutesthroughtheclassicalportion,collectinginformationregardingthequantumportion
– Furtherspeed-upbymemoizingsamemodulecalls
12
13
TheInstrumentation-DrivenApproachScalesBetter
13
300302304306308310312314316318
012345678
n=20
n=40
n=60
n=10
0,s=10
00
n=30
0,s=30
00
n=50
0,s=50
00
M=20
M=30
M=40
n=5
n=10
x=2,y=
2
x=3,y=
2
p=6
p=8
Grovers BWT GSE TFP BF CN
Pass-driven
Instrumentation-driven
Compilatio
ntim
e(s)
(Normalize
dpe
rben
chmark)
Better
14
ScalabilityinCompilationandAnalysis(2)
• TraditionalQASM:– Noloopsormodules:onlysequencesofqubitsandgates
– Usedforsmallprogramrepresentations• Programsthatweexaminedcontainedbetween107 to1012 gates
• Weneedamorescalableoutputformat:– QASMwithHierarchy(QASM-H)
• 200,000Xsmallercode– QASMwithHierarchyandLoops(QASM-HL)
14
15
ManagingScalabilitywithQASMFormat
15
Scaffold#define n 1000module foo(qbit q[n]) {
for(int i = 0; i < n; i++)
H(q[i]);CNOT(q[n-1],q[0]);
}module main() {
qbit b[n];foo(b);
}
QASM-Hmodule foo(qbit* q) {
H(q[0]);H(q[1]);
..H(q[999]);CNOT(q[999],q[0]);
}
module main() {qbit b[1000];foo(b);
}
Flat QASMqbit b[1000];H(b[0]);H(b[1]);
.
.H(b[999]);CNOT(b[999],b[0]);
QASM-HLmodule foo(qbit* q) {
H(q[0:999]);CNOT(q[999],q[0]);
}module main() {
qbit b[1000];foo(b);
}
16
ComparisonofQASM-HandQASM-HL
16
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
n=20
n=30
n=40
n=100,s=1000
n=300,s=3000
n=500,s=5000
M=20
M=30
M=40
n=5
n=10
x=2,y=2
x=3,y=2
p=6
p=8
Grovers BWT GSE TFP BF CN
Code
Size
Ratioof
QAS
M-HLtoQAS
M-H
• AlargereductionisalreadyobtainedfromQASM-HoverflatQASM.
Better
17
QAS
M
Gene
ratio
n ClassicalControl
Resolution
ScaffoldProgram
QASMClangFront-end
LLVM-IR
QAS
M
Gene
ratio
n ClassicalControl
Resolution
ScaffoldProgram
QASMClangFront-end
LLVM-IRCTQGSeparation
ScaffoldQuantumModules
QASMLinker
CTQGCompilation
CTQGClassicalModulesCT
QG
Tran
slatio
n
SynthesizingReversibleComputation
17
• Classical-To-Quantum-Gate(CTQG):AScaffCCfeatureforefficientlytranslatingclassicalmodulestoquantummodules.
18
CTQG:Classical-To-Quantum-Gate
• Facilitatesthesynthesisofquantumcircuitsfromclassicalmathematicalexpressions:– Basicinteger arithmetic(a=a+b,a=a+bc,...)– Fixed-point arithmetic(1/x,sinx,...)– Bit-wisemanipulations(shiftoperators,...)
• State-of-the-artinreversiblelogicsynthesis,minimizingtheuseofextra(ancilla)qubits
• Producesoutputgate-by-gateonthefly– Notlimitedbymemory
18
19
ProgramAnalysis
19
• Analysispasses:• Programcorrectnesschecks• Programestimates
CTQG
Tran
slatio
nQAS
M
Gene
ratio
n ClassicalControl
Resolution
ScaffoldProgram
QASMClangFront-end
LLVM-IR
Timing/ResourceEstimates
Correctness CheckAnalysisOutput
CTQGSeparation
ScaffoldQuantumModules
QASMLinker
CTQGCompilation
CTQGClassicalModules
Prog
ram
Analysis
20
ProgramAnalysis
• ScaffCCsupportsarangeofcodeanalysistechniques:– Programcorrectnesschecks:• No-cloningchecks• Entanglementandun-computationchecks
– Programestimates:• Resourceestimation• Timinganalysis(Parallelscheduling)
20
21
ProgramCorrectnessChecks
• No-Cloning:- Theorem:Thestateofonequbitcannotbe
copiedintoanother(nofan-out)- Checkthatmulti-qubitgatesdonotsharequbits
• Entanglement:- Thejointstateoftwoqubitscannotbeseparated- Data-flowanalysestoautomatethetrackingof
entanglementanddisentanglement
21
22
QuantumProgramAnalysis:ResourceAnalysis
• Obtainingestimatesforthesizeofthecircuit:– Qubitsareexpensive– Moregatesrequiremoreoverallerrorcorrectionandhence
morecost• Thesamepass-drivenandinstrumentation-driven
approachesapply• Dynamicmemoizationtablerecordsnumberofresources
22
23
TimingEstimate
• Estimatesthecriticalpathlengthoftheprogram- Assumingunlimitedhardwarecapabilityfor
parallelization• Schedulingbasedonqubit data dependenciesbetweenoperations
• Hierarchicalschedulingfortractability:– Obtainmodulecriticalpathsseparatelyandthentreatthemasblackboxes.
23
24
Remodularization• Analysismakesuseofmodularity– Avoidrepetitiveanalysis– Reduceanalysistime
• Resultsinlossofparallelismatmoduleboundaries– Decreasedscheduleoptimality
• Idea:– Inlinesmallmodulesatcallsites– largerflattenedmodules
– Definethresholdfor“small”modules– Resultsinbettercriticalpathestimates
24
25
HierarchicalApproachTradeoff
25
• Closenesstoactualcriticalpathisdependentonthelevelofmodularity
• Flatteroverallprogrammeansmoreopportunityfordiscoveringparallelism
a
H
T
C
T
S
b c
C
C
C
TT
T
T
T
C
H
C
1
2
3
4
5
6
7
8
9
10
11
12
PrepZ(s0)PrepZ(s1)X(s1)Toffoli(a0,s1,s0)X(s1)...
s0s1a0
Pz
Pz
X
X
H
T
C
T
S
C
C
C
TT
T
T
T
C
H
C
1
2
3
4
5
6
7
8
9
10
11
12
X
Pz
s0s1a0
X
13
14
1
2
3-14
15
ModuleToffoli(a,b,c) ModularAnalysis FlattenedAnalysis
Pz
26
EffectofRemodularization
26
• Basedonresourceanalysis,flattenmoduleswithsizelessthanathreshold
• Tradeoffbetweenspeedofanalysisanditsaccuracy
0
1
2
3
4
5
6
4.8E+085.0E+085.2E+085.4E+085.6E+085.8E+086.0E+086.2E+086.4E+08
5k 10k 50k 100k 150k 2M
AnalysisTime(s)
CriticalPathLeng
thEs
timate
(#gates)
FlatteningThresholdforRemodularizationBinaryWeldedTreen=300,s=1000
27
Demo
27
28
Conclusion• ExtendedLLVM’sclassicalframeworkforquantumcompilationatthelogicallevel
• Managedscalabilitythrough:– Outputformat:
• 200,000Xonaverage+upto90%forsomebenchmarks– Codegenerationapproach:
• Upto%70forlargeproblems• CTQG:Automaticgenerationofefficientquantumprogramsfromclassicaldescriptions
• Developedascalableprogramanalysistoolbox• ScaffCCcanbeusedasafutureresearchtool
28
29
Thank You