synthesis and optimization of quantum circuits for nearest

Synthesis and Optimizationof Quantum Circuits

for Nearest Neighbor Architectures

Master Thesis

Aaron Frederick Lye

Faculty 3 - Computer Science and MathematicsUniversity of Bremen

January 12, 2016

Synthesis and Optimization of Quantum Circuits forNearest Neighbor Architectures

Aaron Frederick LyeMatriculation number:

A Master Thesis Submitted in Partial Fulfillmentof the Requirement for the Degree ofMaster of Science (M.Sc.)

Faculty 3 - Computer Science and MathematicsUniversity of BremenJanuary 12, 2016

Supervisory Committee

Prof. Dr. Robert WilleInstitute for Integrated CircuitsJohannes Kepler University Linz4040 Linz, Austria

and

Cyber Physical SystemsDFKI GmbH28359 Bremen, Germany

Prof. Dr. Hans-Jörg KreowskiResearch Group Theoretical Computer Science,Faculty 3 - Computer Science and Mathematics University of Bremen28359 Bremen, Germany

Contents

1 Introduction 1

2 Preliminaries 52.1 Reversible Logic and Circuits . . . . . . . . . . . . . . . . . . . . 52.2 Permutations and Inversions . . . . . . . . . . . . . . . . . . . . 82.3 Quantum Information and Quantum Circuits . . . . . . . . . . . . 102.4 Complexity Theory and Combinatorial Optimization . . . . . . . 16

3 Design of Nearest Neighbor Compliant Quantum Circuits 213.1 NCV-based Synthesis of Quantum Circuits . . . . . . . . . . . . . 213.2 Nearest Neighbor Constraints . . . . . . . . . . . . . . . . . . . . 223.3 Nearest Neighbor Optimization . . . . . . . . . . . . . . . . . . . 243.4 Considered Problem and Contributions . . . . . . . . . . . . . . . 26

4 Complexity Analysis of Reordering Schemes 294.1 One-Dimensional Nearest Neighbor Problems . . . . . . . . . . . 294.2 Multi-Dimensional Nearest Neighbor Problems . . . . . . . . . . 334.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5 Optimization for Linear Nearest Neighbor Architectures 355.1 Global Reordering Scheme . . . . . . . . . . . . . . . . . . . . . 355.2 Local Reordering Scheme . . . . . . . . . . . . . . . . . . . . . . 385.3 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . 445.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

6 Optimization for Multi-Dimensional Nearest Neighbor Architectures 516.1 A Heuristic Approach . . . . . . . . . . . . . . . . . . . . . . . . 526.2 An Exact Approach . . . . . . . . . . . . . . . . . . . . . . . . . 536.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 536.4 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . 586.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

i

7 Considering Nearest Neighbor Constraints on the Reversible CircuitLevel 617.1 NCV-|v1〉 Library . . . . . . . . . . . . . . . . . . . . . . . . . . 617.2 Mapping to Gates from the NCV-|v1〉 Library . . . . . . . . . . . 627.3 Consideration of Nearest Neighbor Constraints . . . . . . . . . . 647.4 Nearest Neighbor-aware Optimization of Reversible Circuits . . . 657.5 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . 667.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

8 Conclusion 69

ii

List of Tables

2.1 Quantum Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1 Number of quantum gates . . . . . . . . . . . . . . . . . . . . . . 23

5.1 Evaluation of the exact approaches . . . . . . . . . . . . . . . . 465.2 Comparison to heuristic approaches . . . . . . . . . . . . . . . . 48

6.1 Costs of establishing an arbitrary permutation . . . . . . . . . . . 596.2 Resulting Optimal SWAP Gate Insertion . . . . . . . . . . . . . . 59

7.1 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . 67

iii

List of Figures

2.1 Reversible gates . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Reversible circuit . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3 Permuting the line order (0,1,2,3) to (2,3,1,0) . . . . . . . . . . . 92.4 Quantum circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.5 State transitions for NOT, CNOT, V, and V† operations . . . . . . 152.6 Quantum circuit using the NCV gate library . . . . . . . . . . . . 15

3.1 Toffoli gate decomposition . . . . . . . . . . . . . . . . . . . . . 223.2 Establishing nearest neighbor compliance . . . . . . . . . . . . . 25

4.1 NLR to LR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.1 Global Reordering . . . . . . . . . . . . . . . . . . . . . . . . . 365.2 Resulting PBO encoding for circuit from Fig. 3.2(a) . . . . . . . . 41

6.1 Multi-dimensional quantum circuits . . . . . . . . . . . . . . . . 526.2 Determine the precise quantum circuit configuration . . . . . . . . 546.3 Adjacent transposition graph . . . . . . . . . . . . . . . . . . . . 55

7.1 Quantum circuit NCV-|v1〉 gate library . . . . . . . . . . . . . . . 627.2 Nearest neighbor-aware NCV-|v1〉 quantum decomposition for a

multi-controlled Toffoli gate . . . . . . . . . . . . . . . . . . . . 637.3 Reordering a Toffoli circuit . . . . . . . . . . . . . . . . . . . . . 65

v

Chapter 1

Introduction

Since the invention of integrated circuits, the size of transistors has decreased dueto advances in the semiconductor technology, such that the number of transistorsin nowadays integrated circuits doubles approximately every 18 months. This phe-nomenon is also known as Moore’s Law. Due to this exponential growth, physicalboundaries will be reached in the near future. Furthermore, power consumption isone of the major problems today and will continue to be important in the future.Alternatives replacing or at least enhancing conventional solutions are needed. At-tention has been paid to computing paradigms such as reversible computing, DNAcomputing, and quantum computing.

Quantum circuits [56] represent a promising alternative to conventional circuittechnologies. Properties such as entanglement can be utilized to solve importantproblems like factorization [73], database search [30], or several graph and alge-braic problems (see e.g. [18]) significantly faster. Information is thereby stored interms of qubits. In contrast to conventional bits, qubits do not only allow to rep-resent Boolean 0’s and Boolean 1’s, but also the superposition of both. The statesof the respective qubits are thereby modified by quantum operations which canbe represented by unitary matrices. That is, each quantum computation is inher-ently reversible, but manipulates qubits rather than pure logic values. Motivatedby these developments, researchers and engineers started to actively consider thelogic synthesis of the corresponding circuits.

First netlists of the respective quantum circuits have thereby been developed byhand. But in order to design even more complex quantum circuits, automatic meth-ods for computer-aided design and synthesis are required. Accordingly, efficientquantum circuit design became an active field of research.

Synthesis of quantum circuits is often conducted by a two-stage procedure:First, a reversible circuit is designed using established reversible gate libraries(containing e.g. Toffoli gates [80]). To this end, several synthesis approaches havebeen introduced in the past (see e.g. [50, 72, 84, 91, 77, 62]). Then, the result-ing (reversible) circuits are mapped into the respective quantum circuits. Hereschemes as originally introduced by Barenco [6] or its recently optimized versions

1

(see e.g. [46, 52]) are applied. By this, the Boolean parts of a quantum circuit to berealized are synthesized.

Nevertheless, several technological constraints have to be considered duringthis design process. In the recent years, physical accomplishments have led to sev-eral physical realizations for quantum computers [49]. At the same time, these re-search also revealed intrinsic physical limitations on implementing quantum com-puting technology [60]. Among them, the limited interaction distance between gatequbits is one of the most common ones in several promising implementations ofquantum computation. Here, it is required that computations are only to be per-formed between adjacent, i.e. nearest neighbor, qubits. Considering this aspect,the established synthesis approaches mentioned above lead to non-optimal resultswith respect to nearest neighbor architectures.

In the recent past, algorithms addressing the realization of quantum circuits fornearest neighbor architectures have been introduced. Due to the fact, that naive im-plementations produce large circuits, several heuristics and exact algorithms havebeen presented in order to reduce the circuit size for nearest neighbor compliantquantum circuits [53, 13, 39, 34, 64, 69, 90, 70, 43]. A major problem is therebythat methods for nearest neighbor optimization usually work at the quantum circuitlevel only. Indeed, some previous work (e.g. [12]) considered nearest neighborconstraints already at the reversible circuit level. But due to the lack of proper met-rics, only adjacency of Toffoli gates has been concerned so far, while the mappingto its corresponding quantum circuit realization may introduce further non-adjacentgates. In fact, nearest neighbor conditions cannot efficiently be addressed at the re-versible circuit level with the established scheme.

Moreover, first approaches assumed quantum circuit realizations in so calledone-dimensional architectures – originally motivated by physical realizations basede.g. on trapped ions (see e.g. [32]), liquid nuclear magnetic resonance (see e.g. [41]),and architectures based on the original Kane model [38]. Recently, physical im-plementations based on multi-dimensional architectures have gained interest andare seen as the more suitable solution [35, 40, 55]. Here, qubits are not alignednext to each other, but e.g. in a 2D structure. First physical realizations based onphotonics [8], superconductors [8], quantum dots [79], and neutral atoms [65] havebeen shown very promising. Nevertheless, the development of respective synthesissolutions ensuring nearest neighbor-compliance for these architectures are just atthe beginning. Again initially only hand-made solutions such as [14] existed.

However, in this thesis we discuss the synthesis and optimization of quantumcircuits for nearest neighbor architectures. We consider one-dimensional as wellas multi-dimensional architectures for the design of nearest neighbor compliantquantum circuits. Further, we consider this design problem from several differentperspectives:

1. In the recent past, a cost metric for nearest neighbor compliance was intro-duced. In this context two reordering schemes for reducing the cost, called

2

local reordering and global reordering, have been presented. Here, we showthat these two schemes are NP complete problem.

2. Due to this complexity no efficient solving algorithms are known. Hence,usually heuristics are applied to solve these kind of problems. We reviewexisting approaches that have been proposed in the past for addressing thetwo schemes. After this, we present new exact alternatives which make useof the deductive power of constraint solvers. This enables us to perform aqualitative evaluation of the performance of existing (heuristic) solutions.

3. So far only post-synthesis optimization has been discussed. A better ap-proach would be to address nearest neighbor constraints already on the re-versible circuit level (if the quantum circuit is conducted by the two-stagedprocedure mentioned above). Nevertheless, as also mentioned above, this isnot possible with the presented gate libraries so far. In order to achieve this,we present a synthesis scheme which makes use of a recently introduceddifferent gate library.

The thesis is structured as follows: Preliminaries are introduced in the nextchapter. Afterwards, in Chapter 3 the design flow of nearest neighbor compli-ant quantum circuits is presented. The difficulties which occur during the designflow are discussed and the issues covered by this thesis are substantiated. Theseissues are processed in the remaining chapters. Chapter 4 considers the topictheoretically, whereas the other are more practical. Moreover, Chapter 5 consid-ers one-dimensional architectures and Chapter 6 generalizes this ideas for multi-dimensional architectures. In Chapter 7 nearest neighbor constraints are consid-ered on a reversible circuit level. Finally, a conclusion is conducted in Chapter 8.

3

Chapter 2

Preliminaries

In order to keep this work self-contained, preliminaries defining the basics andestablishing the notation are briefly introduced in this chapter. First, reversiblelogic and circuits are discussed. Afterwards, permutations and their realization arereviewed. After this quantum information and quantum circuits are introduced.Finally, basic definitions and notions from complexity theory and combinatorialoptimization are outlined.

2.1 Reversible Logic and CircuitsIn the recent years, reversible computing has been established as a promising re-search area and an emerging technology with promising application areas, in e.g.low power design and quantum computing. In this section, the concept of reversiblelogic is discussed in order to establish a basis for the sections covering quantumcomputing.

Reversible logic can be used for realizing reversible functions. Reversible func-tions are special multi-output functions and defined as follows.

Definition 2.1. Let IB := F2 be the Galois field of two elements {0, 1} (truthvalues) with the negations 0 = 1 and 1 = 0 and ID be a set of identifiers servingas a reservoir of Boolean variables. Let IBX be the set of all mappings a : X → IBfor some X ⊆ ID where the elements of IBX are called assignments. If the setof variables is ordered, each assignment corresponds to a Boolean vector. Then abijective Boolean (multi-output) function f : IBX → IBX is called reversible.

Definition 2.2. A reversible circuit realizes a reversible function by a cascade ofreversible gates where the result of a prior gate is used as the input of the succeed-ing gate.

Reversible circuits differ from conventional circuits: while conventional cir-cuits rely on the basic binary operations1 and also fanouts are applied in order to

1In logic and complexity theory AND, OR and NOT are assumed to be the basic operations.Sometimes the NAND gate is assumed to be a basic binary operation because this operation is uni-versal and its functionality is equal to a transistor behavior.

5

use signal values on several gate inputs, in reversible logic fan-outs and feedbackare not directly allowed because they would destroy the reversibility of the compu-tation. The logic operators AND and OR cannot be used since they are irreversible.Instead a reversible gate library is applied. Since the Boolean operator NOT is in-verse, the NOT-gate is part of this reversible library. To increase the expressivenessthe universal Toffoli gate has been introduced, which is a multiple-controlled NOT-gate. Since the Toffoli gate is universal, all reversible functions can be realized bycascades of this gate type alone (cf. [80]).

Definition 2.3. A (multiple-controlled) Toffoli gate consists of a target line t ∈ IDand a set C ⊆ ID − {t} of control lines and is denoted by T := (t, C). The gatedefines the function ft,C : IBX → IBX for each X ⊆ ID with {t} ∪ C ⊆ X whichmaps an assignment a : X → IB to fT (a) : X → IB given by fT (a)(t) = a(t) ifa(c) = 1 for all c ∈ C. In all other cases, fT (a) is equal to a. Hence, fT (a) invertsthe value of the target line if and only if all control lines are set to 1. Otherwisethe value of the target line is passed through unchanged. The values of all otherlines always pass through a gate unchanged. Consequently, fT is a mapping onIBX which is inverse to itself and, therefore, reversible in particular.

A multiple-control Toffoli gate can be realized by a sequence of Toffoli gateswith two control lines.

In addition to positive control lines, in the recent past also negative- and mixed-control Toffoli gates have been considered [76]. This gains smaller circuits in gen-eral. Nevertheless, the expressiveness remains the same, since each negative con-trol can be replaced by a positive one with a negation before and after the control.For this reason, in this work we focus on positive control Toffoli gates.

There are two cases of the Toffoli gate which are termed differently: the NOT,with c = ∅, which always inverts the value of the target line t, and the ControlledNOT gate, with |C| = 1, which is also known as Feynman gate.

Definition 2.4. In general, a reversible gate g is denoted by the tuple (C, T, k, i)where k indicates the kind of gate and i ∈ N indicates the position of the gate inthe circuit. The tuple contains a set of control lines C ⊂ X , which may be empty,and a set of target lines T ⊂ X with T 6= ∅ and C ∩ T = ∅. X represents thenumber of line variables in the circuit.

The semantic of the operation depends on the kind of the gate. If all controllines are set to 1, the operation of the gate is applied to the target lines; otherwisethe values of the target lines are always passed through unchanged. The controllines and unconnected lines always pass through a gate unchanged.

As a convention, we often write gi for a gate (c, t, k, i). If the position is irrel-evant and all gates in the circuit are Toffoli gates we abbreviate g as (c, t).

Besides the Toffoli gate, also the Fredkin gate [24] and the Peres gate [59]are of special interest, even though unlike the Toffoli gate, both are not universal.Fig. 2.1 shows these three gates.

6

x1 x′1

x2 x′2

x3 x′3

x1x2x3 x′1x′2x′3

000 000001 001010 010011 011100 100101 101110 111111 110

(a) Toffoli gate

x1 x′1

x2 x′2

x3 x′3

x1x2x3 x′1x′2x′3

000 000001 001010 010011 011100 110101 111110 101111 100

(b) Peres gate

x1 x′1

x2 x′2

x3 x′3

x1x2x3 x′1x′2x′3

000 000001 001010 010011 011100 100101 110110 101111 111

(c) Fredkin gate

Figure 2.1: Reversible gates

For drawing gates, in this work the established convention is utilized. A Toffoligate is depicted by using solid black circles to indicate control connections for thegate and the symbol⊕ to denote the target line. A Fredkin gate is depicted by usingsolid black circles to indicate control connections for the gate and the symbol × todenote the two target lines.

Example 2.1. In Fig. 2.1(a) a Toffoli gate and its corresponding function in theform of a truth table is shown. It inverts the value of the target line x3 if thecontrol lines x1 and x2 are set to 1. In Fig. 2.1(b) the Toffoli representation andthe performed reversible function of the Peres gate is shown. In Fig. 2.1(c) a 1-controlled Fredkin gate is shown. It interchanges the value of x2 and x3 if x1 is setto 1.

A special case of the Fredkin gate is the one without control lines where thetwo target lines are applied to adjacent variables, which is also known as the SWAPgate. This gate always interchanges the values of t1 and t2.

Definition 2.5 (SWAP gate). A SWAP gate operates on the target lines i and i+ 1and realizes an exchange of the two values of the lines.

Due to the fact that the SWAP gate is realizing an interchange of two lines inthe circuit it is of special interest and crucial for this work. For this reason, theconnection between SWAP gates and permutations is considered in the followingsection.

Example 2.2. Fig. 2.2 shows a reversible circuit composed of 4 circuit lines and4 Toffoli gates. This circuit maps e.g. the input pattern 1111 to the output pat-tern 1000. Inherently, every computation can be performed in both directions(i.e. computations towards the outputs and towards the inputs can be performed).

7

x1 = 1 f1 = 1

x2 = 1 f2 = 0

x3 = 1 f3 = 0

x4 = 1 f4 = 0

1

1

1

0

1

0

1

0

1

0

0

0

Figure 2.2: Reversible circuit

2.2 Permutations and InversionsIn the main part of this work the specific order of the elements of a vector or therows of a matrix are of special interest. Matrices changing the order, so calledpermutation matrices, are therefore of great significance. This section reviews thebasics about permutations and their connection to inversion vectors.

Definition 2.6 (Permutation). A permutation p is a reordering of n elements. It canbe represented by a square matrix n × n containing exact one 1 in each row andcolumn; all other entries are 0; i.e.

∧n−1i=0 (

∑n−1j=0 aij = 1)∧

∧n−1i=0 (

∑n−1j=0 aji = 1).

The matrix defines a bijective mapping. We denote the set of all permutations overn elements with Pn.

The identity (I) is the permutation mapping each element to itself. Hence, it isrepresented by the diagonal matrix. The permutation holds

∧n−1j=0 ajj = 1. It is the

neutral element of the matrix multiplication.Every permutation p has an inverse p′ holding p′p = I and p′p = I .

Observation 2.1. Let p1, . . . , px be a sequence of permutations. Let pi be theinverse permutation to pi. Then the following holds: p1p1p2 . . . px−1px = px.

Proof. Follows directly from Definition 2.6, because pipi = I and Ip = p.

The permutation defines the complete order and can be realized by inversions(see e.g. [58] where inversions and inversion vectors are discussed).

Definition 2.7 (Inversion). An inversion is a pair of elements (π(i), π(j)) in apermutation defined by π if i > j, usually i = j + 1, and π(i) < π(j).

In other words, an inversion is an operation on a tuple of two elements whichare mapped next to each other. The operation interchanges the two elements. Notethat an inversion is also a permutation and every permutation can be expressed bya sequence of inversions.

Inversion vectors connect the information of the permutation and the inversionsperformed together.

Definition 2.8 (Inversion Vector). For any permutation π of n elements, an inver-sion vector v = (v0v1...vn−1) is defined by vi = |{π(j) | πj > πi ∧ i < j}|,0 ≤ i < n, such that the ith element of v is the number of elements in π to the leftof i which are greater than i.

8

x0 x2

x1 x3

x2 x1

x3 x0

Figure 2.3: Permuting the line order (0,1,2,3) to (2,3,1,0)

The cost function for permutations c : P|Q| → N is defined by the number ofinversions needed to establish the permutation. In the one dimensional alignmentthe inversions for a specific permutation and hence also the permutation cost canbe computed in polynomial time using inversion vectors.

As mentioned above, it is the order of the elements within a vector, or within amatrix, which is of special interest. This is because the line order of a circuit canbe seen as a permutation matrix where each row represents a circuit line. Due tothe fact, that a SWAP gate performs an inversion2, the total amount of SWAP gatesneeded for creating a particular permutation of circuit lines can be calculated bysumming up the respective entries in the inversion vector. The same principle wasused in [34] for showing that in order to construct an arbitrary permutation a bubblesort sorting algorithm, i.e. sorting by exchange generates the minimum number ofinversions, i.e. SWAP gates, for permuting one permutation into another. Hence,calculating the inversion vector can be done in O(n2). Nevertheless, with a divideand conquer algorithm it can also be done in O(n log n).

Example 2.3. Assume a given circuit line order (0, 1, 2, 3) shall be permutedto (2, 3, 1, 0). The corresponding inversion vector is v = (3, 2, 0, 0). Hence,3 + 2 + 0 + 0 = 5 SWAP gates are required in order to create this permutation.

Even the positions of the SWAP gates can be extracted from the permutationand the inversion vector. Interpreting Π as composition of two gates to the right, weobtain a circuitG =

∏n−1p=0

∏vpi=1 Swap(i−1, i, p). The resulting circuit represents

a permutation of the line order. ThereforeG is containing all inversions representedin the inversion vector in the right order.

This circuit may contain SWAP gates with the same target lines multiple times,because some of the elements got in the right position, but by the fact that onlyneighboring elements can be interchanged, they must be modified in order to moveelements behind these elements.

Example 2.4. Assume a given inversion vector is v = (3, 2, 0, 0). As shown inExample 2.3, 5 SWAP gates are required in order to create this permutation. Af-ter constructing the circuit with the given formula, G contains {(0, 1, 0), (1, 2, 1),(2, 3, 2), (0, 1, 3), (1, 2, 4)} as shown in Fig. 2.3.

2Inversions can be realized by SWAP gates. Hence, every permutation can be expressed by acircuit of SWAP gates.

9

2.3 Quantum Information and Quantum CircuitsQuantum computing is a promising application of reversible logic. In contrast toconventional computing, quantum computing operates on qubits (quantum bits) [85]instead of bits. While bits are restricted to the two binary values, qubits can be setto any superposition of both.

Definition 2.9 (Qubit). A qubit is a two level quantum system with a two dimen-sional complex Hilbert space, i.e. complete complex inner product space (cf. [93]).This Hilbert space has its basis on the orthogonal quantum states |0〉 ≡

(10

)and

|1〉 ≡(

01

)representing the Boolean values 0 and 1.

Any superposition, i.e. state of a qubit, is formally written in the Dirac nota-tion:

|φ〉 = α |0〉+ β |1〉 with |α|2 + |β|2 = 1, α, β ∈ C. (2.1)

The variables α and β are called amplitudes. They can be used to denote thequantum state of a single qubit by the vector

(αβ

).

A quantum system which contains n qubits is called a quantum register of sizen. The state of a quantum system with n > 1 qubits is given by an element ofthe tensor product3 of the respective state spaces. The state can be represented bya state vector which is a normalized vector of length 2n. This quantum registerreveals the power of quantum computing. Although a classic computer can repre-sent 2n states with n bits, it can only hold one state at any one time. In contrast,a quantum computer can hold 2n states parallel with n qubits and can modify allsimultaneously by applying quantum operations. These possible quantum opera-tions on the state vector are performed through the multiplication of appropriate2n×2n unitary matrices [56]. A matrix U is unitary if and only if UU † = I whereU † is the complex conjugate transpose of U and I is the identity matrix.

To retrieve the qubits’ values, they must be measured after the computation. Itis impossible to obtain a qubit’s current state, because the measurement destroysthe state of the qubit with the effect that the measurement returns with a specificprobability either 0 or 1. Depending on the current state, the qubit returns the value0 with the probability of |α|2 and 1 with the probability of |β|2.

There are several quantum computational models. Quantum Turing machines(QTMs) are the quantum physical analogue of probabilistic TMs. They have aninfinite tape, a transition function and actions are local and completely specified bythe transition function. They are the quantum physical analogue because they arebranching with complex probability amplitudes [7]. Further, “QTMs are discretedevices: the transformation amplitudes need only be accurate to O(log T ) bits ofprecision to support T steps of computation” [7]. The transition function is a linearoperator, also called the time evolution operator of the QTM. The QTM is called

3Tensor product: Let the tensors A and B with the components aij... and brs... be of the grade mand n. The resulting 3m+n scalars cij...rs... = aij...brs... build the components of a tensor C of thegrade m+ n on which associativity and distributivity is defined (cf. [10]).

10

well-formed if its time operator preserves Euclidean length. In [7] it is shown thata QTM is well-formed iff its time operator is unitary.

Another way to model quantum computations is adiabatic quantum computingwhich is also a generalized, universal method of implementing quantum compu-tations. Here, “the time complexity of the algorithms is related to the separationbetween the energy eigenvalues of the time-changing Hamiltonian that evolves thesystem into a solution-encoding ground state”4[21]. With this model all QTM com-putations can be simulated with (large) polynomial overhead [1]. Further it can beshown that this model does not provide additional computation power compared toestablished quantum computation [82].

The most common way is by applying unitary transformations in a space ofcomplex superpositions of configurations. A quantum algorithm is the decompo-sition of an unitary transformation into a product of unitary transformations, eachmaking only simple local changes. These simple unitary transformations form thequantum circuit. Qubits are denoted by circuit lines and the quantum operationsare represented by a cascade of quantum gates realizing the unitary matrices of thequantum algorithms in the indicated order. This enables an easy, intuitive and sig-nificant description of quantum algorithms and their evaluation. More precisely, aquantum circuit is defined as follows.

Definition 2.10. A quantum circuitC = (Q,G) is a sequence of gatesG operatingon a set of qubits Q 6= ∅. We denote the total number of quantum gates by |G|. Thenumber of qubits is denoted by |Q|.

The circuit has to following semantics. The unitary matrix implemented by acascade of quantum gates acting on different qubits independently can be calcu-lated by the tensor product of their matrices. For a set of gates {g1, g2, . . . , g|G|}cascaded in a quantum circuit G, the unitary matrix of G can be calculated asM|G|M|G|−1. . .M1 where Mi represents the matrix of gate gi with 1 ≤ i ≤ |G|.

Similar to reversible circuits, operations can be performed on more then onequbits using controlled quantum gates. Here the complex valued vector(α|c〉α|t〉, α|c〉β|t〉, β|c〉α|t〉, β|c〉β|t〉) is applied to the controlled quantum gates rep-resented by the matrix M realizing the unitary operation. In M the four en-tries containing U denote the specific operation applied to the target qubit |t〉 if|c〉 = |1〉. In the case |0〉 the states remain unchanged.

M =

1 0 0 00 1 0 00 0 U0 0

, (2.2)

where U itself is as well an unitary matrix. The tensor product is usually appliedseveral times on the target matrix, the control matrix and the resulting compositionrespectively in order to bring the matrix to the size 2n × 2n with n = |Q|.

4Therefore, the Schrodinger equation i ddt|φ(t)〉 = H(t) |φ(t)〉 with the unitary time evolution

|φ(t)〉 = U(t, t0) |φ(t0)〉 given by U(t, t0) = e−i

∫ tt0H(t)dt is used [20].

11

|1〉 = ( 01 ) − |1〉 =

(0−1

) 1√2(|1〉 − |0〉) = 1√

2(−1

1 )Z H

|1〉 = ( 01 ) ( 0

1 )1√2(|0〉 − |1〉) = 1√

2

(1−1

)H

Figure 2.4: Quantum circuit

The 2n × 2n matrix of a gate depends on whether the gate is controlled oruncontrolled:

• Consider the uncontrolled case first. Let t be the target line on which theunitary matrix U should be applied. Then M = I⊗n−t−1 ⊗ U ⊗ I⊗t−1

where I is the 2 × 2 identity matrix and U is a 2 × 2 unitary matrix. Here,I⊗n−t−1 and I⊗t−1 denote the n− t−1-fold and, respectively, the t−1-foldapplication of the tensor product on the identity matrix.

• If the gate is controlled, the matrix is more complicated. Let c be the controlline and t be the target line on which the unitary matrix U should be applied.

If c < t, then M = I⊗n−c−1⊗U ′ where U ′ =[I ′ 00 U ′′

]with I ′ = I⊗t and

U ′′ = I⊗c−t−1 ⊗ U ⊗ I⊗t; otherwise, i.e. c > t, then M = I⊗n−t−1 ⊗ U ′

where U ′ =[U ′′ 00 I ′

]with I ′ = I⊗t−1 and U ′′ = U⊗t−1.

In the following we use an abstract representation of gates. Let (t, c, U, i) ∈ Gdenote a gate, which applies U on the target qubit, with t ∈ Q, c ∈ Q ∪ {λ}holding t 6= c where t, c represent the target and control qubit, respectively. I.e. ifthe gate is uncontrolled we denote this by c = λ. If U is irrelevant in the contextwe omit it in the tuple. The position of g in C is given by i with 1 ≤ i ≤ |G|.Sometimes we denote it with gi. If the position is irrelevant we denote a gate by(t, c).

Further, we depart the set of gates into the following subsets of uncontrolledand controlled gates. Let G1 = {(t, λ) ∈ G} and G2 = G \G1.

Example 2.5. Fig. 2.4 shows a quantum circuit composed of |Q| = 2 circuit linesand |G| = 3 gates. This circuit gets |11〉 as input and transforms the qubits asindicated at the circuit signals.

Table 2.1 shows common elementary quantum gates which can physically berealized by quantum computing technology to perform a single physical operation.They can be arranged in two groups: the Hadamard gate, and the Pauli gates andtheir roots. The roots of the Y gate are omitted, because they are not used in theliterature.

The Hadamard gate is a 1-qubit rotation, which maps the qubit-basis states intoequally weighted superposition states. The Hadamard operation is a 90° rotation

12

Table 2.1: Quantum GatesGate unitary Matrix

Hadamard 1√2

(1 11 −1

)Pauli X

(0 11 0

)Pauli Y

(0 −ii 0

)Pauli Z

(1 00 −1

)V 1+i

2

(1 −i−i 1

)W 1

2

(1 +√i 1−

√i

1−√i 1 +

√i

)S

(1 0

0 eiπ2

)

T

(1 0

0 eiπ4

)

on the y axis of the three-dimensional Bloch sphere5, followed by a rotation aboutthe x axis by 180° [56].

Each Pauli gate specifies a 180° rotation around one of the three particular axesof the Bloch sphere (cf. [75]). The commonly used Pauli gate is the Pauli X gate,which specifies a rotation around the x axis. It is also known as the NOT gatebecause it inverts the state when applied. The Pauli Y gate is similar to the Pauli Xgate because it inverts the states but by rotating on the y axis. The result is aninversion of the state multiplied by an imaginary scalar. The third inversion is doneby the Pauli Z gate, rotating on the z axis. When it is applied on a state the secondamplitude is inverted.

Derived from the operations of the Pauli gates, different gates have been in-troduced realizing the rotation around a particular axis with different angles. Oneexample is the V gate.

Lemma 2.1 (V gate). The V gate realizes a 90° rotation around the x axis. Asa result and accordingly to Table 2.1, a cascade of two equal V operations isequivalent to an inversion performed by the Pauli X gate since they perform thesame quantum operation, i.e.(

0 11 0

)=

1 + i

2

(1 −i−i 1

)· 1 + i

2

(1 −i−i 1

).

5Equation 2.1 can be rewritten as |φ〉 = eiγ(cos θ

2|0〉+ eiψsin θ

2|1〉

), γ, ψ, θ ∈ R. This can be

simplified to |φ〉 = cos θ2|0〉 + eiψsin θ

2|1〉. The numbers θ and ψ define a coordinate on the unit

three-dimensional sphere, called Bloch sphere [56].

13

Proof.(

1+i2

)2( 1 −i−i 1

)2

= i2

(0 −2i−2i 0

)=

(0 11 0

).

For that reason, the V operation is also known as the square root of NOT. TheV† gate performs the inverse operation of the V gate.

Although many different quantum gates exist, usually only a subset is used inthe different quantum gate libraries. NCV is such a gate library. The quantum gatelibrary NCV was introduced by Barenco et al. [6] which is mostly applied in thedevelopment of design methods for quantum circuits, containing the following setof quantum gates: the Pauli X gate, the controlled Pauli X gate, the controlled Vgate and it’s adjungate C V†. If the input signals and all control lines are restrictedto Boolean values, a 4-valued logic results where each qubit may represent onevalue of {0, 1, v0, v1} with v0 = 1+i

2

(1−i)

and v1 = 1+i2

(−i1

). Fig. 2.5 shows the

resulting transitions with respect to the possible NOT, V, and V† operations. Thisis sufficient to realize every reversible function as a quantum circuit [6], i.e. theNCV library is universal for Boolean functions. Furthermore, this keeps the gatelibrary simple enough to be efficiently realizable in quantum computer technolo-gies [42]. For that reason the elements from the NCV library are usually consideredas elementary gates and mostly applied in the development of design methods forquantum circuits. The NCV library has been used in several works on synthesisand optimization of quantum circuits [36, 29, 52].6 7

Example 2.6. Fig. 2.6 shows a quantum circuit composed of 4 circuit lines and6 quantum gates. This circuit again maps e.g. the input pattern 1111 to the out-put pattern 1000, but in contrast to the circuit from Fig. 2.2 quantum values andquantum operations are utilized for this purpose.

There are two metrics for measuring the performance of a quantum circuit: thenumber of gates and the depth of a quantum circuit. Usually only the number ofgates is considered.

6The authors of [66] extended the library to include controlled-W and controlled-W † gates whichare fourth-root-of-NOT gates.

7A totally different gate library are the Clifford library and its extension the Clifford+T library.The Clifford quantum gate library contains the Hadamard gate H , the CNOT gate and the phasegate S with its adjoint gate S†. The Clifford library has the advantage over the NCV gate library offault-tolerant computing and the possibility of realizing a greater space of states [11]. The Cliffordgroup on n-qubits, denoted by Cn defined as the normalizer of {I,X, Y, Z} in U(2n) contains allunitary matrices which are computable by circuits composed with {I,X, Y, Z,H, S,CNOT}. Itis well-known that Cn with any one unitary matrix U /∈ Cn forms a dense set in U(2n) [56]. Byextending the Clifford library with the fourth square root of the Pauli Z gate, called the T gate, thegate library containing {H,S, S†, CNOT, T, T †}, called Clifford+T becomes universal for generalmatrices [4], which is a great advantage over the NCV and the Clifford library. The set of circuitscomposed from this new library form a subset of 2n × 2n unitary matrices over the ring Z[ i√

2, i]

defined as {a+bei π4 +ce

i π2 +de

i 3π4√

2n} with a, b, c, d, n ∈ Z, n ≥ 0 [4].

Corresponding synthesis approaches relying on this set have been presented in [9, 4, 68].

14

0 1

v1

v0

��

��

��*V† ��

��V

HHHH

HHHHHj

V†

HHHH

HHH

HHY

V

��

��

��*

V

HHHH

HHHH

HY

V†

��

��V†

HHHH

HHHHHj

V

-� NOT

6

?

Figure 2.5: State transitions for NOT, CNOT, V, and V† operations

x1 = 1 f1 = 1

x2 = 1 f2 = 0

x3 = 1 f3 = 0

x4 = 1 f4 = 0V V V†

1

1

1

v1

1

1

1

0

1

0

1

0

1

0

1

0

1

0

0

0

Figure 2.6: Quantum circuit using the NCV gate library

Definition 2.11 (Quantum costs). Usually the cost of an elementary quantum gateis assumed to be 1. Thereby it is irrelevant if the gate is controlled or not. Further,due to the fact that quantum circuits only contain elementary gates the quantumcosts of a quantum circuit is defined by |G|; i.e. the number of gates.

Definition 2.12 (Quantum circuit depth). The depth of a quantum circuit is definedby the number of steps needed to perform the quantum operations.

But the metric is inconsistent due to the fact that some physical implementa-tions can perform multiple quantum operations in parallel if they are independent.Usually such multiple parallel operations are considered as one step. The totalnumber of time steps required to perform all gates is defined as the latency. Thisincludes the time required to move qubits between gate operations and the timerequired to apply all gates (cf. [5]).

In the following, often the complete circuit does not have to be considered. In-stead only subsets are used. The relevant sub-circuits are defined in this subsection.

Due to the fact that any possible combination of quantum gates can occur in aquantum circuit, even if the case is unlikely, it is possible to have multiple inde-pendent sub-circuits in a quantum circuit. Respecting Definition 2.10 a sub-circuitis defined as follows.

Definition 2.13 (Sub-circuit). For a quantum circuit C a sub-circuit is a cascadeof quantum gates Gij = {gi, . . . , gj} with 1 ≤ i ≤ j < |G| and gi, . . ., gj ∈ G.Further, let Qij ⊆ Q be the subset of qubits used in the sub-circuit, i.e.

Qij := {q ∈ Q | (q, t, x) ∈ G ∨ (c, q, x) ∈ G, i ≤ x ≤ j}.

15

Sub-circuits find several applications. For example when grouping gates toconsider the performed quantum operation of a cascade of quantum gates [47], intemplate matching algorithms [64], or in enumerative algorithms where the impactof a selected choice is evaluated [34].

One possibility for determining independent sub-circuits is to apply the War-shall algorithm [83]8 on the circuit’s adjacency matrix. With the algorithm of War-shall the connectivity of the vertices of a given graph can be calculated. Thisconnectivity information allows for the calculation of the number of independentpartitions of the graph. Assume a set of controlled gates of a quantum circuit G2.Then the adjacency-matrix A of the quantum circuit G2 is the |Q| × |Q| matrixwhere the non-diagonal entry aij is the number of gates connecting the two qubitsi and j; due to the fact that only controlled gates are considered, the diagonal en-tries aii are empty.

On each independent sub-circuit the algorithms can be applied. Nevertheless,to keep the focus, in the following w.l.o.g. it is assumed that the circuit contains nomultiple independent sub-circuits, i.e. consists of only one partition.

2.4 Complexity Theory and Combinatorial OptimizationComplexity theory covers the discussion on the complexity of decision and com-putational problems, their complexity classes and their relation to each other. Inthis section we want to recall common definitions and notions.

Definition 2.14. A decision problem L is a set of words over an arbitrary non-empty alphabet. An instance I is an input for which it should be decided if I ∈ L.An instance with I ∈ L is called yes-instance; the other are called no-instance.

One example of a decision problem is the Boolean Satisfiability Problem (SAT).

Definition 2.15 (Boolean Satisfiability Problem). SAT is the decision problem, if agiven propositional formula ϕ, usually given in Conjunctive Normal Form (CNF),with ϕ : IBn → IB, n ∈ N, is satisfiable, i.e.

SAT(ϕ) =

{1, ∃X ∈ IBn : ϕ(X) = 1

0, else,(2.3)

where X ∈ IBn is an assignment for the variables in ϕ and ϕ(X) be the applica-tion of the assignment to the formula.

This means that SAT represents the problem of determining an assignmentto a propositional logic formula ϕ : IBn → IB, such that ϕ evaluates to 1, or toprove that no such assignment exists. Verifying such an assignment can be doneefficiently but finding such an assignment is no trivial task.

The complexity classes P and NP are central in complexity theory and thequestion P = NP is one of the important problems of mathematics and computer

8Usually the Warshall algorithm is used in graphs for finding the shortest path from one vertex toan other.

16

science. P contains all problems, which can be computed with a deterministicalgorithm with equivalent computational power to a Turing machine [81] in poly-nomial time. Respectively, contains NP the problems, which can be computedwith a non-deterministic algorithm in polynomial time. A different to Turing ma-chines equivalent characterization of NP can be done by verification systems. LetL be a decision problem in NP. Then a polynomial verification system R foryes-instances is a relation between instances I of L and a certificate (sometimescalled witness) w such that (I, b) ∈ R ⇔ I ∈ L, with a polynomial p such that|w| ≤ p(|I|) where |w| and |I| denote the length of the word, i.e. w is polynomialin size, and it can be verified in polynomial time if (I, b) ∈ R. A decision problemL is in NP if there exists a polynomial verification system for yes-instances. Thecharacterizations are equal because the certificate passed to the verification systemis one possible witness the Non-deterministic Turing machine (NTM) could guessin order to verify the instance.

Example 2.7. A polynomial verification system for yes-instances for SAT is a rela-tionR = {(ϕ,X) | ϕ(X) = 1} with a given propositional formula ϕ as instance Iand the polynomial verifiable certificateX which verifies that ϕ is satisfied. Hence,SAT ∈ NP.

Due to the fact that SAT is in NP the complexity of this problem is dominatedby the amount of different assignments to the function ϕ : IBn → IB. The amountof different assignment depends on the number of literals used in the formula.Hence, in order to reduce the number of literals and clauses to a minimum, ϕ isencoded in CNF.

For classifying decision problems and for describing their relation to each otherpolynomial-time reductions are usually used [15].

Definition 2.16 (Polynomial-time reduction). Let L and L′ be decision problems.A polynomial-time reduction from L to L′ is a function r : L→ L′ with the follow-ing properties:

• I ∈ L⇔ r(I) ∈ L′

• r can be calculated in polynomial time.

If there exists a polynomial-time reduction from L to L′, then the notation L ≤p L′is used.

Some decision problems are more similar to other problems then to others. Toexpress a deeper similarity between two problem the term p-isomorph is used.

Definition 2.17 (P-Isomorphism). A polynomial-time reduction r : L′ → L is aP-isomorphism if r is a bijection and r and r−1 are calculatable in polynomialtime.

If such an r exists, L and L′ are called p-isomorph.

17

For separating decision problem within a complexity class the notions of hard-ness and completeness are used.

Definition 2.18. Let L be a decision problem and let C be the complexity class ofL. Then

• L is hard in C iff for all L′ ∈ C : L′ ≤p L.

• L is complete in C iff L ∈ C and L is C hard.

Example 2.8. If P 6= NP, then are all NP hard problems are not in P.

In [15], the term NP complete was introduced and it was shown that the SATis NP complete. In the following we denote the set of decision problems which areNP complete with NPC.

An algorithm with solves SAT instances is called SAT solver. Nowadays, SATis a well investigated problem and efficient solving algorithms, i.e. SAT solvers,have been proposed [19, 27]. Instead of simply traversing the complete space ofassignments, powerful optimization techniques are applied, such as intelligent de-cision heuristics, conflict based learning schemes and efficient implication meth-ods by Boolean Constraint Propagation (BCP), conflict-driven no-good learning,conflict analysis via the firstUIP scheme, no-good recording and deletion, back-jumping and restarts, conflict-driven decision heuristics, progress saving, unit prop-agation via watched literals, dedicated propagation of binary and ternary no-goods,dedicated propagation of extended rules (over cardinality and weight constraints)and equivalence reasoning and resolution-based preprocessing. These techniqueslead to effective search procedures.

Combinatorial optimization is part of discrete mathematics and is relevant inmany areas of computer science, artificial intelligence, and engineering. UsuallyNP hard problems are considered.

Definition 2.19. An instance for a combinatorial optimization problem is a tuple(L, f) with a countable set L of possible solutions and a function f : L → D,where D is a suitable co-domain, e.g. R, assigning a value to each solution inL such that a cost, gain, etc can be calculated. Hence, a global optimal solutioncan be found, i.e. ∃i ∈ L,∀u ∈ L : f(i) ≤ f(u) for a minimization problem orf(i) ≥ f(u) for a maximization problem.

There exist several algorithms for solving these optimization problems. Widelyused are Integer Linear Programming (ILP), where the instance and optimizationfunction are expressed as a linear equation system, or branch-and-bound algorithmswhere the problem is divided in sub-problems which are solved (independently).

Pseudo-Boolean optimization (PBO) is a special case of integer linear pro-gramming. Here the formula is expressed as a pseudo-Boolean function, i.e. thevariables used in the formula are restricted to binary values.

18

Definition 2.20 (Pseudo-Boolean optimization). PBO determines a satisfying so-lution for a pseudo-Boolean function (usually in CNF) ψ : IBn → IB, such that∑n

i=1 cixbii ≥ cm, where c1 . . . , cn, cm ∈ Z and xbii either is a positive or a neg-

ative literal, i.e. bi ∈ IB. In addition to the problem encoding, an objective func-tion F is defined by F(x1, . . . , xn) =

∑ni=1mix

bii with m1, . . . ,mn ∈ Z. This

objective function is used by the solver to minimize/maximize F .

Lemma 2.2. PBO ∈ NPC.

Proof. The inheritance of PBO in NP can be shown by guessing the assignmentto the variables. This takes linear time in the number of variables. Afterwardsthe CNF and the objective function must be evaluated which is also possible inpolynomial time.

The NP hardness of PBO is shown by reducing SAT to PBO. The reduction r isstraightforward. The set of Boolean variables for the PBO instance is the same asfor the SAT instance. Every clauses in the CNF is transformed. Let x1∨x2∨· · ·∨xnbe a clause in the SAT-CNF, then 1 · x1 + 1 · x2 + · · ·+ 1 · xn ≥ 1 is the respectiveclause for the PBO-CNF. Finally, an arbitrary tautology as objective function isgenerated, e.g. Fmax = 1 · x1 + 1 · x1.

It is easy to see that for an instance x the identity PBO(x) = true ⇐⇒r(SAT(x)) = true.

Example 2.9. Let Φ = (x1 + x2 + x3)(x1 + x3)(x2 + x3). Then, x1 = 1, x2 = 1,and x3 = 1 is a satisfying assignment solving the SAT problem. Accordingly, letΨ = (2x1 + 3x2 + x3 ≥ 3)(2x1 + x2 ≥ 2) and Fmin = x1 + x2 + x3. Then,x1 = 1, x2 = 0, and x3 = 0 is a solution to the PBO problem, satisfying Ψ and, atthe same time, minimizing F .

Since PBO is an extension of the SAT problem, improvements investigated forSAT solvers can be applied to PBO solvers as-well.

A different common procedure for determining the minimum (maximum) isto translate the respective instance into a sequence of SAT instances where therespective decision problem which a fixed k depending on the iteration step, issolved in order to efficiently determine a solution.

19

Chapter 3

Design of Nearest NeighborCompliant Quantum Circuits

In this chapter the design flow of nearest neighbor compliant quantum circuits isdiscussed. Therefore, the synthesis of quantum circuits is reviewed. Afterwards thephysical realizations of quantum circuits are discussed. Their intrinsic restrictionsare leading to a metric for nearest neighbor compliance from which optimizationschemes can be derived.

3.1 NCV-based Synthesis of Quantum CircuitsProminent examples for quantum algorithms, which exploit the advantages of quan-tum computing in a special way, are Grover’s search [30] and Shor’s algorithm [73,22]. Other examples of specialized quantum circuits realizations for some specificquantum algorithms are e.g. error correction circuits [23], quantum addition [49],or quantum Fourier transformation [78]. They are outstanding by the fact that theyare realized by specialized quantum circuits where the respective circuit netlistshave manually been derived. In contrast to these specialized functions the auto-matic synthesis of quantum circuits for general functions has also been the subjectof intensive research.

Due to the fact that any quantum operation can be represented by a unitarymatrix [56], each quantum circuit is inherently reversible. Accordingly, every re-versible circuit can be transformed (synthesized) to a quantum circuit containingelementary quantum gates only [56]. The synthesis of reversible circuits to quan-tum circuit is called quantum decomposition.

Hence, the synthesis of Boolean components of general quantum circuits isusually conducted in two steps: First, the desired logic is synthesized as a re-versible circuit using multiple control Toffoli gates [31, 71, 45, 62, 28, 84, 63].Afterwards, each reversible gate of the resulting circuit is decomposed by mappingit to a corresponding cascade of elementary quantum gates.

Depending on the addressed gate library, different mapping schemes have beenproposed and intensely studied in the past [6, 44]. This includes the mapping of

21

x1 x1

x2 x2

x3 x3

(a) ReversibleToffoli gate

x1 x1

x2 x2

x3 x3V V V†

(b) Toffoli gate quantum decom-position

Figure 3.1: Toffoli gate decomposition

reversible gates to NCV gates, which was, originally, proposed in [6] by Barencoet al. Afterwards, further improvements were introduced in [46] and more recentlyin [52] which introduced the current state-of-the-art NCV mapping scheme.

Example 3.1. In Fig. 3.1 the quantum decomposition of a Toffoli gate into a quan-tum circuit for the NCV library is shown. Consider a Toffoli gate with two controllines as shown in Fig. 3.1(a). A functionally equivalent realization in terms ofgates from the NCV library is depicted in Fig. 3.1(b). After the decomposition, thequantum circuit contains only elementary gates of the NCV library.

Because of the fact that different Toffoli gate configurations exist, similar map-pings for multi-controlled Toffoli gates have been proposed. But consequently,with increasing number of control lines, the resulting quantum circuits becomemore expensive i.e. require exponential more quantum gates and ancillary lines.An ancillary line is a line which is not the target or a control of a multi controlledToffoli gate but is used in implementing it as a cascade of simpler gates, i.e. asa cascade of Toffoli gates with less control lines then the original (cf. [52]). Toprovide some numbers, Table 3.1 lists the respective number of quantum gates fordifferent Toffoli gate configurations according to the current state-of-the-art NCVmapping scheme introduced in [52].

In contrast to the 2-stage synthesis, there are also automatic synthesis ap-proaches, which apply elementary gates directly during the synthesis process [29,71, 36, 29, 61].

3.2 Nearest Neighbor ConstraintsIn the recent years, physical accomplishments have led to several physical real-izations for quantum computers [49]. At the same time, these realizations alsorevealed intrinsic physical limitations on implementing quantum computing tech-nology [60]. Among them, the limited interaction distance between gate qubitsis one of the most common ones in several promising implementation of quan-tum computation. Here, it is required that computations are only to be performedbetween adjacent, i.e. nearest neighbor, qubits. More precisely, they have the con-straint that only adjacent qubits can be manipulated by a quantum operation. Thisconstraint is called nearest neighbor constraint and the implementations are, re-spectively, called nearest neighbor architectures.

22

Table 3.1: Number of quantum gatesNumber of Number of Ancillary Lines

Control Lines 1 2 3 4 5 6

2 53 144 205 326 447 64 568 76 689 96 88 80

10 108 100 9211 132 120 112 10412 156 132 124 11613 180 156 148 136 12814 204 180 172 148 14015 228 204 198 172 160 152

This was originally motivated by physical realizations based e.g. on trappedions (see e.g. [32]), liquid nuclear magnetic resonance (see e.g. [41]), and archi-tectures based on the original Kane model [38]. Nowadays, ion traps are no longerappropriately described as universally nearest neighbor architectures, liquid nu-clear magnetic resonance is acknowledged as not scalable, and the original Kaneproposal has been superseded by [35]. Nevertheless, nearest neighbor architec-tures are still an issue in recently proposed technologies including proposals forion traps [3, 40, 55], nitrogen-vacancy centers in diamonds [16, 94], quantum dotsemitting linear cluster states linked by linear optics [33], laser manipulated quan-tum dots in a cavity [37], and superconducting qubits [57, 17] and also occurs inrecently proposed technologies. Also recently, physical implementations based onmulti-dimensional architectures have gained interest and are seen as the more suit-able physical solution [35, 40, 55]. Here, qubits are not aligned next to each other,but e.g. in a 2D structure. First physical realizations based on photonics [8], super-conductors [8], quantum dots [79], and neutral atoms [65] have been shown verypromising.

In order to formalize this restriction for computer aided design methods, a cor-responding metric representing the costs – termed Nearest Neighbor Cost (NNC)– of a quantum circuit to become nearest neighbor compliant in one-dimensionalarchitectures has been informally introduced in [64]. Formally it can be defined asfollows:

Definition 3.1. Let C = (Q,G) be a quantum circuit and let l be an alignment ofqubits defined by the bijection l : Q→ {1, 2, . . . , |Q|} which assigns each qubit toa circuit line such that a specific permutation is realized.

23

1. The NNC for a gate g = (t, c) with t ∈ Q, c ∈ Q ∪ {λ} and λ /∈ Q aredefined as

NNC(g, l) =

{0, if c = λ

|l(t)− l(c)| − 1, if c ∈ Q,

where c = λ indicates that the gate is uncontrolled.

2. The NNC for a circuit C are defined by NNC(C, l) =∑g∈G

NNC(g, l).

If G = ∅, then NNC(C, l) = 0.

Example 3.2. Consider the circuit depicted in Fig. 3.1(b) from page 22 (realizingthe Toffoli gate from Fig. 3.1(a) as discussed in Example 3.1). Gates are denotedby G = {g1, . . . , g5} from the left to the right. As can be seen, gate g1 is non-adjacent. Since gate g1 has NNC of 1 and the others have NNC of 0, the resultingNNC of the quantum circuit is 1 in total.

The fact, that only controlled gates (i.e. c 6= λ) increase the NNC of a circuitis formalized by the following observation.

Observation 3.1. Let C = (Q,G) be a quantum circuit. Independent from themappings l the following identity holds:

NNC(C, l) = NNC((Q,G2), l)

where G2 = G \G1 as defined in Definition 2.10.

Proof. The proof follows directly from the first case in Definition 3.1. Because forall g = (t, c) ∈ G1 we have c = λ and thus NNC(g, l) = 0. Hence,

NNC(C, l) =∑g∈G

NNC(g, l) =∑g∈G2

NNC(g, l) = NNC((Q,G2), l).

As a result only controlled gates must be considered during optimization.

3.3 Nearest Neighbor OptimizationUsually, synthesis of quantum circuits often does not consider nearest neighborconstraints. But with the dawn of the recent physical realizations as discussed be-fore, this issue becomes more and more relevant. Accordingly, researchers startedconsidering synthesis of nearest neighbor compliant quantum circuits.

The state-of-the-art is to apply a post-synthesis optimization on the synthesizedquantum circuit in order to make it nearest neighbor compliant. If a given circuitdoes not satisfy the nearest neighbor constraints, additional SWAP gates are added.These SWAP gates allow for making all control lines and target lines adjacent and,by this, help to satisfy the nearest neighbor constraint. More precisely, a cascade

24

x0 x0

x1 x1

x2 x2

x3 x3

(a) Given circuit

x0 x0

x1 x1

x2 x2

x3 x3

(b) Nearest neighbor compliant circuit

x3 x3

x2 x0

x0 x2

x1 x1

(c) Optimal nearest neighbor compliantcircuit

Figure 3.2: Establishing nearest neighbor compliance

of adjacent SWAP gates can be inserted in front of each gate g with non-adjacentcircuit lines in order to shift the control line of g towards the target line, or viceversa, until they are adjacent. The following example illustrates the idea.

Example 3.3. Consider the circuit depicted in Fig. 3.2(a). As can be seen, gates g1,g4, and g5 are non-adjacent. Thus, in order to make this circuit nearest neighborcompliant, SWAP gates in front and after all these gates are inserted as shown inFig. 3.2(b).

According to [64] a naive synthesis for the Linear Nearest Neighbor architec-tures inserts a cascades of SWAP gates in front each non-adjacent gate in orderto change the line order until target and control line of the considered gate areadjacent.

While such a naive approach is able to transform any given circuit into a nearestneighbor compliant version in linear time, the insertion of SWAP gates obviouslyincreases the quantum cost of the resulting circuit. In fact, for each non-adjacentcontrolled gate 2 · (|t− c| − 1) SWAP gates are additionally inserted to the circuit,where t and c denote the position of the target and control line, respectively.

Depending on the considered model, each SWAP gate increases the quantumcosts either by 1 or by 3. This is because SWAP gates itself are sometimes assumedto be elementary gates (as done by certain quantum technologies). But SWAP gatescan also be composed as three elementary gates. In order to avoid confusion, in thefollowing we simply count the number of additionally inserted SWAP gates ratherthan providing the quantum costs.

The fashion in which SWAP gates are inserted has a significant effect: Re-ordering the circuit lines or considering SWAP gate insertion not only locally foreach single gate but for the whole cascade may reduce the costs significantly. Asan example, the nearest neighbor compliant circuit from Fig. 3.2(b) can actually bereduced to the circuit shown in Fig. 3.2(c) – reducing the number of SWAP gatesfrom 8 to 1.

25

Motivated by this, researchers started to intensely investigate how to reduce thenumber of SWAP gate insertions in order to make a given quantum circuit nearestneighbor compliant. The resulting approaches can roughly be divided into twoschemes:

• Using a Global Reordering SchemeThis scheme globally considers possible permutations of circuit lines in or-der to reduce the number of SWAP gates. The permutation of the circuit lineschanges the NNC if the distance between control and target line changes. Ascan clearly be seen, this significantly affects the number of SWAP gates tobe inserted. The challenge is how to determine a good permutation fromall |Q|! possible ones.

• Using a Local Reordering SchemeThe second scheme does not only consider all circuit line permutations glob-ally, but also in a local fashion. That is, SWAP gates may be applied beforeeach gate in order to change the order of circuit lines. These changes may notonly be applied in order to make a gate nearest neighbor, but additionally toestablish a gate order which is beneficial for making remaining gates nearestneighbor compliant. As an example, Fig. 3.2(c) shows a circuit where differ-ent circuit line permutations are established locally. Compared to the circuitfrom Fig. 3.2(b), this also reduces the number of SWAP gate insertions. Incontrast, this hardens the challenge of determining good permutations sincenow all |Q|! possible permutations need to be considered for each gate.

3.4 Considered Problem and ContributionsIn the remaining chapters of this thesis the two presented schemes are evaluatedin detail. Therefore, one-dimensional, so called Linear Nearest Neighbor (LNN)architectures, as well as multi-dimensional architectures are considered. Severalachievements have been obtained. We want to outline the core messages here.

1. Determining a good ordering is difficult. In Chapter 4 it is shown that thedecision problems for the global reordering scheme and for the local reorder-ing scheme are NP complete. But note that even though that the problemsare in the same complexity class, the real complexity of the local reorderingalgorithm is far beyond the complexity of the global reordering algorithm.

2. Several approaches for addressing the problems have been proposed. InChapter 5 and Chapter 6 we review these approaches which are almost allof heuristic nature and propose exact alternatives guaranteeing minimalitywith respect to the number of inserted SWAP gates. Our approach make useof the deductive power of constraint solvers. This enables us to perform aqualitative evaluation of the performance of existing (heuristic) solutions.

The presented algorithms have been implemented and evaluated. As theexperiments show, this does not only lead to less expensive realizations,

26

but also allows for an evaluation of the power of previously proposed ap-proaches.

3. Due to the fact, that many quantum algorithms include a Boolean compo-nent and that quantum circuits can be conducted from reversible circuits.In Chapter 7 we present a synthesis scheme for addressing nearest neigh-bor constraints on the reversible circuit level. For this purpose, a recentlyintroduced gate library is applied.

This research has been published on/in the following conferences and journals:

1. Exact local reordering (LNN) [90]

2. Exact global and local reordering (LNN) [89]

3. Exact local reordering (multi-dimensional) [43]

4. Considering nearest neighbor constraints on the reversible circuit level [88]

27

Chapter 4

Complexity Analysis ofReordering Schemes

A complexity theoretical analysis of the presented reordering algorithms is impor-tant in order to prove that the considered problems are of a certain complexity, i.e.are in a respective complexity class. Since problems for some classes are efficientsolvable whereas for other classes, e.g. NP, no efficient algorithm is known, theinheritance of the respective problem in NP proves it is in deed hard to solve andrequires elaborated solving techniques or heuristics to find solutions.

4.1 One-Dimensional Nearest Neighbor ProblemsGlobal Reordering describes permuting the qubits such that the NNC of a circuitare reduced. Whereas Local Reordering can be described as: Given a quantumcircuit C. Permute the qubits locally in front of each gate, such that all gates areadjacent.

Definition 4.1 (Global Reordering (GR)).Instance: Quantum circuit C = (Q,G) and a k ∈ N.Question: Exists an alignment l : Q→ {1, 2, . . . , |Q|} such thatNNC(C, l) ≤ k?

Definition 4.2 (Local Reordering (LR)).Instance: Quantum circuit C = (Q,G), a k ∈ N and a cost function for permuta-tions c : P|Q| → N.Question: Exists an initial alignment l : Q → {1, 2, . . . , |Q|} and a sequence ofpermutations p1, p2, . . . , p|G| ∈ P|Q| such that

|G|∑i=1

c(pi) ≤ k and NNC(gi, πi) = 0, ∀gi ∈ G with πi =i∏

j=1

pj ◦ l?

Often it is assumed that the first permutation can be realized without any SWAPgates by assuming an arbitrary order (c.f. [90]). Note that in a sequence of permuta-tions every permutation manipulates the previous ordering. Hence, in the equation

29

the product of all these permutations must be considered for each gate. Usually infront of each uncontrolled gate the current order is not permuted.

Theorem 4.1. GR ∈ NPC and LR ∈ NPC.

GR is very similar to the NP complete Optimal Linear Arragement Problem(OLA). In [25] this decision problem is defined as follows:

Definition 4.3 (Optimal Linear Arrangement (OLA)).Instance: Graph (V,E) with a set of vertices V and a set of edges E and a k ∈ N.Question: Exists a bijection f : V → {1, 2, . . . , |V |} such that∑

(u,v)∈E

|f(u)− f(v)| ≤ k?

Theorem 4.2. GR is p-isomorph to OLA.

Theorem 4.2 follows directly from Lemma 4.2, Lemma 4.3 and Observation 4.1.

Lemma 4.1. GR ∈ NP.

Proof. Let (C, k) be GR instance. A bijection f which satisfies NNC(C, f) ≤ kwould be a witness. The size of f is polynomial in |Q| and evaluating the statementcan be done in polynomial time in G.

Lemma 4.2. OLA ≤p GR.

Proof. Let (V,E) be a graph, let k ∈ N and let f : V → {1, 2, . . . , |V |} for thestatement

∑(u,v)∈E

|f(u) − f(v)| ≤ k. Let g : V → Q be a bijection from vertices

to qubits and let h : E → {1, 2, . . . , |E|} be a bijection for enumerating the edges.Then a mapping m : E → G can be constructed using g and h. It is defined asfollows: m((u, v)) = (g(u), g(v), h((u, v))). Obviously this is also a bijection.

Let E2 := {(u, v) ∈ E | u 6= v}, d = k − |E2|, Q = {g(v) | v ∈ V },G2 = {m(e) | e ∈ E2} and C = (Q,G2). Then (C, d) is an GR instance, whichis valid for an bijection n : Q → {1, 2, . . . , |Q|} defined by g−1 ◦ f iff the OLAinstance (V,E, k) is valid for f and f is a bijection. This reduction is obviouslylinear in V and E.

Let S :=∑

(u,v)∈E|f(u)− f(v)| and S2 :=

∑(u,v)∈E2

|f(u)− f(v)| in the follow-

ing.Correctness: Let ((V,E, k), f) ∈ OLA where f is a bijection. Then |f(u) −

f(v)| > 0, if u 6= v and |f(u) − f(v)| = 0 otherwise. Hence, S ≥ |E2|. As aresult, if a bijection f exists such that S ≤ k, i.e. (V,E, k) ∈ OLA, then k ≥ |E2|.Thus, we can reason that there exist a d ∈ N such that S ≤ k = |E2| + d or inother terms S − |E2| ≤ d. From this we can derive

S − |E2| =∑

(u,v)∈E2

(|f(u)− f(v)| − 1) =∑

(t,c,x)∈G2

(|f(g(t))− f(g(c))| − 1)

=∑

(t,c,x)∈G2

(|n(t)− n(c)| − 1) =∑g2∈G2

NNC(g2, n) = NNC(C, n)

30

such that NNC(C, n) ≤ d.Completeness: Let ((V,E, k), f) /∈ OLA, i.e. there is no f such that S ≤ k,

i.e. for all bijections f we have S > k. This is equivalent to S2 > k. Then, usingS2 ≥ |E2|, we get two cases: either S2 > k ≥ |E2| or |E2| > k.

Considering S2 > k ≥ |E2|: Using k = |E2|+d and S−|E2| = NNC(C, n)we have NNC(C, n) > d such that ((C, d), n) /∈ GR.

Considering |E2| > k: Using k = |E2|+ d we obtain d < 0, such that d /∈ N.Hence, ((C, d), n) /∈ GR.

Lemma 4.3. GR ≤p OLA.

Proof. Let C := (Q,G) be a circuit, let l : Q → {1, 2, . . . , |Q|} be a bijectionfor the statement NNC(C, l) ≤ d for a d ∈ N. Let f, g, h,m, n and d be as inLemma 4.2. Then using the same arguments (V,E, k) is an OLA instance, which isvalid for f iff (Q,C) is valid for n. But in this case we derive V = {g−1(q) | q ∈Q} and E2 = {m−1(e) | e ∈ G2}. The proof for correctness and completenessare also analog to the proof in Lemma 4.2.

Observation 4.1. Let r be the polynomial reduction such that r(OLA) = GR.Then for the inverse reduction we have r−1(GR) = OLA.

Lemma 4.4. Every circuit C = (Q,G) with an mapping l can be expressed by afunctional equivalent circuit C ′ = (Q,G′), with |G| ≤ |G′| and NNC(C ′, l) = 0.

Proof. Applying the naive approach presented in [64] by inserting a permutationpi realized by a sequence of SWAP gates in front of each gate (t, c, i) ∈ G2 inorder to make them adjacent, afterwards the inverse permutation is applied in orderto reestablish the initial ordering. All SWAP gates are in G′, i.e. |G′| = G +

2∑|G|

i=1 |pi|.

Lemma 4.5. LR ∈ NP

Proof. Let C = (Q,G) be a quantum circuit, k ∈ N and c : P|Q| → N a costfunction. The certificate is an initial assignment l and a sequence of permutationsp1, p2, . . . , p|G| = P . The certificate is of polynomial size in Q and G and can beverified in polynomial time. This can be done as follows. Calculate in polynomialtime

∑pi∈P c(pi) and assure that it is less or equals k.

Let T be the time necessary for applying a permutation to the current assign-ment for the qubits where t is polynomial in Q. Then the total time needed forapplying

∏ij=1 pj ◦ l for all gi ∈ G is T · |G| due to the fact that after applying

one permutation the adjacency of the qubits of the gate can be verified and thecalculated permutation can be reused for the next gate.

To proof the completeness a reduction to a special case called Naive LocalReordering is used.

31

NLR: p1 g1 p1p2︸︷︷︸ g2 p2p3︸︷︷︸ g3 p3 . . . p|G|−1p|G|︸︷︷︸ g|G| p|G|

LR: p′1 g1 p′2 g2 p′3 g3 . . . p′|G| g|G|

Figure 4.1: NLR to LR

Definition 4.4 (Naive Local Reordering (NLR)).Instance: A quantum circuit C = (Q,G), a k ∈ N and a cost function c : P|Q| →N.Question: Exists an alignment l : Q→ {1, 2, . . . , |Q|} and a sequence of permuta-tions of qubits p1, p2, . . . , p|G| such that

∑|G|i=1 c(pi) ≤ k andNNC(gi, pi◦ l) = 0,

∀gi ∈ G.

Lemma 4.6. GR ≤p NLR.

Proof. Using the idea from Lemma 4.4. Let X = (C, k) be a GR instance andl a certificate for this instance. Then using the same X and l together with ac : P|Q| → N we construct a second certificate P such that (X, l) ∈ GR ⇐⇒((C, k, c), (l, P )) ∈ NLR.

Let p1, p2, . . . , p|G| = P . Then each pi can be constructed from (t, c, i) ∈ Gusing Definition 2.6(9) where c(pi) = |l(t)− l(c)| − 1.

(X, l) ∈ GR ⇐⇒ NNC(C, l) ≤ k

⇐⇒∑

(t,c,i)∈G

(|l(t)− l(c)| − 1) ≤ k

⇐⇒|G|∑i=1

c(pi) ≤ k ∧|G|∧i=1

NNC(gi, pi ◦ l) = 0

⇐⇒ ((C, k, c), (l, P )) ∈ NLR

Lemma 4.7. NLR ⊂ LR.

Proof. Let X = (C, k, c) be a NLR instance and (l, P ) a certificate for this in-stance. Then a LR instance Y = (C, k′, c) with a certificate (l′, P ′) can be con-structed as follows: l′ = l, k′ = 2k − k|G| with k|G| = |f(t) − f(c)| − 1 for(t, c, |G|) ∈ G, and P ′ = p′1, . . . , p

′|G|.

Claim: (X, (l, P )) ∈ NLR =⇒ (Y, (l′, P ′)) ∈ LR.Let P = p1, . . . , p|G|. Then derive the inverse permutations pi for all pi. Map

as depicted in Fig. 4.1 the permutations p′1 = p1 and p′i = pi−1pi, 2 ≤ i ≤ |G|onto each other.

The reduction is obviously a polynomial-time reduction.Let (X, (l, P )) ∈ NLR. Let πi =

∏ij=1 pj ◦ l. Using Observation 2.1 for all

(c, t, i) = gi ∈ G2 the equation |π1 . . . πj(c)− π1 . . . πj(t)| = |π′j(c)− π′j(t)| = 1holds. I.e. NNC(gi, πi) = 0, ∀gi ∈ G.

32

Due to the fact that (X, (l, P )) ∈ NLR we have∑|G|

i=1 c(pi) ≤ k. Now wehave to show that also

∑|G|i=1 c(p

′i) ≤ k′ holds.

By definition∑|G|

i=1 c(pi) ≤ k =⇒∑|G|

i=1 c(pi) ≤ k. Using the constructionp′1 = p1 and p′i = pi−1pi, 2 ≤ i ≤ |G| we get the following equation:

|G|∑i=1

c(p′i) = c(p1) +

|G|∑i=2

c(pi−1pi)

= c(p1p1p2 . . . p|G|−1p|G|)

=

|G|−1∑i=1

c(pipi)

+ c(p|G|)

=

|G|∑i=1

c(pipi)

− c(p|G|)= 2

|G|∑i=1

c(pi)

− c(p|G|)≤ 2k − k|G| = k′.

Lemma 4.6 shows that NLR is NP complete. Lemma 4.7 shows that NLR isa special case of LR. Hence, using Lemma 4.5 we can conclude that LR ∈ NPC.Using this result and Theorem 4.2 the Theorem 4.1 is shown.

4.2 Multi-Dimensional Nearest Neighbor ProblemsAs mentioned earlier multi-dimensional nearest neighbor problems can also beconsidered. Therefore instead of considering a one-dimensional discrete spacebounded by a natural number as the geometric prism where the qubits are placedan n-dimensional space bounded by an n-orthotope, i.e. a hyperrectangle, is used.More formally:

Definition 4.5. Let Q be a set of qubits. Define an n-dimensional circuit grid(d1, . . . , dn) = Dn ∈ Nn, di > 0 for all i holding

∏ni=1 di = |Q|. As a result

every q ∈ Q can be assigned to specific positions in the grid. More specifically wecan define a bijection m : Q→ Dn.

Due to the fact that there exist Q! many different bijections on this n-orthotopepermutations can be defined. We denote the set of these permutations by Pn|Q|.

Also analogously a cost function using these n-orthotope permutations can bedefined which represent the number of inversions along the dimensions to calculatethe difference between two permutations. From this a distance function c : Dn ×Dn → N for two positions can be derived.

33

Definition 4.6 (n-dimensional Global Reordering (GRn)).Instance: Quantum circuit C = (Q,G) and an n-dimensional circuit grid Dn, ak ∈ N and a distance function for two positions c.Question: Exists a mapping f : Q→ Dn such that

∑(t,c,x)∈G2

c(f(t), f(c)) ≤ k?

Analog we can define the n-dimensional Local Reordering problem.

Definition 4.7 (n-dimensional Local Reordering (LRn)).Instance: Same as GRn

Question: Exists a sequence of permutations p1, p2, . . . , p|G2| ∈ Pn|Q| such that

|G|∑i=1

c(pi) ≤ k and∑

(t,c,i)∈G2

c(πi(t), πi(c)) = 0 with πi =

i∏j=1

pj ◦ f?

Theorem 4.3. GRn ∈ NPC and LRn ∈ NPC.

Lemma 4.8. Let X stand for GR and LR. Let (C,D, k, dist) be a Xn instance.Then exists an equivalent Xn+1 instance.

Proof. Let (C,D, k, c) be a Xn instance. Construct C ′, D′, k′ and c′ as follows:C ′ := C and k′ := k, but the grid D′ and the respective cost function c′ are(n + 1)-dimensional such that we define D′ := (d1, . . . , dn, 1) andc′((a1, . . . , an, an+1), (b1, . . . , bn, bn+1)) := c((a1, . . . , an), (b1, . . . , bn).

We have |Q| =∏ni=1 di =

∏n+1i=1 di and if An = (a1, . . . , an) ∈ Dn then

An+1 = (a1, . . . , an, 1) ∈ Dn+1. Thus there is a bijection mapping An to An+1.From the cost function obviously follows that (C ′, D′, k′, c′) an equivalent instanceto (C,D, k, c) but for Xn+1.

Proof of Theorem 4.3. Proof of Xn ∈ NPC where X stand for GR and LR.Guess the permutations, such that the conditions holds. The check can be per-formed as in the 1-dimensional case in polynomial time. Hence, Xn ∈ NP.

For showing the completeness, we reduce X = X1 to Xn. Therefore theLemma 4.8 is applied n times. Each new dimension has depth of 1 such that thereis no permutation which satisfies for lower k. The construction in the Lemmais a polynomial-time reduction. Hence is the reduction from X to Xn also apolynomial-time reduction.

4.3 ConclusionIn this chapter the complexity of decision problems for nearest neighbor architec-tures has been discussed. We have shown that the decision problems for globalreordering and local reordering for one-dimensional grids are NP complete. Fur-ther we have shown that the global reordering decision problem is p-isomorph tothe well known NP complete Optimal Linear Arrangement problem. Finally, wehave proven that the problems for multi-dimensional grids remain in the same com-plexity class as the one-dimensional.

34

Chapter 5

Optimization for Linear NearestNeighbor Architectures

Optimization for Linear Nearest Neighbor architectures applies the global and localreordering scheme to one-dimensional architectures. In Chapter 4 is it shown thatthe two schemes are NP complete. Hence, no efficient algorithm is known to solvethese problems. As a result heuristics can be applied in order to find quite goodsolutions without any guarantee of being optimal.

In this chapter we review existing (heuristic) approaches recently proposed forboth schemes. Further, we present respective new exact algorithms determiningminimal solutions but instead of simply enumerating the search space efficientsolving technologies are utilized.

5.1 Global Reordering SchemeIn general, a global reordering algorithm determines an initial permutation whichis applied throughout the circuit. No additional SWAP gates are usually assumedin order to establish this initial permutation. But gates might remain non-adjacent,i.e. SWAP gates still might be necessary to make gates nearest neighbor compliant.

5.1.1 Existing Approach

One algorithm introduced in [64] proposes a (heuristic) global reordering schemewhich determines a good permutation by calculating the contribution of each cir-cuit line of a given quantum circuit C = (Q,G). Therefore, for each controlledgate g of G with control line at position c and target line at position t, the NNCvalue (see Definition 3.1) is calculated. Afterwards, this value9 is added to vari-ables impc and impt which are used to store the “impacts” of the circuit lines cand t on the total NNC, respectively. More precisely, the impact impi of the ith

9Originally, in [64] half of this value is added to variables impc and impt. Simplifying this sumdoes not adulterate the results, but makes the sum more clear and convenient for further comparisonin this work.

35

x0 x0

x1 x1

x2 x2

x3 x3

x4 x4

V V V†

(a) Given circuit

x3 x3

x1 x1

x0 x0

x2 x2

x4 x4

V V V†

(b) Heuristic global reordering

x4 x4

x0 x0

x1 x1

x2 x2

x3 x3

VV

V†

(c) Exact global reordering

Figure 5.1: Global Reordering

circuit line is calculated by

impi =∑

g=(c,t)∈G | c=i ∨ t=i

NNC(g).

Using these impacts, the algorithm selects the circuit line with the greatestvalue and permutes it with the middle circuit line. If the selected line already is themiddle line, the one with the next greatest impact is selected. This whole procedureis repeated until no further improvements are achieved. Eventually, the resultingcircuit is made nearest neighbor compliant by inserting SWAP gates as describedin Section 3.3.

Example 5.1. Consider the quantum circuit C = (Q,G) with G = {g1, ..., g7}depicted in Fig. 5.1(a). This circuit is not nearest neighbor compliant since thegates g2, g6 and g7 have an NNC greater than 0. Applying the heuristic of [64],the resulting impacts of the circuits lines are impx0 = 7, impx1 = 0, impx2 = 1,impx3 = 0, and impx4 = 6, respectively. Permuting the line order such that thelines with high impact are located in the middle (descending towards the outerlines) results in the circuit depicted in Fig. 5.1(b). Compared to the naive methodfrom Section 3.3 (without global reordering), this reduces the number of requiredSWAP gates from 14 to 6.

5.1.2 Exact ApproachWhile the approach reviewed above provides a simple and efficient method formaking quantum circuits nearest neighbor compliant, the resulting global permu-tation hardly determines the best possible permutation. In order to determine this,all possible permutations need to be considered.

In contrast to the heuristic from above, the exact approach does not considerimpacts of lines anymore, but re-calculates the NNC values after a new permutation

36

has been established. Then, the objective function can be utilized, where the NNCover all gates are considered.

Taking all that into consideration, the costs for a given permutation π ∈ Π canbe calculated by

costπ =∑

(c,t)∈G2

2 ·NNC(π(c), π(t)),

where π(i) relates to the element at index i in the permutation π and Π denotes theset of all permutations10.

In order to select a respective permutation and change the line order respectiveto the order of the selected permutation, free Boolean variables denoted sπ for eachπ ∈ Π are introduced which, since only one permutation can be applied, need tosatisfy the constraint ∑

π∈Π

sπ = 1.

Based on that, the considered problem can be formulated by combining the twoformulas from above with the objective function

min(∑π∈Π

costπsπ).

Note that, the cost function costπ can further be simplified. Gates acting onthe same circuit lines, regardless of whether its the control or the target on theappropriate line, have the same NNC and can be grouped due to the associativityand commutativity of the addition, such that reordering the terms (i.e. reorderingthe gates) does not change the NNC.

An adjacency matrix A for the interaction graph (cf. [69]) can be constructedholding the number of gates between each two lines in its entries, i.e.aij = |(i, j)|+ |(j, i)|, 0 ≤ i, j < |Q| such that aij holds the number of gates be-tween the lines i and j. Thus, the cost function can be simplified to

costπ =∑aij∈A

aij · 2 · (|π(i)− π(j)| − 1).

The adjacency matrix A defines a symmetric matrix, which is mirrored at themain diagonal, which itself contains zeros only. As a result, maximal |Q|(|Q|−1)

2terms must be considered in the cost function.

Nevertheless, overall, the complexity of solving this formulation isO(|Q|! · |G|),since for every possible permutation π the respective costs (i.e. costπ) have to beconsidered. While respective solve engines, e.g. for PBO or ILP, might be utilizedto determine solutions, our evaluations showed that enumerative methods performsbetter in this case. Again, this is caused by the need to calculate the costs costπ

10We apply twice the number since two SWAP gates are required in order to reduce the NNC by 1with an symmetric mapping which restores the initial line order after each gate.

37

for a respectively considered permutation π. While those costs either need to becompletely available in advance or have to exhaustively be formulated e.g. in arespective PBO- or ILP-instance, an enumerative method can easily derive thesevalues on a case to case basis (and abort the calculation when the currently bestknown value has already been exceeded). Experiments summarized in Section 7.5confirm that, despite the complexity, this solution is able to determine the bestglobal permutations in acceptable run-time.

Example 5.2. Consider again the quantum circuit depicted in Fig. 5.1(a) and itsheuristic improvement from Fig. 5.1(b). Determining the exact permutation indeedallows for a further reduction of the required SWAP gates as shown in Fig. 5.1(c)depicting a circuit with the best possible global circuit line permutation.

5.2 Local Reordering SchemeIn general, a local reordering scheme may apply different permutations before eachgate in order to interchange circuit lines and, by this, enabling nearest neighborcompliance. While SWAP gates are needed in order to establish these differentpermutations, no further gates are usually required in order to make the remaininggates adjacent.

5.2.1 Existing Approaches

In the past, various approaches following a local reordering scheme have beenproposed. Most of them relied on a heuristic scheme, i.e. were able to efficientlyproduce proper results but did not guarantee a minimal number of SWAP gateinsertions. In the following, this related work is briefly discussed and summarized.

Exhaustive Enumeration of Permutations In [34], an approach for determin-ing the best possible local permutations has been discussed. This represents oneof the few exact solutions which are available thus far. However, just a rather sim-ple enumerative algorithm has been applied for this purpose. Consequently, thisapproach is clearly unfeasible and, therefore, was only used as motivation for theapplication of heuristics.

Greedy Approaches In [34, 64], greedy algorithms have been introduced for de-termining good solutions in a smaller search space. In contrast to the exhaustivesearch, which explores all possibilities, the method proposed in [34] considers only|t−c|! possible permutations for each quantum gate, where t and c are the index ofthe target and control line of the gate. For this purpose, a recursive traversal is ap-plied. Instead, [64] traverses through a given circuit and, for each gate, adds SWAPgates to make it nearest neighbor compliant. The resulting circuit line permutationis then applied to the succeeding gates. The addition of these SWAP gates and, bythis, the creation of a new circuit line permutation is performed in a local fashion,i.e. without considering possible beneficial or harmful effects to the succeedinggates.

38

Applying Window-based Schemes to the Greedy Search The greedy searchintroduced in [34] has also been improved by applying a window-based schemeto the recursive traversal. For each branch, the respectively considered permuta-tion is assumed to a certain window only. Afterwards, the sub-circuit defined bythis window is optimized using the greedy search and the approximate number ofSWAP gates in the window needed. Afterwards the permutation which creates thewindow with the smallest cost is selected.

Determining Local Minima using a Window-Based Scheme In [69], estab-lishing nearest neighbor compliance has been addressed by representing the inter-actions between circuit lines by an interaction graph. There, the authors are alsoconsidering non-trivial cases where the number of gates interacting with a circuitline is greater than 2 or the circuit contains cycles, i.e. a set of gates greater than 2where two gates are acting on the same line.

Therefore, a window of w consecutive controlled gates with s cycles is consid-ered. Then, all controlled gates are divided into sets with only local, i.e. nearestneighbor compliant, gates with at most s SWAP gates. If it is not possible to makethe current w gates local with s SWAP gates, w is decremented by one, and arecheck is done. For at least w = 2 a local window can be found where no SWAPgates are needed for making the gates of the window nearest neighbor compliant,i.e. s = 0.

Each set i with wi gates needs a new line reordering for the involved ni lines.Therefore, SWAP gates are inserted between the sets i and i+ 1 for 1 ≤ i < w tochange the line ordering in set i, to the one in set i + 1. To find the best possiblereordering for each set the minimum linear arrangement problem is used. This isdone by constructing an interaction graph for gates in each set and applying theminimum linear arrangement problem for each set accordingly. The number ofSWAP gates needed to perform the permutation can be derived from the inversionvector of the permutation. To improve the result, the last gate of the set i canbe moved to the set i + 1 which construct two different windows with differentpermutations of the lines involved in the sets.

Additionally Changing the Gate Order In [47], additionally the positions ofthe gates is manipulated in order to determine local permutations which may re-quire less SWAP gate insertions. This is motivated by the fact that, if some con-straints are satisfied, positions of neighboring gates can be exchanged. This is thecase if the target line of one gate does not modify the control line of another oneand the respective operations are commutative. Another constraint considers theoperations performed by the gates and states that gates can only be interchangedwhen their unitary matrices are commutative. To meet this constraint, groups ofquantum gates (windows) can also be considered.

With this knowledge a gate dependency graph may be constructed for a quan-tum circuit which considers both conditions. A gate dependency graph is a directedgraph that shows the dependence of gates in a given circuit where each vertex cor-responds to a specific gate in the circuit. An edge between two vertices represents

39

a dependency and means that the gate represented by the source vertex should beapplied before the one represented by the target vertex.

For this approach, an adjacent transposition graph may also be constructed.This holds all permutations as vertices connected by edges representing the inver-sion for changing one permutation into another. Each vertex has (n − 1) edges.The number of vertices in an adjacent transposition graph is n! with (n−1)n!

2 edges.The main idea in this approach is to formulate the problem as determining

shortest path in an adjacent transposition graph with minimal cost, i.e. minimaldistance, since every step represents an inversion.

In this way, all the gates in the quantum circuit are considered. As long asnearest neighbor gates are realizable in the current qubit order, gates are inserted. Ifno realizable gates remain, SWAP gates are inserted so that the changed qubit ordermakes some gates realizable. Inserting SWAP gates corresponds to a move fromone vertex to another in the adjacent transposition graph. The problem remains todetermine the best path. This is approached with a breadth first search algorithmwhich utilizes the adjacent transposition graph.

5.2.2 Exact ApproachMost of the existing approaches following the local reordering scheme are of heuris-tic nature. Hence, how to efficiently determine a minimal solution remains an openquestion, too. In the following, a corresponding solution to this problem is pro-posed. For this purpose, the deductive power of solvers for pseudo Boolean opti-mization is exploited. First, the general idea of the proposed solution is sketched.Afterwards, details on the precise implementation are presented.

5.2.2.1 General Idea

Given a quantum circuit C = (Q,G), we are looking for local permutations (tobe established before each gate using a total minimum of SWAP gates) so that allgates g of G can adjacently be executed. In order to determine those, one has toconsider

• all possible permutations of circuit lines that, in principle, can be establishedbefore each gate g of G and

• the costs (in terms of adjacent SWAP gates) that would be needed in order tocreate these particular permutations.

The precise cascade of adjacent SWAP gates and, by this, the costs for creatinga particular permutation of circuit lines can thereby be calculated using inversionvectors.

Note thereby that this does not apply in order to permute the circuit lines beforethe first gate, i.e. before g1. Here, in accordance with previous work, e.g. [64, 69],we assume the circuit lines can arbitrarily be permuted with no additional costs justby re-arranging the primary inputs as necessary.

40

Initial mappingto qubits

Cir

cuit

lines l0 ↔ q0

~x00 = (x000x001x

002x

003) ~x10 = (x100x

101x

102x

103)

l1 ↔ q1~x01 = (x010x

011x

012x

013) ~x11 = (x110x

111x

112x

113)

l2 ↔ q2~x02 = (x020x

021x

022x

023) ~x12 = (x120x

121x

122x

123)

l3 ↔ q3~x03 = (x030x

031x

032x

033) ~x13 = (x130x

131x

132x

133)

π

g1

q2

q0

π ...

(a) Variables

Consistency-constraints:

x000 + x001 + x002 + x003 = 1∧x010 + x011 + x012 + x013 = 1∧x020 + x021 + x022 + x023 = 1∧x030 + x031 + x032 + x033 = 1∧x000 + x010 + x020 + x030 = 1∧x001 + x011 + x021 + x031 = 1∧x002 + x012 + x022 + x032 = 1∧x003 + x013 + x023 + x033 = 1. . .

Adjacency-constraints(for g1 with q0 and q2):

(x100 ∧ x112)∨ (x110 ∧ x122)∨ (x120 ∧ x132)∨ (x102 ∧ x110)∨ (x112 ∧ x120)∨ (x122 ∧ x130)

Permutation-constraint(for π = (2310) and k = 1)

( ~x00 = ~x12 ∧ ~x01 = ~x13∧ ~x02 = ~x11 ∧ ~x03 = ~x10)⇔ s12310

Objective function:

min((0 · s20123 + 1 · s20132 + 1 · s20213 + 2 · s20231 + 2 · s20312 + 3 · s20321 + 1 · s21023 + 2 · s21032+ 2 · s21203 + 3 · s21230 + 3 · s21302 + 4 · s21320 + 2 · s22013 + 3 · s22031 + 3 · s22103 + 4 · s22130+ 4 · s22301 + 5 · s22310 + 3 · s23012 + 4 · s23021 + 4 · s23102 + 5 · s23120 + 5 · s23201 + 6 · s23210)+ . . . )

(b) Constraints

Figure 5.2: Resulting PBO encoding for circuit from Fig. 3.2(a)

Taking all that into account, a naive approach that ensures minimality of SWAPgate insertion would work as follows:

1. Enumerately consider all possible permutations of circuit lines for all gatesof the given circuit C.

2. For each set of permutations which lead to a circuit satisfying the nearestneighbor condition, calculate the costs according to the algorithm describedabove.

3. After all permutations have been considered, take the one with the smallestcosts.

This requires to check all possible permutations for all gates of the circuit,i.e. |Q|!|G| different combinations in total. Note that, |Q|!G2 are sufficient - theremaining permutations are the identity. Although it was shown in [34], that itis sufficient to only consider the permutations between the respective control and

41

target lines of each gate, this remains an exponential complexity. Naive schemesas sketched here or discussed in [34] are infeasible due to their enumerative natureand the complexity of the problem. Instead of naively enumerating all possible per-mutations, we proposed an alternative approach which formulates the consideredquestion as a problem of Boolean satisfiability. In addition to that, the costs of therespective permutations are incorporated in an objective function to be minimized.By this, a formulation results that can be passed to a solver for pseudo-Booleanoptimization, i.e. an efficient solving algorithm which, instead of simply traversingthe complete space of assignments, applies intelligent decision heuristics, power-ful learning schemes, and efficient implication methods (see e.g. [19, 27]). Byexploiting the deductive power of the state-of-the-art PBO solvers, a solution canefficiently be determined. From the solution of this PBO instance, the minimalSWAP insertions can be derived.

5.2.2.2 Implementation

In order to encode the considered problem, we distinguish between the lines of acircuit C (denoted by l0, . . . , ln−1) and their corresponding qubits (denoted by q0,. . . , qn−1). Initially, each qubit corresponds to the circuit line with the same index,i.e. qi corresponds to li for all 0 ≤ i < n. Then, before each gate, we allow anarbitrary permutation (including the identity) which may lead to different mappingsof qubits to circuit lines. Following this, Boolean variables are introduced to thePBO encoding representing which qubit currently corresponds to which line.

Definition 5.1. Let C = (Q,C) be a quantum circuit. Then, variables xki =(xki0x

ki1 . . . x

kin−1), 0 ≤ k < |G|, 0 ≤ i < |Q|, are introduced representing which

qubit corresponds to circuit line li initially (for k = 0) and before gate gk (for1 ≤ k < |G|). More precisely, a variable xkij states whether qubit qj correspondsto the circuit line li (xkij = 1) or not (xkij = 0).

Example 5.3. Consider the circuit shown in Fig. 3.2(a) which works as running ex-ample throughout the remainder of this section. Fig. 5.2 sketches the resulting PBOencoding. The π-blocks denote the positions in which we allow an arbitrary per-mutation of circuit lines. This leads to a new qubit mapping which are representedby the corresponding xki -variables in Fig. 5.2. Here, e.g. the assignment x2

21 = 1states that, before gate g2, the qubit q1 corresponds to circuit line l2.

Obviously, these mappings cannot arbitrarily be made. In fact, each circuitline must exactly correspond to one qubit and each qubit must exactly correspondto one circuit line. In order to ensure this, the following consistency-constraint isadded to the PBO instance:

|G|−1∧k=0

n−1∧i=0

(

n−1∑j=0

xkij = 1) ∧n−1∧i=0

(

n−1∑j=0

xkji = 1)

The left part of this constraint states that, for each permutation position k

(0 ≤ k < |G|) and for each circuit line li, the sum xki0 + xki1 + · · · + xkin−1 is

42

fixed to 1, i.e. exactly one qubit corresponds to one circuit line. The right part ofthis constraint states that, for each permutation position k (0 ≤ k < |G|) and foreach qubit qi, the sum xk0i + xk1i + · · ·+ xkn−1i is fixed to 1, i.e. exactly one circuitline correspond to one qubit.

Example 5.4. The bottom left of Fig. 5.2 sketches the consistency constraint forthe example from Fig. 3.2(a).

Next, we want to ensure that only permutations are applied which satisfy thenearest neighbor condition on all functional gates. As we know the control andtarget qubits of each gate, this can be enforced through the xki -variables and thefollowing adjacency-constraint:

∧gk(qc,qt)∈G

(n−1∨m=0

(xkmc ∧ xk(m+1)t) ∨n−1∨m=0

(xkmt ∧ xk(m+1)c)

)

This constraint considers all gates gk(qc, qt) from a given circuit G with controlqubit qc and target qubit qt. For each of these gates, a mapping of qubits to circuitlines is required so that either

• the control qubit qc corresponds to a circuit line lm and the target line qt cor-responds to a directly succeeding circuit line lm+1 (left part of the constraint)or

• the target qubit qt corresponds to a circuit line lm and the control line qccorresponds to a directly succeeding circuit line lm+1 (right part of the con-straint).

That is, one of the possible adjacencies between the respective qubits of these gateshas to be established.

Example 5.5. Consider again the sketch of the encoding shown in Fig. 5.2. Thegate blocks represent the qubits which must be adjacent (derived from Fig. 3.2(a)).Based on that, Fig. 5.2 exemplarily shows the resulting adjacency-constraint forgate g1 with control qubit q2 and target qubit q0.

Finally, the respectively chosen permutation of circuit lines at each positionhas to be extracted and the corresponding costs for creating it has to be linkedto the objective function of the PBO instance. Again, the xki -variables can beexploited for this purpose. Based on them, it can be derived what permutation isapplied before gate gk in order to change the previous circuit line order. Further freeBoolean variables (denoted by skπ) are utilized to store whether a correspondingpermutation π is applied. This is expressed by a permutation-constraint as follows:

|G|−1∧k=1

(∧π∈Π

(

n−1∧i=0

xk−1i = xkπ(i))⇔ skπ

)

43

This constraint considers all possible permutations (denoted by Π) for each posi-tion k. If the assignments of xk−1

i and xki establish a particular permutation π ∈ Π,then the respective variable skπ is set to 1 (encoded through ⇔). This states thatthis particular permutation π has been chosen before gate gk and, hence, the corre-sponding costs for it have to be considered. This is eventually incorporated in theobjective function

min(

|G|−1∑k=1

∑π∈Π

cπskπ),

where cπ denotes the costs (in terms of adjacent SWAP gates) for creating a per-mutation π using the methods described in Section 5.2.2.1.

Example 5.6. The permutation-constraint and the objective function for the run-ning example from Fig. 3.2(a) are sketched at the bottom right of Fig. 5.2. Inparticular, the constraints for permutation π = (2, 3, 1, 0) and k = 2 are shown.As already discussed in Example 2.3, creating this permutation requires 5 SWAPgates. Accordingly, costs of 5 are assumed in the objective function for this par-ticular permutation. Furthermore, as it is assumed that circuit lines before gate g1

can arbitrarily be permuted with no additional costs, all variables s1π (π ∈ Π) are

not part of the objective function.

Combining all these constraints, a PBO instance results which is satisfiabile forall permutations of circuit lines that lead to a nearest neighbor compliant circuit.The precise permutation to be created at position k can thereby be derived fromthe assignment to the skπ variables. If skπ has been assigned 1 by the PBO solver,a permutation π has to be created before gate gk. By additionally optimizing theobjective function, the PBO solver ensures a minimal number of SWAP gates.

Example 5.7. Passing the PBO encoding presented above to a PBO solver, anoptimal assignment with s0

πe , s1πe , s

2(0213), s

3πe , s

4πe set to 1 results (πe represents

the identity permutation). From that, the SWAP insertion as depicted in Fig. 3.2(c)results. This represents on optimal solution to the SWAP insertion problem for thecircuit given in Fig. 3.2(a).

5.3 Experimental EvaluationThe exact approaches introduced in Section 5.1.2 and 5.2.2 for the the global re-ordering and the local reordering scheme, respectively, have been implemented inC++ on top of RevKit [74] and evaluated against their heuristic counterparts. Forthe latter approach (based on the PBO formulation), clasp [27] has been utilized assolving engine. As benchmarks, quantum circuits from RevLib [86] as well as con-sidered previously in [69] have been applied. All evaluations have been conductedon an Intel E6700 Core2 CPU with 2.7 GHz and 4 GB of memory.

In this section, the results of our evaluations are summarized and discussed.The discussion is distinguished into two parts. First, the exact results of bothschemes are compared to each other. Afterwards, the obtained minimal solutions

44

are checked against results obtained by heuristic approaches. While the first partallows for an evaluation of the general performance of the global and local reorder-ing scheme, the second part provides insight into the quality of existing heuristicsolutions.

5.3.1 Evaluation of the Exact ApproachesFollowing the global reordering scheme obviously considers just a fraction of thepossible optimization potential and, because of this, only ensures sub-optimal re-sults. On the contrary, it is significantly less complex than the local scheme. Hence,an evaluation of the the trade-off between both schemes with respect to the result-ing quality as well as the required run-time is of high interest. Exact approachesas introduced in Section 5.1.2 allow such an evaluation as they conduct a completeconsideration of the respective search spaces and provide the best possible resultsthat can be achieved with each scheme.

Corresponding results are summarized in Table 5.1. The first column denotesthe name of the considered benchmarks followed by the number n of circuit linesand the number |G| of controlled gates. As unary gates are inherently nearestneighbor compliant, they do not have to be considered for SWAP gate insertion and,thus, are ignored. Afterwards, numbers obtained by the two schemes are listed.More precisely, the number of determined SWAP gates as well as the requiredrun-time (in CPU seconds) are provided for both exact reordering schemes (for thelocal reordering scheme additionally the complexity as discussed in Section 5.2.2.1is provided). The last column provides the percentual improvement of the localreordering scheme compared to the global reordering scheme.

Note that the upper part of Table 5.1 lists all benchmarks for which indeedexact results have been determined within the time limit of 14000 CPU seconds.In contrast, the lower part of Table 5.1 additionally lists benchmarks for which thelocal reordering scheme did not terminate within this time limit. In these cases, theutilized PBO solver simply returns the best result which has been determined thusfar. While an actual minimal result is not guaranteed for these cases, often rathersmall values are obtained which is why these numbers are additionally consideredin this evaluation.

Having these results, several conclusions can be drawn:

• The results confirm the deductive power of the applied solving engine. Infact, circuits composed of up to 5 circuit lines or up to 17 non-unary quantumgates can be handled even with the local scheme. While this might soundsmall at a first glance, it leads to a significant complexity to be tackled. In themost complex case (i.e. for 4gt13-v1_93), 1.9 · 1033 possible permutationshave been considered. This scalability is comparable to exact approachesaiming for other objectives, e.g. exact Toffoli gate synthesis as proposedin [28].

• It can be observed that the performance differs depending on the respectivecircuit. For example, 4gt11_84 and 4mod5-v1_25 have about the same com-

45

plexity. However, an optimal SWAP gate insertion can be determined for4gt11_84 two orders of magnitude faster than for 4mod5-v1_25.

• The differences between the global and the local reordering scheme areclearly unveiled. As shown by the results in Table 5.1, the number of SWAPgates can significantly be decreased if the local scheme is applied. In the bestcase, a quality difference of 70% is observed; for many cases the number ofSWAP gates can be reduced by a half. In contrast, the run-time of the localapproach is considerably higher. While the global scheme terminates in lessthan a second for the majority of the benchmarks, local reordering scalesrather badly. In the cases listed in the bottom of Table 5.1, just heuristic re-sults were obtained using the local scheme within the time limit. Neverthe-less, even the non-minimal results obtained here are often quite close to theminimal value of the global scheme – in some cases, e.g. hwb4_52, 4gt5_76,one-two-three-v0_98, or QFT_6 even better results can be achieved.

For the first time, these observations experimentally confirm the respectiveproperties of both schemes with respect to efficiency and quality.

Table 5.1: Evaluation of the exact approaches

Global reordering Local reorderingBenchmark n |G| Swaps Time n!|G| SwapsTime Impr.

3_17_13 3 13 4 0.1 1, 4 · 1010 2 0.1 50%3_17_14 3 13 4 0.1 1, 4 · 1010 4 0.1 0%3_17_15 3 9 2 0.1 1, 1 · 107 2 630.2 0%ex-1_166 3 7 2 0.1 2, 8 · 105 2 0.1 0%fredkin_5 3 7 2 0.1 2, 8 · 105 1 0.1 50%fredkin_6 3 15 8 0.1 4, 7 · 1011 3 4.6 62.5%ham3_102 3 9 2 0.1 1, 1 · 107 2 0.1 0%miller_11 3 17 8 0.1 1, 7 · 1013 3 0.1 62.5%miller_12 3 8 4 0.1 1, 7 · 106 2 745.6 50%peres_8 3 4 2 0.1 1, 3 · 103 1 0.1 50%peres_9 3 6 2 0.1 4, 7 · 104 1 2463.3 50%peres_10 3 4 2 0.1 1, 3 · 103 1 0.1 50%toffoli_1 3 5 2 0.1 7, 8 · 103 1 0.1 50%toffoli_2 3 5 2 0.1 7, 8 · 103 1 0.2 50%decod24-v0_38 4 17 6 0.1 3 · 1023 4 19.2 33.3%decod24-v0_39 4 15 6 0.1 5, 1 · 1020 5 0.5 16.6%decod24-v1_42 4 8 2 0.1 1, 1 · 1011 2 7.7 0%decod24-v2_43 4 16 6 0.1 1, 3 · 1022 3 0.5 50%decod24-v3_46 4 9 4 0.1 2, 7 · 1012 3 0.1 25%rd32-v0_66 4 12 8 0.1 3, 7 · 1016 4 0.4 50%rd32-v0_67 4 8 4 1.1 1, 1 · 1011 2 1.6 50%rd32-v1_68 4 12 6 0.1 3, 7 · 1016 4 0.4 33.3%rd32-v1_69 4 8 2 0.1 1, 1 · 1011 2 0.1 0%toffoli_double_3 4 7 2 0.1 4, 6 · 109 1 0.9 50%toffoli_double_4 4 10 8 0.1 6, 4 · 1013 3 200 62.5%

46

4gt11_83 5 12 10 0.1 9 · 1024 6 9 40%4gt11_84 5 7 2 0.1 3, 6 · 1014 1 16.6 50%4gt13-v1_93 5 16 6 0.1 1, 9 · 1033 6 489.3 0%4mod5-v0_19 5 12 6 0.1 9 · 1024 6 55.3 0%4mod5-v0_20 5 8 2 0.1 4, 3 · 1016 2 45.5 0%4mod5-v1_22 5 9 4 0.1 5, 2 · 1018 3 548.8 25%4mod5-v1_25 5 7 2 0.1 3, 6 · 1014 1 11705.3 50%QFT_QFT5 5 10 20 0.1 6, 2 · 1020 6 1.6 70%

Aborted after timeout (local scheme only; i.e. suboptimal results for the local reordering scheme)hwb4_52 4 23 18 0.1 5, 6 · 1031 12 - 33,3%mod10_176 5 56 80 0.1 2, 8 · 10116 100 - -25%hwb5_55 5 106 114 0.1 2, 5 · 10220 140 - -22.8%aj-e11_168 5 36 50 0.1 7, 1 · 1074 61 - -22%mod10_171 5 78 116 0.1 1, 5 · 10162 140 - -20.6%one-two-three-v0_975 92 120 0.1 2 · 10191 141 - -17.5%hwb4_49 5 79 120 0.1 1, 8 · 10164 133 - -10.8%hwb4_50 5 77 122 0.1 1, 3 · 10160 128 - -4.9%4mod5-v1_23 5 24 30 0.1 8 · 1049 31 - -3.3%sf_275 5 60 68 0.1 5, 7 · 10124 69 - -1.4%one-two-three-v0_985 47 66 0.1 5, 3 · 1097 59 - 10.6%4gt5_76 5 36 40 0.1 7, 1 · 1074 35 - 12.54gt4-v0_78 6 81 124 0.2 2, 8 · 10231 154 - -24.14gt4-v1_74 6 85 132 0.2 7, 5 · 10242 163 - -23.4alu-v2_30 6 161 218 0.2 1, 1 · 10460 242 - -11mod8-10_177 6 142 222 0.1 5, 5 · 10405 236 - -6.3hwb6_58 6 146 290 0.1 0 · 101,1 280 - 3.44gt4-v0_72 6 81 130 0.2 2, 8 · 10231 123 - 5.34gt4-v0_73 6 131 164 0.2 2, 1 · 10374 146 - 10.9hwb5_53 6 434 800 0.2ham7_104 7 87 140 1.9rd53_135 7 78 136 1.8QFT_QFT8 8 28 112 20urf2_152 8 25150 71280 22QFT_QFT9 9 36 168 236.5urf1_149 9 57770 179832241.3urf5_158 9 51380 176284247QFT_QFT10 1045 240 2936.8rd73_140 1076 150 1579.4Shor3 102076 4802 1846.2sym9_148 104452 10984 2415.12sys6-v0_144 1062 114 1586.4urf3_155 10132340 4533683023.6

Benchmark: Name of the benchmark

n: Number of lines

|G|: Number of controlled gates

n!|G|: Search space complexity of the local reordering scheme

Swaps: Determined number of Swap gates

Time: Run-time (in CPU seconds)

Impr: Improvement of the local scheme compared to the global scheme

47

Table 5.2: Comparison to heuristic approachesGlobal reordering Local reordering

Swaps Swaps Impr.Benchmark [64] Exact Impr. [64] [69] Exact wrt. [64] wrt. [69]

3_17_13 6 6 0.00 5 4 2 60% 50%4_49_17 32 28 12.50 16 12 - - -4gt10-v1_81 74 32 56.76 29 20 - - -4gt11_84 6 2 66.67 5 1 1 80% 0%4gt13-v1_93 20 6 70.00 10 6 6 40% 0%4gt5_75 40 22 45.00 20 12 - - -4mod5-v1_23 30 30 0.00 17 9 - - -4mod7-v0_95 68 44 35.29 30 21 - - -alu-v4_36 70 32 54.29 20 18 - - -decod24-v3_46 8 4 50.00 4 3 3 25% 0%ham7_104 162 140 13.58 86 68 - - -hwb4_52 18 18 0.00 13 10 - - -hwb5_55 146 114 21.92 86 63 - - -hwb6_58 316 290 8.23 140 118 - - -mod5adder_128 148 108 27.03 79 51 - - -QFT5 - - - 8 6 6 25% 0%rd32-v0_67 4 4 0.00 4 2 2 50% 0%rd53_135 194 136 29.90 87 66 - - -rd73_140 190 150 21.05 61 56 - - -sym9_148 13656 10984 19.57 5353 3415 - - -sys6-v0_144 116 114 1.72 60 59 - - -urf1_149 200804 179832 10.44 62019 44072 - - -urf2_152 83152 71280 14.28 23607 17670 - - -urf3_155 491356 453368 7.73 - - - - -urf5_158 206288 176284 14.54 54038 39309 - - -

Benchmark: Name of the benchmarkSwaps: Determined number of Swap gates using the respective approachesImpr: Improvement of the exact approach compared to the respective heuristic approach(es)

5.3.2 Comparison to Previous WorkAs discussed in the sections above, most of the existing approaches for nearestneighbor optimization relied on heuristic schemes. While they usually performsignificantly better with respect to run-time and scalability, their approaches donot guarantee minimality. Moreover, due to the absence of exact approaches, thequality of these approaches with respect to the number of SWAP gates remainedunclear. Using the exact approaches presented in this work, a corresponding evalu-ation becomes possible. Accordingly, respective comparisons have been conductedin our experimental evaluation. For this purpose, recently proposed (heuristic) op-timization approaches have been considered, namely re-implementations of theglobal reorderding method from [64] and the local reordering method from [64] aswell as the state-of-the-art local reordering method from [69] (all these methodsare briefly reviewed in Section 5.1.1 and 5.2.1, respectively).

The respective comparisons are summarized in Table 5.2. Again, the first col-umn denotes the name of the benchmark while columns headed by Swaps provide

48

the number of SWAP gates determined using the respective heuristic and exactapproaches. The improvement of the exact approach compared to the respectiveheuristic approaches is given in the columns headed by Impr.

The results unveil that existing heuristics following the global reordering schemedo not behave very properly. In fact, the number of SWAP gates can significantlybeen improved in almost all cases. For some benchmarks, absolute reductions ofup to several thousand swaps gates can be achieved. Here, further room for improv-ing heuristic solutions exist. In contrast, heuristics following the local reorderingscheme already seem to be very efficient with respect to their quality. Although thenumber of available exact results is significantly smaller (due to the larger complex-ity), we were able to prove that previously proposed (heuristic) approaches alreadydetermined the minimal number of SWAP gates in a couple of cases. Particularlythe approach recently presented in [69] does convince. Hence, existing solutionsfollowing the local reordering scheme seem to already exploit their potential.

5.4 ConclusionsIn this chapter, we discussed algorithms for post-synthesis optimization for nearestneighbor architectures. Therefore, we addressed global reordering as well as localreordering. Due to the fact, that local reordering is the more challenging problemwe discussed several algorithms presented in the past, whereas no satisfying exactapproach was presented so far. We presented an approach utilizing the power ofsolvers for pseudo-Boolean satisfiability to address optimal results with respectivesatisfying running time.

49

Chapter 6

Optimization forMulti-Dimensional NearestNeighbor Architectures

So far quantum circuits have been considered whose qubits are arranged in a one-dimensional (i.e. linear) fashion where each qubit is placed next to each other.Fig. 6.1(a) shows an example of such a circuit. However, recent technologicaldevelopments (e.g. [35, 40, 55]) also lead to the consideration of 2-dimensional ar-rangements where qubits are placed according to a grid-structure. In this case, thecircuit from Fig. 6.1(a) would be realized as sketched in Fig. 6.1(b). Such arrange-ments can accordingly been extended to higher dimensions eventually leading tomulti-dimensional quantum circuits.

In a similar fashion to the one-dimensional case the reordering strategies canbe applied to the 2D circuit as well. So far, the majority of algorithms for thetwo schemes focused on 1D quantum circuits and applied strategies such as thereordering of qubit positions [64], window-based heuristics [69], or mapping theproblem to a corresponding graph arrangement problem [34]. In contrast, nearestneighbor optimization for multi-dimensional quantum circuits is just at the begin-ning: Although manually derived nearest neighbor compliant 2D realizations forcertain building blocks such as an adder, e.g. in [14], have been presented, auto-matic approaches for SWAP gate insertion are hardly available yet. To the bestof our knowledge, only the approach recently proposed in [70] exists. This, how-ever, only generates heuristic solutions. The approach is reviewed in the followingsection. In [43] we presented an exact approach, because no exact approach and,hence, no exact results on 2D and multi-dimensional nearest neighbor quantumcircuits existed so far. This is crucial since clear results e.g. on the number ofneeded SWAP gates or the best possible qubit arrangement cannot be derived fromheuristic results.

Motivated by this, we consider the following problems in this chapter: How todetermine the minimal number of SWAP gates to be inserted to make a 2D or even

51

q0 q0

q1 q1

q2 q2

q3 q3

U

U

U

(a) 1D circuit

q0q1

q2q3

U

U

U

(b) 2D circuit

Figure 6.1: Multi-dimensional quantum circuits

a generic, i.e. multi-dimensional, quantum circuit nearest neighbor-compliant.

6.1 A Heuristic ApproachHere we want to review the approach presented in [70]. The approach inherits twosteps: First an exact global reordering approach is applied using an Mixed IntegerProgramming (MIP) formulation. Afterwards, the gates are made nearest-neighborcompliant using an heuristic.

1. MIP 2D Global Reordering:

The problem is formulated as the following MIP formula. Therefore, an in-teraction graph is used. Binary variables xij are used to represent the assign-ment of qubiti to positionj on the 2D grid. The weight between the qubitiand the qubitk in the interaction graph is denoted by wik. Further, distjlrepresents the Manhatten distance between two positions, i.e. positionj andpositionl. The resulting cost of assigning qubiti to positionj and qubitkto positionl (i.e. xij and xkl) is expressed as cijkl = wik × distjl. Theobjective is to minimize

min

|Q|∑i=1

|Q|∑j=1

|Q|∑k=1

|Q|∑l=1

cijklxijxkl.

Therefore, a consistency constraints and a restriction to Boolean values (be-cause MIP allows integer values for the variables) are formulated:

|Q|∑j=1

xij = 1, 1 ≤ i ≤ |Q|

|Q|∑i=1

xij = 1, 1 ≤ j ≤ |Q|

xij ∈ IB, 1 ≤ i, j ≤ |Q|

In the implementation the objective function is optimized such that |Q|2Boolean variables (x’s), |Q|2 internal real variables and |Q|2 + 2|Q| con-straints are used.

52

2. Heuristic 2D Local Reordering:

After the global reordering, the corresponding control qubits are routed to-wards the target qubit (first along the x-axis, then along the y-axis) by insert-ing SWAP gates.

6.2 An Exact ApproachDetermining the (minimal) number of SWAP gate insertions in 1D quantum cir-cuits basically focused on where and how many SWAP gates have to be addedinto an existing circuit structure. Considering multi-dimensional quantum circuits,further issues need to be addressed. For example, the precise configuration of thecircuit is not fix: A 1D circuit with e.g. 7 qubits always have those seven qubitsarranged next to each other; in a 2D quantum circuit, they may be arranged ontoa 2 × 4- or even 3 × 3-grid. Multi-dimensional quantum circuits allow for manyfurther possibilities. This obviously has an effect on which qubits are adjacent andwhich are not. Moreover, even the number of SWAP gates needed to establisha certain qubit permutation onto the circuit (e.g. in order to make a non-adjacentgate nearest neighbor-compliant) significantly depends on this. Determining thisnumber is a non-trivial task.

In order to address all these issues, we propose a solution composed of thefollowing steps:

1. Determine the precise configuration of the considered quantum circuit, i.e. itsdimensions for a given number of qubits.

2. Determine a cost function providing the minimal number of SWAP gatesneeded to realize an arbitrary permutation onto the given circuit configura-tion.

3. Determine the minimal SWAP gate insertions based on the given circuit con-figuration and its cost function.

6.3 ImplementationThis subsection describes the proposed solution and the applied procedures to eachof the steps mentioned above in detail. For sake of clarity, all issues are mostlydiscussed and illustrated by means of 2D quantum circuits. However, all conceptscan accordingly be extended to multi-dimensional quantum circuits.

6.3.1 Determine the Precise Quantum Circuit ConfigurationFor a given number of qubits, multi-dimensional quantum circuits allow for sev-eral possible configurations to be considered. The respective choice has thereby asignificant effect on the number of needed SWAP gates but also on the number ofgarbage, i.e. unused, qubit positions.

Example 6.1. Consider a quantum circuit over five qubits q0, . . . , q4 to be real-ized in a 2D architecture and with interactions between q0 and each of q1, . . . , q4.

53

q1 q4

q0 q3

q2

(a) 2× 3-grid

q1

q2 q0 q3

q4

(b) 3× 3-grid

Figure 6.2: Determine the precise quantum circuit configuration

The smallest possible 2D configuration would require a 2× 3-grid11. As sketchedin Fig. 6.2(a), the best possible qubit-placement indeed would keep the numberof garbage positions minimal (just one), but additionally requires an additionalSWAP gate for q0 and q4. In contrast, a 3× 3-grid would allow for a qubit-placement with no need for any additional SWAP gates but eventually lead to fourgarbage qubit positions (as sketched in Fig. 6.2(b)).

In general, researchers and designers aim for keeping the number of garbagequbits as small as possible [87]. At the same time, also efforts have been made toexplicitly exploit those [51, 92]. Multi-dimensional nearest neighbor quantum cir-cuits provide another argument for being more flexible with the “keep the numberof garbage qubits as small as possible”-design rule. Eventually, the designer has totrade-off the respective criteria. In the following, we aim for determining the min-imal number of SWAP gates for quantum circuit configurations with the minimalnumber of garbage qubit positions. However, the approach is also applicable forconfigurations which are larger than necessary.

6.3.2 Costs of Establishing an Arbitrary PermutationIn order to globally determine the minimal number of SWAP gates needed to makean arbitrary quantum circuit nearest neighbor-compliant, it has to be known howmany SWAP gates are required to establish an arbitrary permutation of qubit po-sitions in that circuit. For 1D quantum circuits, this can been obtained in lineartime using inversion vectors [90]. For multi-dimensional circuits, this constitutesa more complex problem. The problem can be formulated by means of adjacenttransposition graphs12.

Definition 6.1. Let C = (Q,G) be a quantum circuit. An adjacent transpositiongraph A = (V,E) is a representation of all transpositions which can be real-ized by nearest neighbor-compliant SWAP gates. The set V of nodes representall |V | = |Q|! possible permutations while edges represent valid transpositionsfrom one permutation to another.

11In principle, a 1× 5-grid may be considered the smallest possible configuration but, however, isconsidered as a 1D configuration.

12Adjacent transposition graphs have previously also been applied in [48] for nearest neighboroptimization of 1D circuits. However, as said above inversion vectors are the more efficient solutionhere.

54

q0q1q2q3

q0q1q3q2

q0q3q2q1

q1q0q2q3

q2q1q0q3

. . . . . . . . . . . .

Figure 6.3: Adjacent transposition graph

Example 6.2. Fig. 6.3 sketches a part of the transposition graph for a 2D quan-tum circuit over n = 4 qubits. Transition graphs for multi-dimensional quantumcircuits can be created by accordingly considering the possible transpositions andthe resulting permutations.

Having such a representation, the minimal number of SWAP gates needed toestablish an arbitrary permutation can be obtained by determining a minimal pathfrom the node representing the identity permutation (denoted as vsrc) to the noderepresenting the desired permutation (denoted as vdest). Inspired by [2], this canbe formulated as Pseudo-Boolean Optimization problem (PBO problem).

More precisely, for each node v ∈ V a Boolean variable xv is introduced repre-senting whether v is included in the optimal path from the source to the destinationnode, i.e. xv = 1 iff v is in the optimal path. In the same fashion, a Boolean vari-able xe is introduced for each edge e ∈ E representing whether e is included inthe optimal path from the source to the destination node, i.e. xe = 1 iff e is in theoptimal path.

Then, it has to be constrained that (1) vsrc and vdest are part of the path, (2) oneedge including vsrc and one including vdest have to be part of the path, and (3) ifany other node v ∈ V \ {vsrc, vdest} is part of the path (i.e. iff xv = 1), then twoother edges incident to v (an incoming one and an outcoming one) have to be partof the path as well. Minimality of the path is ensured by enforcing the optimizationfunction (4), i.e.

xvsrc ∧ xvdest∧

∑e∈{(vsrc,•)∈E}

xe = 1 ∧∑

e∈{(vdest,•)∈E}

xe = 1

∧v∈V \{vsrc,vdest}

(xv ⇔∑

e∈{(v,•)∈E}

xe = 2),

min :∑e∈E

xe.

Passing this formulation to a state-of-the-art PBO solver [26], a minimal as-signment to all xv- and xe-variables is derived. From the xe-variables, the minimaltranspositions and, by this, the minimal SWAP gate cascade realizing the desired

55

permutation can be obtained. Using state-of-the-art PBO solvers enables the ex-ploitation of intelligent decision heuristics, powerful learning schemes, and effi-cient implication methods and, hence, is much more sufficient that simply travers-ing the complete space of assignments.

6.3.3 Optimal SWAP Gate InsertionFinally, the actual determination of SWAP gates needed in order to make an arbi-trary quantum circuit nearest neighbor-compliant is considered. For this purpose,(1) all possible permutations of qubit positions before each gate g ∈ G of the cir-cuit and (2) the costs (in terms of adjacent SWAP gates) that would be needed inorder to create these particular permutations have to be considered. How to calcu-late the second issue has already been covered in the previous subsection. For thefirst issue, again a PBO formulation is proposed.

Again, Boolean variables are introduced for this purpose. We distinguish therebybetween the position in a circuit and the corresponding qubits. Before each gate,we allow an arbitrary permutation (including the identity) which may lead to dif-ferent mappings of qubits to the respective positions.

Recall Definition 4.5: LetQ be a set of qubits. Define an n-dimensional circuitgrid (d1, . . . , dn) = Dn ∈ Nn, di > 0 for all i holding

∏ni=1 di = |Q|. As a result

every q ∈ Q can be assigned to specific positions in the grid. More specifically wecan define a bijection m : Q→ Dn.

Due to the fact that there existQ! many different bijections on this n-orthotopepermutations can be defined. We denote the set of these permutations by Pn|Q|.

Also analogously a cost function using these n-orthotope permutations can bedefined which represent the number of inversions along the dimensions to calculatethe difference between two permutations. From this a distance function c : Dn ×Dn → N for two positions can be derived.

Hence, Boolean variables xkij with 1 ≤ k ≤ |G|, i enumerating all positions pi ∈Dn, and 1 ≤ j ≤ |Q| are introduced representing whether a qubit qj is assigned aposition pi before gate gk (xkij = 1) or not (xkij = 0)13.

Example 6.3. Consider again the quantum circuit over five qubits q0, . . . q4 to berealized in a 3× 3 2D architecture as sketched in Fig. 6.2(b). Additionally assumethat the positions pi ∈ P 2 are enumerated from left to right and top to bottom,i.e. p0 (p8) represents the position at the top-left (bottom-right). This permutationto be established before a gate gk would be represented by the assignment xk40 = 1(qubit q0 at position p4), xk11 = 1 (qubit q1 at position p1), xk23 = 1 (qubit q2 at po-sition p3), xk53 = 1 (qubit q3 at position p5), and xk74 = 1 (qubit q4 at position p7).All remaining xkij-variables are assigned zero.

Obviously, these mappings cannot arbitrarily be made. In fact, each positionmust exactly correspond to one qubit and each qubit must exactly correspond to

13Note that, in accordance to previous work (e.g. [64, 69, 90, 43]), we assume the primary inputqubits can arbitrarily be permuted with no additional costs.

56

one position. In order to ensure this, the following constraint is added to the PBOinstance:

|G|∧k=1

∧p∈P d

(∑p′∈P d

xkp′p = 1) ∧ (∑p′∈P d

xkpp′ = 1)

Next, it has to be ensured that only permutations are applied which satisfy the

nearest neighbor condition on all functional gates. Since the control and targetqubits of each elementary controlled gate (denoted by (c, t) with c, t ∈ P d) areknown, this can be enforced through the xki -variables and the following constraint:

∧(c,t,k)∈G

d−1∨i=0

∨p∈P d|pi<bi−1

((xkpc ∧ xk(p+ui)t) ∨ (xkpt ∧ xk(p+ui)c)

)

where ui is the ith unit vector14. More precisely, this constraint enumerates allpossible adjacent positions for the qubits c and t and eventually ORs them. Bythis, only assignments are valid which make qubit c and qubit t adjacent.

Finally, the possible permutations of qubits at each position and the corre-sponding costs for creating such a permutation has to be formulated into the PBOinstance. Again, the xki -variables can be exploited for this purpose. Based on them,it can be derived what permutation is applied before gate gk in order to change theprevious positioning. Further free Boolean variables (denoted by skπ) are utilizedto store whether a corresponding permutation π is applied. This is expressed bythe following constraint:

|G|∧k=2

∧π∈Π

∧p∈P d

(xk−1p = xkπ(p))⇔ skπ

This constraint considers all possible permutations (denoted by Π) established be-fore each gate gk. If the assignments of xk−1

i and xki establish a particular per-mutation π ∈ Π, then the respective variable skπ is set to 1 (encoded through⇔).This states that this particular permutation π has been chosen before gate gk and,hence, the corresponding costs for it have to be considered. This is eventuallyincorporated in the objective function

min :

|G|∑k=2

∑π∈Π

cπskπ

where cπ denotes the costs (in terms of adjacent SWAP gates) for creating a per-mutation π. These costs have been determined before as described in Section 6.3.2.

14Adding the unit vector to a position, i.e. p+ui, means incrementing the ith element of the tuplep ∈ P d. The result is the next position in the respective direction of the selected dimension and isvalid due to pi < bi − 1.

57

Combining all these constraints, a PBO instance results which is satisfiabilefor all permutations of qubits that lead to a nearest neighbor compliant circuit. Theprecise permutation to be created at position k can thereby be derived from theassignment to the skπ variables. If skπ has been assigned 1 by the PBO solver, apermutation π has to be created before gate gk. By additionally optimizing theobjective function, the PBO solver ensures a minimal number of SWAP gates.

6.3.4 Costs of Establishing an Arbitrary PermutationSWAP gate insertion basically is about establishing a new permutation of qubits tothe respective positions within a multi-dimensional quantum circuit. Hence, beforeit comes to an actual SWAP gate insertion, the general question is how costly (interms of SWAP gates) is it to realize a certain permutation. While linear solutionsto this question exist for 1D circuits (due to the help of inversion vectors [90, 43]),the approach presented in Subsection IV.B represents the first exact solution formulti-dimensional circuits. This does not only provide the designer with crucialinformation on the (exact) costs of establishing a certain operation, but also allowsan analysis on the suitability of different configurations with respect to nearestneighbor constraints.

As a representative, Table 6.1 shows the obtained costs needed to establish all4! = 24 possible permutations over 4 qubits in both, a 1D quantum circuit as wellas a 2× 2, i.e. 2D, quantum circuit. The first column denotes thereby all possiblepermutations π, while the remaining two columns provide the number of SWAPgates needed in order to realize the respective π’s in the 1D circuit and the 2Dcircuit.

Obviously, establishing the identify permutation (i.e. q0q1q2q3) does not re-quire a SWAP gate in neither the 1D nor the 2D configuration. But for all otherpermutations, significant differences can be observed. In fact, 2D architectures re-quire never more than 4 SWAP gates – in a single case only. Instead, 1D circuitsrequire 4 SWAP gates or more (i.e. up to 6) in a total of 9 cases. Hence, with re-spect to nearest neighbor constraints, the higher dimension certainly pays off. Butthis does not necessarily hold for all permutations. For example, the permutationq0q2q1q3 can be realized in a 1D architecture with a single SWAP gate only, why3 SWAP gates are needed in a 2D circuit. Without a scheme which enables logicdesigners to determine those exact values, precise conclusions as discussed herewould not be possible.

6.4 Experimental EvaluationConsidering the actual SWAP gate insertion for a particularly given circuit, veryfew results exist yet. All of them are of heuristic nature. Here, the approach pro-posed in Section 6.3 advances the state-of-the-art by, for the first time, providingexact solutions. This allows for an evaluation on how far the previously obtainedheuristic results are from the optimum.

Table 6.2 provides a selection of results confirming this statement. Here, theheuristically determined number of SWAP gates as reported in [70] is compared

58

Table 6.1: Costs of establishing an arbitrary permutation#SWAPs #SWAPs

π 1D 2D π 1D 2Dq0q1q2q3 0 0 q2q0q1q3 2 2q0q1q3q2 1 1 q2q0q3q1 3 3q0q2q1q3 1 3 q2q1q0q3 3 1q0q2q3q1 2 2 q2q1q3q0 4 2q0q3q1q2 2 2 q2q3q0q1 4 2q0q3q2q1 3 1 q2q3q1q0 5 3q1q0q2q3 1 1 q3q0q1q2 3 3q1q0q3q2 2 2 q3q0q2q1 4 2q1q2q0q3 2 2 q3q1q0q2 4 2q1q2q3q0 3 3 q3q1q2q0 5 3q1q3q0q2 3 3 q3q2q0q1 5 3q1q3q2q0 4 2 q3q2q1q0 6 4

Table 6.2: Resulting Optimal SWAP Gate Insertion#SWAPs

Circuit Conf. [70] Sect. 6.3 Time

3_17_13 2x2 6 4 17.6sdecod24-v3_46 2x2 3 2 0.5shwb4_52 2x2 9 7 42752.9srd32-v0_67 2x2 2 2 0.2s4gt11_84 2x3 1 1 1644.5s

against the exactly determined number of SWAP gates obtained by the approachproposed in Section 6.3 (the columns Circuit and Conf. provide the name and theconfiguration as used in [70], respectively).

As expected, the approach presented in [70] – to the best of our knowledgethe only solution addressing SWAP gate determination for 2D architectures thusfar – does not guarantee minimality. In fact, (exact) solutions with less SWAPgates are possible as shown by means of the first three representatives in Table 6.2.On the other side, this does not mean that the approach from [70] never realizesminimal solutions. In fact, as shown in the last two rows of Table 6.2, respectiverepresentatives exists. However, without the approach presented in Section 6.3 andaccordingly in [43], it would still be unknown whether these results are indeedminimal or not.

In all these evaluations, the computation time remains the limiting factor. Thatwas expected and is a well-known characteristic of exact synthesis schemes in gen-eral (regardless of whether conventional or emerging technologies are considered).In our evaluations, we were able to determine exact results for configurations withup to 6 qubits (using an AMD Athlon X2 CPU with 3 GHz and 4 GB of mem-

59

ory). This is in accordance to exact synthesis schemes for other purposes. Theright-most column of Table 6.2 gives the detailed run-times for the selected repre-sentatives. Despite this limitation, the proposed approaches nevertheless advancesthe field of nearest neighbor optimization for multi-dimensional quantum circuitsby providing the minimal number of SWAP gates needed in order to establish anarbitrary permutation for several circuit configurations and a methodology to real-ize minimal solutions to be used for comparison to (much more scalable) heuristicresults.

6.5 ConclusionsIn this chapter, we considered the problem of how to exactly determine the min-imal number of SWAP gates to be inserted in order to make a generic, i.e. multi-dimensional, quantum circuit nearest neighbor-compliant. We observed that a so-lution for this problem requires several steps – each of them with its certain com-plexity. To cope with the respective complexity, again PBO solvers have been uti-lized. This allowed, for the first time, for a qualitative evaluation of the respectiveoptimization steps and enabled an exact comparison to heuristic results.

60

Chapter 7

Considering Nearest NeighborConstraints on the ReversibleCircuit Level

So far post-synthesis optimization schemes have been considered in this thesis.Now we want to discuss possibilities to address nearest neighbor constraints on thereversible circuit level. This is important due to the fact that many quantum algo-rithms include a Boolean component such that quantum circuits can be conductedfrom reversible circuits. When these circuit already satisfy the nearest neighborconstraints no post-synthesis optimization must be applied.

The presented synthesis scheme makes use of a recently introduced gate library,called the NCV-|v1〉 library.

7.1 NCV-|v1〉 LibraryAlthough the NCV library is universal for Boolean functions, i.e. every Booleanfunction can be represented by it [6], extensions of it have been introduced recently(see e.g. [66, 67]). In [88], we additionally consider the quantum gate library intro-duced in [67] which is based on the theoretical discussion on physical realizationsfrom [54]. Here, qudits instead of qubits are assumed, i.e. a basic building blockwhich does not rely on a two level quantum system, but a (multiple-valued) d-levelquantum system is assumed. Any state of a qudit may be written as

|Ψ〉 =

d−1∑i=0

ci|i〉 where ci ∈ C such thatd−1∑i=0

|ci|2 = 1.

Hence, qubits are a special case of qudits with d = 2. Similar to qubits, the statesof a qudit are represented by a state vector. The state vector is changed throughmultiplication of appropriate unitary matrices. In the case of an uncontrolled trans-formation the dimension of the unitary matrices is d×d, in the case of an controlledtransformation the dimension is d2 × d2. However, in contrast to a qubit, the con-

61

x1 = 1 f1 = 1

x2 = 1 f2 = 0

x3 = 1 f3 = 0

x4 = 1 f4 = 0

Vv1

Vv1

v1

V

v1V†

V V†v1

v1

v1

1

1

v1

v1

1

0

v1

0

1

0

1

0

0

0

Figure 7.1: Quantum circuit NCV-|v1〉 gate library

trolled gates for qudits perform the respective operation not when the control lineis |1〉, but rather when the control line is set to the value |d− 1〉.

In [54], a corresponding gate library for qudits has been presented and dis-cussed. In [67], this model is adopted with a 4-level logic, i.e. d = 4. The basicstates are, in that order, 0, v0, 1, and v1. As already explained, the controlledgates are only transforming the target qudit if the value of the control line is set to|d− 1〉 ≡ v1. We emphasize this fact by labeling the control connections for therespective gates with v1.

The library is composed of the three unitary gates (i.e. gates without a controlline) performing the NOT, V, and V† operation as well as single-control versionsof these gates. More precisely, these gates transform the target qubit t as specifiedby the unitary matrices

NOT =

(0 0 1 00 0 0 11 0 0 00 1 0 0

),V =

(0 0 0 11 0 0 00 1 0 00 0 1 0

),V† =

(0 1 0 00 0 1 00 0 0 11 0 0 0

).

Again, restricting the inputs to Boolean values still allows for the realization of anyarbitrary reversible functions. In the following, this model is called NCV-|v1〉 gatelibrary.

Example 7.1. Fig. 7.1 shows a quantum circuit composed of 4 circuit lines and5 quantum gates. This circuit performs the same computation as the circuits fromFig. 2.2 and 2.6, but is composed of gates from the NCV-|v1〉 library introducedin [67].

7.2 Mapping to Gates from the NCV-|v1〉 LibraryWith the NCV-|v1〉 library, also a corresponding mapping scheme has been intro-duced in [67]. This scheme fully exploits the |v1〉-sensitivity of the control lineswhich enables a more efficient mapping of reversible gates than the mapping togates from the NCV library. The general principle of this mapping is illustrated bymeans of the following example.

Example 7.2. Consider a Toffoli gate with an arbitrary number of control linesas shown in the left-hand side of Fig. 7.2. A functionally equivalent realizationin terms of gates from the NCV-|v1〉 library is depicted in the right-hand side ofFig. 7.2. First, all control lines are |v1〉-sensitized, i.e. (v1-controlled) V-gates are

62

x1 f1

x2 f2

x3 f3

. . .xn−1 fn−1

xn fn

x1 f1

x2 f2

x3 f3

Vv1

Vv1

Vv1

. . .xn−1 fn−1

xn fn

Vv1

V†

v1

v1

V†

v1

V†V†

(a) Target below

x1 f1

x2 f2

x3 f3

. . .xn−1 fn−1

xn fn

x1 f1

x2 f2

x3 f3

v1

Vv1

v1

V†V V†

. . .xn−1 fn−1

xn fnVv1

Vv1 v1

v1

V†

V†

(b) Target in between (simple construction)

x1 f1

x2 f2

x3 f3

x4 f4

. . .xn−1 fn−1

xn fn

x1 f1

x2 f2

x3 f3

x4 f4

Vv1

V

Vv1

v1

V

v1

V†V†

. . .xn−1 fn−1

xn fnVv1

Vv1 v1

v1

V†

V†

(c) Target in between (Barenco construction)

Figure 7.2: Nearest neighbor-aware NCV-|v1〉 quantum decomposition for a multi-controlled Toffoli gate

applied setting the values of the control lines to v1 iff they all have initially beenset to 1. By this, a v1-controlled NOT gate ensures that the value of the target lineis only flipped iff all control lines have been set to 1. Afterwards, (v1-controlled)V†-gates are applied to de-sensitize the control lines.

Using this scheme, every muliple-controlled Toffoli gate T = (C, xj) withxj ∈ X and C ⊂ X\{xj} can be mapped to an equivalent cascade of 2 · |C| + 1NCV-|v1〉 quantum gates [67]. In comparison to the mapping to gates from theNCV library, this is (1) significantly more compact and (2) does not even requireancillary lines. Table 3.1 provides a more precise comparison between these twomappings. Note that the physical costs of the respective gates may differ in therespective libraries. However, comparing the number of elementary gates seems tobe an acceptable abstraction until the physically realizations eventually advanced.

63

7.3 Consideration of Nearest Neighbor ConstraintsUsing the NCV-|v1〉 library nearest neighbor constraints can be satisfied much eas-ier. In fact, the mapping shown in Fig. 7.2(a) already satisfies it since only adjacentgates are included here. However, this does not hold for all configurations: Apply-ing the same scheme for a Toffoli gate as shown in the left-hand side of Fig. 7.2(b),a quantum circuit as shown in the right-hand side of Fig. 7.2(b) results still includ-ing two non-adjacent gates. In contrast to the NCV library, the number of SWAPgates to be applied is linear and can always be derived from the configuration ofthe considered reversible gate.

More precisely, let T = (C, xj) be a multiple-controlled Toffoli gate withxj ∈ X and C ⊂ X/{xj}. Assume that all control and target lines in T = (C, xj)are already adjacent. Furthermore, w.l.o.g. assume that x1, x2, · · · ∈ X are thetop lines and . . . , xn−1, xn ∈ X are the bottom lines of the circuit, respectively.Then, this Toffoli gate can be mapped to a cascade of adjacent quantum gateswith respect to the following cases (which is an improved version of the two casespresented in [88]):

1. The target line xj is “either on the top or the bottom”, i.e. either∀xi ∈ C : i < j or ∀xi ∈ C : i > j holds. In this case, the mapping al-ready introduced in Fig. 7.2(a) can be used. This already satisfies the near-est neighbor condition, i.e. no additional SWAPs are needed and a total of2 · |C|+ 1 gates are required.

2. The target line xj is “between the control lines”, i.e. ∃xi ∈ C : i < jand ∃xk ∈ C : k > j holds. In this case, the multiple-controlled Toffoligate can be represented by a quantum gate cascade as shown in Fig. 7.2(b).This requires min(2 · (j − 1 + 1), 2(k − j + 1)) SWAPs gates in order tosatisfy the nearest neighbor condition (assuming that both, xi and xk are thecontrol lines at the top and the bottom, respectively, i.e. @xl ∈ C : l < i and@xm ∈ C : m > k). That is, a total of (2 · |C|+ 1) gates and 6 ·min(j, (k−j + 1)) SWAP gates are required.

An other way to represent the multiple-controlled Toffoli gate is by the cas-cade shown in Fig. 7.2(c) where the two controlled Toffoli gate is decom-posed using the Barenco mapping (c.f. Fig. 3.1) but in this case the targetline is between the two control lines which are also v1 sensitive. For a near-est neighbor compliant decomposition of the two controlled Toffoli gate twoadditional SWAP gates are required. As a result, in total 2 · |C|+ 5 gates andtwo SWAP gates are required.

Hence, if a Toffoli gate T = (C, xj) is adjacent, a linear mapping to a quan-tum circuit satisfying nearest neighbor constraints can be derived. In contrast tothe mapping for the NCV library, this provides a precise metric for the nearestneighbor constraints that can even be used at the reversible circuit level. By this,

64

x1 x1

x2 x2

x3 x3

x4 x4

x5 x5

(a) Toffoli Circuit

x3 x3

x2 x2

x1 x1

x4 x4

x5 x5

(b) Re-ordered Toffoli Cir-cuit

Figure 7.3: Reordering a Toffoli circuit

existing synthesis procedures for reversible circuits can be extended in order to di-rectly support nearest neighbor architectures right from the beginning of the designprocess.

7.4 Nearest Neighbor-aware Optimization of ReversibleCircuits

Using the new metric provided above, this section introduces an approach thatexemplarily illustrates the consideration of nearest neighbor constraints at the re-versible circuit level. Therefore, an optimization method is proposed that is in-spired by the work from [64]. Here a reordering scheme was applied to quantumcircuits in order to reduce the number of non-adjacent quantum gates. The appli-cation of this scheme to the reversible circuit level is illustrated by means of anexample.

Example 7.3. Consider the reversible circuit shown in Fig. 7.3(a). Using the met-ric from Section 7.3, it can be calculated that a quantum circuit satisfying the near-est neighbor constraints would be composed of 50 gates, i.e. 5 + 5 + 5 + 5 = 20gates resulting from the mapping plus 3 · 5 = 15 gates to swap the control andtarget lines of the first, third, and fourth reversible gate plus 3 · 5 = 15 gates toundo this swapping. However, it can be calculated that, by slightly reordering thelines as shown in Fig. 7.3(b), the total number of quantum gates necessary can bereduced to 32. Due to the absence of a precise metric, previous methods were notable to precisely determine such optimizations at the reversible circuit level.

Motivated by this, a reversible circuit can be optimized with respect to nearestneighbor constraints as follows: First, the “contribution” of each line to the totalnumber of needed SWAPs is calculated. More precisely, for each gate g with con-trol lines C and target line xj , the number of SWAPs needed to make C and xjadjacent is determined. This number is added to variables propi with i ∈ C ∪{xj}which are used to save the proportion of a circuit line i on the total number ofSWAPs. Next, the line i with the highest value of propi is chosen for reorder-ing and placed at the middle of the circuit (i.e. line i is swapped with the middleline). If the selected line is the middle line itself, a line with the next larger valueis selected. This procedure is repeated until no improvements are achieved.

65

Note that this only exemplarily illustrates the consideration of nearest neigh-bor constraints at the reversible circuit level. In fact, thanks to the metric from Sec-tion 7.3 other existing synthesis and optimization approaches for reversible circuitscan be adjusted accordingly. However, the experimental evaluation summarized innext section confirms that already the this rather simple reordering scheme leads tosignificant improvements.

This reordering is discussed in the next chapter in more detail.

7.5 Experimental EvaluationThe optimization approach presented in the previous section has been implementedin C++ on top of RevKit [74] and applied to all benchmark circuits available atRevLib [86]. Thanks to the rather heuristic nature of the approach, all circuits havebeen processed with negligible run-time (i.e. within a couple of minutes) on anIntel Core i5-3210M machine with 2.5 GHz and 4 GB of memory.

The evaluation showed that, over all considered circuits, a reduction in thenumber of gates of the resulting quantum gate cascades of approx. 8% can beachieved on average. This includes many small circuits for which no improvementat all has been obtained. In contrast, for larger circuits improvements of up to 68%are possible.

Table 7.1 shows the best improvements which have been observed during ourevaluation. The first columns give thereby the name, the number n of lines, andthe number g of gates of the respective reversible circuits. Afterwards, the numberof quantum gates are reported for the original circuits (w/o optimization) and forthe improved circuits (w/ optimization). In both cases, of course the metric fromSection 7.3 has been applied, i.e. the nearest neighbor condition was considered.The total improvement is provided by the last column.

7.6 ConclusionsThe presented approach allows the consideration of nearest neighbor constraintsalready at the reversible circuit level. By this, a gap in today’s design flows forcorresponding quantum circuits has been closed. So far, only adjacency of Tof-foli gates has been achieved at the reversible circuit level, while the mapping to itscorresponding quantum cascades may have introduced further non-adjacent gates.Using the gate library proposed in [54, 67], we were able to provide a metric wheresuch cases are covered. By this, nearest neighbor constraints can directly be ad-dressed in the reversible circuit. Exemplarily, this has been illustrated by means ofan optimization approach.

[88] just provided a theoretical and conceptual discussion on the applicabilityof the NCV-|v1〉 library. Hence, physical realizations and concepts on how to real-ize this library is subject to future work. This includes (1) direct realizations of thequdits, (2) the emulation of qudits e.g. by existing qubit-realizations, and (3) thecompatibility to existing fault-tolerant quantum error correction protocols.

66

Table 7.1: Experimental EvaluationNO. OF QUANTUM GATES

CIRCUIT n g W/O OPT. W/ OPT. IMPR.plus127mod8192_308.real 25 54 1936 610 68%plus63mod8192_310.real 25 53 1864 676 64%rd53_311.real 13 34 872 344 61%e64-bdd_295.real 195 387 149492 68174 54%ham15_298.real 45 153 11706 5820 50%mini_alu_305.real 10 20 467 245 48%4mod5-v0_20.real 5 5 26 14 46%hwb7_302.real 73 281 46853 25487 46%ham7_299.real 21 61 2545 1429 44%4mod5-v1_23.real 5 8 98 56 43%con1_216.real 9 21 359 209 42%rd73_312.real 25 73 3648 2148 41%cycle10_293.real 39 78 5435 3275 40%plus63mod4096_309.real 23 49 1592 974 39%one-two-three-v2_100.real 5 8 62 38 39%rd84_313.real 34 104 6589 4057 38%sym9_317.real 27 62 3564 2214 38%hwb6_301.real 46 159 16194 10254 37%hwb5_300.real 28 88 5668 3592 37%mod5adder_306.real 32 96 7532 4796 36%4mod5-v0_21.real 5 6 17 11 35%rd53_136.real 7 15 256 166 35%bw_291.real 87 307 57753 37665 35%urf4_187.real 11 32004 831648 542844 35%4gt12-v1_89.real 5 5 52 34 35%4mod5-v1_24.real 5 5 52 34 35%mod10_171.real 4 10 87 57 34%4gt11_84.real 5 3 35 23 34%rd32_270.real 5 9 70 46 34%rd32_271.real 5 9 70 46 34%hwb8_303.real 112 449 113585 74951 34%

67

Chapter 8

Conclusion

In this work synthesis and optimization of quantum circuits for nearest neighborarchitectures have been discussed comprehensively. It was shown that the two re-ordering strategies, called local and global reordering, are NP complete. The globalreordering algorithm considers all permutations for finding the best line order suchthat the NNC of the given circuit are minimal; whereas the local reordering con-siders all permutations for finding the minimal number of SWAP gates in total formaking a given quantum circuit nearest neighbor compliant. Both strategies havebeen evaluated on an exact level. This ended as an approach for the optimal de-termination of SWAP gate insertions needed to make an arbitrary quantum circuitnearest neighbor-compliant. In order to handle the exponential complexity, thedeductive power of PBO solvers has been exploited. That is, the given problemhas been encoded as a PBO instance and, afterwards, solved by a proper solvingmethod. Experiments confirmed the applicability of the proposed approach. Bythis, it was possible to compare results obtained by heuristic methods to the actualoptimum.

Since the encoding does not consider the gate type, but only its position, it canbe applied on all known quantum libraries, which implement a SWAP gate. Fur-ther, the encoding of the PBO instance is straightforward. This has the advantage,that the model can be easily changed and extended if further physical constraintsmust be considered.

It is assumed that inserting the minimal number of SWAP gates in a givencircuit by keeping the original gate order results in the optimal, i.e. minimal circuit.Changing also the gate order was discussed in [47]. But the results obtained thereare not comparable to this work.

These exact approaches enabled a qualitative evaluation of the performance ofexisting (heuristic) solutions. We were able to show that heuristics following theglobal reordering scheme still have room for improvement. In contrast, heuristicsbased on the local reordering scheme already perform very well. Also a compari-son between both schemes clearly shows that a local consideration behaves signif-icantly better than the global approach – mainly because of the fact that a larger

69

search space is considered. Overall, it can be concluded that existing solutions forthe optimization of nearest neighbor quantum circuits seem to satisfactorily exploittheir potential.

Further, it was shown that the problem for arbitrary dimensional circuit gridsremains NP complete. Hence, an analogue approach solving this problem withPBO solvers has been presented. Here, several other obstacles must be considered,e.g. permutation cost, grid size, etc. – each of them with its certain complexity. Tocope with the respective complexity, again PBO solvers have been utilized. Thisallowed, for the first time, for a qualitative evaluation of the respective optimiza-tion steps and enabled an exact comparison to heuristic results. Nevertheless, anopen question is the complexity of the procedure of determining the number ofinversions for multi-dimensional grids.

Third, due to the complexity of the problems also a method for addressingnearest neighbor constraints on the reversible level when synthesizing reversiblefunctions into quantum circuits has been presented. The presented approach allowsthe consideration of nearest neighbor constraints already at the reversible circuitlevel. By this, a gap in today’s design flows for corresponding quantum circuitshas been closed. So far, only adjacency of Toffoli gates has been achieved at thereversible circuit level, while the mapping to its corresponding quantum cascadesmay have introduced further non-adjacent gates. Using the gate library proposedin [54, 67], we were able to provide a metric where such cases are covered. Bythis, nearest neighbor constraints can directly be addressed in the reversible circuit.Exemplarily, this has been illustrated by means of an optimization approach.

Further, with this approach the global reordering algorithm can be used to re-duce the number of SWAP gates. Considering the problem on the reversible levelreduces the complexity of the problem significantly. The presented global reorder-ing algorithm scaled well. Hence, we assume an analogue for reversible circuitswill scale as well. Note that on the reversible level the cost function is much morecomplex. Nevertheless, the decision problem remains in NP (which is easy to see).A local reordering analogon for the reversible level, in order to reduce the numberof initial and final uncontrolled V gates, would be an interesting evaluation as well.Nevertheless, formalizing this problem remains an open task for further works.

[88] just provides a theoretical and conceptual discussion on the applicability ofthe NCV-|v1〉 library. Hence, physical realizations and concepts on how to realizethis library is subject to future work. This includes (1) direct realizations of thequdits, (2) the emulation of qudits e.g. by existing qubit-realizations, and (3) thecompatibility to existing fault-tolerant quantum error correction protocols.

70

Bibliography

[1] D. Aharonov, W. Van Dam, J. Kempe, Z. Landau, S. Lloyd, and O. Regev.Adiabatic quantum computation is equivalent to standard quantum computa-tion. SIAM review, 50(4):755–787, 2008.

[2] F. Aloul and B. Rawi. Identifying the shortest path in large networks usingBoolean satisfiabilit y. In Electrical and Electronics Engineering, pages 1–4,Sept 2006.

[3] J. M. Amini, H. Uys, J. H. Wesenberg, S. Seidelin, J. Britton, J. J. Bollinger,D. Leibfried, C. Ospelkaus, A. P. VanDevender, and D. J. Wineland. To-ward scalable ion traps for quantum information processing. New Journal ofPhysics, 12(3):033031, 2010.

[4] M. Amy, D. Maslov, M. Mosca, and M. Roetteler. A meet-in-the-middlealgorithm for fast synthesis of depth-optimal quantum circuits. IEEE Trans.on CAD of Integrated Circuits and Systems, 32(6):818–830, 2013.

[5] M. Arabzadeh, M. S. Zamani, M. Sedighi, and M. Saeedi. Depth-optimizedreversible circuit synthesis. Quantum Information Processing, 12(4):1677–1699, 2013.

[6] A. Barenco, C. H. Bennett, R. Cleve, D. DiVinchenzo, N. Margolus, P. Shor,T. Sleator, J. Smolin, and H. Weinfurter. Elementary gates for quantum com-putation. The American Physical Society, 52:3457–3467, 1995.

[7] E. Bernstein and U. V. Vazirani. Quantum complexity theory. SIAM J. Com-put., 26(5):1411–1473, 1997.

[8] A. Blais, J. Gambetta, A. Wallraff, D. I. Schuster, S. M. Girvin, M. H. De-voret, and R. J. Schoelkopf. Quantum-information processing with circuitquantum electrodynamics. Phys. Rev. A, 75:032329, Mar 2007.

[9] A. Bocharov and K. M. Svore. Resource-optimal single-qubit quantum cir-cuits. Phys. Rev. Lett., 109:190501, Nov 2012.

[10] I. N. Bronstejn and K. A. Semendjaev. Taschenbuch der Mathematik.Deutsch, Thun, 1987.

71

[11] H. Buhrman, R. Cleve, M. Laurent, N. Linden, A. Schrijver, and F. Unger.New limits on fault-tolerant quantum computation. In FOCS, pages 411–419,2006.

[12] A. Chakrabarti, S. Sur-Kolay, and A. Chaudhury. Linear nearest neighborsynthesis of reversible circuits by graph partitioning. CoRR, 2011.

[13] A. Chakrabarti and S. SurKolay. Nearest neighbour based synthesis of quan-tum boolean circuits. Engineering Letters, 15:356–361, 2007.

[14] B.-S. Choi and R. V. Meter. A θ(√n)-depth quantum adder on the 2D NTC

quantum computer architecture. Journal of Emerging Technologies in Com-puting Systems, 8(3):24, 2012.

[15] S. Cook. The complexity of theorem-proving procedures. In ACM Symposiumon Theory of Computing, pages 151–158. ACM, 1971.

[16] S. J. Devitt, A. G. Fowler, A. M. Stephens, A. D. Greentree, L. C. L. Hol-lenberg, W. J. Munro, and K. Nemoto. Architectural design for a topologicalcluster state quantum computer. New Journal of Physics, 11(8):083032, 2009.

[17] D. P. DiVincenzo and F. Solgun. Multi-qubit parity measurement in circuitquantum electrodynamics. New Journal of Physics, 15(7):075001, 2013.

[18] C. Dürr, M. Heiligman, P. Hoyer, and M. Mhalla. Quantum query complexityof some graph problems. SIAM Journal of Computers, 35:1310–1328, 2006.

[19] N. Eén and N. Sörensson. An extensible SAT solver. In SAT 2003, volume2919 of LNCS, pages 502–518, 2004.

[20] C. Epstein. Adiabatic quantum computing: An overview. Quantum Complex-ity Theory, 6:845–851, 2012.

[21] E. Farhi, J. Goldstone, S. Gutmann, and M. Sipser. Quantum computation byadiabatic evolution. 2000. arXiv:quant-ph/0001106.

[22] A. G. Fowler, S. J. Devitt, and L. C. L. Hollenberg. Implementation of Shor’salgorithm on a linear nearest neighbour qubit array. Quant. Info. and Comput.,4:237–245, 2004.

[23] A. G. Fowler, C. D. Hill, and L. Hollenberg. Quantum error correction onlinear nearest neighbor qubit arrays. Phys. Rev. A, 2004.

[24] E. F. Fredkin and T. Toffoli. Conservative logic. International Journal ofTheoretical Physics, 21(3/4):219–253, 1982.

[25] M. Garey and D. Johnson. Computers and Intractability - A Guide to NP-Completeness. Freeman, San Francisco, 1979.

72

[26] M. Gebser, B. Kaufmann, A. Neumann, and T. Schaub. clasp: A conflict-driven answer set solver. In Logic Programming and Nonmonotonic Reason-ing, pages 260–265, 2007.

[27] M. Gebser, B. Kaufmann, A. Neumann, and T. Schaub. Conflict-driven an-swer set solving. In Int’l Joint Conference on Artificial Intelligence, pages386–392, 2007.

[28] D. Große, R. Wille, G. W. Dueck, and R. Drechsler. Exact multiple con-trol Toffoli network synthesis with SAT techniques. IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems, 28(5):703–715,2009.

[29] D. Große, R. Wille, G. W. Dueck, and R. Drechsler. Exact synthesis of ele-mentary quantum gate circuits. Multiple-Valued Logic and Soft Computing,15(4):270–275, 2009.

[30] L. K. Grover. A fast quantum mechanical algorithm for database search. InTheory of computing, pages 212–219, 1996.

[31] P. Gupta, A. Agrawal, and N. K. Jha. An algorithm for synthesis of reversiblelogic circuits. IEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems, 25(11):2317–2330, 2006.

[32] H. Häffner, W. Hänsel, C. F. Roos, J. Benhelm, D. C. al kar, M. Chwalla,T. Körber, U. D. Rapol, M. Riebe, P. O. Schmidt, C. Becher, O. Gühne,W. Dür, and R. Blatt. Scalable multiparticle entanglement of trapped ions.Nature, 438:643–646, 2005.

[33] D. A. Herrera-Martí, A. G. Fowler, D. Jennings, and T. Rudolph. Photonicimplementation for the topological cluster-state quantum computer. Phys.Rev. A, 82:032332, 2010.

[34] Y. Hirata, M. Nakanishi, S. Yamashita, and Y. Nakashima. An efficientmethod to convert arbitrary quantum circuits to ones on a linear nearest neigh-bor architecture. Third International Conference on Quantum, Nano and Mi-cro Technologies. ICQNM, pages 26–33, 2009.

[35] L. C. L. Hollenberg, A. D. Greentree, A. G. Fowler, and C. J. Wellard. Two-dimensional architectures for donor-based quantum computing. Phys. Rev. B,74:045311, 2006.

[36] W. Hung, X. Song, G. Yang, J. Yang, and M. Perkowski. Optimal synthesisof multiple output Boolean functions using a set of quantum gates by sym-bolic reachability analysis. IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems, 25(9):1652–1663, 2006.

73

[37] N. C. Jones, R. Van Meter, A. G. Fowler, P. L. McMahon, J. Kim, T. D. Ladd,and Y. Yamamoto. Layered architecture for quantum computing. Phys. Rev.X, 2:031007, 2012.

[38] B. Kane. A silicon-based nuclear spin quantum computer. Nature, 393:133–137, 1998.

[39] M. H. A. Khan. Cost reduction in nearest neighbour based synthesis of quan-tum boolean circuits. Engineering Letters, 16:1–5, 2008.

[40] M. Kumph, M. Brownnutt, and R. Blatt. Two-dimensional arrays of radio-frequency ion traps with addressable interactions. New Journal of Physics,13(7):073043, 2011.

[41] M. Laforest, D. Simon, J.-C. Boileau, J. Baugh, M. Ditty, and R. Laflamme.Using error correction to determine the noise model. Phys. Rev. A, 75:133–137, 2007.

[42] S. Lee, S. Lee, T. Kim, J. Lee, J. Biamonte, and M. P. ki. The cost of quantumgate primitives. Multiple Value Logic Soft Comput., (12):5–6, 2006.

[43] A. Lye, R. Wille, and R. Drechsler. Determining the minimal number of swapgates for multi-dimensional nearest neighbor quantum circuits. In Asia andSouth Pacific Design Automation Conference, pages 178–183, 2015.

[44] D. Maslov, G. Dueck, D. Miller, and C. Negrevergne. Quantum circuit simpli-fication and level compaction. IEEE Transactions on Computer-Aided Designof Integrated Circuits and Systems, 27(3):436–444, 2008.

[45] D. Maslov, G. W. Dueck, and D. M. Miller. Techniques for the synthesis ofreversible Toffoli networks. ACM Trans. on Design Automation of ElectronicSystems, 12(4), 2007.

[46] D. Maslov, C. Young, G. W. Dueck, and D. M. Miller. Quantum circuitsimplification using templates. In Design, Automation and Test in Europe,pages 1208–1213, 2005.

[47] A. Matsuo and S. Yamashita. Changing the gate order for optimal lnn con-version. In A. Vos and R. Wille, editors, Reversible Computation, volume7165 of Lecture Notes in Computer Science, pages 89–101. Springer BerlinHeidelberg, 2012.

[48] A. Matsuo and S. Yamashita. Changing the gate order for optimal LNN con-version. In Reversible Computation, volume 7165 of Lecture Notes in Com-puter Science, pages 89–101. Springer Berlin Heidelberg, 2012.

[49] R. V. Meter and M. Oskin. Architectural implications of quantum comput-ing technologies. Journal of Emerging Technologies in Computing Systems,2(1):31–63, 2006.

74

[50] D. M. Miller, D. Maslov, and G. W. Dueck. A transformation based algorithmfor reversible logic synthesis. In Design Automation Conf., pages 318–323,2003.

[51] D. M. Miller, R. Wille, and R. Drechsler. Reducing reversible circuit cost byadding lines. In International Symposium on Multi-Valued Logic, 2010.

[52] D. M. Miller, R. Wille, and Z. Sasanian. Elementary quantum gate real-izations for multiple-control Toffolli gates. In International Symposium onMulti-Valued Logic, pages 288–293, 2011.

[53] M. Mottonen and J. J. Vartiainen. Decompositions of general quantum gates.Ch. 7 in Trends in Quantum Computing Research, NOVA Publishers, NewYork, 2006.

[54] A. Muthukrishnan and C. R. Stroud. Multivalued logic gates for quantumcomputation. Physical Review A, 62:052309, 2000.

[55] N. H. Nickerson, Y. Li, and S. C. Benjamin. Topological quantum computingwith a very noisy network and local error rates approaching one percent. NatCommun, 4:1756, 2013.

[56] M. Nielsen and I. Chuang. Quantum Computation and Quantum Information.Cambridge Univ. Press, 2000.

[57] M. Ohliger and J. Eisert. Efficient measurement-based quantum computingwith continuous-variable systems. Phys. Rev. A, 85:062318, 2012.

[58] S. Pemmaraju and S. S. Skiena. Computational discrete mathematics: combi-natorics and graph theory with Mathematica. Cambridge Univ. Press, Cam-bridge [u.a.], reprinted edition, 2006.

[59] A. Peres. Reversible logic and quantum computers. Phys. Rev. A, (32):3266–3276, 1985.

[60] M. Ross and M. Oskin. Quantum computing. Comm. of the ACM, 51(7):12–13, 2008.

[61] M. Saeedi, M. Arabzadeh, M. S. Zamani, and M. Sedighi. Block-based quantum-logic synthesis. Quantum Information and Computation,11(3&4):262–277, 2011.

[62] M. Saeedi, M. Sedighi, and M. S. Zamani. A novel synthesis algorithm forreversible circuits. In IEEE/ACM International Conference on Computer-aided design, pages 65–68, 2007.

[63] M. Saeedi, M. Sedighi, and M. S. Zamani. A library-based synthesis method-ology for reversible logic. Microelectronics Journal, 41(4):185 – 194, 2010.

75

[64] M. Saeedi, R. Wille, and R. Drechsler. Synthesis of quantum circuits forlinear nearest neighbor architectures. Quantum Information Processing,10(3):355–377, 2011.

[65] M. Saffman, T. G. Walker, and K. Mølmer. Quantum information with Ryd-berg atoms. Rev. Mod. Phys., 82:2313–2363, Aug 2010.

[66] Z. Sasanian and D. M. Miller. Transforming MCT circuits to NCVW circuits.In Reversible Computation 2011, volume 7165, pages 77–88, 2012.

[67] Z. Sasanian, R. Wille, and D. M. Miller. Realizing reversible circuits using anew class of quantum gates. In Design Automation Conf., pages 36–41, 2012.

[68] P. Selinger. Quantum circuits of T-depth one. Phys. Rev. A, 87:042302, Apr2013.

[69] A. Shafaei, M. Saeedi, and M. Pedram. Optimization of quantum circuitsfor interaction distance in linear nearest neighbor architectures. In DesignAutomation Conf., pages 41–46, 2013.

[70] A. Shafaei, M. Saeedi, and M. Pedram. Qubit placement to minimize thecommunication overhead in circuits mapped to 2D quantum architectures. InAsia and South Pacific Design Automation Conference, pages 495–500, 2014.

[71] V. V. Shende, S. S. Bullock, and I. L. Markov. Synthesis of quantum-logiccircuits. IEEE Transactions on Computer-Aided Design of Integrated Circuitsand Systems, 25(6):1000–1010, 2006.

[72] V. V. Shende, A. K. Prasad, I. L. Markov, and J. P. Hayes. Synthesis ofreversible logic circuits. IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems, 22(6):710–722, 2003.

[73] P. W. Shor. Algorithms for quantum computation: discrete logarithms andfactoring. Foundations of Computer Science, pages 124–134, 1994.

[74] M. Soeken, S. Frehse, R. Wille, and R. Drechsler. RevKit: A toolkit forreversible circuit design. In Workshop on Reversible Computation, pages 69–72, 2010. RevKit is available at http://www.revkit.org.

[75] M. Soeken, M. Miller, and R. Drechsler. Quantum circuits employing rootsof the Pauli matrices. Physical Review A, 88(4):042322, 2013.

[76] M. Soeken and M. K. Thomsen. White dots do matter: Rewriting reversiblelogic circuits. In Proceedings of the 5th International Conference on Re-versible Computat ion, RC’13, pages 196–208, Berlin, Heidelberg, 2013.Springer-Verlag.

76

[77] M. Soeken, R. Wille, C. Hilken, N. Przigoda, and R. Drechsler. Synthesis ofreversible circuits with minimal lines for large functions. In Asia and SouthPacific Design Automation Conference, pages 85–92, 2012.

[78] Y. Takahashi, N. Kunihiro, and K. Ohta. The quantum fourier transform on alinear nearest neighbor architecture. Quantum Information and Computation,7(4):383–391, 2007.

[79] J. M. Taylor, J. R. Petta, A. C. Johnson, A. Yacoby, C. M. Marcus, and M. D.Lukin. Relaxation, dephasing, and quantum control of electron spins in dou-ble quantum dots. Phys. Rev. B, 76:035315, Jul 2007.

[80] T. Toffoli. Reversible computing. In W. de Bakker and J. van Leeuwen,editors, Automata, Languages and Programming, volume 85 of Lecture Notesin Computer Science, pages 632–644. Springer, 1980.

[81] A. M. Turing. On computable numbers, with an application to the entschei-dungsproblem. Proceedings of the London Mathematical Society, s2-42(1):230–265, 1937.

[82] W. Van Dam, M. Mosca, and U. Vazirani. How powerful is adiabatic quantumcomputation? In Foundations of Computer Science, pages 279–287. IEEE,2001.

[83] T. Warshall. A theorem on boolean matrices. Journal of the ACM, 9:11–12,1962.

[84] R. Wille and R. Drechsler. BDD-based synthesis of reversible logic for largefunctions. In Design Automation Conf., pages 270–275, 2009.

[85] R. Wille and R. Drechsler. Towards a Design Flow for Reversible Logic.Springer, 2010.

[86] R. Wille, D. Große, L. Teuber, G. W. Dueck, and R. Drechsler. RevLib: anonline resource for reversible functions and reversible circuits. In Interna-tional Symposium on Multi-Valued Logic, pages 220–225, 2008. RevLib isavailable at http://www.revlib.org.

[87] R. Wille, O. Keszöcze, and R. Drechsler. Determining the minimal number oflines for large reversible circuits. In Design, Automation and Test in Europe,pages 1–4, 2011.

[88] R. Wille, A. Lye, and R. Drechsler. Considering nearest neighbor constraintsof quantum circuits at the reversible circuit level. Quantum Information Pro-cessing, 13(2):185–199, 2013.

[89] R. Wille, A. Lye, and R. Drechsler. Exact reordering of circuit lines fornearest neighbor quantum architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 33(12):1818–1831, 2014.

77

[90] R. Wille, A. Lye, and R. Drechsler. Optimal SWAP gate insertion for nearestneighbor quantum circuits. In Asia and South Pacific Design AutomationConference, pages 489–494, 2014.

[91] R. Wille, S. Offermann, and R. Drechsler. SyReC: A programming languagefor synthesis of reversible circuits. In Forum on Specification and DesignLanguages, pages 184–189, 2010.

[92] R. Wille, M. Soeken, D. M. Miller, and R. Drechsler. Trading off circuit linesand gate costs in the synthesis of reversi ble logic. INTEGRATION, the VLSIJour., 47(2):284–294, 2014.

[93] N. S. Yanofsky and M. A. Mannucci. Quantum computing for computer sci-entists. Cambridge Univ. Press, Cambridge, 2008.

[94] N. Y. Yao, Z.-X. Gong, C. R. Laumann, S. D. Bennett, L.-M. Duan, M. D.Lukin, L. Jiang, and A. V. Gorshkov. Quantum logic between remote quantumregisters. Phys. Rev. A, 87:022306, 2013.

78

synthesis and optimization of quantum circuits for nearest

Documents