the geometry of musical chords draft - freecanonsrythmiques.free.fr/gueststars/voiceleading.pdf ·...
TRANSCRIPT
Dmitri TymoczkoPrinceton University
May 13, 2005
The Geometry of Musical ChordsDRAFT
Musical chords have a geometry that is surprisingly easy to specify. An n-notechord can be represented as a point on the orbifold Tn/Sn (the n-torus modulothe symmetric group Sn) (1). Composers in a wide range of musical styles haveexploited the non-Euclidean features of these spaces, typically by utilizingshort-distance pathways between structurally-similar chords (Fig. 1). Theexistence of such pathways depends on a chord’s symmetry, or near-symmetry,under translation, reflection, and permutation. Paradigmatically “consonant”and “dissonant” chords possess different symmetries, thereby suggestingdifferent musical applications.
Western music lies at the intersection of two seemingly independent disciplines:
harmony and counterpoint. Harmony delimits the range of acceptable chords and chord-
sequences. “Chords,” informally, are collections of simultaneously-occurring notes.
Western musical styles typically permit only a limited number of chords (for example,
only the major and minor triads) and successions between them (for example, the triadic
progression C major-F major, but not C major-Ef major). Counterpoint is the technique
of connecting the individual notes in a series of chords to form simultaneous melodic
lines. To a good first approximation, chords are typically connected so that these lines
(or voices) move independently (not all in the same direction by the same amount) and
efficiently (by short distances, according to some perhaps-implicit notion of musical
“distance”). Such voice-leading simplifies physical performance, engages explicit
aesthetic norms (2-4), and facilitates the auditory streaming necessary for perceiving
music polyphonically (5).
Figure 1 shows independent, efficient voice-leadings in a wide range of musical
styles. In each example, arrows represent individual musical voices. Fig. 1(a) comes
from the classical period and features four major triads forming an archetypal sequence:
I-IV-I-V-I in C major. The voice-leading connects each note in the first chord to its
2
nearest successor in the second. Fig. 1(b), which is a common contemporary jazz pattern,
is analogous: here the chords are again similar, and the voice-leading connects notes by
short, though not necessarily minimal, paths. The voice-leadings in Fig. 1(c), which are
celebrated examples of nineteenth-century chromaticism, are also efficient, though here
they connect chords traditionally considered to belong to different types. Fig. 1(d)
presents a series of voice-leadings in which voices move independently and efficiently
within an unchanging harmony; such procedures are typical of twentieth-century “avant
garde” composition.
How is it that Western music can satisfy harmonic and contrapuntal constraints at
once? And what determines whether two chords can be connected by efficient voice-
leading? Composers and music theorists have been investigating these questions for
almost three hundred years. The “circle of fifths” (Fig. S1), first published in 1728 (6),
can be interpreted as depicting maximally efficient voice-leadings among the twelve
familiar major scales. The Tonnetz (Fig. S2), originating with Euler in 1739 and
discussed by nineteenth-century music theorists Oettingen and Riemann, depicts efficient
voice-leadings among the twenty-four major and minor triads (3, 7-8, 12). Recent
theoretical work (9-19) has continued this tradition, investigating efficient voice-leading
among other small collections of interesting chords. However, no comprehensive theory
of voice-leading has yet emerged. In this paper I provide such a theory, showing that
chords that can be connected by efficient voice-leading are close in the space of all
possible n-note chords.
Characterizing the geometry of chord-space requires surprisingly recent
mathematics: chord-space is an “orbifold,” a notion introduced by Satake in 1956 (20)
and developed by William Thurston in the 1970s (21). Understanding the orbifold
structure of chord-space permits a unified perspective on musical practices across a very
wide range of styles and time-periods: in particular, it shows that composers have
frequently (and perhaps unwittingly) exploited the special contrapuntal properties of
nearly-symmetrical chords (Fig. 1). More generally, the geometry of chord-space reveals
how the internal structure of a chord, including its degree of acoustic consonance,
3
determines the kind of efficient voice-leadings it can participate in. Thus, for the first
time, we can precisely specify the way in which harmony and counterpoint are related.
I. Background and definitionsFor maximal generality, we will consider voice-leading in a continuous, octave-
free space of “pitch-classes.” Two pitches are instances of the same pitch-class (or
chroma) when they are one or more octaves apart. Music theorists represent pitches
numerically by associating their fundamental frequencies f with real numbers according
to the equation:
p = 69 + 12log2(f/440) (1)
This creates a linear pitch space in which middle C is 60, an octave has size 12, and a
semitone (the distance between adjacent keys on a piano keyboard) has size 1. To create
circular pitch-class space, we identify all points p and p+12, forming the quotient space
R/12Z. (Here R refers to the set of real numbers and Z to the group of integers; the
notation R/12Z refers to the circular quotient space, whose points are the orbits of 12Z as
it acts on R.) This creates numerical equivalents for the familiar pitch-class letter names:
C=0, Cs/Df=1, D=2, “D quarter-tone sharp”=2.5, and so on. Note that although we will
consider the most general case of a continuous pitch-class space, in musical situations
one is typically concerned with a lattice of discrete, equally-spaced points in this space,
corresponding to the familiar pitch-classes of Western equal-temperament.
Formally, a chord is a multiset of pitch-classes, i.e. a set in which duplicates are
allowed. We will denote unordered multisets using curly braces: the C major chord is {0,
4, 7}, and the F-major chord is {0, 5, 9}. The musical term transposition is synonymous
with the mathematical term translation, and corresponds to addition in R/12Z. Two
chords are transpositionally equivalent if they the same up to some translation in pitch-
class space. Thus the C-major chord and F-major chord are transpositionally equivalent,
since {0, 5, 9} = {7 + 5, 0 + 5, 4 + 5}. Symbolically, we write T5({0,4,7}) = {5, 9, 0}.
The musical term inversion is synonymous with the mathematical term reflection, and
4
corresponds to subtraction from a constant value in R/12Z. Two chords are inversionally
equivalent if they are the same up to some reflection in pitch-class space. Thus the C
major chord {0, 4, 7} is inversionally equivalent to the C minor chord {0, 3, 7} since {0,
3, 7} = {7 – 7, 7 – 4, 7 – 0}. We use Ix to refer to the reflection that sends 0 to x, writing
I7({0, 4, 7}) = {0, 3, 7}. Musically, transposition and inversion are significant because
they preserve an important aspect of the “quality” or “character” of a chord:
transpositionally-related chords sound extremely similar; inversionally-related chords,
somewhat less so.
A voice-leading between two chords {a1, a2, …, am} and {b1, b2, …, bn} is a
multiset of ordered pairs (ai, bj), such that every element of each chord is in some pair.
(Regular parentheses denote ordered lists.) A trivial voice-leading contains only pairs of
the form (x, x). We denote voice-leadings using vector notation A→B indicating that the
ith components of the vectors are associated by the voice-leading. Thus the voice-leading
(0, 0, 4, 7)→(11, 2, 5, 7) associates the root of the C major triad with both the third and
fifth of the G7 chord, while associating the third and fifth of the C major chord with the
seventh and root of the G7, respectively. This voice-leading can be interpreted either as a
non-bijective voice-leading between (0, 4, 7) and (11, 2, 5, 7) or as a bijective voice-
leading between (0, 0, 4, 7) and (11, 2, 5, 7).
Music theorists have proposed numerous ways of measuring the size of a voice-
leading. These measures are closely related to familiar mathematical norms, and include
“taxicab norm,” Euclidean norm, and a few exotic quasi-norms indigenous to music
theory (see the Materials and Methods). These proposals are at best approximations,
attempts to make explicit composers’ intuitions as embodied in Western musical practice.
For this reason we will not adopt any one method of measuring voice-leading size.
Instead, we will require only a normlike strict weak ordering of voice-leadings, satisfying
a few constraints that ensure it resembles a mathematical measure of “length” (see the
Materials and Methods). To date, every music-theoretical method of measuring voice-
leading size gives rise to a normlike strict weak ordering. It can be shown that for any
normlike strict weak ordering, there will be a minimal voice-leading between arbitrary
chords A→B that has no “voice-crossings”: that is, it is possible to move the elements of
5
A continuously to their counterpoints in B such that no two paths coincide other than at
the endpoints of the process (see the Materials and Methods). Since Western composers
have traditionally avoided “voice-crossing” (Fig. 1[a-c]), this result suggests that
normlike strict weak orderings are at least consistent with observed features of Western
musical practice (see the Materials and Methods). Furthermore, it enables the use of the
standard computer-science technique of “dynamic programming” to identify, in
polynomial time, a minimal voice-leading between arbitrary chords (see the Materials
and Methods).
II. The Geometry of Musical ChordsA) the space of all n-note chordsWe now describe the geometry of musical chords. The elements of an ordered n-
note chord can be interpreted as the coordinates of a point on the n-torus (R/12Z)n, or Tn.
Every ordered pair of points in this space corresponds to a bijective voice-leading. (We
can restrict our attention to bijective voice-leadings without loss of generality, since any
voice-leading can be interpreted as a bijection between the appropriate multisets.) If our
method of measuring voice-leading size gives rise to a metric, then we can use it to
measure distances between points in this space. If our method of measuring voice-
leading size is only a normlike strict weak ordering, then we can use it to compare
“distances,” though without quantifying these “distances” as real numbers (see the
Materials and Methods).
To represent voice-leadings between unordered chords, we need to identify all
points representing different orderings of the same chord. Mathematically, we form the
quotient of the n-torus Tn by Sn, identifying all points (x1, x2, … xn) and (xσ(1), xσ(2), …
xσ(n)), where σ is some permutation of the integers from 1 to n. This space is what
mathematicians call a global-quotient orbifold (20, 21). A global quotient orbifold (or
“orbit-manifold”) is the space that results from identifying all points lying in the orbits of
a group acting discontinuously on a locally Euclidean space (a manifold). Orbifolds are
more complex than manifolds in that they can have “singularities” at which the space is
not locally Euclidean. Figure 2 shows the orbifold T2/S2, drawn using orthogonal
coordinates and Euclidean distance (see the Materials and Methods). We see that the
6
space of two-note musical chords is a Möbius strip, a “square” whose left “edge” is
identified, modulo a half-twist, with its right. The orbifold is singular at its circular
boundary, which acts as a mirror (21).
Any voice-leading between dyads can be uniquely associated with a path on
Figure 2. A metric allows us to measure the length of these paths, while a normlike strict
weak ordering allows us to compare two paths without quantifying their “length.” The
paths corresponding to voice-leadings are the images of line-segments in the parent space
Tn. They are either line-segments in the orbifold, or “reflected” line-segments that
“bounce off” the orbifold’s mirror boundary. For example, on Figure 2, the voice-leading
(0, 1)→(1, 0) corresponds to the path that begins at (0, 1), moves in a straight line to (.5,
.5), and gets reflected back along the same line-segment to (0, 1) (Fig. 2). (To see why,
imagine each pitch-class moving continuously to its destination, starting and ending at
the same time.) “Reflected” voice-leadings contain voice-crossings, since the edge
contains all and only those chords with duplicate pitch-classes. The “no-crossing”
principle therefore asserts that there will be a minimal voice-leading between any two
points that does not touch the orbifold’s singular boundary.
Generalizing Figure 2 to higher dimensions is straightforward (see the Materials
and Methods). Given a Euclidean metric, the orbifolds Tn/Sn are simplicial prisms whose
faces are identified, modulo the orthogonal transformation that cyclically permutes the
vertices of one of the faces. (Figure 2 is a 2-dimensional prism [square] whose 1-
dimensional “faces” [left and right “edges”] have been identified modulo the reflection
that exchanges their vertices [a reflection, or 180° “twist” in the third dimension]. Three-
note chords lie on a three-dimensional prism whose faces are equilateral triangles. One
face is rotated 120° before the faces are identified; the resulting figure is the bounded
interior of a twisted triangular 2-torus.) The singular boundary of the orbifold acts as a
mirror and contains chords with duplicate pitch-classes. Chords that divide the octave
into n equal parts are at the center of the orbifold, while chords containing only one pitch-
class lie along its one-dimensional edge (Fig. 2). Voice-leadings represented by line-
segments parallel to the orbifold’s one-dimensional edge are not independent: each voice
moves in the same direction by the same amount. Voice-leadings perpendicular to the
7
boundary are independent, and preserve the sum of a chord’s pitch-classes. These above
descriptions assume a Euclidean metric; closely analogous statements hold for the
orbifolds with other metrics.
B) consequences of the geometry
Two chords can be connected by efficient voice-leading if they are “near” each
other in the voice-leading orbifold, relative to some method of measuring voice-leading
size and some voice-leading that serves as a standard of “efficiency” or “nearness.”
Standards of “nearness” vary somewhat from musician to musician, but there is in
practice a good degree of agreement about which voice-leadings are efficient. Relative to
these widely-agreed-upon standards, some chords can participate in efficient,
independent voice-leadings to their transpositionally- or inversionally-related forms.
These chords have been the focus of much Western compositional attention, since they
permit the simultaneous satisfaction of harmonic and contrapuntal constraints: they can
be used in progressions that are harmonically consistent (involving chords equivalent
under transposition or inversion) while also permitting good counterpoint (the voice-
leading between successive chords is independent and widely considered to be efficient).
The progressions in Figure 1 are all of this type.
What determines the size of the minimal independent voice-leading between a
chord and one of its transpositions or inversions? The answer, for any normlike strict
weak ordering, is that such voice-leadings are due to the chord’s invariance, or near-
invariance, under permutation, transposition, or inversion (see the Materials and
Methods). Each of these three invariances, or symmetries, corresponds to a different
geometrical property and produces a different type of voice-leading. In addition, the first
two invariances are closely related to the musical notions of consonance and dissonance:
acoustically consonant chords are nearly invariant under transposition, while a number of
paradigmatically dissonant chords are nearly invariant under permutation. We will now
survey these three types of invariance, their geometrical representation in the voice-
leading orbifolds, and the musical applications to which they give rise.
8
A bijective voice-leading from a chord to itself (a permutational voice-leading)
acts as a permutation of that chord’s elements. A chord that has duplicate pitch-classes
will be permutationally invariant (P-invariant), since there will be some nontrivial
permutation of its elements that yields a trivial voice-leading. P-invariant chords lie on
the boundary (or “singular locus”) of the voice-leading orbifold. Chords that are close to
the orbifold’s boundary can be described as nearly P-invariant, since they will have
efficient permutational voice-leadings that reflect off the nearby boundary (Fig. 2).
These voice-leadings are non-minimal, since they are larger than the trivial voice-leading;
they contain voice-crossings, since they touch the orbifold’s boundary.
Nearly P-invariant chords include “chromatic clusters” such as {0, 1}, {0, 1, 2},
and {4, 5, 6, 7}. Such chords, which are considered to be extremely dissonant, are well-
suited for static music in which voices move by small distances within an unchanging
harmonic context (Fig. 1[d]) (23). This sort of “permutational” voice-leading is
characteristic of much late-twentieth century non-tonal composition, particularly the
works of Gyorgy Ligeti (24). Efficient “permutational” voice-leadings can also be used to
generate independent, relatively efficient voice-leadings A→Tx(A), where x is close to
zero.
Transposition by x semitones is an automorphism of the voice-leading orbifold
that preserves the “size” of voice-leadings according to any normlike strict weak
ordering: for any normlike strict weak ordering, the voice-leading (a1, a2, …, an)→(b1, b2,
…, bn) is the same size as (a1 + x, a2 + x, …, an + x)→(b1 + x, b2 + x, …, bn + x). A
transpositionally invariant (T-invariant) chord is a fixed point of one of these
automorphisms; for n-note chords, such fixed points exist only when nx is congruent to 0,
mod 12Z. Chords lying close to these T-invariant chords can be described as nearly T-
invariant, since there will be multiple transpositions of such chords located near any
single fixed point. These transpositionally-related chords can therefore be connected by
efficient voice-leadings.
On Figure 2, the T6-related perfect fifths {4, 11} and {5, 10} lie close to the same
T6-invariant tritone, {4.5, 10.5}. This accounts for the small voice-leading (4, 11)→(5,
10). By transposing the second chord up by semitone, we can obtain a fairly efficient
9
voice-leading (4, 11)→(4, 9). This voice-leading occurs between the top and lower-
middle voices of the first two chords in Fig. 1(b); analogous voice-leadings connect the
top and lower-middle voices of the remaining chords in the progression. (The other two
voices are linked by the smooth voice-leading between T5-related tritones, which appears
on Figure 2 as rightward motion.) Similarly, on the three-dimensional voice-leading
orbifold the T4-related C and E major triads lie close to the same T4-invariant augmented
triad; for this reason, they are connected by a small voice-leading (0, 4, 7)→(11, 4, 8).
Again, by transposing the second chord up by semitone, this voice-leading generates a
fairly efficient voice-leading between T5-related major triads; the result is the first voice-
leading shown in Figure 1(a). The remaining voice-leadings in Figure 1(a) can all be
derived from this one by transposition and time-reversal.
T-invariance is due to the evenness with which a chord’s elements are distributed
in pitch-class space. A T-invariant chord either divides the octave into equal parts, and
occupies the center of the orbifold, or is the union of equally-sized chords that themselves
divide the octave evenly (25). (The union of differently-sized chords that evenly divide
the octave is not in general T-invariant.) Likewise, a near T-invariant chord divides the
octave into nearly-equal parts, or is the union of n-note chords that do so. In general, the
more evenly-spaced a chord, the closer it will be do the center of the orbifold, and the
smaller will be its bijective voice-leadings to its T-equivalent forms (see the Materials
and Methods). Indeed, it can be shown that the chord which divides pitch-class space
into n equal parts has the smallest possible minimal bijective voice-leading to all of its
transpositions: for all n-note chords A, the minimal bijective voice-leading between A
and Tx(A) can be no smaller than the minimal bijective voice-leading between E and
Tx(E), where E divides pitch-class space into n equal parts (see the Materials and
Methods). A corollary covers the covers the discrete case of a finite evenly-tempered
pitch-class space (see the Materials and Methods).
This fact has a singularly important musical consequence: “acoustically
consonant” chords tend to be nearly T-invariant. Acoustic consonance is incompletely
understood; however, most music theorists agree that chords approximating the first few
consecutive pitch-classes of the harmonic series will be consonant when played with
10
harmonic tones (20). Remarkably, the structure of the harmonic series ensures that such
chords will divide the octave into nearly-even parts (Table 1). The relation between
acoustic-consonance and near-evenness has had an enormous impact on the development
of traditional Western music. The near-evenness of traditional Western harmonic
materials implies that these chords are clustered near the center of the voice-leading
orbifold; for this reason, there exist transpositions of these chords that can be linked by
efficient, independent voice-leadings. This is true whether the chords are
transpositionally equivalent (Fig. 1[a-b]) or transpositionally distinct (Fig. 1[c]).
Traditional tonal counterpoint, in its essence, consists in the exploitation of these efficient
voice-leadings. They exist because of the near-evenness of the underlying sonorities, a
property which is in turn attributable to classical composers’ interest in acoustic
consonance.
Finally, inversions (or reflections) are automorphisms of the voice-leading
orbifold that again preserve the “size” of voice-leadings according to any normlike strict
weak ordering. Inversionally invariant (I-invariant) chords are fixed points of some
reflection; such fixed points exist for any Ix. A chord that lies near an I-invariant chord is
nearly I-invariant, since there will be two I-related chords lying close to the same I-
invariant chord; this again permits small voice-leadings between them. For example, the
Fs “half-diminished seventh” chord {6, 9, 0, 4} and the F “dominant seventh” chord {5,
9, 0, 3} lie near the same (I-invariant) chord {5.5, 0, 0, 3.5}: this permits the efficient
voice-leading (6, 9, 0, 4)→(5, 9, 0, 3), shown in Fig. 1(c). I-invariant chords can be
highly consonant, like (0, 3, 7, 10), or highly dissonant, like (0, 1, 2, 3). However,
composers in most Western styles have considered I-invariant chord pairs to be “similar.”
Consequently, they have frequently exploited efficient voice-leadings between
inversionally-related chords.
Thus we see that the geometrical properties of the orbifolds Tn/Sn give rise to a
wide range of related musical practices, each of which exploits different symmetries that
a chord might have. Our discussion suggests multiple avenues of further music-
theoretical inquiry. First, one could investigate in detail the ways in which Western
composers, performers, and improvisers have exploited the three symmetries that can
11
produce small voice-leading: for example, Schubert was fond of the near T4-invariance of
the major triad (27), Wagner and Debussy exploited the near i-invariance of the
“dominant seventh chord” (Fig. 1[c]), while contemporary jazz harmony frequently
exploits the near t-symmetry of the perfect fifth (Fig. 1[b], top and lower-middle voice).
Second, one could investigate how the mathematical properties described in this paper
have influenced the broader course of music history—examining how the concern for
efficient voice-leading interacted with, and presumably helped motivate, the increasing
“chromaticism” of nineteenth-century music. Third, one could investigate whether
distances in the voice-leading orbifold correlate with perceptual judgments of similarity
among chords—a topic of considerable recent theoretical interest (28). Finally, a clear
understanding of the relation between chord structure and voice-leading may suggest new
techniques to contemporary composers.
12
NOTES
1. For a glossary of mathematical and musical terms and abbreviations used in this paper,see Tables S1 and S2.
2. C. Masson, Nouveau Traité des Regles pour la Composition de la Musique (Da Capo,New York, 1967 [1694]).
3. O. Hostinsky, Die Lehre von den musikalischen Klangen (H. Dominicus, Prague,1879).
4. A. Schoenberg. Theory of Harmony (University of California Press, Berkeley, 1978).
5. J. K. Wright, A. S. Bregman. Contemporary Music Review 2, 63 (1987).
6. J. D. Heinichen, Der General-Bass in der Composition (G. Olms Verlag: Dresden,1728).
7. A. v. Oettingen. Harmoniesystem in dualer Entwicklung (W. Gläser, Leipzig, 1866).
8. H. Riemann, “Die Natur der Harmonik,” Sammlung musikalisher Vorträge (Breitkopf& Härtel, Leipzig, 1882).
9. J. Roeder. A Theory of Voice Leading for Atonal Music. Ph.D. thesis, Yale University(1984).
10. J. Roeder. Perspectives of New Music 25, 362 (1987).
11. R. Cohn. Music Analysis 15, 9 (1996).
12. R. Cohn. Journal of Music Theory 41, 1 (1997).
13. D. Lewin. Journal of Music Theory 42, 15 (1998).
14. R. Morris. Music Theory Spectrum 20, 175 (1998).
15. C. Callender. Journal of Music Theory 42, 219(1998).
16. A. Childs. Journal of Music Theory 42, 181(1998).
13
17. J. Douthett, P. Steinbach, Peter. Journal of Music Theory 42, 241(1998).
18. J. Straus. Music Theory Spectrum 25, 305 (2003).
19. C. Callender, Music Theory Online 10.3 (2004).
20. I. Satake. Proceedings of the National Academy of Sciences 42, 359 (1956).
21. W. Thurston. Three Dimensional Geometry and Topology (Princeton MathematicalSeries 35, Princeton University Press, Princeton, 1997).
22. See the “materials and methods” section.
23. C. Callender, personal communication.
24. J. Bernard, Music Theory Spectrum 21, 1 (1999).
25. R. Cohn. Journal of Music Theory 35, 1 (1991).
26. W. Sethares. Tuning, Timbre, Spectrum, Scale (Springer, New York, 1998).
27. R. Cohn, Nineteenth-Century Music 22, 213 (1999).
28. I. Quinn. Perspectives of New Music 39, 108 (2002).
& www www www www www
? wwww# wwwwn wwwwb wwwwb
& wwww# wwwwnb wwww### wwwwbb
& wwwb www www
(0, 4, 7) → (0, 5, 9) → (0, 4, 7) → (11, 2, 7) → (0, 4, 7) I IV I V I
(6, 11, 0, 4) → (5, 9, 11, 4) → (4, 9, 10, 2) → (3, 7, 9, 2) D7 G7 C7 F7
(6, 9, 0, 4) → (5, 9, 0, 3) (1, 4, 8, 10) → (2, 5, 8, 10)
Figure 1. Efficient voice-leading in the Western tradition.Numbers correspond to pitch-classes, with C = 0, Cs = 1, etc. The voice-leadings in (a)-(c) are minimal voice-leadingscontaining no “voice-crossings.” That in (d) is non-minimal,and contains crossings. The four examples exploit threedifferent kinds of near-symmetry: translation in (a) and (b), reflection in (c), and permutation in (d).
a) a common classical upper-voice I-IV-I-V-I pattern
b) a common jazz-piano “left-hand” voice-leading pattern
c) Wagner, Parsifal (simplified) and Debussy, Prelude to the Afternoon of a Faun
d) in the style of Gyorgy Ligeti
(11, 0, 1) → (0, 1, 11) → (11, 1, 0)
01
00
02
03
04
05
06
07
08
09
0t
0e
[00]
11
12
13
14
15
16
17
18
19
1t
[1e]
1e
2e
3e
4e
5e
6e
7e
8e
9e
te
ee
22
23
24
25
26
27
28
29
[2t]
2t
3t
4t
5t
6t
7t
8t
9t
tt
33
34
35
36
37
38
[39]39
49
59
69
79
89
99
44
45
46
47
[48]
48
58
68
78
88
55
56
[57]
57
67
77
[66]
66
Figure 2. The orbifold T2/S2, drawn using a Euclidean metric Labelled points in the space correspond to equal-tempered dyads; the symbols “t” and “e” refer to 10 and 11, respectively. The left “edge” is identified, with a half-twist, with the right. The two voice-leadings (0, 1)→(1, 0) and (4, 11)→(5, 10) are shown on the graph; the first of these is reflected off the figureʼs mirror boundary.
Number ofNotes
The equal-tempered chordproviding the best approximationto the lowest pitch-classes of theharmonic series
Other chords providing reasonablygood approximations to the lowestpitch-classes of the harmonic series
2 (dyad) fifth (0, 7)3 (triad) major (0, 4, 7) diminished
minoraugmented
(0, 3, 6)(0, 3, 7)(0, 4, 8)
4 (seventhchords)
dominant (0, 4, 7, 10) diminishedhalf-diminishedminormajor
(0, 3, 6, 9)(0, 3, 6, 10)(0, 3, 7, 10)(0, 4, 7, 11)
5 (ninthchords)
dominant ninth (0, 2, 4, 7, 10) pentatonic (0, 2, 4, 7, 9)
7 (scales) melodic minor(ascendingform)
(0, 2, 4, 6, 7, 9, 10) majorharmonicminor
(0, 2, 4, 5, 7, 9, 11)(0, 2, 3, 6, 7, 9, 10)
Table 1. Familiar sonorities used in Western music. The sonorities on the left providethe best equal-tempered approximations to the first n pitch-classes of the harmonic series.The commonly-used sonorities on the right lie also approximate the first n pitch-classesof the harmonic series. All sonorities divide pitch-class space fairly evenly.
S1
MATERIALS AND METHODSTABLE OF CONTENTS
1. Comparing voice-leadings S1
2. Minimal voice-leadings and voice-crossings S4
3. A polynomial-time algorithm for finding a minimum
voice-leading between two chords S8
4. Derivation of the voice-leading orbifolds S10
5. Efficient voice-leading and symmetry S12
6. Evenness and transpositional invariance S16
1. Comparing voice-leadings. Let a be an element of R/12Z. We define the
norm of a, written |a|12Ζ, as the smallest real number |x| such that x and a are congruent
mod 12Z. (Here |x| refers to the standard absolute-value function.) The distance between
two pitch-classes a and b is |b – a|12Ζ. We define the displacement multiset associated
with a voice-leading A→B as the multiset of distances |bj – ai|12Ζ for all (ai, bj) in A→B.
For example, the displacement multiset associated with the voice-leading (0, 0, 4,
7)→(11, 2, 5, 7) is {1, 2, 1, 0}.
We will require that any method of comparing voice-leadings depend only on
their displacement multisets: for any two displacement multisets X and Y, it tells us
which, if any, is larger. More formally, a method of comparing voice-leading size will be
an asymmetric, negatively transitive relation (a strict weak order) over multisets of non-
negative reals. (A relation “>” is “asymmetric” if A > B implies that not B > A. It is
“negatively transitive” if A > B implies that either A > C or C > B, for all C.) A strict
weak order defines equivalence classes consisting of all non-comparable items: A ≡ B iff
neither A > B nor B > A. Strict weak orders are stronger than partial orders, since they
satisfy the trichotomy axiom: for any two elements in a strict, weakly ordered set, either
A > B, A ≡ B, or B > A. However, a strict weak order is weaker than a total order, since
it does not satisfy the “antisymmetry” condition: in a strict weak order, A ≡ B does not
imply that A and B are the same object.
S2
Let > be a strict weak order of multisets of nonnegative reals. We will say that
the relation > is normlike if and only if it satisfies two constraints.
{x1, x2, …, xm, c} > {y1, y2, …, yn, c} implies {x1, x2, …, xn} > {y1, y2, …, yn} (Recursion)
{x1 + i, x2, …, xn} ≥ {x1, x2 + i, …, xn} ≥ {x1, x2, …, xn}, for x1 > x2, i > 0 (Distribution)
(NB: since multisets are unordered the numerical subscripts do not have ordinal
significance: x1 is no more “first” than x2 or xn.) The recursion constraint mandates a
predictable relationship between the size of a multiset and the size of its sub-multisets.
The distribution constraint’s first inequality requires that if X is an n-element multiset
whose values sum to x, then {x, 0, 0, …, 0} ≥ X ≥ {x/n, x/n, …, x/n}. Thus, x semitones
of motion in a single “voice” yields at least as large a voice-leading as x semitones of
motion distributed over multiple voices. As we will see below, this constraint is closely
related to the triangle inequality. The distribution constraint’s second inequality requires
that reducing the size of an element in a displacement multiset not make that multiset
larger. If a normlike strict weak order strictly satisfies both of the distribution
constraint’s inequalities, we will say that it strictly satisfies the distribution constraint.
At present, every music-theoretical method of measuring voice-leading size
produces a normlike strict weak order of multisets of non-negative reals. All but one
strictly satisfy the distribution constraint.
A. “Smoothness.” The size of a voice-leading is the sum of the elements of thedisplacement multiset (S1, S2, S3). This is sometimes called “taxicab norm.”Smoothness satisfies the distribution constraint non-strictly.
B. Smoothness is analogous to the L1 vector norm, though the components ofvectors are ordered whereas the elements of displacement multisets are not. Theanalogues to Lp vector norms strictly satisfy the distribution constraint for finite p> 1. (The L∞ vector norm also satisfies the distribution constraint, but notstrictly.) The L2 vector norm, which has been used by Callender (S4),corresponds to Euclidean norm.
S3
C. “Parsimony.” Parsimony generalizes a notion introduced by Richard Cohn anddeveloped by Jack Douthett and Peter Steinbach (S5, S6). Given two voice-leadings, α and β, α is smaller (or “more parsimonious”) than β iff there existssome real number j such that
1) for all real numbers i > j, i appears the same number of times in thedisplacement multisets associated with α and β; and2) j appears fewer times in the displacement multiset of α than β.
D. “Smoothness then parsimony.” This measure represents my own besthypothesis about how classical composers might have thought about voice-leading size. Given two voice-leadings α and β, α is smaller than β iff:
1) α is smoother than β; or2) α and β are equally smooth, and α is more parsimonious than β.
Many of these methods of measuring voice-leading size yield mathematical “norms”:
there is some function f from multisets to the real numbers, such that f(X) > f(Y) if and
only if X > Y according to the normlike strict weak order >. Note, however, that neither
“parsimony” nor “smoothness then parsimony” can give rise to such a function f.
Nevertheless, both “parsimony” and “smoothness then parsimony” represent musically
viable ways of thinking about voice-leading size. For this reason, we cannot simply
impose the mathematically-convenient requirement that measurements of voice-leading
size produce “norms” or “metrics.”
However, both “parsimony” and “smoothness then parsimony” are very closelyrelated to traditional norms. “Parsimony” refines the L∞ vector norm, according to which
the size of a voice-leading is given by the largest element in its displacement multiset.Given two voice-leadings α and β, if α < β according to the L∞ norm then α is more
parsimonious than β. However, the converse does not hold: the voice-leadings {3, 3} and
{3, 0} have the same L∞ norm but the first is less parsimonious than the second.
“Parsimony” is therefore closely related to, but slightly more fine-grained than the L∞
norm. “Smoothness then parsimony” stands in an analogous relation to “smoothness,”
the L1 vector norm. For this reason, we can often reason about “smoothness” and“smoothness then parsimony” using our geometric intuitions about the L∞ and L1 norms.
This point holds more generally. As the name suggests, the notion of a normlike
strict weak order is a weakened analogue to a traditional geometrical “norm.” We can
S4
think of the displacement multiset associated with the voice-leading A→B as a non-real-
valued “norm” of the voice-leading A→B. Likewise, the displacement multiset
associated with the minimal voice-leading A→B is analogous to a non-real-valued
“distance” between A and B. This non-real-valued “distance” has many of the properties
associated with a proper mathematical metric:
1. It is symmetric, since the displacement multiset associated with the minimalvoice-leading A→B is the same as the displacement multiset associated with the minimal
voice-leading B→A.
2. The minimal voice-leading A→A has displacement multiset {0, 0, …, 0},
which is at least as small as any other displacement multiset with the same number of
elements. In this sense, the “distance” between A and A is as small as it can be.
3. If the displacement multiset associated with the minimal bijective voice-leadingA→B is {0, 0, …, 0} then A = B.
4. Finally, the distribution constraint is closely related to the triangle inequality.
Indeed, as long as we require that the size of a voice-leading depend only on the size of
its displacement multiset, then the two principles are equivalent: any violation of the
distribution constraint generates a violation of the triangle inequality, and vice-versa.
(This is fairly obvious in the case of the metrics associated with the Lp vector norms, and
less than obvious in the general case. I sketch a proof at the end of §2, below.)
Intuitively, both the distribution constraint and the triangle inequality express the
principle that x steps in a single direction take you farther than x total steps in a number
of mutually orthogonal directions.
2. Minimal voice-leadings and voice-crossings. The following theorem shows that
between any two chords there is a minimal voice-leading with no “voice-crossings” in
pitch-class space. Since avoidance of voice-crossings is a feature of traditional Western
musical practice, it helps justify our use of normlike strict weak orders; it furthermore
allows us to generate an efficient algorithm for determining the minimal voice-leading
between two chords.
S5
THEOREM 1. Let A and B be any two chords, and let our measure voice-leading size be a strict weak order satisfying the distribution constraint. There willexist a minimal voice-leading from A to B, (a1, a2, …, an)→(b1, b2, …, bn), that hasno “voice-crossings” in pitch-class space. That is, there will exist a set ofcontinuous functions fn(t) such that fn(0) = an, fn(1) = bn, and fm(t) ≠ fn(t), for all m≠ n, and all t such that 0 < t < 1. Furthermore, if our order strictly satisfies thedistribution constraint, then every minimal voice-leading between A and B will becrossing-free.
The theorem is proved by a simple examination of cases.Suppose that a voice-leading A→B contains a crossing; we will show that we can
remove the crossing without increasing the size of the voice-leading and without creating
any new crossings. In what follows, will depict pitch-class space as a circle with
ascending motion in pitch-class space corresponding to clockwise motion around thecircumference. It is always assumed that 0 < x, x + m, x + n ≤ 6. Note that although the
following proof is stated in terms of pitch-classes, a precisely analogous result applies to
pitches; here, “chords” are simply multisets of real numbers, and there is always a
minimal voice-leading with no crossings in pitch-space.
Figure S3(a) shows the first geometrical possibility: pitch-class a1 moves n
semitones counterclockwise to b2 while pitch class a2 moves x + m semitonescounterclockwise to b1, with 0 ≤ n < m. The uncrossed voice-leading (a1, a2)→(b1, b2) has
displacement multiset {m, x + n}. Since m > n, the distribution constraint implies that{m, x + n} ≤ {x + m, n}. The uncrossed voice-leading is no larger than the voice-leading
with the crossing; if the strict weak order strictly satisfies the distribution constraint, then
the uncrossed voice-leading is smaller.
Figure S3(b) shows a second possibility: a1 moves clockwise by n semitones to b2,while a2 moves counterclockwise by x + m semitones to b1, with m ≥ 0, x > n > 0. The
voice-leading (a1, a2)→(b2, b1) is associated with the displacement multiset {n, x + m};
the voice-leading (a1, a2)→(b1, b2) is associated with {m, x – n}. By the distribution
constraint, {x + m, n} ≥ {x, n + m} ≥ {x – n, m}, so the uncrossed voice-leading is no
larger than the crossed voice-leading. If the strict weak order strictly satisfies the
distribution constraint, the uncrossed voice-leading is smaller.
S6
Figure S3(c) shows a third possibility. m + n > x, since otherwise there would beno crossing. This implies x – m < n and x – n < m. Therefore {m, n} ≥ {x – m, x – n},
and the uncrossed voice-leading is no larger the crossed voice-leading. Again, if the
strict weak order strictly satisfies the distribution constraint then the uncrossed voice-
leading is smaller.
The remaining cases are closely analogous to those already considered, and are
left for the interested reader to verify. It remains to be shown that we can follow the
above procedures without creating any new voice-crossings. This is readily seen from
Figure S4. Without loss of generality, we can choose points b1 and b2 in Figure S4 to be
adjacent. We connect every note in the source chord to its destination by a path that has
no unnecessary crossings, as in Figure S4. Figure S4(a) features the crossing (a1,a2)→(b2, b1), as well as two additional types of voice-crossing: c1→d1, which crosses the
line a1→b2, and c2→d2, which crosses both a1→b2 and a2→b1. Figure S4(b), which
removes the crossing (a1, a2)→(b2, b1), shows that the remaining crossings c1→d1 and
c2→d2 are unaffected. Removing the crossing therefore reduces the total number of
voice-crossings in the voice-leading. The crossings shown in Figure S4, along with those
that can be obtained from this figure by reflection, exhaust the relevant geometrical
possibilities. We conclude that it is possible remove a voice-leading’s crossings without
making the voice-leading larger. If our normlike strict weak order strictly obeys the
distribution constraint, then removing voice-crossings will always make the voice-leading
smaller.
Theorem 1 is significant because it ties an important musical notion, “voice-
crossing,” to an important mathematical one: the triangle inequality, as represented by its
close cousin, the distribution constraint. It is widely accepted that avoidance of “voice-
crossings” in pitch space is a feature of traditional Western compositional practice (S7).
Theorem 1, which can easily be adapted to cover the case of voice-leadings in non-
circular pitch space, shows that normlike strict weak orderings are compatible with this
feature of classical practice. Moreover, it is easy to show that if a method of comparing
voice-leading size violates the distribution constraint, then there will be at least one
“crossed” voice-leading (in either pitch or pitch-class space) that is preferred to its
S7
uncrossed alternative. Thus the distribution constraint and the principle of avoiding
voice-crossings are equivalent within the limits of the formalism we have developed.
At the same time, the distribution constraint is closely related to the triangle
inequality. This allows us to use the minimal voice-leading between two chords to define
a “distance” between them, thereby underwriting the geometrical approach of the present
paper. Again, Theorem 1 is interesting precisely because it shows that our reference to
the geometrical concept of “distance” requires that we not prefer crossed voice-leadings
to their uncrossed alternatives. Consequently, were classical composers to have favored
voice-crossings, we would not be able to able to speak of the “distance” between chords
in the relatively straightforward way that we do here. We would be constrained to talk
only about the affine structure of musical chords—roughly, those non-metric properties
that depend only on the existence of “straight lines” in the space.
We conclude this section with a brief sketch of a proof that the distribution
constraint is equivalent to the triangle inequality. Let A and C be chords. The triangleinequality requires that a bijective voice-leading A→C be no larger than combined length
of any pair of bijective voice-leadings A→B and B→C, that takes A to C by way of B in
such a way as to preserve the mappings of the “direct” voice-leading A→C. It is
straightforward to identify the displacement multiset associated with A→B→C when A,
B, and C are collinear: one simply adds the elements of the displacement multisetsassociated with A→B and B→C so as to be faithful to the musical voices’ motions. The
displacement multiset associated with non-collinear A→B→C, if defined, is simply the
displacement multiset associated with A→B→D, with A, B, and D collinear and B→C
the same size as B→D. A normlike strict weak ordering does not ensure that there is a
displacement multiset associated with all paths A→B→C; but it does ensure if there is, it
is smaller than that associated with the direct voice-leading A→C.
To see why, suppose there is some crossed voice-leading between chords A and Cthat is preferred to the uncrossed voice-leading A→C. There will be a pair of voice-
leadings A→B→C that has the same combined displacement multiset as the crossed
voice-leading but which preserves the mappings of the “direct” voice-leading A→C.
(Here B is the point where the two voices cross as they move linearly from notes in A to
S8
their counterparts in C.) Since the crossed voice-leading is preferred, the combinedvoice-leadings A→B→C are smaller than A→C, which violates the triangle inequality.
Conversely, suppose there is a triangle ABC such that the combined voice-leadingsA→B→C are smaller than the “direct” voice-leading A→C. There is a voice-leading
A→D with the same displacement multiset as A→B→C. Since A→B→C form two legs
of a triangle, it is easy to show that the preference for A→D over A→C must violate the
distribution constraint.
3. A polynomial-time algorithm for finding a minimum voice-leading between twochords. Given two chords A and B, how do we find a minimal voice-leading between
them? The question is non-trivial, since minimal voice-leadings need not be bijective:
using any of the standard measures of voice-leading size, the minimal voice-leadingbetween {0, 4, 6} and {6, 10, 0} is (0, 0, 4, 6)→(10, 0, 6, 6). The large number of
possibilities here—roughly 2mn, where m and n are the cardinalities of the two
chords—makes an exhaustive search impractical, particularly in time-critical applications
such as interactive computer music.
However, Theorem 1 enables us to use the technique of “dynamic programming,”
common in computer science, to provide an efficient, polynomial-time algorithm (order
n2m) for determining a minimal voice-leading between arbitrary chords. Define the
ascending distance from pitch-class a to b as the smallest positive real number x such that
a + x is congruent to b, mod 12Z. Let (a1, a2, …, am, am+1 = a1) order the elements of
chord A based on ascending distance from arbitrarily-chosen a1. (Note that we repeat the
first element a1 as the last element of the list.) Similarly, for (b1, b2, …, bn, bn+1 = b1). Thenotation [a1, …, ai]→[b1, … bj] will refer to all voice-leadings from {a1, a2, …, ai} to {b1,
b2, … bj}, that can be notated so that both chords’ subscripts are in nondecreasing order.Thus [a1, a2]→[b1, b2, b3] includes (a1, a1, a2)→(b1, b2, b3), (a1, a1, a2, a2)→(b1, b2, b3, b3),
and so on.
If a crossing-free voice-leading contains the pair (ai, bj) then it must contain at
least one of the following: (ai-1, bj), (ai, bj-1), or (ai-1, bj-1) (subscript arithmetic modulo the
cardinality of the chords). By the recursion constraint, the smallest voice-leading of the
S9
form [a1, …, ai]→[b1, … bj] will be the voice-leading that adds the pair (ai, bj) to the
smallest voice-leading of the form [a1, …, ai-1]→[b1 … bj], [a1, …, ai]→[b1 … bj-1], or [a1,
…, ai-1]→[b1 … bj-1].
Thus, once we have fixed the pair (a1, b1) we can recursively compute the minimal
voice-leading between A and B that contains that pair. We do this by creating a matrixwhose cells ei, j record the size of the minimal voice-leading of the form [a1, …, ai]→[b1,
… bj]. It is trivial to fill in the first row and column of the matrix; from there, we can
proceed to fill in the rest. At each step, we need only consider the voice-leadings in a
cell’s upper, left, and upper-left neighbors.
Figure S5 illustrates the technique, identifying the smallest voice-leading between
the C and E major-seventh chords, {4, 7, 11, 0} and {4, 8, 11, 3}, such that the voice-
leading contains the pair (4, 4). In constructing this matrix we have used “smoothness”
(or “taxicab norm”) to measure the voice-leading size. The voice-leading in the bottom-
right cell is the minimal voice-leading between the two chords that contains (4, 4). To
remove this last restriction, we would need to repeat the calculation three more times,
each time cyclically permuting the order of one of the chords so as to fix a different
initial pair. As it happens, however, the voice-leading shown in Figure S5 is the
minimum voice-leading between the respective chords. This follows from the fact thatthe voice-leading in the top-left cell (4→4) contributes nothing to the overall size of the
voice-leading; we can therefore add this mapping to any voice-leading without increasing
its size according to the L1 norm.
Figure S5 includes in each cell both the numerical size of the voice-leading and
the voice-leading itself. With the L1 norm (“smoothness”) this is unnecessary: we need to
keep track of the size, but not the voice-leading. To determine the value of cell ei,j we can
simply add the distance between the pair (ai, bj) to the minimum value in the cells ei-1, j
ei, j-1, and ei-1, j-1. (With the Euclidean metric we can calculate squared distance in this
way, taking the square-root just before output.) Having filled in the matrix, we can
recover the minimal voice-leading by “tracing back” all paths that move from the bottom-
right cell to the top left, moving only north, west, and northwest, such that the size of the
S10
voice-leading decreases as much as possible with each step. The cells in boldface
indicate the path that such a traceback algorithm would take.
Due to the circular structure of pitch-class space, the voice-leading in the lower
right-hand corner of the matrix counts the pair (a1, b1) = (am+1, bn+1) twice; this can easily
be corrected prior to output.
Finally, note that need only consider n distinct possibilities to find a minimalbijective voice-leading A→B. Let (a0, a1, … , an-1) order the elements of chord A based on
ascending distance from arbitrarily-chosen element a0. Similarly for (b0, b1, … , bn-1). By
Theorem 1, there will be a minimal bijective voice-leading between A and B of the form(a0, a1, … , an-1)→(bc, bc+1, … , bc+n-1), where c is an integer and the subscript arithmetic is
reduced mod n.
4. Derivation of the voice-leading orbifolds. We begin by deriving Figure 2 in the main
text. Figure S6 shows the 2-torus T2, drawn using a Euclidean metric, and representing
ordered 2-note chords. To form a graph of unordered chords we need to identify all
points (x, y) and (y, x). As can be seen from Figure S6, this involves “folding” the 2-
torus along the diagonal line AB. The result is a “triangle” whose two sides are
identified, shown in Figure S7. Although it may not be immediately obvious, this figure
is a Möbius strip. To see why, cut Figure S7 along the line CD. This creates two
detached triangles. Then glue line AC on one triangle to CB on the other. (You will
have to turn one piece of paper over to get the chords to line up.) The result is the main
text’s Figure 2.
We now proceed more abstractly, describing the orbifolds Tn/Sn for arbitrary n.
For simplicity of exposition and ease of visualization, we will assume the Euclidean
metric in what follows. Since pitch-class space is represented by the circle R/12Z we are
interested in the n-torus (R/12Z)n. The quotient space (R/12Z)n/Sn can also be writtenRn/(Sn × 12Zn), where Zn refers to the group of n-tuples of integers, and the Zn action is
by componentwise addition. (The notation 12Zn indicates that the components of each n-
tuple in Zn are to be multiplied by the scalar 12.) We will proceed by deriving afundamental domain of Sn × 12Zn in Rn. (A “fundamental domain” of the group Γ in
S11
space S is a region R of S, such S is the union of the regions gR, for all g ⊂ Γ, and such
that the intersection of any two regions gR and hR, for g ≠ h, has no interior.) By
identifying the appropriate boundary points of this fundamental domain, we will obtainthe orbifold Rn/(Sn × 12Zn).
We first describe a fundamental domain of Sn in Rn. In this region, no two
distinct points (x1, x2, …, xn) and (y1, y2, …, yn) have coordinates that are equivalentunder some permutation: that is, there is no σ(n) such that (x1, x2, …, xn) = (yσ(1), yσ(2),…,
yσ(n)), where σ(n) is some permutation of the integers from 1 to n. We can create such a
region simply by requiring that a point’s coordinates be in nondescending order: i.e.considering all points (x1, x2, …, xn) such that x1 ≤ x2 ≤ … ≤ xn. We can incorporate the12Zn action by requiring that xn ≤ x1 + 12, and 0 ≤ Σnxn≤ 12. In Euclidean space, the
resulting fundamental domain is a right hyperprism whose faces are n-1 dimensional
simplexes. To see why, observe that1. The n inequalities x1 ≤ x2 ≤ … ≤ xn ≤ x1 + 12 define an n-1 simplex in every
plane Σnxn = n.
2. Addition by (c, c, …, c) sends the simplex in the plane Σnxn = n to the simplex in
the plane Σnxn = n + cn.
3. The planes Σnxn = n are perpendicular to the vector (1, 1, …, 1).
The vector (1, 1, …, 1) points in the direction of the “height” coordinate of the prism; the
prism’s “faces” lie in planes perpendicular to the vector (1, 1, …, 1) and therefore contain
chords whose pitch-classes sum to the same value.
Our construction of the fundamental domain ensures that no two points on asingle plane Σnxn = n can represent the same chord. However, the planes do contain
chords related by transposition: if (x1, x2, …, xn) satisfies the inequalities x1 ≤ x2 ≤ … ≤ xn
≤ x1 + 12, then so does the transpositionally-related (x2 – 12/n, x3 – 12/n, …, xn – 12/n, x1
+ 12 – 12/n), which has the same sum as (x1, x2, …, xn). Let O refer to the function that
sends (x1, x2, …, xn) to (x2 – 12/n, x3 – 12/n, …, xn – 12/n, x1 + 12 – 12/n). By repeatedly
applying O to a chord X, we can obtain n transpositionally equivalent chords X, O(X),
O2(X), … On-1(X) whose pitch-classes all sum to the same value. (If X is invariant under
some transposition, then some of the chords On(X) will be equal.) In Euclidean space, O
S12
is an orthogonal transformation that is an automorphism of the prism: it is a rotation
when the prism has an odd number of dimensions, and a rotation-plus-reflection
otherwise. O acts so as to cyclically permute the vertices of the simplex in each planeΣnxn = n.
It remains to be determined how the two simplicial faces of the prism are to be
identified. We cannot identify them in the obvious way, since this would identify point(x1, x2, …, xn) on the Σnxn = 0 face of the prism with the transpositionally-distinct chord (x1
+ 12/n, x2 + 12/n, …, xn + 12/n) on the Σnxn = 12 face. Notice, however, that
O(x1 + 12/n, x2 + 12/n, …, xn + 12/n) = (x2, x3, …, xn, x1 + 12) (3)
(x2, x3, …, xn, x1 + 12) represents the same chord as (x1, x2, …, xn). We therefore need to
identify (x1, x2, …, xn) with O(x1 + 12/n, x2 + 12/n, …, xn + 12/n) = (x2, x3, …, xn, x1 +12). Colloquially, we apply the transformation O to the Σnxn = 12 face before “gluing the
two faces together.” This identification transforms the prism’s “height coordinate” into a
circle: in moving parallel to the vector (1, 1, …, 1) we pass through all and only the
transpositions of a given chord, returning eventually to our starting point. Thus we can
describe the orbifold (R/12Z)n/Sn as the product of a n-1 simplex with a circle, modulo
the action that rotates the circle by 360/n degrees while applying the transformation O to
the simplex.
5. Efficient voice-leading and symmetry. Let A be an n-note chord and let (a1, a2, …,
an) be an arbitrary ordering of its elements. The symbol σ(a1, a2, …, an) will refer to the
ordering (aσ(1), aσ(2),…, aσ(n)), where σ(n) is some permutation of the integers from 1 to n.
We will use the notation A→σ(A) to refer to any voice-leading A→A that can be written
(a1, a2, …, an)→(aσ(1), aσ(2),…, aσ(n)) (4)
An arbitrary n-note chord S will be invariant under σ (or σ-invariant) if the chord’s
elements can be labeled so that si = sσ(i) for all i ≤ n.
S13
In what follows, we will use the variable O to refer to a specific permutation σ,
transposition Tx, or inversion Iy. We will say that an n-note chord S is invariant under Oif there is some voice-leading S→O(S) that is trivial. We will generally assume that Oitself is non-trivial: that is, there is at least one chord that is not invariant under O. Thus
we will not be considering the trivial permutation σ(n) = n or the trivial transposition
T0(x) = x.
It is intuitively obvious that the size of a voice-leading A→S, where S is invariant
under some O, sets an upper bound on the size of the minimal voice-leading A→O(A).
This is because we can express the voice-leading A→O(A) as the composition of two
equally-sized voice-leadings A→S and S→O(A). (For any A→S, we can find an equally
large S→O(A), since S is invariant under O and since a normlike strict weak order is
insensitive to the “direction” of the voice-leading.) Write the displacement multiset
corresponding to A→S as {d1, d2, …, dn}. We can conclude that the minimal voice-
leading A→O(A) can have a displacement multiset no larger than {2d1, 2d2, …, 2dn}.
Thus as the size of the voice-leading A→S goes to zero, the minimal voice-leading
A→O(A) must also go to zero.
The converse, however, is less obvious. Suppose we have some bijective voice-
leading A→O(A). Does the size of A→O(A) set an upper bound on the size of the
minimal voice-leading A→S, where S is O-invariant?
Yes, assuming such an S exists. The following theorem uses the size of A→O(A)
to limit the size of A→S, showing that as A→O(A) vanishes so must A→S. Since the
result is proven for any normlike strict weak order, it does not set a very tight (or
interesting) limit on the voice-leading A→S. However, it does establish the general
theoretical point that the size of A→O(A) is dependent on that of A→S.
Lemma 2.1. Let A be a chord with n elements, let x be some element of R/12Zsuch that nx is congruent to 0, mod 12Z, and let A→σ(A) be a bijective voice-leading that acts as a cyclical permutation of A’s elements. Label the pitch-classes of A so that the voice-leading A→σ(A) can be written
(a0, a1, …, an-1)→(a1, a2, …, an-1, a0)
There is a voice-leading A→Tx(A) has the form
S14
(a0, a1, …, an-1)→(x + a1, x + a2, …, x + an-1, x + a0)
with displacement multiset consisting of values di = |x + (ai+1 – ai)|12Z (subscriptarithmetic mod n). There must therefore exist a voice-leading A→S, from A tosome chord S that is invariant under Tx (or σ, if x = 0) and which hasdisplacement multiset.
m–1 m–1 m–1 m m+1 n–2{Σ di, Σ di, …, Σ di, Σ di, Σ di, …, Σ di, 0} i = 0 i = 1 i = m–1 i = m i = m i = m
Here m = n/2, the greatest integer ≤ n/2.
Proof. The value di = |x + (ai+1 – ai)|12Ζ measures how close the interval ai+1 – ai is to –x.
As Figure S8 shows, we need to move at most n/2 pitch-classes by |x + (ai+1 – ai)|12Ζ
semitones in order to make a given interval ai+1 – ai equal to –x. We can do so,
furthermore, without disturbing any of the other di, for i < n-1. Only the intervals ai+1 – ai
and dn-1 = W need be disturbed. Since the voice-leading acts as a circular permutation,
and since nx is congruent to 0, mod 12Z, we need iterate this procedure only n–1 times in
order to obtain a set that is invariant under Tx or σ (if x = 0): once we set n–1 of the
intervals equal to –x, the final “wraparound” interval—labeled W on Figure S8—will
also be equal to –x. Note that since our choice of a1 is arbitrary, we can choose W so as
to minimize the resulting voice-leading A→S.
Lemma 2.2. Let A→Ix(A) be a crossing-free, bijective voice-leading. SinceA→Ix(A) is crossing-free, it can be written in the form
(a0, a1, …, an-1)→(x – ac-0, x – ac-1, …, x – ac-(n-1))
with displacement multiset {d0, d1, …, dn-1} (subscript arithmetic is mod n). Wecan therefore find an S such that S is invariant under Ix and the voice-leadingA→S has displacement multiset no larger than
{d0/2, d1/2, …, dn-1/2}.
Proof. The crossing-free voice-leading A→Ix(A) associates pitch-class ai with x – ac-i
(subscript arithmetic mod n). There are two cases to consider, i = c/2, in which case the
S15
voice-leading associates ac/2 with x – ac/2, and i ≠ c/2, in which case the voice-leading
associates ai with x – ac-i and ac-i with x – ai.
Case 1. i = c/2. Our voice-leading associates ai with x – ai; the distance between
these two points is |x – 2ai|12Ζ. Now consider the two minimal-length linear paths in pitch-
class space: the first from ai to x – ai and the second its retrograde, from x – ai to ai.
These paths are reflection symmetrical under Ix: every point ai + ε along the path ai→(x –
ai) is mapped by Ix to the point x – (ai + ε) along the path (x – ai)→ai. Therefore, the
midpoint af = x – af is fixed by the reflection. Consequently, we can move ai by |x –
2ai|12Ζ/2 semitones to obtain a pitch-class that is invariant under Ix.
Case 2. i ≠ c/2. Let j = c – i. Our voice-leading associates ai with x – aj and aj
with x – ai. Both ai and aj are mapped to pitch-classes |x – (ai + aj)|12Ζ semitones away.
Consider the two minimal linear paths ai→x – aj and aj→x – ai. If we reverse the
direction of the second path, we obtain two equal-length paths ai→x – aj and x – ai→aj
that are reflection-symmetrical under Ix: every point ai + ε along the path from ai→x – aj
is mapped to the point x – (ai + ε) along the path x – ai→aj. The points halfway along
these paths, af and x – af, are related by Ix. Therefore, we can move each pitch-class by
|x – (ai + aj)|12Ζ/2 semitones to obtain a pair that is invariant under Ix.
THEOREM 2. Let A be an n-note chord, let O be a non-trivial permutation,transposition, or inversion such that there exists an n-note chord that is invariantunder O. Then, if the displacement multiset associated with A→O(A) is smallerthan the n-element multiset {d, 0, 0, …, 0}, there will be a voice-leading A→S,with S is invariant under O, and displacement multiset less than or equal to
{d, d, 2d, 2d, 3d, 3d, …, n/2d, 0}.
The term n/2d appears once for even n, twice for odd n.
Proof. By the distribution constraint, the displacement multiset corresponding to thevoice-leading A→O(A) has no terms greater than or equal to d. There are three cases to
consider, depending on whether O is a permutation σ, a transposition T, or an inversion
I.
S16
Case 1. O is a permutation σ. Since any permutation can be decomposed into
cycles, we simply apply Lemma 2.1 to obtain a voice-leading A→S that is no larger than
{d, d, 2d, 2d, 3d, 3d, …, n/2d, 0}, with S invariant under σ.
Case 2. O is a nonzero transposition Tx. By Theorem 1, there exists a crossing-free voice-leading A→Tx(A) whose displacement multiset consists of values less than d.
Any crossing-free voice-leading can be decomposed into cycles of the form:
(a0, a1, …, an-1)→(x + a1, x + a2, …, x + an-1, x + a0)
Thus we can again apply Lemma 2.1 to obtain the desired voice-leading.
Case 3. O is an inversion Ix. By Theorem 1, there exists a crossing-free voice-leading A→Ix(A) whose displacement multiset consists of values less than d. By Lemma
2.2, there exists a voice-leading A→S, such that S is invariant under Ix, and with
displacement multiset less than or equal to {d/2, d/2, …, d/2}. By the distributionconstraint, this multiset is less than or equal to {d, d, 2d, 2d, 3d, 3d, …, n/2d, 0}.
6. Evenness and transpositional invariance. We begin with an informal argument
describing the relation between “evenness” and T-invariance. By Theorem 1, there willbe a minimal bijective voice-leading A→Tx(A) of the form (a0, a1, …, an-1)→(ac + x, ac+1 +
x, …, ac+n-1 + x), where c is some integer and the subscript arithmetic is reduced mod n.
The displacement multiset associated with this voice-leading will consist of the distances|x + (ac+i – ai)|12Ζ. For a chord that divides the octave nearly-evenly, the values ac+i – ai are
nearly-constant for all c. (This is simply because the distance between the values ac+i – ai
measures how evenly the chord divides c octaves into n equal parts.) Thus, for every c,there will be an x for which ac+i – ai ≅ –x, for all i. The “cyclical” component of the
voice-leading offsets the “parallel” component. For chords that evenly divide the octave,
the quantities ac+i – ai can be made to approximate –x as closely as is possible for n-note
chords.
We now provide a rigorous proof of this last statement.
S17
THEOREM 3. Let A be any multiset of cardinality n. For all x, the minimalbijective voice-leading between A and Tx(A) can be no smaller than the minimalbijective voice-leading between E and Tx(E), where E divides pitch-class spaceinto n equal parts.
In proving Theorem 3 it is again convenient to work in pitch-space, or Rn. (Notethat we do not assume the Euclidean metric on this space.) We will use the symbol ≡nZ to
mean “congruent mod nZ.” The symbol applies to both scalars and ordered n-tuples:thus –2.5 ≡12Z 9.5, and (0, 4, 7) ≡12Z (-12, 4, 19). Each chord is represented by an infinite
number of points in Rn, all congruent mod 12Z. A voice-leading between two points X,Y ⊂ Rn will simply be the ordered n-tuple X – Y = (x1 – y1, x2 – y2, … xn – yn). The
displacement multiset associated with this voice-leading will be the multiset (|x1 – y1|, |x2
– y2|, … |xn – yn|). Clearly, for any voice-leading in the orbifold Rn/(Sn × 12Zn), there will
be an infinite number of equivalent, equally-sized voice-leadings in Rn. Conversely, forany voice-leading in Rn, with displacement multiset containing only elements ≤ 6, there
is a corresponding voice-leading in Rn/(Sn × 12Zn).
Let En be a chord that divides pitch-class space into n equal parts. Since E is
invariant under transposition by 12/n semitones, there will be a voice-leading between
chords congruent to E and Tx(E) of the form
(e1, e2, …, en)→(e1 + c, e2 + c, … en + c), where c is any real number ≡12Z/n x (5)
(NB: c is congruent to x mod 12Z/n, not mod 12Z.) Choose c so that |c| is as small as
possible. The displacement multiset corresponding to this voice-leading is {|c|, |c|, … ,
|c|}. The sum of the elements of this multiset is n|c|, where n|c| is the smallest positive
real number such that nc ≡12Z nx. By the distribution constraint, this multiset is as small
as any n-note multiset with the same or greater sum.
Now consider any bijective voice-leading between representatives of two n-notetranspositionally-equivalent chords A and Tx(A). Let ΣA refer to the sum of the
components of A. Therefore,
S18
Σ(Tx(A) – A) ≡12Z nx (6)
The real number Σ(Tx(A) – A) is the sum of signed quantities; the sum of the absolute
values of these quantities must therefore be greater than or equal to n|c|, where n|c| is the
smallest positive number such that nc ≡12Z nx. Thus the elements of the displacement
multiset associated with the voice-leading A→Tx(A) sum to at least n|c|. We conclude
that this voice-leading can be no smaller than the minimal voice-leading between En and
Tx(E).
There is a useful corollary to Theorem 3 that applies in the discrete case.
COROLLARY. Let Ek (the “chromatic scale”) divide pitch-class space into k >n equal parts, let A be any n-note subset of Ek, and let M be the “maximally even”n-note subset of Ek (S8). Then, for any integer i, the minimal bijective voice-leading between A and T12i/k(A) can be no smaller than the minimal bijectivevoice-leading between M and T12i/k(M).
The proof follows the same basic outlines as the proof of Theorem 3. We rely on the fact
that M divides any number of octaves into nearly even parts: given M = (m0, m1, …, mn-
1), and some constant integer c, the distances |mc+i – mi|12Ζ (subscript arithmetic mod n)
come in “consecutive integer sizes” when measured in units of 12/k (S8). That is, forevery integer c there exists an integer j, such that the distances |mc+i – mi|12Ζ are equal to
12j/k and (12j+1)/k. This allows us to find a voice-leading M→T12i/k(M) is small as
possible for n-note subsets of Ek. As before, we use the “cyclical” component of thevoice-leading mi→mc+i to neutralize the “transpositional” component of the voice-leading
mi→mi + x.
Now for the formalities. By the argument given above, the minimal voice-leadingA→T12i/k(A) has a displacement multiset whose sum is at least n|c|, where n|c| is the
smallest positive number such that nc ≡12Z 12in/k. What needs to be shown is that there is
a voice-leading M→T12i/k(M), with a displacement multiset summing to n|c|, whose
values are as evenly distributed as possible. Since our voice-leadings are required to
connect subsets of Ek, we can establish maximally-even distribution by showing that the
S19
values of the displacement multiset take on just two distinct values: 12r/k and 12(r+1)/k,
where r is some nonnegative integer.
Let (m0, m1, … mn-1) order the elements of M in ascending numerical order; formthe infinite sequence S = {m(j mod n) + 12j/12}∞j=-∞. (Again, “x” refers to the greatest
integer ≤ x.) S consists of all of the elements of R congruent mod 12Z to elements of M.
This sequence is ordered in ascending numerical order and indexed such that S-1 = mn-1 –
12, S0 = m0, S1 = m1, and so on. The voice-leadings
(m0, m1, … mn-1)→(Sa + x, Sa+1 + x, …, Sa+n-1 + x) (7)
are voice-leadings between chords congruent to M and Tx(M).
The following music-theoretical facts are well known:
1. The (real-valued) sum of the components of (Sa – m0, Sa+1 – m1, … Sa+n-1 – mn-1)
is equal to 12a (S9).
2. The elements of this n-tuple will either be constant, or have two distinct
values: 12r/k and 12(r+1)/k, where r is some integer (S8).From these two facts, it follows that we can find a voice-leading S→Tx(S)
corresponding to the n-tuple
(Sa + x – m0, Sa+1 + x – m1, …, Sa+n-1 + x – mn-1) (8)
with elements summing to nc, where n|c| is the smallest positive number such that nc ≡12Z
nx. When x and Sa+i – mi are both integer multiples of 12/k, the values of this n-tuple are
either constant or can be expressed in the form 12r/k and 12(r+1)/k, where r is some
integer. These values will either be all nonnegative or all nonpositive. The sum of theelements of this voice-leading’s displacement multiset will therefore be n|c|. The
displacement multiset will contain just two distinct values, 12|r|/k and 12|r+1|/k. This
implies that the displacement multiset is as evenly-distributed as possible, given the
hypothesis that the voice-leading connects subsets of Ek.
S20
NOTES
S1. D. Lewin. Journal of Music Theory 42, 15 (1998).
S2. R. Cohn, Journal of Music Theory 42, 283 (1998).
S3. J. Straus. Music Theory Spectrum 25, 305 (2003).
S4. C. Callender, Music Theory Online 10.3 (2004).
S5. R. Cohn. Journal of Music Theory 41, 1 (1997).
S6. J. Douthett, P. Steinbach. Journal of Music Theory 42, 241(1998).
S7. R. Gauldin, Harmonic Practice in Tonal Music (Norton, New York, 1997).
S8. J. Clough, J. Douthett. Journal of Music Theory 35, 93 (1991).
S9. J. Clough, G. Myerson. Journal of Music Theory 29, 249 (1985)
SYMBOL OR TERM DEFINITIONmultiset A set in which duplications are permitted. Like sets, multisets are
unordered.{a, b, c} A multiset with elements a, b, c.(a, b, c) An ordered list. (a, b, c) and (b, c, a) are not the same.x The greatest integer ≤ x.R The real numbers.Z The integers.nZ, where n is a real number The set {ni | i ⊂ Z}. Thus 12Z is the set
{…, -24, -12, 0, 12, 24, …}, whose elements form a group underaddition.
mZn, where m is real and n is aninteger
The set of ordered n-tuples (x1, x2, … xn) such that each xi ⊂ mZ.This set forms a group under vector addition.
A/G, where G is some group oftransformations acting on theelements of A
the quotient space that identifies all points a and ga, where a ⊂ Aand g ⊂ G
R/12Z The circular quotient space in which all real numbers x and x + 12have been identified. The group 12Z acts by ordinary addition, sothat every point x has orbits {…, x – 36, x – 24, x – 12, 0, x + 12, x+ 24, x + 36}.
a ≡nZ b Pitch class a is congruent to b mod nZ. Thus there exists an integerc such that a = b + cn.
|a|12Ζ The norm of a pitch-class a. The smallest real number |x| such thatx ≡12Z a.
(a1, a2, …, an) ≡12Z (b1, b2, …, bn) For all n, an ≡12Z bn.
Tn The n-torus, or product of n circles. Since R/12Z is a circle, Tn canalso be written (R/12Z)n.
Sn The “symmetric group” consisting of all the distinct permutationsof n objects.
Table S1. A glossary of mathematical terms and symbols used in the article.
SYMBOL OR TERM DEFINITIONpitch Pitch is a fundamental attribute of musical notes. Pitches are
typically represented by real numbers such that middle C is 60, theoctave has length 12, and semitones have size 1.
pitch-class An equivalence class of pitches, consisting of all pitches separatedby an integral number of octaves. A220 and A440 both areinstances of the same pitch-class A. Pitch-classes can berepresented by elements of the quotient space R/12Z.
chord A multiset of pitch-classes. It is also possible to consider chords ofpitches, which are simply multisets of real numbers.
transposition Translation in pitch or pitch-class space. In both pitch and pitch-class space, transposition corresponds to addition by a constantvalue. If a is a pitch or pitch-class then a + x is the transposition ofa by x semitones.
Tx(A) The transposition of the chord A by x semitones.inversion Reflection in pitch or pitch-class space. In both pitch and pitch-
class space, inversion corresponds to subtraction from a constantvalue. If a is a pitch or pitch-class, then x – a is an inversion of a.The quantity “x” is called the index number of the inversion.
Ix(A) The inversion of chord A with index number x.voice-leading A voice-leading between two multisets {a1, a2, …, am} and {b1, b2,
…, bn} is a multiset of ordered pairs (ai, bj), such that every elementof each chord is in some pair.
trivial voice-leading A trivial voice-leading contains only pairs of the form (x, x).
Table S2. A glossary of musical terms and symbols used in the article.
G D A E B/Cf Fs/Gf
Cs/Df
Af
Ef
Bf
F
Figure S1. The circle of fifths can be interpreted as depicting minimal voice-leadings between diatonic collections (major scales). Each diatonic collection can be transformed into its neighbors by voice-leading in which one pitch-class moves by semitone. For example, the C major scale, containing pitch-classes 0, 2, 4, 5, 7, 9, and 11 (= e) can be transformed into the G major scale (containing pitch-classes 0, 2, 4, 6, 7, 9, and 11) by moving the pitch class 5 (F) to 6 (Fs). Here as elsewhere, theletters “t” and “e” refer to the numbers 10 and 11, respectively.
...
..
. . ..
...
C{5↔6} {t↔e}
{3↔
4}{8
↔9}
{1↔
2}
{6↔
7}
{11↔0} {4↔5}
{2↔3}
{7↔8}
{0↔1}
{9↔10}
{024579e}{024679e}
{124679e}{124689e}
{134689e}
{13468te}{13568te}
{13568t0}
{135
78t0
}{2
3578
t0}
{235
79t0
}{24579t0}
Cc
f F
a A
[e] [E]
e E
[g] [G]
d D
fs
gs
Fs
[as]
[cs]
cs
[Cs]
Af
Ef
bf
g G
B
ds
b
Bf
[D]
[Fs]
[Bf]
[Bf]
[Bf]
Df
Figure S2. The Tonnetz. Nineteenth-century theorists such as Hostinsky, Oettingen, and Riemannexplored a geometrical figure that is the “geomterical dual” of the one shown here. The graph displays efficient voice-leadings among the 24 familiar major and minor triads. Triads connected by horizontal lines share both “root” and “fifth,” and can be connected by voice-leading in which one note moves by one semitone. (For example, the C-major triad can be transformed into a C-minor triad by changing E to Ef.) Triads along the NE/SW diagonal also share two notes and can be connected by single-semitone voice-leading. (For example, the C-major triad can be transformed into an E-minor triad by changing C to B.) Triads along a NW/SE diagonal share two notes and can be connected by voice-leading in which one note moves by two semitones. (For example, the C-major triad can be transformed into an A-minor triad by changing G to A.) Topologically, the figure is a 2-torus.
a1
(a1)
b1a2
(a2)b2
m
n
x
..
.. .
.
a1
(a1)
b1a2
(a2)b2
m
n
x
..
.. .
.
a1
(a1)b1
a2
(a2)b2
m
nx
. .
.. .
.
Figure S3. Three types of voice-crossing
a)
b)
c)
a1
b1
a2
b2. .. c1 c2
.d1
d2..
..
Figure S4. Removing a crossing does not create new crossings
a)
a1
b1
a2
b2
c1 c2
.d1
d2..
.b) . .
. .
FIGURE S5. Using dynamic programming to find minimal voice-leading
4 8 11 3 44 (4)→(4)
Size: 0
(4, 4)→(4, 8)
Size: 4
(4, 4, 4)→(4, 8, 11)Size: 9
(4, 4, 4, 4)→(4, 8, 11, 3)Size: 10
(4, 4, 4, 4, 4)→(4, 8, 11, 3, 4)Size: 10
7 (4, 7)→(4, 4)
Size: 3
(4, 7)→(4, 8)
Size: 1
(4, 7, 7)→(4, 8, 11)Size: 5
(4, 7, 7, 7)→(4, 8, 11, 3)Size: 9
(4, 7, 7, 7, 7)→(4, 8, 11, 3, 4)Size: 12
11 (4, 7, 11)→(4, 4, 4)Size: 8
(4, 7, 11)→(4, 8, 8)Size: 4
(4, 7, 11)→(4, 8, 11)Size: 1
(4, 7, 11, 11)→(4, 8, 11, 3)Size: 5
(4, 7, 11, 11, 11)→(4, 8, 11, 3, 4)Size: 10
0 (4, 7, 11, 0)→(4, 4, 4, 4)Size: 12
(4, 7, 11, 0)→(4, 8, 8, 8)Size: 8
(4, 7, 11, 0)→(4, 8, 11, 11)Size: 2
(4, 7, 11, 0)→(4, 8, 11, 3)Size: 4
(4, 7, 11, 0, 0)→(4, 8, 11, 3, 4)Size: 8
4 (4, 7, 11, 0, 4)→(4, 4, 4, 4, 4)Size: 12
(4, 7, 11, 0, 4)→(4, 8, 8, 8, 8)Size: 12
(4, 7, 11, 0, 4)→(4, 8, 11, 11, 11)Size: 7
(4, 7, 11, 0, 4)→(4, 8, 11, 11, 3)Size: 3
(4, 7, 11, 0, 4, 4)→(4, 8, 11, 11, 3, 4)Size: 3
1000 20 30 40 50 60 70 80 90 t0 e0 [00]
1101 21 31 41 51 61 71 81 91 t1 e1 [01]
221202
231303
32 42 52 62 72 82 92 t2 e2 [02]
33
241404 34
43 53 63 73 83 93 t3 e3 [03]
44
251505 35 45
54 64 74 84 94 t4 e4 [04]
55
261606 36 46 56
65 75 85 95 t5 e5 [05]
66
271707 37 47 57 67
76 86 96 t6 e6 [06]
77
281808 38 48 58 68 78
87 97 t7 e7 [07]
88
291909 39 49 59 69 79 89
98 t8 e8 [08]
99
2t1t0t 3t 4t 5t 6t 7t 8t 9t
t9 e9 [09]
tt
2e1e0e 3e 4e 5e 6e 7e 8e 9e te
et [0t]
ee
[20][10][00] [30] [40] [50] [60] [70] [80] [90] [t0] [e0]
[0e]
[00]
Figure S6. Ordered dyad-space is a 2-torus. To identify points (a, b) and (b, a), we need to “fold” the torus along the AB diagonal. The result of this operation is shown in Figure S7.
B
A
0100 02 03 04 05 06 07 08 09 0t 0e [00]
11 12 13 14 15 16 17 18 19 1t 1e [10]
22 23 24 25 26 27 28 29 2t 2e [20]
33 34 35 36 37 38 39 3t 3e [30]
44 45 46 47 48 49 4t 4e [40]
55 56 57 58 59 5t 5e [50]
66 67 68 69 6t 6e [60]
77 78 79 7t 7e [70]
88 89 8t 8e [80]
99 9t 9e [90]
tt te [t0]
ee [e0]
[00]
Fig. S7. The result of “folding” the 2-torus in Figure S6 along its diagonal AB. The resulting figureis a triangle with two of its sides identified, which is a Möbius strip. To transform Figure S7 intoa more familiar representation of a Möbius strip, cut the figure along the line CD and glue AC to CB.(To make this identification in Euclidean 3-space, you will need to turn over one of the pieces of paper.)The result is a “square” with opposite sides identified, as in Figure 2 of the main paper.
A
D
B
C
Figure S8. The cyclical voice-leading (a0, a1, a2, a3, a4, a5, a6)→(a1, a2, a3, a4, a5, a6, a0) has displacementmultiset {d0, d1, d2, d3, d4, d5, d6 = W}. By moving at most three notes by |x – di| semitones, we can make any of the di = x without changing the other dn ≠ W. That is, to change d0, we need only move a0; to change d5 we need only move a6; to change d1 we need only move a0 and a1; and so on. In the case of an arbitrary cyclical voice-leading, we never need to move more than half of a chord s notes by |x – di| semitones to “fix”any interval.
. . . . . .. d0A A
d1 d2 d3 d4 d5
a0 a1 a2 a3 a4 a5 a6
d6 = W