nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfnonlinear...

17
This journal is © The Royal Society of Chemistry 2016 Soft Matter, 2016, 12, 7119--7135 | 7119 Cite this: Soft Matter, 2016, 12, 7119 Nonlinear machine learning and design of reconfigurable digital colloidsAndrew W. Long, a Carolyn L. Phillips, b Eric Jankowksi c and Andrew L. Ferguson* ad Digital colloids, a cluster of freely rotating ‘‘halo’’ particles tethered to the surface of a central particle, were recently proposed as ultra-high density memory elements for information storage. Rational design of these digital colloids for memory storage applications requires a quantitative understanding of the thermodynamic and kinetic stability of the configurational states within which information is stored. We apply nonlinear machine learning to Brownian dynamics simulations of these digital colloids to extract the low-dimensional intrinsic manifold governing digital colloid morphology, thermodynamics, and kinetics. By modulating the relative size ratio between halo particles and central particles, we investigate the size-dependent configurational stability and transition kinetics for the 2-state tetrahedral (N = 4) and 30-state octahedral (N = 6) digital colloids. We demonstrate the use of this framework to guide the rational design of a memory storage element to hold a block of text that trades off the competing design criteria of memory addressability and volatility. 1 Introduction Digital colloids, 1 reconfigurable clusters of lock-and-key col- loidal particles, 2 have recently been proposed as a novel soft matter-based substrate for high-density information storage. A digital colloid comprises freely-rotating but tethered ‘‘halo’’ particles that are bound to the surface of a central particle (Fig. 1). Information can be stored within distinguishable con- figurations of the halo particles around the central particle. These structures have been experimentally synthesized via deple- tion mediated binding of dimpled halo particles to the surface of a central spherical particle, 1,2 as well as through programmable DNA interactions. 3 These halo particles may be distinguished through a variety of labeling techniques, 4 such as functionaliza- tion with complementary Fo ¨rster Resonant Energy Transfer (FRET) pairs 5 or through binding of short oligonucleotides, 1 thereby enabling the unique identification of distinct metastable halo particle configurations that mediate information storage and retrieval. Storing information in micron-scale colloids has the potential to advance computing in unconventional environ- ments. A prime example of which is DNA-based computing in solution, wherein arbitrary digital circuitry can be mapped onto reaction networks of DNA strands. 6 One practical limitation of DNA-based computing is the storage of information in the concentrations of DNA strands, which are essentially uni- form throughout the computing volume. Digital colloids have potential as compartmentalized, high-density storage elements that can diffuse throughout a computing volume, and which can in principle be reconfigured during computations. In order to realize the potential of digital colloids as unconventional computing elements, we require a fundamental understanding of the configurational transitions between storage states. The number of distinguishable halo particle configurations in the digital colloid ground state determines the number of unique ‘‘bit states’’ of a cluster, and therefore its information storage capacity. For example, the halo particles of the N =4 digital colloid exist in a tetrahedral ground state, meaning – after the elimination of trivial rotational symmetries it possesses exactly two distinguishable arrangements of the four halo particles (Fig. 1c). These two available states mean that the N = 4 colloid is a binary storage element that can encode exactly one bit of information. In general, the number of storable states is specified by the N! halo particle permutations divided by the symmetry operator for the ground state configurational arrangement, U N . The symmetry operator for an N-particle colloid is determined by the spherical code solution for the a Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA b Neurensic, Inc., Chicago, IL, 60604, USA c Micron School of Materials Science and Engineering, Boise State University, Boise, ID, 83725, USA d Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA. E-mail: [email protected]; Fax: +1 (217) 333-2736; Tel: +1 (217) 300-2354 Electronic supplementary information (ESI) available: Two movies highlighting the structure and topography of the low-dimensional intrinsic manifolds for the N = 4 and N = 6 digital colloids, a figure showing the transition network for the N = 6 digital colloid, and a movie tracking a N = 6 transition event over the intrinsic manifold. See DOI: 10.1039/c6sm01156j Received 19th May 2016, Accepted 1st August 2016 DOI: 10.1039/c6sm01156j www.rsc.org/softmatter Soft Matter PAPER Published on 01 August 2016. Downloaded by University of Illinois - Urbana on 24/08/2016 16:19:15. View Article Online View Journal | View Issue

Upload: ngokiet

Post on 07-May-2018

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

This journal is©The Royal Society of Chemistry 2016 Soft Matter, 2016, 12, 7119--7135 | 7119

Cite this: SoftMatter, 2016,

12, 7119

Nonlinear machine learning and design ofreconfigurable digital colloids†

Andrew W. Long,a Carolyn L. Phillips,b Eric Jankowksic and Andrew L. Ferguson*ad

Digital colloids, a cluster of freely rotating ‘‘halo’’ particles tethered to the surface of a central particle,

were recently proposed as ultra-high density memory elements for information storage. Rational design

of these digital colloids for memory storage applications requires a quantitative understanding of the

thermodynamic and kinetic stability of the configurational states within which information is stored. We

apply nonlinear machine learning to Brownian dynamics simulations of these digital colloids to extract

the low-dimensional intrinsic manifold governing digital colloid morphology, thermodynamics, and

kinetics. By modulating the relative size ratio between halo particles and central particles, we investigate

the size-dependent configurational stability and transition kinetics for the 2-state tetrahedral (N = 4) and

30-state octahedral (N = 6) digital colloids. We demonstrate the use of this framework to guide the

rational design of a memory storage element to hold a block of text that trades off the competing

design criteria of memory addressability and volatility.

1 Introduction

Digital colloids,1 reconfigurable clusters of lock-and-key col-loidal particles,2 have recently been proposed as a novel softmatter-based substrate for high-density information storage.A digital colloid comprises freely-rotating but tethered ‘‘halo’’particles that are bound to the surface of a central particle(Fig. 1). Information can be stored within distinguishable con-figurations of the halo particles around the central particle.These structures have been experimentally synthesized via deple-tion mediated binding of dimpled halo particles to the surface ofa central spherical particle,1,2 as well as through programmableDNA interactions.3 These halo particles may be distinguishedthrough a variety of labeling techniques,4 such as functionaliza-tion with complementary Forster Resonant Energy Transfer(FRET) pairs5 or through binding of short oligonucleotides,1

thereby enabling the unique identification of distinct metastable

halo particle configurations that mediate information storageand retrieval.

Storing information in micron-scale colloids has thepotential to advance computing in unconventional environ-ments. A prime example of which is DNA-based computing insolution, wherein arbitrary digital circuitry can be mapped ontoreaction networks of DNA strands.6 One practical limitationof DNA-based computing is the storage of information inthe concentrations of DNA strands, which are essentially uni-form throughout the computing volume. Digital colloids havepotential as compartmentalized, high-density storage elementsthat can diffuse throughout a computing volume, and whichcan in principle be reconfigured during computations. In orderto realize the potential of digital colloids as unconventionalcomputing elements, we require a fundamental understandingof the configurational transitions between storage states.

The number of distinguishable halo particle configurationsin the digital colloid ground state determines the number ofunique ‘‘bit states’’ of a cluster, and therefore its informationstorage capacity. For example, the halo particles of the N = 4digital colloid exist in a tetrahedral ground state, meaning –after the elimination of trivial rotational symmetries – itpossesses exactly two distinguishable arrangements of the fourhalo particles (Fig. 1c). These two available states mean that theN = 4 colloid is a binary storage element that can encode exactlyone bit of information. In general, the number of storablestates is specified by the N! halo particle permutations dividedby the symmetry operator for the ground state configurationalarrangement, UN. The symmetry operator for an N-particlecolloid is determined by the spherical code solution for the

a Department of Materials Science and Engineering, University of Illinois at

Urbana-Champaign, Urbana, IL, 61801, USAb Neurensic, Inc., Chicago, IL, 60604, USAc Micron School of Materials Science and Engineering, Boise State University,

Boise, ID, 83725, USAd Department of Chemical and Biomolecular Engineering,

University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.

E-mail: [email protected]; Fax: +1 (217) 333-2736; Tel: +1 (217) 300-2354

† Electronic supplementary information (ESI) available: Two movies highlightingthe structure and topography of the low-dimensional intrinsic manifolds for theN = 4 and N = 6 digital colloids, a figure showing the transition network for theN = 6 digital colloid, and a movie tracking a N = 6 transition event overthe intrinsic manifold. See DOI: 10.1039/c6sm01156j

Received 19th May 2016,Accepted 1st August 2016

DOI: 10.1039/c6sm01156j

www.rsc.org/softmatter

Soft Matter

PAPER

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

.

View Article OnlineView Journal | View Issue

Page 2: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

7120 | Soft Matter, 2016, 12, 7119--7135 This journal is©The Royal Society of Chemistry 2016

arrangement of the halo particles around the central particle.1,7–10

This arrangement is the solution to the Tammes problem11 – aspecial case of the Thomson problem12 – corresponding to theconfiguration of N halo particles that maximizes the minimumdistance (or equivalently minimum angle) between the haloparticle centers.7–10,13 This configuration determines both thepoint group of the packing structure and the multiplicity of thesymmetry operator. Putatively optimal spherical code solutions upto N = 130 are reported in ref. 14. The spherical code solutionfor N = 4 is a tetrahedron (Fig. 1a) and for N = 6 an octahedron(Fig. 1b) with corresponding point groups, rotational sym-metries, and number of storable states listed in Fig. 1d. Thenumber of bit states that can be written in a digital colloid isgiven by the base-2 logarithm of the number of storable states,log2(N!/UN). The information storage capacity of digital colloidsincreases as O(log2(N!)), making them ideal for high-densityinformation storage.1 For example, a single N = 6 octahedralcluster is capable of storing 4.9 bits (0.613 bytes), and a N = 12icosahedral cluster 22.9 bits (2.87 bytes).

A key feature of these digital colloids is that the halo particles,although strongly tethered to the central particle, are free to moveover the surface of the central particle. Defining the diameter ratiobetween the central and halo particles as L = dcentral/dhalo, thespherical code solution imposes a minimum value LSC for a givensystem of N halo particles, below which the N particles cannot bindto the surface without overlapping. For L above this value butbelow some threshold ratio, LSC r L o LT, each halo particle isconfined within a cage composed of its neighbors, and the digital

colloid is locked into a single bit state. This threshold ratio, LT, isgiven by the minimum value of L needed for collective halo particlerearrangements to access the transition state, and has beenpreviously computed for N = 4–12.1 At L Z LT there is sufficientfree volume for the halo particles to transition between bit statesand the digital colloid becomes unlocked. By actuating halo particletransitions between bit states, information can be written. Byobserving the halo particle bit state, information can be read.

Above the unlocking transition, the digital colloid is nolonger confined to oscillate around the spherical code configu-ration, and can explore a rich and complex morphologicallandscape. The design of information storage elements withtailored free energy barriers sufficiently low to actuate the haloparticle configuration (i.e., a low write free energy) but suffi-ciently high to prevent thermally-activated spontaneous bit stateflips (i.e., low memory volatility) requires an accounting of thethermally accessible configurational state space, transition statefree energy barriers, and mean first passage times as a functionof N and L. Furthermore, designing reliable digital colloidsrequires an understanding of how sensitive bit transitions areto the manufacturing tolerances of their constituent parts.

The configurational state of a digital colloid exists in a2N-dimensional phase space defined by the azimuthal andpolar coordinates of the N halo particles on the surface of thecentral sphere. Excluded volume and any other interactionsbetween the halo particles means that their motions andlocations are correlated, and these cooperative couplings areexpected to define an emergent low-dimensional subspace to

Fig. 1 Schematic of the digital colloid architecture with N distinguishable halo particles with diameter dhalo tethered to the surface of a central particlewith diameter dcentral. The diameters exist in a ratio L = dcentral/dhalo. The spherical code structure defines the densest packing of halo particles around thecentral particle at the minimum diameter ratio LSC for which the halo particles do not overlap,1 and defines the idealized structural arrangement of haloparticles defining a distinguishable bit state. At L = LSC, the halo particles of a (a) N = 4 digital colloid define the vertices of a tetrahedron, and (b) anoctahedron in a N = 6 digital colloid. For LSC o L o LT there is sufficient free volume for the halo particles to rattle around the stable spherical codestructure, but not transition between different structures. Above a critical diameter ratio L 4 LT 4 LSC, there is sufficient free volume for the haloparticles to cooperatively transition between different distinguishable configurations of the halo particles. (c) It is the availability of distinguishable bitstates and transition pathways between them that makes it possible to store information within digital colloids. The N = 4 digital colloid, for example,possesses two distinguishable tetrahedral bit states corresponding to left- and right-handed chiral arrangements. Transitions between these two chiralconfigurations mediated by a structural transition through a square planar arrangement of halo particles. (d) Table of the symmetry UN and informationstorage capability of the N = 4 and N = 6 digital colloids. The more rotationally distinguishable bit states available the higher the information storagecapacity. By using configurations for information storage, these digital colloids are capable of ultra-high information storage density. For example, N = 6digital colloids composed of 3-methacryloxypropyl trimethoxysilane (TPM) colloidal particles with density r = 1.228 g cm�3 36 and particle diameters of1 mm possess an estimated memory density of 1.1 TB g�1.

Paper Soft Matter

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 3: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

This journal is©The Royal Society of Chemistry 2016 Soft Matter, 2016, 12, 7119--7135 | 7121

which the digital colloid dynamics are effectively restrained.15 Thislatent low-dimensional manifold supports the free energy surfacecontaining the thermally-accessible system configurations, stableand metastable states, and transition pathways between them.Determining this landscape provides quantitative understandingof the system morphology, thermodynamics, and kinetics.

Dimensionality reduction techniques have demonstrated goodsuccess in recovering these low-dimensional manifolds for bothsingle molecule16–18 and many-body self-assembling15,19,20 systems.Recently, we (A. W. L. and A. L. F.) developed an extension of thediffusion map nonlinear dimensionality reduction technique21–23

for many-body colloidal systems to infer low-dimensional assemblylandscapes from both simulation15 and experimental particletracking data.19 Building upon these foundations, in this workwe combine Brownian dynamics simulations, nonlinear machinelearning, and first passage time analysis to quantify the stable andtransition state morphologies and determine the transitionmechanisms, free energy barriers, and transition rates for bit stateswitching for N = 4 and N = 6 digital colloids. We use this newquantitative understanding to computationally design addressabledigital colloids with tailored kinetic stability and actuation freeenergies, providing a step towards the engineering of high-densitydigital colloid information storage media.

2 Materials & Methods

We study the morphology, thermodynamics, and kinetics ofdigital colloids comprising N = 4 and N = 6 halo particles usingBrownian dynamics simulations, nonlinear dimensionalityreduction, and first passage time analysis.

2.1 Brownian dynamics simulations of digital colloids

Following the protocol previously detailed in ref. 1 and 24, weperformed Brownian dynamics simulations of N = 4 and N = 6digital colloids using the HOOMD-blue GPU-based molecular simu-lation package (http://glotzerlab.engin.umich.edu/hoomd-blue/).25–27

The spherical halo particles are constrained to glide over the surfaceof the spherical central particle using a radial restoring forceperpendicular to the free particle direction of motion. This restoringforce provides a rigid constraint binding the halo particles tothe central particle surface,1 implemented via the HOOMDconstrain.sphere module (http://hoomd-blue.readthedocs.io/en/stable/module-md-constrain.html). The diameter of thecentral particle dcentral regulates the geometric confinement ofthese halo particles, which is defined by the ratio of central tohalo particle diameters, L = dcentral/dhalo. The central particle istreated as static and inert, while halo particles i and j interactthrough a surface-shifted Weeks–Chandler–Anderson (WCA)pair potential to account for excluded volume interactions,28

u rij� �

¼4e

srij � D

� �12

� srij � D

� �6" #

þ e; rij o 2ð1=6Þsþ D

0; rij � 2ð1=6Þsþ D

8>><>>:

(1)

where rij is the center of mass separation between the haloparticles, e is the interaction strength, s is the size parameter,and the parameter D = 2s shifts the WCA potential to aneffective surface of the halo particles. We perform our simula-tions in a dimensionless gauge such that energy is measured inunits of e and distance in units of s. The reduced temperature is

defined as T* = kBT/e, and reduced time as t� ¼ t. ffiffiffiffiffiffiffiffiffiffiffiffiffi

ms2=ep

where m is the mass of a halo particle.The surface-shifted WCA potential is a convenient computa-

tional approximation for particles possessing hard coreinteractions,29,30 and can be mapped to a hard sphere fluidusing a perturbative generalization of the Rowlinson schemedeveloped by Barker and Henderson.31–33 Employing eqn (12)in ref. 31, we determine an effective hard sphere diameterof the halo particles of dhalo = 3.0785s. The central particlediameter is defined as dcentral = 2dcentral–halo � dhalo, wheredcentral–halo is the constrained center of mass distance betweenthe central and halo particles. This mapping of our WCA haloparticles to effective hard spheres of diameter dhalo = 3.0785senables us to calculate the spherical code diameter ratio LSC

and threshold diameter ratio LT based on purely geometricconsiderations.1,7 These hard sphere values do not hold preciselyfor our soft particles, but the steepness of the WCA potential andmoderate temperatures employed in our calculations mean thatthey do behave as approximate hard spheres. Specifically, weobserve no pair of halo particles i and j to approach closer thanrij = 2.95s corresponding to a pairwise interaction energyu(rij) = 2.96e. The values of LSC and LT reported for each of thedigital colloid architectures should therefore be interpreted asapproximations based on the effective hard sphere mappings.

Simulations were performed at a reduced temperature ofT* = 0.1 and the Brownian dynamics (overdamped Langevindynamics) equations of motion numerically integrated usingvelocity verlet in the NVE ensemble and a Langevinthermostat34 with a time step of dt* = 10�3.35 Each simulationwas initialized with L0 c LT 4 LSC and the halo particlesrandomly distributed and free to move over unencumberedover the central particle surface. The central particle was thengradually shrunk to the desired value of L over the course of1.1 � 105 time steps to allow the system to equilibrate beforeconducting a 109 time step production run.

To make contact between our dimensionless simulationsand real units, we adopt a characteristic interaction strengthof e = 10 kBT at 298 K between micron-sized halo particles withs = 0.915 mm (dhalo = 2.82 mm) fabricated from 3-methacryl-oxypropyl trimethoxysilane (TPM) with density r = 1.228 g cm�3.36

Under this choice the reduced temperature T* = 0.1 maps toT = 298 K, the reduced time step dt* = 10�3 to dt = 0.54 ms, andthe reduced simulation time t* = 109 � 10�3 = 106 to t = 540 s.

2.2 Digital colloid dimensionality reduction

Simulations of the digital colloid dynamics provide all particlecoordinates as a function of time, furnishing, in principle, all ofthe morphological, thermodynamic, and kinetic details governingtransitions between different configurations of the distinguishable

Soft Matter Paper

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 4: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

7122 | Soft Matter, 2016, 12, 7119--7135 This journal is©The Royal Society of Chemistry 2016

halo particles defining the information storage bit states. Inpractice, extracting this information from the trajectories ischallenging due to the high dimensionality of the system: aparticular configuration of N halo particles rolling around thesurface of a central particle exists as a point in a 2N-dimensionalphase space defined by the surface locations of each haloparticle. For sufficiently small values of L the accessible phasespace volume is, however, drastically reduced by excludedvolume interactions that give rise to cooperative halo particlerearrangements.1 These cooperative couplings are expected toyield a separation of time scales, wherein local caged oscillationswithin a stable bit state give rise to high-frequency motions,and transitions between bit states are rare events with slowcharacteristic time scales. We anticipate that these cooperativetransition pathways correspond to a small number of slowcollective modes of the halo particle dynamics.15,19,37 Extractingthese slow modes from the simulation trajectory reveals themulti-body transition pathways governing the digital colloiddynamics, and also presents kinetically meaningful collectivecoordinates in which to construct low-dimensional free energysurface mapping out the accessible morphologies, stable states,and dynamical transition pathways.15,16,18,19,38–41

Nonlinear dimensionality reduction provides a means todetermine the low-dimensional hypersurface – the so-called‘‘intrinsic manifold’’ – within the high-dimensional phasespace to which the multi-body digital colloid dynamics areeffectively restrained.15,19 Geometrically, the intrinsic manifoldis the low-dimensional phase space volume parameterized by asmall number of collective variables that contains the stablecolloidal bit states and the dynamic transition pathwaysbetween them.15,19,38,39 Temporally, the existence of the intrinsicmanifold can be viewed through the Mori–Zwanzig formalism asthe emergence of a small number of slowly-evolving collectivecoordinates to which the remaining system degrees of freedomare slaved as effective noise.15,19,23,38,39,42 Compared to lineardimensionality reduction techniques such as principal compo-nents analysis (PCA)43 or multidimensional scaling (MDS),44

nonlinear methods provide greater flexibility in extractingcollective modes that are nonlinear combinations of the particledegrees of freedom,16,39,45 enabling the recovery of more generaland complex intrinsic manifolds.

Path-based approaches such as transition path sampling,46

finite temperature string,47 and the nudged elastic band48 seekto compute high-probability (low-free energy) reactive pathwaysbetween two metastable system configurations. These pathscontain information on the reaction coordinate, transition stateensemble, isocommittor surfaces, and reaction mechanismfor the structural transition between the metastable basins.49

In contrast, nonlinear dimensionality reduction approaches seekto discover within high-dimensional simulation trajectories –including ensembles of simulations performed under differentconditions18 or biased simulations to which artificial restrainingpotentials are applied to enhance sampling50 – low-dimensionalprojections of the configurational phase space containing themetastable states and transition pathways between them. Thesereduced dimensional maps are valuable in identifying the

metastable states, dynamical pathways and transition mechanisms,and also furnish good order parameters describing the emergentcollective motions governing the long-time dynamics of thesystem.39 Accordingly, the aims and utility of the two classes ofmethods are slightly different. Path methods are preferred wherethere exists an unambiguous definition of the reactant and productstates and the objective is to obtain a precise characterization of thetransition path ensemble, transition mechanisms, or reaction ratesbetween these two basins.49 Nonlinear dimensionality reductionmethods may be preferred where good order parameters dis-tinguishing the metastable states are sought or it is theobjective to obtain a global overview of the available metastablestates, their mutual connectivity, and slow collective structuralrearrangements. Of course only those metastable stateswithin the simulation trajectories to which the dimensionalityreduction methodology is applied can be resolved, andalthough accelerated sampling techniques can facilitate barriercrossing, path-based methods are typically more adept atsurmounting high free energy barriers and may be moreefficient at resolving transition pathways within rugged freeenergy surfaces. Finally, nonlinear dimensionality reductionmay be profitably combined with path sampling as a preprocessingstep to identify metastable states, deliver good collective variableswith which to distinguish them, and determine initial transitionpaths with which to initialize path sampling.38,39

We (A. W. L. and A. L. F.) recently reported a novel approachextending the application of the nonlinear diffusion maptechnique to multi-body systems to extract aggregate morphologyand assembly mechanisms from computer simulations of patchycolloid self-assembly15 and experimental particle tracking dataof Janus colloid self-assembly.19 In this work, we apply thistechnique to Brownian dynamics simulations of digital colloidsto extract the intrinsic manifold governing digital colloidmorphology and transitions. We present a complete descrip-tion of the many-body diffusion map in ref. 15, but we brieflydetail the approach below, including system specific adapta-tions for the digital colloid system.

2.2.1 Microstate pairwise distances. The metastable bitstates of the digital colloids are morphologically identical inthat they correspond to the same configurational arrangementof halo particles around the central sphere.1 Informationstorage is only possible insofar as all (or some) of the haloparticles are distinguishable, thereby permitting unambiguousdiscrimination between all (or some) of the bit state configura-tions. In this work we consider all N halo particles to bedistinguishable such that there are N! relabelings cluster thatare reduced by the rotational symmetry of the bit state UN toyield an information storage capacity of log2(N!/UN) bits.

Our nonlinear dimensionality reduction algorithm requiresas input a measure of similarity distances between differentdigital colloid configurations observed over the course of thesimulation. When we measure the distance between differenthalo particle configurations, we assume the halo particles areindistinguishable, but retain the halo particle label informationfor a posteriori identification of the distinguishable bits stateswithin the low-dimensional projection. By separating morphology

Paper Soft Matter

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 5: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

This journal is©The Royal Society of Chemistry 2016 Soft Matter, 2016, 12, 7119--7135 | 7123

and labeling when sampling the configurational phase spacewe achieve an important practical advantage. Since each of theN! bit states are morphologically identical, treating the haloparticles as indistinguishable allows for comprehensive samplingof the identical metastable basin of all bit states within relativelyshort simulation trajectories. In contrast, treating halo particlesas distinguishable would result in the emergence of (N!/UN)identical metastable free energy wells within the intrinsic mani-fold, many of which would be incompletely sampled, and somenever sampled at all. For small digital colloids containing fewstates, this is unlikely to present difficulties, but for larger colloidscontaining hundreds of thousands of states, sampling the intrinsicmanifold without this simplification presents a practical challenge.

For digital colloids comprising N halo particles we compile alibrary L(N,{L}) of all digital colloid microstates observed inour simulation trajectories over a range of central to haloparticle diameter ratios L. We compute v the N-by-N pairwisedistance matrix between all pairs of halo particles in a particulardigital colloid microstate in L(N,{L}). We normalize pairwisedistances by the minimum distance between vertices in theidealized regular polyhedron with equispaced halo particle verticesfor a given value of L. For a central to halo particle distancedcentral–halo = dhalo(L + 1)/2, the edge length of the idealized

tetrahedron in the N = 4 system is dhalo�halo0 ¼

ffiffiffiffiffiffiffiffi8=3

pdcentral�halo,

and that of the idealized octahedron in the N = 6 system is

dhalo�halo0 ¼

ffiffiffi2p

dcentral�halo. In each case, we define the normal-ized pairwise distance matrix v0 = v/dhalo–halo

0. By normalizingwith respect to the ideal polyhedral edge length, we enable directcomparisons between digital colloid microstates across differentL values.

Using the v0 computed for each digital colloid microstate inL(N,{L}), we can define a fast, rotationally invariant, graph-based measure of structural similarity between pairs of digitalcolloid microstates i and j,

dij ¼ minH

wi0 �HTwj

0H

��� ���; (2)

where H is a permutation matrix rearranging the rows andcolumns of vj

0. That is, our measure is the minimum of thepermuted entrywise L1,1 norm between halo particle distancematrices. Minimizing over permutations is equivalent to mini-mizing over relabelings of halo particles. For the N = 4 and N = 6systems considered in this work, the number of permutationsis quite small, N! = 24 and N! = 720, permitting us to determinethe minimum permuted distance by brute force enumeration.Exhaustive enumeration will become intractable for sufficientlylarge systems, and it will be necessary to resort to approximategraph matching approaches such as those we have previouslyemployed in ref. 15.

2.2.2 Diffusion maps. Given our measure of structuralsimilarity between all pairs of digital colloid microstates inL(N,{L}), we apply the diffusion map nonlinear dimensionalityreduction technique21–23,39,51 to extract the collective orderparameters governing structural transitions between digitalcolloid morphologies. These collective coordinates are associatedwith the slow modes governing the long-time structural evolution

of the system to which the remaining motions are effectivelyslaved. The diffusion map approach was first introduced byCoifmann and coworkers,21–23,51 and has been used to determinethe underlying pathways and mechanisms underpinning bothmolecular folding18,40,45 and colloidal assembly.15,19,20

In the diffusion map approach, a random walk is constructedover the high-dimensional data with transition probabilitiesbased on structural similarity. Then a spectral analysis of therandom walk is performed to identify the dimensionality andcollective coordinates of a low-dimensional intrinsic manifold towhich the system dynamics are effectively restrained.15,19,38,39

The approach proceeds by forming the kernel matrix A byconvoluting our pairwise distance matrix with a Gaussian kernel,Aij = exp(�dij

2/2e), where e is a soft-thresholding bandwidth thatrestricts transitions between structures to those within ane-neighborhood in the high dimensional space. This procedureserves to model structural transitions between microstates as adiffusion process over the discrete point cloud of digital colloidmicrostates in the high-dimensional space.21 The value of e isspecified for each system using an automated procedure detailedin ref. 52.

We next form the right-stochastic Markov matrix M = D�1A,where D is the diagonal matrix Dii ¼

Pk

Aik. The element Mij

can be interpreted as the transition probability from microstatei to microstate j, and therefore M describes a random walk overthe data.21,22 M is positive-semidefinite, right stochastic, andadjoint to a symmetric matrix MS = D1/2MD1/2, so possesses an

orthogonal set of real right eigenvectors {~Ci} and associated

eigenvalues {0 r li r 1} with the trivial top pair ~C1 ¼~1 andl1 = 1.38 These eigenvectors are discrete approximations tothe eigenfunctions of the Fokker–Planck equation describingthe time evolution of the probability density function over thedata,21,22 with eigenvalues corresponding to the frequency ofthese collective harmonic modes. Large eigenvalues correspondto the slow (e.g. long time) relaxation modes of the Markovprocess, and small eigenvalues correspond to fast modes. A gapin the eigenvalue spectrum corresponds to a separation ofrelaxation time scales, with collective modes above the gapcorresponding to the slow subspace modes governing long timestructural evolution to which the remaining fast modes arerestrained.23,38,42 We employ the L-method of Salvador andChan to systematically identify the location of this spectralgap.15,18,53

Identifying a gap in the eigenvalue spectrum after l(k+1)

prompts construction of a low-dimensional landscape in thetop k nontrivial eigenvectors furnished by the diffusion map{~Ci}

k+1i=2 . The low-dimensional projection of the ith microstate in

our library L(N,{L}) is specified by the ‘‘diffusion mapping’’into the ith component of these top k eigenvectors,

microstatei 7! ~C2ðiÞ; ~C3ðiÞ; . . . ; ~Ckþ1ðiÞ�

: (3)

This projection defines the nonlinear dimensionality reductionfrom the 2N-dimensional phase space of halo particle angularcoordinates to the k-dimensional intrinsic manifold.

Soft Matter Paper

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 6: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

7124 | Soft Matter, 2016, 12, 7119--7135 This journal is©The Royal Society of Chemistry 2016

One of the primary limitations of the diffusion map, andnonlinear dimensionality reduction methods in general – withthe exception of autoencoders54 – is that the technique does notfurnish a mapping for these collective variables to the inputfeatures of the original high dimensional data. While auto-mated techniques exist to sieve and approximate these topeigenvectors from a candidate feature pool,55,56 these algorithmicfeatures can be so complex that attributing physical character-istics to these eigenvectors can still be exceedingly difficult.Further, there is no guarantee that a concise physical mappingexists for the nonlinear order parameters describing these many-body systems.15,39 To aid in physical interpretation, we correlatethe diffusion map variables by physical ‘‘bridge’’ variables suchas the polyhedral volume or the distance of a configuration froman idealized polyhedron.

‘‘Diffusion distances’’ in the high-dimensional space –defined precisely in ref. 21 and 51 – may be colloquiallyinterpreted as the ease with which two microstates may inter-convert, and are approximated by Euclidean distances in the low-dimensional manifold. This property of diffusion map embeddingsprovides a direct kinetic interpretation of distances betweenmicrostates in the low-dimensional embedding, which holdsunder the mild assumptions that (i) the dynamics of the systemmay be described by a diffusion process and (ii) that ourmeasure of pairwise similarity effectively captures short timediffusive motions of the system.15,21,38,39,51 Regarding the firstcondition, we expect that the coupled motions of the haloparticles and cooperative motion required for transition eventsbetween bit states will lead to the existence of a small set of slowcollective modes describing the long time dynamical evolution.This hypothesis is supported post hoc by the emergence of aspectral gap in M and the small number of modes existing abovethis gap. This separation of time scales into slow and fastcollective modes suggests that the dynamics of this digitalcolloid system can be modeled under the Mori–Zwanzig formalismas a set of coupled stochastic differential equations in the slowcollective modes to which the remaining fast modes couple asnoise.42,57 Accordingly, the system dynamics are well approximatedas a diffusion process. Regarding the second condition, we haveshown in ref. 15 and 19 that our graph-based distance metric formany-body systems provides a good measure for structuralproximity over short time scales, and therefore captures theshort time diffusive motions.

2.3 Free energy surfaces

We construct free energy landscapes over the k-dimensionalintrinsic landscape defined by the diffusion map embedding(eqn (3)) for each digital colloid. The free energy landscapeis constructed by collecting histograms over all microstatescollected over the course of the simulation projected into thelow-dimensional projection, and employing the statisticalmechanical relationship,58,59

F*(L,~x)/T* = �ln P(L,~x) + C(L), (4)

where L is the diameter ratio, T* = kBT/e is the reducedtemperature at which we conduct our simulations, ~x is a

k-dimensional vector specifying a point on the k-dimensionalintrinsic manifold defined by the diffusion map, P(L,~x) is ahistogram approximation to the probability density of digitalcolloid microstates at ~x for a specific value of L, F*(L,~x) is thereduced free energy of the configurations at location ~x anddiameter ratio L, and C(L) is an arbitrary additive constant thatmay differ for each diameter ratio. At constant temperature andvolume, F* = F/e is identifiable as the reduced Helmholtz freeenergy. In applying eqn (4) to estimate the equilibrium freeenergy surface from the empirical probability distributioncomputed from the Brownian dynamics trajectories, we assumethat our simulations are sufficiently long to return convergedestimates of the probability distribution over the low-dimensionalmanifold. Computing F* = F/e from our simulations at areduced temperature T* = 0.1, we identify bF = eF*/kBT. Byconverting these units to room temperature T = eT*/kB = 298 K,and a characteristic interaction energy of e = 10 kBT, wecan calculate bF = eF*/kBT = F*/T* = 10F* in units of kBT atT = 298 K.

Importantly, although each histogram P(L,~x) is constructedfrom an independent simulation trajectory performed at adifferent value of L, the diffusion map was constructed froma library of microstates L(N,{L}) collected over all values of L.This permits each individual simulation trajectory to be pro-jected into a shared intrinsic manifold spanned by the samenonlinear basis vectors {~Ci}

k+1i=2 , and allowing us to quantify the

impact of L upon the underlying free energy landscape of thedigital colloid of fixed size N. We have previously referred tothis approach as the composite diffusion map approach.18 Wechoose the arbitrary constant C(L) for each value of L so as toshift the global free energy minimum to zero (i.e., C(L):min~xF(L,~x) = 0). Uncertainties in the free energy landscapewere computed by performing 10 bootstrap resamples (randomresampling with replacement) of the data used to compile thehistograms at each L.

2.4 Determining the digital colloid bit state

Up until now, the halo particles in the calculation have beentreated as indistinguishable. To discern one bit state fromanother, we must now perform a post hoc accounting of thedistinguishable states of each of the digital colloid microstruc-tures projected into the intrinsic manifold. For the N = 4 andN = 6 digital colloids considered in this work, we employ asimple procedure to identify the distinguishable bit state of agiven microstate based on the N-by-N halo particle adjacencymatrix G and the halo particle positions. We compute thesymmetric binary adjacency matrix G between the halo particlesof any particular digital colloid configuration from the pairwisedistances matrix v by specifying a threshold separation of 110%of the halo particle separation in the idealized polyhedralconfiguration. Halo particles i and j closer than this distanceare defined as bonded (Gij = 1) and those at larger separationsare not (Gij = 0). Thus the adjacency matrix maps the set of allpossible microstates to a finite set of structures by applying aproximity criterion between halo particles. We have verifiedthat our results are robust to choices of cutoff over the range

Paper Soft Matter

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 7: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

This journal is©The Royal Society of Chemistry 2016 Soft Matter, 2016, 12, 7119--7135 | 7125

105% to 130% of the ideal polyhedron halo particle separation.The symmetric binary adjacency matrix G cannot, however, dis-tinguish between two states that rotationally distinguishable butrelated through a mirror reflection, and distinguishing betweenthese states requires additional analyses described below.

2.4.1 N = 4 bit states. The tetrahedral structure formed bythe N = 4 digital colloid belongs to the Td point group, withrotational symmetry U4 = 12. Given the N! = 24 possiblepermutations of halo particle labelings, we have N!/UN = 2rotationally distinguishable states that can be stored in theconfiguration of these halo particles, corresponding to the twochiral enantiomers of the tetrahedral bit state. We distinguish thetwo bit states through their chirality (i.e., handedness) by com-puting the signed volume, V, of the parallelepiped connecting thehalo particle centers of mass using the triple product. The sign ofthis volume, sgn(V), determines the handedness of the tetra-hedron, allowing us to distinguish between left- and right-handed structures and discriminate between the two bit states.

2.4.2 N = 6 bit states. The octahedral structure defined bythe N = 6 digital colloid belongs to the Oh point group withrotational symmetry U6 = 24, giving a total of N!/UN = 30rotationally distinguishable bit states. These 30 bit statescorrespond to 15 possible unique permutations of the adjacencymatrix of an octahedron, each of which has two distinct chiral-ities owing to a left- or right-handed arrangement of the haloparticles occupying the mid-plane of the octahedron. To identifya digital colloid as belonging to a particular bit state, we firstclassify the structure as containing one of the 15 unique adja-cency matrices. To distinguish the two chiral alternatives withineach of these 15 matrix classes, we tag the particle correspondingto the first row of the adjacency matrix and specify that particleto reside at the top of the digital colloid (e.g., the red halo particlein Fig. 1b). We then identify the three halo particles corres-ponding to the first three non-zero entries of the first row ofthe adjacency matrix as three neighboring particles residing inthe mid-plane of the octahedron (e.g., the green, yellow, andorange particles in Fig. 1b). The signed volume, V, of theparallelepiped connecting these four halo particle centers ofmass distinguishes between a clockwise or counter-clockwisearrangement of halo particles in the mid-plane, permittingunambiguous identification of the right- and left-handed chiralbit states. By pairing the adjacency matrix and chirality informa-tion, the instantaneous configuration of an arbitrary sized digitalcolloid can be unambiguously assigned to one of its distinguish-able bit states.

2.5 Bit state transition kinetics

Having determined the rotationally distinguishable bit state ofeach digital colloid microstructure following the approachdetailed in Section 2.4, we can track transition events aschanges in colloidal bit state over time. To quantify the transi-tion kinetics and the dwell time (or volatility) of each bit state,we compute from our simulations the mean first passage time(MFPT) as the average duration for which a bit state is occupiedbefore a transition occurs.42,60 The minimum first passage timeidentified in our simulation trajectories is 80 times larger that

the sampling period, assuring that the identified transition eventsare ‘‘elementary transitions’’ in the sense that they correspond to asingle structural rearrangement events. We define for each bitstate a core region comprising an ensemble of configurationsaround the idealized bit state structure. The phase space betweencore regions comprises a ‘‘no man’s land’’ belonging to none ofthe bit states. The digital colloid is considered to occupy aparticular bit state when it passes into its corresponding coreregion, and to have left that bit state only when it passes into thecore region of a new bit state. Transient excursions into the noman’s land and then back into the same bit state core region arenot considered transitions. The use of non-contiguous core regionsseparated by a no man’s land leads to improved MFPT estimatesby eliminating diffusive recrossing events of the separatrix betweenbit states that can artificially depress the MFPT estimate.61 Theparticular definition of the core regions depends on the architec-ture of the digital colloid under consideration and is detailed inSection 3. Having defined the core regions we analyze ourBrownian dynamics production run to extract for each of thek = 1. . .K bit states all of the i = 1. . .P(k) first passage times of thesystem out of that bit state {FPTk

i }P(k)i=1 . The MFPT for bit state k then

follows straightforwardly as MFPTk ¼ 1

PðkÞPPðkÞi¼1

FPTki . Moreover,

since the bit states are all structurally identical, we can averagethe estimate over all N equivalent bit state transitions

MFPT ¼ 1

N

PKk¼1

PPðkÞi¼1

FPTki . Similarly, we compute the variance

of the first passage time as s2 ¼ 1

N

PKk¼1

PPðkÞi¼1

FPTki �MFPT

� �2.

Using the central limit theorem, we estimate 95% confidenceintervals from the sampling distribution of the MFPT as

MFPT� 1:96sffiffiffiffiNp ;MFPTþ 1:96

sffiffiffiffiNp

�.

The reciprocal of the MFPT is the escape rate G = 1/MFPT,42

which, since all bit states for a particular digital colloid arestructurally equivalent, is identical for all bit states. For theN = 4 system, the escape rate is precisely equal to the transitionrate between the two chiral bit states. For the N = 6 case, theescape rate is equal to the transition rate from any particularbit state to any one of eight other rotationally distinguishablebit states accessible by a single halo particle rearrangementevent (Fig. S1, ESI†). This quantification of mean first passagetimes and escape rates can be straightforwardly extended todigital colloids of arbitrary size.

As detailed in Section 2.1, we adopt experimentally relevantparameter values of e = 10 kBT at 298 K, s = 0.915 mm, andr = 1.228 g cm�3.1,36 This maps our reduced time stepdt* = 10�3 to dt = 0.54 ms, and permits us to report the MFPTin seconds and the escape rate in inverse seconds.

3 Results & discussion3.1 N = 4 (2-state) digital colloids

Using the approach detailed in Section 2.1, we performedBrownian dynamics simulations of the tetrahedral N = 4 digital

Soft Matter Paper

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 8: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

7126 | Soft Matter, 2016, 12, 7119--7135 This journal is©The Royal Society of Chemistry 2016

colloid for L = {0.4000, 0.4142, 0.4400, 0.5000, 0.7500, 1.0000,1.1256}. In each case we held the halo particle diameter fixedand varied the size of the central particle. These values of Lspan three regimes: (I) (LSC E 0.2247) r L o (LT = 0.4142)where the central particle diameter is sufficiently small that thedigital colloid bit state is effectively locked since the haloparticles possess insufficient free volume to transition to anew bit state, (II) (LT = 0.4142) r L o (LU = 0.5361) wherethe central particle is sufficiently large such that the cluster isunlocked and transitions between bit states are permitted, and(III) LZ (LU = 0.5361) where the central particle is so large thatbit state transitions are effectively barrierless and the digitalcolloid remains in constant flux between states. LSC E 0.2247is the minimum diameter ratio at which the halo particles canexist on the surface of the central particle without overlaps1

under the effective hard sphere mapping with dhalo = 3.0785s.(As observed above, the soft WCA potential does admit a closerdistance of approach but we never observe two halo particles toapproach closer than 2.95s over the course of any of ourcalculations.) The threshold LT = 0.4142 is the transitionthreshold diameter ratio below which bit state transitionsare geometrically forbidden under the effective hard spheremapping, and was previously computed from geometric con-siderations in ref. 1. The threshold LU = 0.5361 is defined by thevalue of L for which the free energy barrier height for bit statetransitions falls to 1 kBT and the transition free energy barrier iscomparable to the size of thermal fluctuations (cf. Fig. 4a).

From each simulation we extract 2 � 106 digital colloidmicrostates saved every 500 time steps, and aggregate theseinto a single composite library L(N = 4, L = {0.4000, 0.4142,0.4400, 0.5000, 0.7500, 1.0000, 1.1256}) comprising 1.4 � 107

microstates. Constructing diffusion maps over such a largenumber of microstates is computationally intractable, so wereduce the number of microstates used to construct the diffu-sion map embedding by subsampling the data. In principle,this could be done uniformly, but to ensure that we have good

coverage of the structural conformations we instead takeadvantage of the vast number of redundant (i.e., nearly identical)microstates in our library. We identify from each simulationtrajectory the set of unique adjacency matrices G and assign eachconformation in the trajectory as belonging to one of theseprototypical architectures. Aggregating the different architecturelists across all values of L yields a set of 34 prototypicalarchitectures. To obtain good coverage over this range of struc-tures, we sample 30 configurations from each architecture (whenthere are less than 30 configurations, we select all observedmicrostates) to define a representative set of 941 digital colloidmicrostates over the various values of L.

We then applied diffusion maps to these 941 microstatesusing the procedure detailed in Section 2.2.2, employing asoft-thresholding bandwidth of e = exp(3) determined via themethod defined in ref. 52. Using the L-method,53 we identifieda gap in the eigenvalue spectrum beyond l3, prompting theconstruction of a two-dimensional landscape in the top twonon-trivial diffusion map eigenvectors (C2, C3). The remaining(1.4 � 107 � 941) configurations were projected into this two-dimensional intrinsic manifold using the Nystrom extension.62–64

By constructing a diffusion map over microstates collected fromsimulations spanning all values of L, we can project eachsimulation trajectory into a shared intrinsic manifold andquantify the impact of L upon the morphology, stability, andkinetics of the digital colloid within a unified low-dimensionalbasis.18

Morphology and stability. We present in Fig. 2a the reducedensemble of digital colloid structures observed at the transitionthreshold value of LT = 0.4142 in the two dimensional intrinsicmanifold spanned by (C2, C3). Each point in this manifoldcorresponds to a single digital colloid configuration observed inour simulation. To aid in visualization, we have superimposed ontothe manifold renderings of representative snapshots visualizedusing VMD.65 Given a particular labeling of the distinguishablehalo particles (A – red, B – yellow, C – blue, D – green) illustrated in

Fig. 2 Diffusion map embeddings of the N = 4 tetrahedral digital colloid at the unlocking threshold LT = 0.4142. (a) Embedding into the top twocollective modes (C2, C3) furnished by the diffusion map. Each point represents one of the 2 � 106 microstates recorded over the course of thissimulation trajectory. To assist in interpretation of the embedding, we present visualizations of representative microstates and color points by theabsolute volume of the parallelepiped jVj ¼ AB

�! � AC�!� AD

�!� connecting the centers of mass of the halo particles, where XY��!

denotes the unit vectorpointing from X to Y. The magnitude of V provides a measure of the deviation of the halo particle configuration from planarity: |V| = 0 corresponds to all

halo particles residing in a single plane, and jV j ¼ffiffiffi2p

=2 to an idealized tetrahedral arrangement. (b) Augmentation of the 2D intrinsic manifold in panel (a)with a third axis containing the signed volume of the parallelepiped V. The sign of V distinguishes the chirality of the two tetrahedral bit states: V 4 0corresponds to right-handed tetrahedra and V o 0 to left-handed configurations. (c) Graphical illustration of the triple product defining the signed

parallelepiped volume V ¼ AB�! � AC

�!� AD�!�

.

Paper Soft Matter

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 9: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

This journal is©The Royal Society of Chemistry 2016 Soft Matter, 2016, 12, 7119--7135 | 7127

Fig. 2c, we compute for each digital colloid microstate the vector

triple product V ¼ AB�! � AC

�!� AD�!�

defining the signed volume

of a parallelepiped connecting the halo particle centers of mass,

where XY�!

denotes the unit vector pointing from X to Y. Byemploying unit vectors, the magnitude of V is bounded on theinterval [0,1] irrespective of L and can be interpreted as a measureof the deviation of the halo particle configuration from planarity:|V| = 0 corresponds to all halo particles residing in a single plane,

and jVj ¼ffiffiffi2p

=2 to an idealized tetrahedral arrangement. The signof V distinguishes the chirality of the two tetrahedral bit states:V 4 0 corresponds to right-handed tetrahedra and V o 0 to left-handed configurations.

Coloring each point in Fig. 2a by |V| reveals a strongcorrelation between C3 and |V|, indicating that C3 was dis-covered by the diffusion map as a collective variable measuringthe relative planarity of the halo particle configuration. Theminimum value of |V| = 0 is attained at the top of the tear-shaped intrinsic manifold in the region of (C2 E 0.01, C3 E 0.085)where the digital colloid adopts a square planar geometry.

The largest values of jV j �ffiffiffi2p

=2 are reached at (C2 E �0.03,C3 E �0.04) where the digital colloid adopts approximatelytetrahedral geometries. The square planar geometry defines thetransition state for interconversions between bit states.1 At thetransition threshold value of LT = 0.4142 the transition statecomprises essentially a single point within the intrinsic mani-fold since the free volume available to the halo particles at thetransition state is so small as to admit essentially a uniquesquare planar configuration. Contrariwise, since (LT = 0.4142) 4LSC

the tetrahedral structures occupy a relatively larger volume atthe bottom of the tear-shaped intrinsic manifold correspondingto a diversity of structures defining the metastable bit staterattling around the caged volume centered on the idealizedtetrahedral geometry.

As a result of our choice to construct diffusion map embeddingswithout regard to halo particle distinguishability (Section 2.2.1),the two rotationally distinguishable chiral bit states accessible tothe N = 4 digital colloid are collapsed together onto the intrinsicmanifold in Fig. 2a. In order to separate these two bit stateswe present in Fig. 2b the diffusion map embeddingaugmented with a third axis containing the signed triple product

V ¼ AB�! � AC

�!� AD�!�

as a measure of both microstate morphology

and chiral bit state. This augmented embedding separates thetwo chiral bit states and makes clear their structural equivalence,with each occupying a topographically identical volume of theintrinsic manifold and connected by the square planar transitionstructure occupying the slim bottleneck at V = 0.

In Fig. 3, we present the augmented diffusion map embeddingsfor all seven values of L. In each case we present both the low-dimensional diffusion map projection into (C2, C3, V) of the2 � 106 microstates recorded over the course of this simulationtrajectory, and the free energy surface over the intrinsic mani-fold F(C2, C3, V) defined by collecting histograms over the embed-ding. As detailed in Section 2.3, by assuming a characteristicinteraction strength between halo particles of e = 10 kBT at 298 K

(cf. eqn (1)), we report bF as a dimensionless free energy measuredin units of kBT at T = 298 K. In Movie S1 in the ESI,† we presentrotating images of the images in Fig. 3 that more clearly illustratethe shape and topography of the intrinsic manifolds and freeenergy surfaces.

For (L = 0.4000) o (LT = 0.4142) (Fig. 3a and b), structuraltransitions are effectively forbidden and the digital colloid islocked into the bit state in which it was initialized. The squareplanar transition state at V = 0 is effectively inaccessible dueto excluded volume interactions and is never visited over thecourse of our simulations.

At L = (LT = 0.4142) (Fig. 3c and d), the digital colloid is atthe unlocking threshold and is able to access the square planartransition state |V| = 0 and make transitions between the twobit states. The transition bottleneck exists at the top of a highD(bF) = (9.07 � 0.23) free energy barrier. Accordingly, bit statetransitions are an activated process and a rare event, and weobserve only three bit state transitions over the course of thesimulation. The structural equivalence of the two chiral bitstates means that the topography of the free energy surfacewithin each metastable basin should be identical. The observeddifferences in the empirical free energy values computed withinthe two basins is a consequence of kinetic trapping and raretransition events leading to poor sampling of the accessiblephase space.

For L 4 (LT = 0.4142) (Fig. 3e–n), the volume of thetransition state ensemble at |V| = 0 expands as the increasedfree volume available to the halo particles admits a greaterdiversity of square planar transition structures. There is acommensurate drop in the transition barrier to D(bF) =(5.15 � 0.07) at L = 0.4400, to D(bF) = (1.93 � 0.02) at L = 0.5000,before the transitions become essentially barrierless withD(bF) o 1 for L 4 (LU = 0.5361).

We present in Fig. 4a the free energy barrier as a functionof L. An exponential fit of the form D(bF) = C0 exp(�C1L)provides a good fit to the data, with best fit parameters{C0 = (1.29 � 0.89) � 104, C1 = (17.65 � 1.55)} and an adjustedR2 = 0.98. Quantification of the free energy barrier for thetransition provides a measure of the reversible work requiredto actuate a transition between bit states, or equivalently thesize of a concerted thermal fluctuation necessary to inducea spontaneous transition. This stands as one of the keyvariables for the design of stable but addressable informa-tion storage elements.

To ascertain the change in the phase space volume andstructural morphologies of the transition state as a function ofL, we present in Fig. 4b a superposition of 2D (C2, C3) sectionsthrough the 3D free energy landscapes at V = 0, where the haloparticles occupy planar configurations at the transitionbetween the two chiral bit states. The broadening of thetransition region bottleneck with increasing L is a result ofthe increased free volume available to the halo particles movingover the surface of increasingly large central particles, permittingaccess to a greater diversity of planar transition structures and acommensurate reduction in the free energy barrier between thetwo bit states.

Soft Matter Paper

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 10: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

7128 | Soft Matter, 2016, 12, 7119--7135 This journal is©The Royal Society of Chemistry 2016

Kinetics. The depression of the transition state free energywith increasing L leads to more frequently observed transitionsbetween the two chiral bit states. To quantify the transition rateas a function of L we report in Fig. 5 the mean first passagetime (MFPT) computed from each simulation as the averageduration for which a bit state is occupied before a transitionoccurs.42,60 MFPT calculations were conducted for L = {0.4142,0.4200, 0.4300, 0.4400, 0.4500, 0.4750, 0.5000, 0.5100, 0.5200,0.5300, 0.5400, 0.5500, 0.6000, 0.6500, 0.7000, 0.7500, 1.000,1.1256}. To mitigate the impact of possible recrossing events atthe summit of the free energy barrier leading to artificiallydepressed MFPT estimates, we define a core regions withineach bit state separated by a no man’s land.61 We define the

core region of the right-handed bit state to be the region ofthe intrinsic manifold for which V 4 0.5, and that of theleft-handed bit state to be the region for which V o �0.5.The digital colloid is defined to enter a particular bit state thefirst time it passes into the corresponding core region, and isdefined to have transitioned out of that bit state only when itpasses into the core region of a new bit state. Excursions intothe interstitial no man’s land (i.e., �0.5 r V r 0.5) followed byre-entry into the core region of the same bit state are notconsidered transition events.

We present in Fig. 5 the calculated MFPT as a function of Land associated error bars delimiting 95% confidence intervals.A shifted exponential of the form ln(MFPT) = B0 exp(�B1L) + B2

Fig. 3 Diffusion map embeddings and the free energy surfaces they support of the N = 4 tetrahedral digital colloid as a function of L. Embeddings areconstructed in (C2, C3, V) where (C2, C3) are the top two non-trivial eigenvectors discovered by diffusion map corresponding to the slowest collectivemodes of the digital colloid dynamical rearrangements and V is the signed volume of a parallelepiped connecting the halo particle centers of mass(Fig. 2c). The sign of V distinguishes the chirality of the two tetrahedral bit states: V 4 0 corresponds to right-handed tetrahedra and V o 0 to left-handedconfigurations. The magnitude of V provides a measure of the deviation of the halo particle configuration from planarity: |V| = 0 corresponds to all haloparticles residing in a single plane, and jV j ¼

ffiffiffi2p

=2 to an idealized tetrahedral arrangement. In each pair of panels we present on the left the low-dimensional projection of the 2 � 106 microstates harvested from the Brownian dynamics simulation at a particular value of L into the intrinsic manifold,and on the right the free energy surface over the intrinsic manifold F(C2, C3, V) computed by application of eqn (4) employing rectilinear histogram binsof size [DC2, DC3, DV] = [0.01, 0.01, 0.05]. (a and b) L = 0.4000, (c and d) L = 0.4142, (e and f) L = 0.4400, (g and h) L = 0.5000, (i and j) L = 0.7500, (k and l)L = 1.0000, (m and n) L = 1.1256. Points in the diffusion map embeddings are colored by |V|. Isosurfaces are plotted at bF = {0, 2, 4, 6, 8, 10, 12}, with thearbitrary zero of free energy defined by the most populated voxel of the embedding. Representative microstates have been projected over the embeddingsto illustrate the halo particle configurations in each region of the intrinsic manifold.

Paper Soft Matter

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 11: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

This journal is©The Royal Society of Chemistry 2016 Soft Matter, 2016, 12, 7119--7135 | 7129

provides a good fit to the data, with best fit parameters{B0 = (2.03 � 0.91) � 103, B1 = (14.08 � 0.92), B2 = (�2.53 � 0.04)}and an adjusted R2 = 0.97.

For Lo (LT = 0.4142), the cluster is effectively locked and theMFPT is undefined since transitions are sterically forbidden.

For L Z (LT = 0.4142), the cluster unlocks and MFPT rapidlydecreases with increasing L as the free volume available to thehalo particles increased and the transition free energy barrierdecreases (cf. Fig. 4). The MFPT approaches an approximateasymptotic limit at L 4 (LU = 0.5361) corresponding to acentral particle so large that the motions of the halo particlesare largely decorrelated and transitions are effectively barrier-less. The escape rate G = 1/MFPT defines the transition ratefrom one bit state to the other42 and is a critical design variablein the practical realization of digital colloids for informationstorage.

3.2 N = 6 (30-state) digital colloids

Using an analogous approach, we conducted Browniandynamics simulations of an N = 6 digital colloid for L = {0.4142,0.5100, 0.5275, 0.5500, 0.5750, 0.6000, 0.6500}, again holdingthe halo particle size fixed and varying the center particlediameter. There are three regimes of L: (I) (LSC E 0.4142) rL o (LT = 0.5275) where the digital colloid configuration iseffectively locked into a particular octahedral bit state configuration,and halo particle transitions between bit states are stericallyforbidden, (II) (LT = 0.5275) r L o (LU = 0.7021) where theconfiguration becomes unlocked as halo particles are able tocollectively transition between bit states, and (III) L Z (LU =0.7021) where transitions between bit states are effectively barrier-less, and the digital colloid configuration is in constant flux betweenstates. The spherical code diameter ratio LSC E 0.4142 and unlock-ing transition threshold LT = 0.5275 were previously reported inref. 1 for the effective hard sphere mapping. The barrierless

Fig. 4 Free energy barriers for bit state transitions and cross sections of the transition region from the N = 4 free energy surfaces in Fig. 3 as a function of L.(a) Free energy barrier height D(bF) computed as the free energy difference between the global minimum of the free energy surface and the minimum freeenergy within the transition state region defined by V A [�0.1, 0.1]. Error bars correspond to 95% confidence bounds computed by 10 roundsof bootstrapping. The blue dashed-line corresponds to an exponential of the form D(bF) = C0 exp(�C1L) with parameters {C0 = (1.29 � 0.89) � 104,C1 = (17.65 � 1.55)} and an adjusted R2 = 0.98. Best fit parameters were computed by weighted least squares regression in which we confronted theobserved heteroskedasticity in the data by weighting each free energy measurement in inverse proportion to its variance.68 The colored regions distinguishthe different regimes of L: (LSC E 0.2247) r L o (LT = 0.4142) (red, locked), (LT = 0.4142) r L o (LU = 0.5361) (green, unlocked), andL Z (LU = 0.5361) (blue, barrierless transitions). LT = 0.4142 defines the unlocking transition at which there is sufficient free volume for the halo particlesto transition between bit states through the square planar transition state. LU = 0.5361 defines the transition to effectively unconstrained barrierlesstransitions between bit states where D(bF) = 1. (b) Sections through the 3D free energy surfaces in F(C2, C3, V) constructed from the subset of approximatelyplanar halo particle configurations with V A [�0.1, 0.1]. At LT = 0.4142 the cluster is at the transition threshold and there exists an essentially unique squareplanar transition state. With increasing L the increased free volume available to the halo particles permits access to a greater diversity of planarconfigurations including trapezoidal and kite geometries. This increase in the phase space volume of the transition region leads to a commensuratedecrease in the transition state free energy barrier due to reduced energetic and entropic barriers.

Fig. 5 Mean first passage times (MFPT) for the N = 4 digital colloid as afunction of L. Errorbars denote 95% confidence intervals on the mean valuecomputed from the observed distribution of first passage times. The bluedashed-line corresponds to a shifted exponential of the form ln(MFPT) =B0 exp(�B1L) + B2, with parameters {B0 = (2.03 � 0.91) � 103, B1 = (14.08 �0.92), B2 = (�2.53 � 0.04)} and an adjusted R2 = 0.97. Best fit parameterswere computed by weighted least squares regression in which each MFPTmeasurement is weighted in inverse proportion to its variance to confrontthe observed heteroskedasticity in the data due to elevated uncertainties inthe low-L regime where bit state transitions are rare events.68 G = 1/MFPTdefines the transition rate from one chiral bit state to the other.

Soft Matter Paper

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 12: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

7130 | Soft Matter, 2016, 12, 7119--7135 This journal is©The Royal Society of Chemistry 2016

transition threshold LU = 0.7021 is defined as the value of L atwhich the (extrapolated) free energy barrier for bit state transitionsdrops to 1 kBT (cf. Fig. 7a).

We extract from our simulations 2 � 106 microstates savedevery 500 time steps for each value of L and compile these intoa composite library (N = 6, L = {0.4142, 0.5100, 0.5275, 0.5500,0.5750, 0.6000, 0.6500}) comprising 1.4 � 107 microstates.We extract a restricted set of 1932 representative microstatesspanning the 110 prototypical cluster architectures observed inthe simulations, and apply diffusion maps employing a soft-thresholding bandwidth of e = exp(2).52 Using the L-method weidentify a gap in the eigenvalue spectrum above l4,53 prompting

construction of a three-dimensional embedding into the topthree non-trivial diffusion map eigenvectors (C2, C3, C4). Weidentify three collective variables to parameterize the intrinsicmanifold of the N = 6 system compared to only two for theN = 4 case, reflecting the richer configurational dynamicsavailable to the larger system. The remaining (1.4 � 107 � 1932)configurations were projected into this reduced embedding viathe Nystrom extension.

Morphology and stability. We present in Fig. 6 the three-dimensional diffusion map embeddings for all seven valuesof L, constructed by projecting the 2 � 106 microstatesrecorded from each simulation into (C2, C3, C4). We also

Fig. 6 Diffusion map embeddings and free energy surfaces for the N = 6 octahedral digital colloid as a function of L. Embeddings are constructed in thetop three non-trivial eigenvectors discovered by diffusion maps corresponding to the slowest collective modes of the digital colloid dynamicalrearrangements (C2, C3, C4). In each pair of panels we present on the left the low-dimensional projection of the 2 � 106 microstates harvested from theBrownian dynamics simulation performed at a particular value of L into the intrinsic manifold, and on the right the free energy surface over the intrinsicmanifold F(C2, C3, C4) computed by application of eqn (4) employing rectilinear histogram bins of size [DC2, DC3, DC4] = [0.005, 0.005, 0.005]. (a and b)L = 0.4142, (c and d) L = 0.5100, (e and f) L = 0.5275, (g and h) L = 0.5500, (i and j) L = 0.5750, (k and l) L = 0.6000, (m and n) L = 0.6500. Points in thediffusion map embedding are colored by the L1,1 norm between the normalized halo particle pairwise distances matrix of the corresponding microstateand that of an idealized octahedron 8w0 � woct

08. Isosurfaces are plotted at bF = {0, 2, 4, 6, 8, 10, 12, 14}, with the arbitrary zero of free energy defined bythe most populated voxel of the embedding. Representative microstates have been projected over the embeddings to illustrate the halo particleconfigurations in each region of the intrinsic manifold. The viewing angles of the triangular prism configurations within the transition state ensemble wereselected to emphasize the stacking of the two triangular layers of halo particles such that those closer to the viewer eclipse from view those that arefurther away.

Paper Soft Matter

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 13: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

This journal is©The Royal Society of Chemistry 2016 Soft Matter, 2016, 12, 7119--7135 | 7131

report the associated free energy landscapes F(C2, C3, C4)computed from histograms over the low-dimensional projec-tions. Rotating movies of these figures are provided in Movie S2of the ESI.†

Our diffusion map embeddings treat all halo particles asindistinguishable, collapsing together the structurally identical(N!/YN) = 30 bit states within the intrinsic manifold spannedby (C2, C3, C4). In the case of N = 4 we were able to distinguishthe two chiral bit states by augmenting our low-dimensionalprojection with the signed volume V of the parallelepipedmapped out by the halo particles. For the N = 6 case, no simpleorder parameters are available to separate out the 30 rotationallydistinguishable bit states within our low dimensional embedding.Accordingly, we do not augment our embeddings but appreciate –by virtue of the structural equivalence of all bit states – that thelow-dimensional projections and free energy surfaces pertainequally to all bit states. The connectivity graph illustrating thebit states that are mutually accessible by single collectiverearrangement events of the halo particles are illustrated inFig. S1 in the ESI.†

To gain insight into the physical interpretation of thediffusion map variables, we project representative microstatesonto the low-dimensional manifolds in Fig. 6 and color eachmicrostate according to the L1,1 norm between its normalizedhalo particle pairwise distances matrix and that of an idealizedoctahedron 8v0 � voct

08 (cf. Section 2.2.1). The collective variableC2 is well-correlated with 8v0 � voct

08 indicating that it provides

a measure of the octahedral character of the digital colloidarchitecture. The minimum value of 8w0 �woct

08 = 0 is found inthe region of (C2 E 0.05, C3 E 0.03, C4 E �0.03) where thedigital colloid adopts an idealized octahedral geometry. Each ofthe low-dimensional manifolds in Fig. 6 is funnel-shaped, withthe narrow tip containing the transition state ensemble andwide base corresponding to the ‘‘rattling’’ of halo particlesabout a particular stable bit state. The size of the tip of thefunnel increases with L as the increased free volume accessibleto the halo particles opens access to a greater diversity oftransition state configurations.

At L = (LSC E 0.4142) (Fig. 6a and b) – corresponding to theminimum diameter ratio for which effective hard sphere haloparticles can exist on the surface of the central particle withoutoverlapping1,24 – the digital colloid is confined to a small set ofpossible configurations localized around the ideal octahedron.At (L = 0.5100) o (LT = 0.5275) (Fig. 6c and d), the digitalcolloid remains effectively locked in a single bit state, but theincreased free volume permits the halo particles to rattlearound around the ideal octahedron and fill a greater volumeof the intrinsic manifold.

At the unlocking threshold of L = (LT = 0.5275) (Fig. 6e and f ), astructure very close to the triangular prism transition state betweenoctahedral bit states emerges at (C2 E �0.03, C3 E �0.03,C4 E �0.07).24 Nevertheless, bit state transitions remain stericallydisfavored and – in line with Fig. SI-I-3 (ESI†) in ref. 1 – notransition events are observed over the course of our simulations.

Fig. 7 Free energy barriers for bit state transitions and mean first passage time as a function of L for the N = 6 digital colloid. (a) Free energy barrierheight D(bF) computed as the free energy difference between the global minimum of the free energy surface and the minimum free energy within thetransition state region defined by the voxels containing microstates with 8v0 � vprism

08 o 1.75, where vprism0 is the normalized halo pairwise distances

matrix for the idealized triangular prism transition state structure. This threshold identifies microstates possessing an average 5% deviation away from thenormalized distance matrix for the idealized triangular prism. Error bars correspond to 95% confidence bounds computed by 10 rounds of bootstrapping.The blue dashed-line corresponds to an exponential of the form D(bF) = C0 exp(�C1L) with parameters {C0 = (1.41 � 0.58)� 104, C1 = (13.60 � 0.70)} andan adjusted R2 = 0.99. Best fit parameters were computed by weighted least squares regression in which we confronted the observed heteroskedasticityin the data by weighting each free energy measurement in inverse proportion to its variance.68 The colored regions distinguish the different regimes ofL: (LSC E 0.4142) r L o (LT = 0.5275) (red, locked), (LT = 0.5275) r L o (LU = 0.7021) (green, unlocked), and L Z (LU = 0.7021) (blue, barrierlesstransitions). LT = 0.5275 defines the unlocking transition at which there is sufficient free volume for the halo particles to transition between bit statesthrough the square planar transition state. LU = 0.7021 defines the (extrapolated) transition to effectively unconstrained barrierless transitions between bitstates where D(bF) = 1. (b) Mean first passage time computed as the average dwell time within a bit state before transitioning to a new bit state by acollective structural rearrangement of the halo particles. Errorbars denote 95% confidence intervals on the mean value computed from the observeddistribution of first passage times. The blue dashed-line corresponds to a shifted exponential of the form ln(MFPT) = B0 exp(�B1L) + B2, with parameters{B0 = (2.30� 0.16) � 106, B1 = (23.17� 0.12), B2 = (�1.91 � 0.08)} and an adjusted R2 = 0.93 best fit parameters were computed by weighted least squaresregression in which each MFPT measurement is weighted in inverse proportion to its variance.68 G = 1/MFPT is the escape rate from any particular bitstate to any one of eight other rotationally distinguishable bit states accessible by a single halo particle rearrangement event (Fig. S1, ESI†).

Soft Matter Paper

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 14: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

7132 | Soft Matter, 2016, 12, 7119--7135 This journal is©The Royal Society of Chemistry 2016

For (L = 0.5500) 4 (LT = 0.5275) (Fig. 6g and h), there issufficient free volume for the halo particles to realize thetriangular prism transition structure, and we observed multiplebit state transitions over the course of our calculations. The freeenergy surface develops a funnel-like structure with the cagedoctahedral structures occupying a broad low-free energy regionat the bottom of the funnel-shaped intrinsic manifold, and thetriangular prism transition state occupying a small high-freeenergy region at the narrow end of the funnel. We definemicrostates as belonging to the transition state ensembleby thresholding the L1,1 norm between the normalized haloparticle pairwise distance matrix and that of the idealizedtriangular prism, given as 8w0 �wprism

08 o 1.75. These config-urations possess an average 5% deviation in the normalizedpairwise distances matrix away from that of the idealizedtriangular prism. Productive transitions between bit statesoccur when the digital colloid enters the transition state andthe halo particles undergo a collective rearrangement into anew rotationally distinguishable configuration (cf. Fig. S1 inthe ESI†). The free energy barrier between the bottom ofthe octahedral free energy well and the transition state isD(bF) = (8.22 � 0.17).

At larger values of L (Fig. 6i–n), the volume of the intrinsicmanifold increases with the elevated free volume available tothe halo particles. The transition state ensemble becomesmarkedly enlarged since the two planar layers of the triangularprism are able to move more freely over the center particlesurface, leading to an expansion of the intrinsic manifoldtoward (C2 E �0.10, C3 E 0.00, C4 E �0.05) due to a greaterdiversity of transition structures. There is a commensurate decreasein the transition barrier height from D(bF) = (5.52 � 0.07) atL = 0.5750, to D(bF) = (3.99 � 0.06) at L = 0.6000, toD(bF) = (2.14 � 0.04) at L = 0.6500.

We report the dependence of the transition state free energybarrier as a function of L in Fig. 7a. An exponential fit of theform D(bF) = C0 exp(�C1L) provides an excellent fit to the data,with best fit parameters {C0 = (1.41 � 0.58) � 104, C1 = (13.60 �0.70)} and an adjusted R2 = 0.99. In the next section, we will usethis dependence to design of the digital colloids as informationstorage elements that have sufficiently low free energy barriersso as to be writable (i.e., their bit state can be changed by doingexternal work) but sufficiently high to be stable to thermally-activated spontaneous changes in the bit state.

Kinetics. We present in Fig. 7b the MFPT as a function of L.MFPT calculations were conducted for L = {0.5480, 0.5490,0.5500, 0.5550, 0.5600, 0.5700, 0.5750, 0.5800, 0.5900, 0.6000,0.6100, 0.6200, 0.6300, 0.6400, 0.6500}. The MFPT correspondsto the mean residence time of the digital colloid within a bitstate before it makes a transition to one of the eight otherbit states accessible by a single collective rearrangement of thehalo particles (Fig. S1 in the ESI†). To eliminate artificialdepression of the MFPT due to recrossing events, we definethe core region of a bit state to be that region of the intrinsicmanifold containing configurations possessing a halo particleadjacency matrix equivalent to that of the ideal octahedron.A transition is defined when the digital colloid adjacency matrix

departs from the octahedral architecture corresponding to onerotationally distinguishable bit state then reforms the octahedraladjacency matrix in different bit state. An illustration of aproductive transition event for L = 0.5750 is illustrated in MovieS3 of the ESI,† where we simultaneously visualize the digitalcolloid configuration, follow its location on the low dimensionalmanifold, and map the transition between a pair of the 30rotationally distinguishable bit states.

A shifted exponential of the form ln(MFPT) = B0 exp(�B1L) + B2

provides a good fit to the data, with best fit parameters{B0 = (2.30 � 0.16) � 106, B1 = (23.17 � 0.12), B2 = (�1.91 � 0.08)}and an adjusted R2 = 0.93. In the effectively locked regionL o (LT = 0.5275) the MFPT is undefined, before droppingsteeply beyond the unlocking threshold and attaining anapproximate plateau at L 4 (LU = 0.7021) where transitionsbecome effectively barrierless. The escape rate G = 1/MFPT quanti-fies the volatility of a digital colloid memory storage element due tothermally-driven bit state flips. In the next section, we will combine

Fig. 8 Rational design of a N = 6 digital colloid with tailored write freeenergy and memory volatility. Red points correspond to the calculatedtransition barrier heights D(bF) and transition rates G = 1/MFPT for the fourN = 6 digital colloid architectures with L = {0.5500, 0.5750, 0.6000,0.6500}. The dashed blue line is a parametric fit of the best fit shiftedexponential for the MFPT and the best fit exponential for D(bF) (cf. Fig. 7)constructed by eliminating L to obtain an expression for D(bF) as afunction of G. The ratio of the coefficients B1 and C1 of L in the two bestfit exponentials controls the nonlinearity of the parametric fit. At a ratio ofunity, the parametric fit corresponds to Arrhenius behavior and the dashedblue line would appear a straight line in the semi-logarithmic axes. Theactual ratio is B1/C1 = (1.70 � 0.09) leading to non-Arrhenius behavior andthe observed non-linearity of the dashed blue curve. D(bF) is the reversiblework required to actuate a halo particle rearrangement and write a new bitstate into the digital colloid. G is the expected rate at which errors areintroduced due to thermally-activated bit state transitions. The lower thewrite free energy, the higher the error rate. The design problem is tominimize the work required to write to the digital colloid elements whilemaintaining a the error rate above a particular error tolerance threshold. Asa simple example, the overlaid text boxes illustrate the information storagefidelity of a short text block over a one hour duration. A system of 246 N = 6digital colloids would be required to encode this text using two digitalcolloids per 8-bit ASCII character. Red text is illustrative of the expectednumber of errors due to thermally-activated bit state transitions at the endof one hour of storage.

Paper Soft Matter

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 15: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

This journal is©The Royal Society of Chemistry 2016 Soft Matter, 2016, 12, 7119--7135 | 7133

this design variable with the transition free energy barrier to designN = 6 digital colloid memory elements.

3.3 Rational design of N = 6 digital colloid memory storageelements

We now proceed to use the calculated dependencies of thetransition barrier height D(bF) and escape rate G on L torationally design a N = 6 digital colloid with desired memoryvolatility and/or actuation free energy. In Fig. 8 we present aparametric plot of the escape rate G = 1/MFPT against the transitionbarrier height D(bF) for the four diameter ratios at which bit statetransitions were observed L = {0.5500, 0.5750, 0.6000, 0.6500}. Wemay interpret the transition barrier height as the reversible workrequired to actuate a halo particle rearrangement and intentionallywrite a new bit state into the digital colloid. The escape rate can beconceived as the rate at which errors are introduced due tothermally-activated spontaneous bit state transitions. An idealstorage medium would be both non-volatile and cheap to address,implying a low write free energy and a low error rate. Since thetransition barrier height controls both of these properties, they arecompeting design constraints and our design objective becomes toidentify the value of L that minimizes the write free energy whilemaintaining the error rate below a user-specified tolerance. As anillustrative example, we consider the storage of a short text block forthe period of one hour, and illustrate the tradeoff between theexpected error rate and write free energy. Our analysis indicates thatrelatively large transition barriers are required to preserve dataintegrity over any significant time period. Determining the tradeoffbetween volatility and ease of writing will depend largely on thespecifics of the system, but this approach can help guide the designprocess by imposing the engineering requirements of the writeelement and intended storage duration.

4 Conclusions

We show how to use low-dimensional embeddings to characterizethe stability and dynamical transition pathways for digitalcolloids. By applying diffusion maps to simulation-generatedtrajectories of the digital colloid dynamics, we discover a non-linear projections of the digital colloid dynamics into low-dimensional embeddings. These projections of the dynamicsinto a low-dimensional space reveal the metastable bit statesand transition pathways between them. The free energysurfaces supported by these low-dimensional embeddingsquantify the stability of the bit states and height of the transi-tion barriers for bit state interconversions as a measure of thereversible work required to write a new bit state into the digitalcolloid. The low-dimensional embeddings also reveal the diver-sity of morphologies within the metastable bit state minimaand transition state ensemble, and a means to robustly identifytransitions between bit states. Mean first passage times calcu-lated from our simulations provide a measure of the averagedwell time within a bit state, and a measure of memoryvolatility and expected rates of memory degradation due tothermally-activated bit state transitions.

We apply our method to the N = 4 and N = 6 digital colloidscapable of storing 0.125 bytes and 0.613 bytes, respectively.Using our calculated barrier heights and escape rates for theN = 6 system, we calculate the tradeoff between the reversiblework required to write a bit state change and the memorydegradation rate due to thermal fluctuations. The results of thiscalculation guide the rational design of the relative halo andcentral particle diameters for a memory storage application.

We also describe how to extend our analysis to largersystems. Our exhaustive graph matching methods to computethe microstate pairwise distances required by the diffusion mapare limited by their high computational cost. We are currentlyexploring and developing approximate methods with sufficientlyhigh efficiency and fidelity to robustly recover the low-dimensionalembeddings for larger digital colloids with higher informationstorage capacity, such as the 2.87 byte N = 12 system.1

The double-exponential dependence of mean first passagetimes on L shows that rearrangement kinetics of digitalcolloids are extremely sensitive to L near the locking transition.This fact highlights opportunities for designing digital colloidsthat can ‘‘lock’’ and ‘‘unlock’’ in response to relatively smallchanges in the central particle size, and also provides guide-lines for manufacturaing digital colloids that satisfy long-termdata storage requirements. Examples of N = 4 digital colloidshave been realized by Grier, Pine and coworkers using holo-graphic optical tweezers,1 but it remains a challenge to producelarger clusters and fabricate them at scale. Based on thefindings of this work, we anticipate next steps for manufacturingbulk quantities of digital colloids will depend on controlling themonodispersity of central particles and developing mechanismsfor shrinking and swelling the central particles controllably.Moreover, robust and scalable methods to read, store, and writeinformation to the colloids must be developed before this can beused as a practical information storage substrate. Techniquesbased on DNA functionalization,66 FRET barcoding,5 and swella-ble central particles67 have been suggested, and this remains anarea of active research. We anticipate that the mutually beneficialfeedback between simulation, machine learning-enabled dataanalysis, and advanced experimental synthesis and imaging tech-niques will improve our understanding and control of this noveland potentially ultra-high capacity information storage medium.

Acknowledgements

This material is based upon work supported by a National ScienceFoundation CAREER Award to A. L. F. (Grant No. DMR-1350008).

References

1 C. L. Phillips, E. Jankowski, B. J. Krishnatreya,K. V. Edmond, S. Sacanna, D. G. Grier, D. J. Pine andS. C. Glotzer, Soft Matter, 2014, 10, 7468–7479.

2 S. Sacanna, W. T. M. Irvine, P. M. Chaikin and D. J. Pine,Nature, 2010, 464, 575–578.

Soft Matter Paper

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 16: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

7134 | Soft Matter, 2016, 12, 7119--7135 This journal is©The Royal Society of Chemistry 2016

3 J. T. McGinley, I. Jenkins, T. Sinno and J. C. Crocker,Soft Matter, 2013, 9, 9119–9128.

4 L. Wang, W. Zhao and W. Tan, Nano Res., 2008, 1, 99–115.5 L. Wang and W. Tan, Nano Lett., 2006, 6, 84–88.6 L. Qian and E. Winfree, Science, 2011, 332, 1196–1201.7 C. L. Phillips, E. Jankowski, M. Marval and S. C. Glotzer,

Phys. Rev. E: Stat., Nonlinear, Soft Matter Phys., 2012, 86,041124.

8 T. W. Melnyk, O. Knop and W. R. Smith, Can. J. Chem., 1977,55, 1745–1761.

9 L. Whyte, The American Mathematical Monthly, 1952, 59,606–611.

10 J. Edmundson, Acta Crystallogr., Sect. A: Found. Crystallogr.,1992, 48, 60–69.

11 R. Tammes, Recl. Trav. Bot. Neerl., 1930, 27, 1–84.12 J. J. Thomson, London, Edinburgh Dublin Philos. Mag. J. Sci.,

1904, 7, 237–265.13 D. Weaire and T. Aste, The pursuit of perfect packing,

CRC Press, 2008.14 W. D. S. N. J. A. Sloane, with the collaboration of R. H.

Hardin et al., Tables of Spherical Codes, published electro-nically at NeilSloane.com/packings/.

15 A. W. Long and A. L. Ferguson, J. Phys. Chem. B, 2014, 118,4228–4244.

16 P. Das, M. Moll, H. Stamati, L. E. Kavraki and C. Clementi,Proc. Natl. Acad. Sci. U. S. A., 2006, 103, 9885–9890.

17 M. Ceriotti, G. A. Tribello and M. Parrinello, Proc. Natl. Acad.Sci. U. S. A., 2011, 108, 13023–13028.

18 R. A. Mansbach and A. L. Ferguson, J. Chem. Phys., 2015,142, 105101.

19 A. W. Long, J. Zhang, S. Granick and A. L. Ferguson,Soft Matter, 2015, 11, 8141–8153.

20 M. A. Bevan, D. M. Ford, M. A. Grover, B. Shapiro,D. Maroudas, Y. Yang, R. Thyagarajan, X. Tang and R. M.Sehgal, J. Process Control, 2015, 27, 64–75.

21 R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler,F. Warner and S. W. Zucker, Proc. Natl. Acad. Sci. U. S. A.,2005, 102, 7426–7431.

22 R. R. Coifman and S. Lafon, Appl. Comput. Harmon. Anal.,2006, 21, 5–30.

23 R. Coifman, I. Kevrekidis, S. Lafon, M. Maggioni andB. Nadler, Multiscale Model. Simul., 2008, 7, 842–864.

24 C. L. Phillips, E. Jankowski, M. Marval and S. C. Glotzer,Phys. Rev. E: Stat., Nonlinear, Soft Matter Phys., 2012,86, 041124.

25 C. L. Phillips, J. A. Anderson and S. C. Glotzer, J. Comput.Phys., 2011, 230, 7191–7201.

26 J. A. Anderson, C. D. Lorenz and A. Travesset, J. Comput.Phys., 2008, 227, 5342–5359.

27 J. Glaser, T. D. Nguyen, J. A. Anderson, P. Lui, F. Spiga,J. A. Millan, D. C. Morse and S. C. Glotzer, Comput. Phys.Commun., 2015, 192, 97–107.

28 J. D. Weeks, D. Chandler and H. C. Andersen, J. Chem. Phys.,1971, 54, 5237–5247.

29 H. C. Andersen, J. D. Weeks and D. Chandler, Phys. Rev. A:At., Mol., Opt. Phys., 1971, 4, 1597.

30 L. Filion, R. Ni, D. Frenkel and M. Dijkstra, J. Chem. Phys.,2011, 134, 134901.

31 J. A. Barker and D. Henderson, J. Chem. Phys., 1967, 47,4714–4721.

32 J. Rowlinson, Mol. Phys., 1964, 7, 349–361.33 J. Rowlinson, Mol. Phys., 1964, 8, 107–115.34 C. L. Phillips, J. A. Anderson and S. C. Glotzer, J. Comput.

Phys., 2011, 230, 7191–7201.35 T. Schlick, Molecular Modeling and Simulation: An Interdisci-

plinary Guide, Springer Science & Business Media, 2010.36 S. Sacanna, W. T. Irvine, L. Rossib and D. J. Pinea, Soft Matter,

2011, 7, 1631–1634.37 M. K. Transtrum, B. B. Machta, K. S. Brown, B. C. Daniels,

C. R. Myers and J. P. Sethna, J. Chem. Phys., 2015, 143, 010901.38 A. L. Ferguson, A. Z. Panagiotopoulos, P. G. Debenedetti and

I. G. Kevrekidis, Proc. Natl. Acad. Sci. U. S. A., 2010, 107,13597–13602.

39 A. L. Ferguson, A. Z. Panagiotopoulos, I. G. Kevrekidis andP. G. Debenedetti, Chem. Phys. Lett., 2011, 509, 1–11.

40 M. A. Rohrdanz, W. Zheng, M. Maggioni and C. Clementi,J. Chem. Phys., 2011, 134, 124116.

41 H. Stamati, C. Clementi and L. E. Kavraki, Proteins: Struct.,Proteins: Struct., Funct., Bioinf., 2010, 78, 223–235.

42 R. Zwanzig, Nonequilibrium Statistical Mechanics, OxfordUniversity Press, USA, 2001.

43 I. T. Jolliffe, Principal Component Analysis, Springer,New York, 2nd edn, 2002.

44 T. Cox and M. Cox, Multidimensional Scaling, CRC Press, 2000.45 A. L. Ferguson, S. Zhang, I. Dikiy, A. Z. Panagiotopoulos,

P. G. Debenedetti and A. J. Link, Biophys. J., 2010, 99,3056–3065.

46 P. G. Bolhuis, D. Chandler, C. Dellago and P. L. Geissler,Annu. Rev. Phys. Chem., 2002, 53, 291–318.

47 L. Maragliano, A. Fischer, E. Vanden-Eijnden andG. Ciccotti, J. Chem. Phys., 2006, 125, 024106.

48 H. Jonsson, G. Mills and K. W. Jacobsen, in Classical andQuantum Dynamics in Condensed Phase Simulations, ed.B. J. Berne, G. Ciccotti and D. F. Coker, World Scientific,Singapore, 1998, ch. Nudged elastic band method for find-ing minimum energy paths of transitions, pp. 385–404.

49 E. Weinan and E. Vanden-Eijnden, Annu. Rev. Phys. Chem.,2010, 61, 391–420.

50 A. L. Ferguson, A. Z. Panagiotopoulos, P. G. Debenedetti andI. G. Kevrekidis, J. Chem. Phys., 2011, 134, 135103.

51 B. Nadler, S. Lafon, R. R. Coifman and I. G. Kevrekidis, Advancesin Neural Information Processing Systems 18: Proceedings of the2005 Conference (Neural Information Processing), The MIT Press,2006, pp. 955–962.

52 R. Coifman, Y. Shkolnisky, F. Sigworth and A. Singer, IEEETrans. Image Process., 2008, 17, 1891–1899.

53 S. Salvador and P. Chan, Tools with Artificial Intelligence,2004. ICTAI 2004. 16th IEEE International Conference on,2004, pp. 576–584.

54 M. Scholz, M. Fraunholz and J. Selbig, Principal Manifoldsfor Data Visualization and Dimension Reduction, Springer,2008, pp. 44–67.

Paper Soft Matter

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online

Page 17: Nonlinear machine learning and design of …ferguson.matse.illinois.edu/resources/25.pdfNonlinear machine learning and design of reconfigurable digital colloids ... digital colloids

This journal is©The Royal Society of Chemistry 2016 Soft Matter, 2016, 12, 7119--7135 | 7135

55 A. Ma and A. R. Dinner, J. Phys. Chem. B, 2005, 109,6769–6779.

56 B. Peters and B. L. Trout, J. Chem. Phys., 2006, 125, 054108.57 J. Xing and K. S. Kim, J. Chem. Phys., 2011, 134, 044132.58 M. S. Shell, Thermodynamics and Statistical Mechanics:

An Integrated Approach, Cambridge University Press, 2015,ch. Chapter 16.

59 M. S. Shell, Thermodynamics and Statistical Mechanics:An Integrated Approach, Cambridge University Press, 2015,ch. Chapter 21.

60 S. Redner, A Guide to First-Passage Processes, CambridgeUniversity Press, 2001.

61 G. R. Bowman, in An Introduction to Markov State Models andTheir Application to Long Timescale Molecular Simulation, ed.G. R. Bowman, V. S. Pande and F. Noe, Springer Science &

Business Media, 2013, vol. 797, ch. 2. An overview andpractical guide to building Markov state models, pp. 20–21.

62 C. T. Baker and C. Baker, The Numerical Treatment of IntegralEquations, Clarendon Press, Oxford, 1977, vol. 13.

63 C. R. Laing, T. A. Frewen and I. G. Kevrekidis, NonlinearityBiol., Toxicol., Med., 2007, 20, 2127.

64 B. E. Sonday, M. Haataja and I. G. Kevrekidis, Phys. Rev. E:Stat., Nonlinear, Soft Matter Phys., 2009, 80, 031102.

65 W. Humphrey, A. Dalke and K. Schulten, J. Mol. Graphics,1996, 14, 33–38.

66 N. Geerts and E. Eiser, Soft Matter, 2010, 6, 4647–4660.67 K. Kratz, T. Hellweg and W. Eimer, Colloids Surf., A, 2000,

170, 137–149.68 S. Chatterjee and A. S. Hadi, Regression Analysis by Example,

John Wiley & Sons, Hoboken, New Jersey, 5th edn, 2012.

Soft Matter Paper

Publ

ishe

d on

01

Aug

ust 2

016.

Dow

nloa

ded

by U

nive

rsity

of

Illin

ois

- U

rban

a on

24/

08/2

016

16:1

9:15

. View Article Online