fopra - dna origami design and structure prediction

19
FoPra - DNA Origami Design and Structure Prediction Honemann Max, Feigl Elija September 2021 1 Introduction 1.1 Goals of this lab course DNA Origami has evolved into a versatile tool for the construction of nanoscale object. Structures can be rapidly folded into seemingly arbitrary shapes within minutes at a high yield and with high precision [Sobczak et al. 2012]. However, before structures can be assembled in the lab, they need to be designed. The designs need to be verified and potential flaws in the design should be identified and fixed prior to ordering. In this lab course you will design a DNA origami corner, simulate this structure with different coarse-grained tools and compare the result of a simulation with experimental data from a Cryo-EM acquisition. Overall this should give you a good idea how the design process of a DNA origami works and give you an insight into the field of molecular dynamic simulations. 1.2 Introduction Desoxyribonucleic acid and DNA-origami DNA is commonly found in all cells of all living organisms as the primary storage for genetic information. The information is stored as a sequence of 4 different Nucleobases (Adenine, Thymine, Guanine, and Cytosine). These 4 nucleobases are bound to a constant repetive backbone formed by desoxyribose – a sugar – and phosphoric acid. The combination of these 3 components gives DNA its name. Figure 1 depicts the 4 different nucleotides of DNA and how they interact to form the 2 possible base pairs. Note that the 2 strands that are formed by the formation of these base pairs are running antiparallel. The DNA double strand is stabilized by base stacking, where the π-electron ring systems of the aromatic nucleobases overlap, as well as base pairing which occurs between adenine and thymine and guanine and cytosine and gives the specificity for the formation of the double helix. The overlapping π-electron systems force the DNA into a right- handed helix with a per bp rotation of 34.3 , resulting in roughly 21 bases for 2 full turns of the helix. The bare DNA-helix has a rise of 3.4 ˚ A/bp and a diameter of 20 ˚ A. However, in DNA origami, the effective distance between helices due to repulsive effects and the binding of cations such as magnesium is closer to 2.6 nm, which we will use to estimate the angle in exercise 1. 1

Upload: others

Post on 18-Oct-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: FoPra - DNA Origami Design and Structure Prediction

FoPra - DNA Origami Design and Structure Prediction

Honemann Max, Feigl Elija

September 2021

1 Introduction

1.1 Goals of this lab course

DNA Origami has evolved into a versatile tool for the construction of nanoscale object. Structurescan be rapidly folded into seemingly arbitrary shapes within minutes at a high yield and withhigh precision [Sobczak et al. 2012]. However, before structures can be assembled in the lab, theyneed to be designed. The designs need to be verified and potential flaws in the design should beidentified and fixed prior to ordering. In this lab course you will design a DNA origami corner,simulate this structure with different coarse-grained tools and compare the result of a simulationwith experimental data from a Cryo-EM acquisition. Overall this should give you a good idea howthe design process of a DNA origami works and give you an insight into the field of moleculardynamic simulations.

1.2 Introduction

Desoxyribonucleic acid and DNA-origami

DNA is commonly found in all cells of all living organisms as the primary storage for geneticinformation. The information is stored as a sequence of 4 different Nucleobases (Adenine, Thymine,Guanine, and Cytosine). These 4 nucleobases are bound to a constant repetive backbone formedby desoxyribose – a sugar – and phosphoric acid. The combination of these 3 components givesDNA its name. Figure 1 depicts the 4 different nucleotides of DNA and how they interact to formthe 2 possible base pairs. Note that the 2 strands that are formed by the formation of these basepairs are running antiparallel. The DNA double strand is stabilized by base stacking, where theπ-electron ring systems of the aromatic nucleobases overlap, as well as base pairing which occursbetween adenine and thymine and guanine and cytosine and gives the specificity for the formationof the double helix. The overlapping π-electron systems force the DNA into a right- handed helixwith a per bp rotation of 34.3◦, resulting in roughly 21 bases for 2 full turns of the helix. The bareDNA-helix has a rise of 3.4 A/bp and a diameter of 20 A. However, in DNA origami, the effectivedistance between helices due to repulsive effects and the binding of cations such as magnesium iscloser to 2.6 nm, which we will use to estimate the angle in exercise 1.

1

Page 2: FoPra - DNA Origami Design and Structure Prediction

Figure 1: The 4 nucleotides of DNA. A - Guanine B - Adenine C - Cytosine and D - Thymine

Due to the base pairing, DNA single strands in solution anneal onto each other with high specificityand high affinitiy. This pairing of strands can be used in a self assembly process called DNA origami.Initially Seeman suggested to use DNA as a building material, and form small assemblies of DNAthrough holiday junctions [Seeman 2003]. This method was then pushed further by a group ofresearchers who were able to build 2-dimensional structure from DNA through the formation ofsheets of holiday junctions [Rothemund 2006]. The third big step was then achieved by Douglas etal. in 2009, who were able to assemble complex 3-dimensional nanostructures with bends and twists[Douglas et al. 2009].

In general, DNA Origami nanonstructures are formed from one long singlestranded DNA (scaffold)and many short singlestranded DNA oligomers (staples or oligos), with complementary sequences.The final structure of the Origami is determined by the sequence specific binding between thedifferent DNA strands. The DNA helices are arranged in either a honeycomb or square latticepattern. The differences between the two being the tightness of the helical packaging, aswell asthe position of the interhelical connections. The connections between the different helices arecalled crossovers, which mimic holiday junctions, a DNA motif naturally occuring during geneticrecombination. In this lab course you will create a 6-helixbundle, which is based on the honeycomblattice. We will therefore focus on the features of the honeycomb lattice.

As explained earlier, DNA is a doublestranded righthanded helix. This allows the designer of a

2

Page 3: FoPra - DNA Origami Design and Structure Prediction

DNA origami to form connections between two adjacent DNA strands, whenever their phosphatebackbones point towards each other. The honeycomb lattice allows for crossovers at 3 differentpositions, a 0◦ crossover at base 0 and 21, a 240◦ crossover at base 7 and a 120◦ crossover at base14. Figure 2 illustrates this. Note that due to the right handedness of the helix and the per baseturn of 34◦ per base, no other crossovers of this strand are possible without helical deformation,which would cause an energetic penalty and could prevent folding.

We will discuss the structure and the formation of DNA origami in more detail at the beginning ofthe lab course.

Figure 2: Different lattice patterns and possible crossover positionsA Comparisson between square (left) and honeycomb (right) lattice. B Cross-overs in multilayerobjects with honeycomb lattice packing, spaced in constant intervals of 7 bp along the helical axisto link double-helical domains to each of three possible neighbors. Taken from Castro et al. 2011

3

Page 4: FoPra - DNA Origami Design and Structure Prediction

2 DNA Origami

2.1 Exercise 1: DNA Origami Design - Corner

For the first part of this lab course we will design a small DNA origami that contains a 90◦ corner.The structure should be formed from a scaffold with roughly 1000 bases. The design of the structurewill be done in caDNAno a computational tool that allows for the easy design of scaffolded DNAorigami structures. However prior to opening caDNAno, it is important to think about some detailsof the design:

Figure 3 shows you an exemplary 120◦ corner, look at the figure and try to understand which partof the design causes the angle. Based on that think about how a 90◦ corner differs from that.Furthermore, try to make a sketch of a 90◦ honeycomb corner on paper, based on that sketch tryto find a formula that correlates angle and relative shortening of the helices.

Figure 3: 6-helix bundle with a 120◦-corner.A Cadnano strand diagram, with the scaffold in blue and the staples in grey. B Relative positionof the helices. C CanDo simulation showing the 120◦-angle of the structure.

If you take this course at the university, a laptop running caDNAno will be provided, if you arerunning the course remotely, we will send you a link that points toward a caDNAno installationguide.

4

Page 5: FoPra - DNA Origami Design and Structure Prediction

We will go through the GUI of caDNAno and how to use it at the beginning of the course. To helpyou understand the design in Figure 3, here are some general remarks.

Firstly, note that for clarity the design was simplified and some features of a normal design, suchas staple crossovers are not optimmized. Secondly the blue colored strand is the scaffold, whichgives the structure its overall size, while the strands colored in grey are the staples. Note that eachstaple has a square and an arrow shaped end, which stand for the 5’-end and 3’-end of the staplerespectively. The scaffold is circular and does not have a starting point. Thirdly, each block in thedepiction stands for a basepair, with a thicker line every 7 bases for clarity.

The design you create in this part of the course will be iteratively refined using CanDo, a fastwebhosted tool for structure prediction of DNA Origami nanostructures. CanDo estimates DNA asslices, with each slice representing a basepair. The slices are connected by elastic rods, that simulatethe properties of DNA [Castro et al. 2011]. This will allow you to rapidly prototype your design.In later steps of the course, more advanced and computationally expensive methods will be used topredict structures of other DNA nanostructures.

2.2 Report 1: DNA Origami Design - Corner

For the report please discuss the following things:

• Compare your intermediate and your final design files and describe which mistakes in yourinitial design files lead to which deformation in the CanDo simulation.

• How would a corner design change, if you added a secondary 90◦ corner which is orthogonalto your corner, what could be benefits or disadvantages of such a change.

• Which limits does this form of corner design have, and can you think of an alternative way todesign corners.

5

Page 6: FoPra - DNA Origami Design and Structure Prediction

3 Structure Prediction

3.1 Introduction Structure Prediction

Molecular Dynamics Simulation

Performing a Molecular Dynamics Simulation (MD) corresponds to computing the time evolutionof a many-particle system by solving the equations of motion of classical mechanics. The analysis ofthis time evolution allows the calculation of thermodynamic properties and structural observables,like average bond length, bond angles. The forces acting on one particle are a result of a potentialgenerated by all other particles and fields in the system. This couples the motion of one particle tothe motion of all others.

For a classical system within the Born-Oppenheimer approximation the Equations of Motion (EOM)become

ri =pimi

pi = −∇riV = Fi,(1)

where Fi is the Force exerted on particle i, ri its nuclear coordinates and pi its momentum. For asystem of N particle this amounts to a system of 6N coupled differential equations of first order,whereas the coupling is realized via the gradient of the potential energy V.

These equations can be solved numerically using a discrete time step ∆t. Each time step thepotential energy V is computed and used to determine the forces acting on a particle. Using theforces, velocity and position of that particle at t+ ∆t can be calculated using an algorithm like theVelocity Verlet.

r(t+ ∆t) = r(t) + v(t)∆t+F(t)

2m∆t2 +O(∆t3)

v(t+ ∆t) = v(t) +F(t) + F(t+ ∆t)

2m∆t+O(∆t2)

(2)

The basic structure of an MD simulation is depicted in the following algorithm 1.

Algorithm 1: Basic MD structure

Input: simulation parameters: ∆t, M, N

Initialization();for M steps do

forces = compute forces();trajectory += integrate EOM(∆t, forces, integrator);

averages = update averages();

endreturn trajectory, averages

The algorithm’s main part primarily comprises of the for-loop over M time-steps of size ∆t. Foreach iteration three essential tasks have to be performed. Firstly, computation of the forces acting

6

Page 7: FoPra - DNA Origami Design and Structure Prediction

on all particles is usually the computationally most expensive part of the simulation and determinesaccuracy of the simulation. In general, the computational complexity of calculating the forces forall N particles is of order O(N2). Secondly, the positions of all particles have to be advanced by∆t by solving the EOM using a finite difference integration algorithm. The new positions are thenadded to the system’s trajectory already containing all previous time-steps. Lastly, thermodynamicaverages have to be updated according to the new data.

Some remarks:

• To actually compute forces, the way in which the simulated particles interact has to bespecified via the potential energy V. These interactions are usually derived from experimentsor Quantum Chemistry simulations. To reduce computational cost, interaction potentials arechosen as simple as possible. In general, modern hardware is optimized for addition andmultiplication rather than square roots, exponential or trigonometric functions (50-100 timesslower than addition). Harmonic bonds for example are often used to model covalent bonds,as they only require one addition and two multiplication

• Usage of an interaction-distance cutoff in combination with long-range corrections orneighbour-lists, that keep track of molecules close by over multiple time-steps, are two examplesfor effective computational schemes that have been developed to reduce the computational costof force calculation to below O(N2).

• Due to finite time-steps ∆t and the finite computational accuracy of computers the computedtrajectory is always an approximation an never represents a real trajectory. However, inmost cases (symplectic algorithm, ergodic simulation and velocity-independent potentials)this trajectory has the same average properties as the real one.

• Simulations are started from an initial model, usually generated from idealized coordinates(f.i. perfect B-form). As this configuration is artificial, the system needs to change into aconfiguration in thermodynamic equilibrium. This process is called relaxation and its progressis usually assessed by the potential energy which has to stabilise. Depending on the simulation,this relaxation can take a significant amount of simulation time. Importantly only trajectoryframes after the relaxation phase can be used for averages!

Structure Predictions

Lattice based DNA Origami design rely on an abstracted 2D representation of the model’s topology.Design choices are carried out under the assumption of rigid helices composed of perfect B-formDNA. As this is hardly the case in reality designers use iterative design pipelines to finalise thedesign of a structure. They start with a base design, use experimental and computational measuresto validate or detect issues with their design and update accordingly. This cycle is repeated untilthe desired shape and functionality can be verified experimentally. While experimental methods likegel-electrophoresis and electron microscopy are indispensable as a final means of validation, theyare also time consuming and expensive. Computational structure prediction is fast and cheap andcan help a lot during the design process.

The trajectory of a simulation can be used to compute a variety of average quantities that canbe an invaluable resource in assessing a design. In this context we will focus on two geometricquantities that are easy to compute and interpret. While the trajectory itself might not represent

7

Page 8: FoPra - DNA Origami Design and Structure Prediction

a real chain of events, it is an ensemble of viable configurations for the model. Therefore, it will(if the simulation is among other things long enough) produce the same average structure as a realtrajectory. Importantly only trajectory frames after the relaxation phase can be used for averages.We can use this average structure to inspect the geometry of the design.

The trajectories can also be used to asses the flexibility of a structure, information that is extremelyhard to quantify experimentally. The typical measure used in literature is the Root Mean SquareFluctuation (also Root Mean Square Deviation from average over time).

RMSF =

√√√√ 1

N

N∑i=1

∆r2i (3)

where ∆ri is the deviation from the average position of particle i.

Coarse Grained Modelling

DNA Origami structure predictions provide a specific challenge for computer simulations. With upto 5 · 106 atoms per typical structure any simulation of all these atoms - including its surroundingsolvent molecules (usually 10 times! as many) - become very expensive and take too long for viableuse in iterative design. To solve this problem, simpler DNA models have been developed. In thesecoarse models, multiple atoms are combined into a single interaction site, drastically reducing thenumber of interactions to be calculated.

As an example, a DNA helix could be represented as a number of beads on a string. Each bead wouldcorrespond to a base pair, collapsing all theri atoms in a single interaction site. Connected beadscould interact via a harmonic potential to keep them firmly, modelling the backbone connectivity ofDNA. All others beads could repel each other via a Coulomb-like interaction term. In this examplethe Coulomb interaction would represent the electrostatic repulsion of the negatively chargedbackbone of DNA. Obviously, this bead-on-a-string model is an overly simplified representation,reducing its predictive value drastically. This simplified representation for DNA reduces the numberof interaction sites by two orders of magnitude. For a typical DNA Origami structure this reducesthe total amount of particles from 5 · 106 to around 5 · 103, resulting in a drastic speedup for thesimulation (103

2

!).

Striking the right balance between fast and accurate is the main challenge for developing these typesof model. In this FoPra you will be using two models - oxDNA and mrDNA - that are regularlyused in our lab to perform structure prediction of our designs.

oxDNA Model

In the oxDNA model [Doye et al. 2013] each DNA base is represented by two interactionssites. An outer site for modeling backbone connectivity and electrostatic repulsion and aninner site for hybridisation and helical structure. All DNA specific interactions are displayed inFigure 4. Understanding which properties are explicitly represented as an interaction and how it isimplemented is important for evaluating strengths and weaknesses of a coarse grained model.

As an example, let us investigate base stacking interactions in oxDNA. π-π orbital stacking ofadjacent bases has an important effect on the stability of DNA helices. In oxDNA this effect is

8

Page 9: FoPra - DNA Origami Design and Structure Prediction

Figure 4: Model overview and possible interactions. From [Doye et al. 2013]

captured by the potential Vstack that acts on the inner site of adjacent bases. It is implementedusing a Morse potential which is of the form:

Vstack(r) = ε ·(

1− e−a·(r−σ))2

(4)

where r is the distance and a, ε, and σ affect width, height and position of the potential well. Incontrast to a harmonic bond, the Morse potential allows for bond breaking, allowing bases to stackunstack.

Additionally Vcoaxialstack represents a weakened stacking interaction of bases that are not part ofthe same strand and therefore not covalently bonded. Without this second potential oxDNA wouldnot be able to accurately describe the destabilizing effect of - so called - nick sites, that are a veryfrequent motive in DNA Origami designs.

In addition to the interactions shown in the figure, oxDNA features a volume exclusion potentialfor all non-bonded sites, to avoid overlaps and an electrostatic interaction of non-bonded backbonesites. As of oxDNA2 [Snodin et al. 2015] this interaction is based on the Debye-Huckel model.To experimentally stabilise DNA nanostructures it is important to shield electrostatic repulsion ofthe negatively charged DNA helices. Without enough salt, helices would not pack densely enoughto form a stable structure, while with to much salt structures will form huge aggregates. TheDebye-Huckel model allows for screened electrostatic interactions depended on salt concentrationand temperature, replicating this important aspect of experimental setups.

Compared to all atom MD simulations oxDNA not only drastically reduces the amount of interactionsites, but also reflects electrostatic effects of the solution without requiring the explicit explicitsimulation of water molecules and ions. Simulations with oxDNA can be run on GPU, allowingfor structure predictions of DNA Origami with up to 10.000 base pairs within a couple of hours.However, relaxation of structures that are far away from the final configuration can take extremelylong to resolve. Structures that consist of multiple domains (f.e. a corner designed in caDNAno,composed of two connected arms) have to be prealigned to remain feasible.

9

Page 10: FoPra - DNA Origami Design and Structure Prediction

mrDNA Model

In the early stages of relaxation towards a solution structure hardly any detail is necessary. For ourexample of a corner structure it would be sufficient to start a simulation of two rigid corner-armsand only transition to a higher resolution once these arms are properly aligned. In a simulation ofthis type not one, but multiple coarse models are used in succession. Each succeeding model is lesscoarse than the previous.

Figure 5: Gradual increase of granularity by increasing the number of interaction sites. From [Maffeoand Aksimentiev 2020]

Multi-Resolution-DNA developed by C. Maffeo [Maffeo and Aksimentiev 2020] expands on the ideaof the bead model presented in the introduction. At first, multiple base-pairs are representedby beads on a string with extremely simplified interaction potentials. However, as the simulationprogresses the granularity of the model is increased by introducing more beads, while also increasingthe complexity of their interactions. The closer the simulation gets to the final prediction, the lesscoarse the model becomes. The bead model used with mrDNA is also optimized for GPU, furtherspeeding up the simulation time. As one of several options, the simulation can be concluded withan oxDNA allowing maximum accuracy in short time.

This approach drastically shortens the relaxation time of the design and enables the relaxation ofdesigns that would not relax reasonably well in oxDNA alone. Therefore, the oxDNA simulationrequires only as many steps as are necessary for proper averaging.

10

Page 11: FoPra - DNA Origami Design and Structure Prediction

3.2 Exercise 2: Structure Prediction - Experimental Data

In this exercise you will compare experimental cryo-EM data of a small DNA origami structure toa finished structure prediction of its design.

As experimental cryo-EM data is hard to analyse on its own, you will use the structure predictionas a reference to describe the global shape of the structure and its deviations from the idealizedlattice pattern. Furthermore, you will learn about causes for varying signal intensity and resolutionof the experimental data. Finally, as is the case with all structure predictions, you will discussthe accuracy of the used prediction model and its strengths and weaknesses In the process you willinstall and use the Visualization Tool UCSF ChimeraX to investigate both cryo-EM volumetric data(mrc) and atomic model files (pdb/cif).

Prerequisites

Download the latest version of UCSF ChimeraX https://www.cgl.ucsf.edu/chimerax and installit on your system.

For this exercise you will not have create any data on your own. All files required for this exercisecan be found in the folder FoPra folder: FoPra/StructurePrediction-2/ .

Instructions

1. Bullet Design:

Use CaDNAno to open the design file bullet.json.

The bullet is a small, compact 16 Helix bundle, designed as a projectile for constrainedBrownian motion inside a tube-like assembly called barrel by P. Stommer. It is designedfrom a scaffold of approximately 3000 bases. To ensure, that the bullet does not get stuck inthe tube its design balances compactness and rigidity. In this version a 10 helix bundle is usedas the core of the structure. Additionally, three shorter helical pairs were added to the top ofthe design. These helical pairs are called protrusions and are part of a click mechanism thatallows the insertion of the bullet into the barrel.

Have a look at the design, and see if you can identify the parts mentioned above. Try tovisualize how the final assembly of such a design should look like before looking at the otherdata. You should be familiar with CaDNAno from exercise 1.

2. Structure Prediction:

Provided Data: Files can be opened with ChimeraX and the data has been aligned for you.

bullet-prediction.cif : atomic model generated from oxDNA structure prediction

bullet-prediction.mrc: artificial electron density map generated from this atomicmodel at a resolution of 10A (average experimental resolution)

Start ChimeraX and open the two files. You can do this by either right-click + ”open with...”or by drag and drop into ChimeraX. ChimeraX is comprised of 5 panels: QuickActions (top),RenderView (left, big), Log (upper-right), ModelViewer and VolumeViewer (lower-right).

11

Page 12: FoPra - DNA Origami Design and Structure Prediction

In the ModelViewer you can select, hide/show and recolor your each open model. Modelsthat are not hidden are displayed in the RenderView.

The VolumeViewer is only visible if at least one .mrc file is open. It displays detailedinformation about the volumetric data set and allows changing the visualization settings. Forthe Bullet we recommend using an isosurface (default) at a level around 0.06 and step=1.Make the isosurface transparent to reveal the atomic model beneath it by clicking the colorbox and setting the opacity to 75%.

QuickActions can be used to change the appearance of the selected model in theRenderView or change the general setting of the render engine, like background color orlighting. We prefer the ”chain” colored ”sphere” representation for atomic model.

Adjust and experiment with the settings and look at the models each separately and together.

Discuss with your colleagues, how the structure prediction deviates from your expectationsfrom the design.

Use the generated cryo-Em map to investigate which level of detail you are able to see for thisideal 10A map. Can you identify the grooving of B-form DNA? Do you still see individualbases or even atoms? Are you able to identify crossovers?

3. Experimental Cryo-EM map:

Provided Data: File can be opened with ChimeraX and the data has been aligned with theprediciton of the previous step.

bullet-experimental.mrc: experimental electron density map measured by cryo/EM

Open the experimental cryo-EM map bullet-experimental.mrc, adjust the visuals andcompare it to the structure prediction. Hiding one model or changing the opacity mightbe helpful.

Varying the isosurface threshold for the experimental map has a much more significant effectthan for the generated map. Some parts of the structure will only be visible for a verylow threshold while others will be obscured if the threshold is too high. Discuss with yourcolleagues why the experimental map is not as uniform as the generated one. Which parts areeffected most and why?

Compare how well the simulated model predicts the experimental map. Discuss the differencesbetween experiment and prediction. How good is the predicted model? Which parts does itpredict nicely and where is it less perfect? You can use the CaDNAno design and the modelto better understand the data.

3.3 Report 2: Structure Prediction - Experimental Data

As you have seen during the exercise, the Bullet shows some clear deviations from the ideal shapeintended by the design. It turned out in the actual experiments, that this version of the Bullet wasnot suitable as a projectile. Insights from the analysis of structure predictions and experimentalcryo-EM data help identify some of the issues. The designer of this structure used this informationto design an improved variant that did not suffer the same deformations. With his updated designit was possible complete the project.

For this exercise, report your observations when comparing experimental data and structureprediction results. Also, focus on challenges when working with experimental data and answerthe following questions.

12

Page 13: FoPra - DNA Origami Design and Structure Prediction

• Explain the deviations of the structure prediction from the ideal lattice. Focus specifically atthe structure’s edges and use the design file as reference. Add screenshots of areas you focuson to your report.

Compare oxDNA model and experimental map. Vary the isosurface threshold of the experimentalmap to get a better impression:

• Does the simulation give a good overall prediction?

• Is the prediction equally twisted and deformed as the experimental map?

• One part of the structure is not visible in the experimental data, except for a very highthreshold. Which part is it and why? (use the design file for clues)

Compare map generated from oxDNA model and experimental map:

• The oxDNA map was generated at the overall resolution of the experimental data. whichparts of the experimental data appear to have a better/worse resolution?

• BONUS: Why is the resolution of the experimental map not uniform? (give at least 3 reasons)

13

Page 14: FoPra - DNA Origami Design and Structure Prediction

3.4 Exercise 3: Structure Prediction - Iterative Design

In this exercise you will use mrDNA and oxDNA to make a structure prediction of your cornerdesign from exercise 1.

As the simulation will take approximately 60-90 minutes on our compute-nodes, make sure to startthe simulation before you work on exercise 2 and preferably even before the break. Your task willbe to use the structure prediction to verify the design goals of exercise 1 and assess your overalldesign. In the process you will learn some command-line basics allowing you to set up the simulationyourself and check your simulation’s progress.

Prerequisites

If you are not on site and participating in the FoPra remotely, you need to setup a VPN connectionto Dietzlab. Credentials and instructions are available from your supervisor upon request for remoteparticipation.

If you are not familiar with basic UNIX Commands (cd, ls, cp) use one of the many available internetresources to learn how to use a command-line interface (shell). You need to be able to move betweenfolder, show folder contents and copy files.

Simulations will be performed on the Dietzlab Cluster. All necessary software has already beeninstalled and a specific workspace is set up for you. You will need an SSH-client to connect tothe Dietzlab cluster. Linux/Unix and Mac usually provide one with the base installation. ForWindows10 please install OpenSSH, while on older Windows versions install the open source toolPUTTY. To get the most out of your limited time at the FoPra, please setup and test your systembefore the Fopra. Read basic instructions on how to connect your computer to a Linux server (ssh)and how transfer data (scp). PUTTY equivalents are called plink and pscp. Your supervisor willprovide you with access credentials at the start of the experiments. Your access to the Cluster islimited to the duration of the FoPra.

OxDNA-Viewer is a powerful visualization tool for oxDNA simulations. It is browser based and freeto download or directly usable at https://sulcgroup.github.io/oxdna-viewer/. It does onlyfully support Mozilla Firefox and Google Chrome.

Instructions

1. Design Upload:

Transfer your design to the Dietzlab Cluster using Terminal on Linux/Unix or PowerShell onWindows. The exact command will vary depending on your system. On Linux/Unix, while inthe directory containing your design, the command will look like this (words in all caps haveto be replaced with your design-name and your credentials:

$ scp NAME. j son ’USER@CLUSTER: / home/ fopra / ’

After entering the password your file NAME.json will be transferred to the FoPra folder onthe Dietzlab Cluster.

14

Page 15: FoPra - DNA Origami Design and Structure Prediction

2. Connect to Dietzlab Cluster:

You can connect to the Dietzlab Cluster using the SSH-client. If you are not on-site you willneed to establish a VPN connection first.

$ ssh USER@CLUSTER

After entering the password you will be located in the FoPra home-directory /home/fopra/.Executing just cd without any target always returns you here.

3. Start simulation:

Starting a mrDNA structure prediction is simple. First switch to the appropriate virtualpython environment, where everything is set up for you by executing:

$ conda a c t i v a t e mrdna

If successful (mrdna) should be displayed at the start of your prompt. Now you can start yoursimulation by executing the following command:

$ nohup mrdna −d NAME −g 0 −−coarse−s t ep s 1e7 −−f i n e−s t ep s 5e6−−oxdna−s t ep s 1e6 NAME. j son > NAME. log &

The chosen settings have been optimized for this FoPra. You can copy past this to your console,but make sure you replace NAME with the appropriate name three times. You can change theGPU (-g 0) to ensure that your simulation is as fast as possible (0,1,2,3 are available, pleasecoordinate with your colleagues). A detailed explanation of this command can be found atthe end of this chapter.

4. Check progress:

If you want to check the progress of your simulation, go to your folder and display the contentsof your log file.

$ l e s s NAME. log

You can scroll by pressing space and exit with q.

Another great way of keeping track of your simulation is using the tail command.

$ t a i l −f NAME. log

The command tail always displays the last few lines of a file. With the -f flag (follow), everynew line will also be printed, allowing you to keep track in real time. Press CTRL + c to exit.

If your job fails with an error, the log file can help you identify what went wrong. Maybe itis just a missing file or a typo you can fix yourself.

5. Analysis Preparation:

Once the simulation is complete the data has to be processed to make it suitable for analysis.Use the OxdnaAnalysisTools available at https://github.com/sulcgroup/ compute anaverage final structure as well as the Root-Mean-Squared-Fluctuations (RMSF) of thestructure.

Theses python tools have already been installed on the cluster for you. To use themnavigate into your simulation folder, activate the mrdna environment and execute the followingcommand (replace NAME):

$ python compute mean . py −f oxDNA −o NAME−oxdna−average . oxdna−d NAME−oxdna−rmsf . dat output /NAME−oxdna . dat

15

Page 16: FoPra - DNA Origami Design and Structure Prediction

This script computes the mean structure of your simulation (output/NAME-oxdna.dat). Otheroptions are: -f specifies the format, -o the name of the average structure file, and -d name ofthe fluctuations file.

6. Download from Cluster:

Analysis of your simulation data will be performed on your own machine. To retrieve thesimulation data from the Dietzlab cluster you will again use the shell. Final simulation results(all files starting with NAME-oxdna) are sufficient for your analysis.

$ scp ’USER@CLUSTER: / home/ fopra /NAME/NAME−oxdna ∗ ’ ˜/Downloads/

7. Analysis:

Open the OxDNA-Viewer in your browser, files can be added via drag and drop. Select thefollowing files and add them together

• NAME-oxdna-average.oxdna

• NAME-oxdna.top

• NAME-oxdna-rmsf.dat

The topology (.top) stores the connectivity, the configuration (.oxdna) the base positions andorientation. Here we use an average configuration to represent the structure. By addingfluctuation data (-rmsf.dat) we can asses which parts of the structure deviate most from ouraverage configuration throughout the oxDNA simulation. OxDNA-Viewer colors all basesaccording to the RMSF.

Use the left mouse button to rotate, the right mouse button to move and the wheel to zoom.Investigate your structure prediction and analyse your results. OxDNA-Viewer also providesa Screenshot button and a variety of other tools you will not need.

Additional Information: mrdna command

$ nohup mrdna −d NAME −g 0 −−coarse−s t ep s 1e7 −−f i n e−s t ep s 5e6−−oxdna−s t ep s 1e6 NAME. j son > NAME. log &

• nohup: usually a job is connected to your shell. If you disconnect, the job stops too. Usingthe command nohup detached your job from the shell.

• mrdna: command for starting structure prediction run. if you want to know all possibleinputs for mrDNA type mrdna -h. In our case the following input is passed:

• -d NAME: name of the output directory. You do not need to create it.

• -g 0: GPU used for the simulation. On this Node of the Dietzlab Cluster 0,1,2,3 are available.

• –coarse-steps 1e7: number of steps for the coarse simulation step. Here 10 Million.

• –fine-steps 5e6: number of steps for the fine simulation step. Here 1 Million.

• –oxdna-steps 1e6: number of steps for the final oxDNA simulation. Here 1 Million.

• NAME.json: Your cadnano design file. mrdna is compatible with a variety of different formatsand will detect automatically what to do.

16

Page 17: FoPra - DNA Origami Design and Structure Prediction

• > NAME.log: as nohup detached the job from your shell, the standard output will nolonger be directed to your screen. ¿ catches the simulation output and writes a log-file calledNAME.log. You can use this log-file to check the progress of your simulation.

• &: the ampersand pushes the current job to the background to ensure that you can continueusing your shell while the simulation is running.

3.5 Report 3: Structure Prediction - Iterative Design

For the report of this exercise focus on how structure prediction of DNA origami can help you inthe design process of your DNA origami structure. Make sure to answer the following questions,but also include short descriptions of problems and potential solutions of your specific design.

Give a qualitative overview of your design and structure prediction.

• What turned out as expected? Which part of the structure turned out differently?

• Which part of the structure deviated the most from your expectation?

• Estimate the angle of your corner (f.i. measure the angle using a screenshot of OxDNA-Viewer.How much does it deviate from the desired value?

• Your design is still just a prediction. How confident are you in this prediction and why?

Add an image of your average structure colored by RMSF (using sulcgroup-tools and oxDNA-viewerto the report and use it to address the following questions.

• Which parts of your structure are most flexible and why?

• Use the absolute values of the RMSF to discuss, how strongly the angle of your design canfluctuate.

• BONUS: How could you improve your design to get a more precise and/or more rigid corner?

17

Page 18: FoPra - DNA Origami Design and Structure Prediction

4 Protocol Requirements

The Protocol should explain what you learned in the lab course in a precise and scientific manner.We would therefore prefer that you adhere to some style guidelines.

4.1 Title Page

Include your names, the enrollment number of all members, your team number, your degree program(biochemistry/physics) and your e-mail adresses.

4.2 Introduction (0.5-1 pages)

Briefly describe DNA origami and the used simulation tools. Explain the similarities and differencesbetween cando, oxDNA and mrDNA. What are some advantages and disadvantages?

4.3 Results (1-3 pages)

Describe the results from the different simulations you ran.

References

[1] Carlos Ernesto Castro et al. “A primer to scaffolded DNA origami”. In: Nature Methods 8.3(2011), pp. 221–229.

[2] Shawn M. Douglas et al. “Self-assembly of DNA into nanoscale three-dimensional shapes”. In:Nature 459.7245 (2009), pp. 414–418. doi: 10.1038/nature08016. url: https://doi.org/10.1038/nature08016.

[3] Jonathan P. K. Doye et al. “Coarse-graining DNA for simulations of DNA nanotechnology”. In:Phys. Chem. Chem. Phys. 15.47 (2013), p. 20395. issn: 1463-9076. doi: 10.1039/c3cp53545b.url: http://xlink.rsc.org/?DOI=c3cp53545b.

[4] Christopher Maffeo and Aleksei Aksimentiev. “MrDNA: a multi-resolution model for predictingthe structure and dynamics of DNA systems”. In: Nucleic Acids Res. 48.9 (May 2020),pp. 5135–5146. issn: 0305-1048. doi: 10.1093/nar/gkaa200. url: https://academic.oup.com/nar/article/48/9/5135/5814051.

[5] Paul W. K. Rothemund. “Folding DNA to create nanoscale shapes and patterns”. In: Nature440.7082 (2006), pp. 297–302.

[6] Nadrian C. Seeman. “DNA in a material world”. In: Nature 421.6921 (2003), pp. 427–431.

[7] Benedict E. K. Snodin et al. “Introducing improved structural properties and salt dependenceinto a coarse-grained model of DNA”. In: J. Chem. Phys. 142.23 (June 2015), p. 234901. issn:0021-9606. doi: 10.1063/1.4921957. arXiv: 1504.00821. url: http://aip.scitation.org/doi/10.1063/1.4921957.

18

Page 19: FoPra - DNA Origami Design and Structure Prediction

[8] Jean-Philippe J. Sobczak et al. “Rapid Folding of DNA into Nanoscale Shapes at ConstantTemperature”. In: Science 338.6113 (2012), pp. 1458–1461. doi: 10.1126/science.1229919.

19