quiz section week 5 april 26, 2018 - github pages · dictionarybasics d =...

35
Quiz Section Week 5 April 26, 2018 Review

Upload: others

Post on 27-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Quiz Section Week 5April 26, 2018

Review

Page 2: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Topics (not guaranteed to be comprehensive!)• Alignments

• Reasons to align sequences• Needleman-Wunsch algorithm• Smith-Waterman algorithm• Effects of parameter variation (including gap penalties)• Testing for statistical significance of an alignment

• Phylogenetic trees• Rooted and unrooted topologies• Defining the best tree with UPGMA and Neighbor Joining• Concept of parsimony• Fitch algorithm: quantifying how parsimonious a tree is, assigning internal states• Finding the most parsimonious tree: Hill climbing w/ Nearest-Neighbor interchanges• Bootstrapping to quantify confidence in tree partitions

• Clustering• Defining a clustering problem• Hierarchical clustering

• Impact of using single/complete/average linkage• K-means: Objective and algorithm

• Networks• Reasons to make and analyze networks• Basic network definitions• Dijkstra’s Algorithm• Network motifs and their uses

Page 3: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Topics (not guaranteed to be comprehensive!)• Programming

• Variables and types• String methods• List methods• Conditionals• Loops• Functions

Page 4: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Phylogenetic treesUPGMA/Neighbor Joining• Define the best tree: based on distance between leaves• Find the best tree using: polynomial time algorithm to construct the

best tree from a distance matrix

Parsimony approach• Define the best tree: Minimum # of mutations required to traverse

tree• Find the best tree: by enumerating all trees (exhaustive search), or by

heuristic approach like Nearest-Neighbor Interchange Hill-Climbing

Page 5: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Tree topologies

Are these the same tree?ABCD

BACD

BADC

Page 6: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Tree topologies

Are these the same tree?

How about these?

ABCD

BACD

BADC

Page 7: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Counting tree topologies

# of unrooted topologies = 3*5*7*…*(2N-5)# of branches = 2N-3

For N leaves

E.g. an unrooted tree with 6 nodes

BA

C

D

E

F How many different topologies?

Page 8: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Counting tree topologies

# of unrooted topologies = 3*5*7*…*(2N-5)# of branches = 2N-3

For N leaves

E.g. an unrooted tree with 6 nodes

BA

C

D

E

F How many different topologies?3*5*7 = 105

Page 9: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Counting tree topologies

# of unrooted topologies = 3*5*7*…*(2N-5)# of branches = 2N-3

For N leaves

E.g. an unrooted tree with 6 nodes

BA

C

D

E

F How many different topologies?3*5*7 = 105

How many branches?

Page 10: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Counting tree topologies

# of unrooted topologies = 3*5*7*…*(2N-5)# of branches = 2N-3

For N leaves

E.g. an unrooted tree with 6 nodes

BA

C

D

E

F How many different topologies?3*5*7 = 105

How many branches?2N-3 = 9

Page 11: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Counting tree topologies

# of unrooted topologies = 3*5*7*…*(2N-5)# of branches = 2N-3

For N leaves

E.g. an unrooted tree with 6 nodes

BA

C

D

E

F How many different topologies?3*5*7 = 105

How many branches?2N-3 = 9

Page 12: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

The root could be placed on any branchE.g. an unrooted tree with 6 nodes

BA

C

D

E

FHow many different topologies?3*5*7 = 105

How many branches?2N-3 = 9

BA

C

D

E

F

Page 13: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

The root could be placed on any branchE.g. an unrooted tree with 6 nodes

BA

C

D

E

FHow many different topologies?3*5*7 = 105

How many branches?2N-3 = 9

BA

C

D

E

F

ED

FC

ABTotal options = 105*9

= 945

Page 14: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

B

A

C

D

EFor each internal branchgenerate two variant trees that swap the relationships of the four outside branches

Nearest Neighbor Interchange trees

Page 15: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Nearest Neighbor Interchange trees

B

A

C

D

EFor each internal branchgenerate two variant trees that swap the relationships of the four outside branches

B

A

C

D

E

BA

D

CE

Page 16: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Nearest Neighbor Interchange trees

B

A

C

D

EFor each internal branchgenerate two variant trees that swap the relationships of the four outside branches

B

A

C

D

E

BA

D

CE

BA

E

DC

Page 17: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Nearest Neighbor Interchange trees

B

A

C

D

EFor each internal branchgenerate two variant trees that swap the relationships of the four outside branches

B

A

C

D

E

BA

D

CE

CA

B

D

E

CB

A

D

E

BA

E

DC

Page 18: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

NN Practice: Draw both interchanges from swapping this branch

BA

C

D

E

F

Page 19: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

NN Practice: Draw both interchanges from swapping this branch

BA

C

D

E

FCA

B

D

E

F

BC

A

D

E

F

Page 20: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Fitch algorithm practice

ED

FC

AB

Page 21: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Fitch algorithm practice: bottom-up

ED

FC

ABOr /Blk

Or/Blk/Bl

BlBl

Blk / Bl

Page 22: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Fitch algorithm practice: top-down

ED

FC

AB

BlkBl

BlBl

Bl

Page 23: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Hierarchical clustering with complete linkage example

A B C DA 0 1 2 4B 1 0 2 5C 2 2 0 5D 4 5 5 0

A B C DA,B C D

A,B 0 2 5C 2 0 5D 5 5 0

A,B,C DA,B,C 0 5D 5 0

0.5 1 2.5

Page 24: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Dijkstra’s Algorithm

Page 25: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Another data type: Dictionaries

• a data structure that consists of an unordered set of key: value pairs• think of as word: definition pairs!

https://docs.python.org/3/tutorial/datastructures.html

Q: How could we encode the entire genetic code?

Page 26: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Dictionaries: How could we encode the entire genetic code?

https://docs.python.org/3/tutorial/datastructures.html

>>> genetic_code = {"ATG": "Start", "TGA": "Stop", "TAG": "Stop"}>>> genetic_code["TAA"] = "Stop">>> genetic_code.get("TGA")'Stop'>>> genetic_code["TGA"]'Stop'>>> genetic_code.get("sss") #nothing or 'None' if not defined>>> genetic_code["sss"]KeyError: 'ttt'

Page 27: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Creating a dictionary

#create an empty dictionarymyDict = {}

#create a dictionary with three entriesmyDict = {"Curly":4123, "Larry":2057, "Moe":1122}

#add another entrymyDict["Shemp"] = 2232

#change Moe's phone numbermyDict["Moe"] = 4040

#delete Moe from dictionarydel myDict["Moe"]

Page 28: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Rules for dictionaries• The first item is a key.

• Each key can appear only once in a dict.

• A key must be an immutable object: number, string, or tuple.

• Lists cannot be keys (they are mutable).

• The key is the item you'll use to do look-ups.

• Each key is paired with a value.

Page 29: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Some useful dictionary methods>>> genetic_code.items()[('TAA', 'Stop'), ('TGA', 'Stop'), ('TAG', 'Stop'), ('ATG', 'Start')]>>> genetic_code.keys()['TAA', 'TGA', 'TAG', 'ATG']>>> genetic_code.values()['Stop', 'Stop', 'Stop', 'Start']

Page 30: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Another use of dictionaries: store counts of named elements

sequence = "GACCCT"nuc_counts = {'A': 0, 'C': 0, 'T':0, 'G': 0}for nuc in sequence:

#Add to the count for the given nucleotide

Example: Calculate # of each nucleotide in a sequence

Page 31: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Another common use of dictionaries: store counts of named elements

sequence = "GACCCT"nuc_counts = {'A': 0, 'C': 0, 'T':0, 'G': 0}for nuc in sequence:

nuc_counts[nuc] = nuc_counts[nuc] + 1

Calculate # of each nucleotide in a sequence

Page 32: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

Traversing a dictionary by key

# birthdays is a dictionary with names as keys# and birth dates as values

for person in birthdays.keys():print "Send", person, "a card on", birthdays[person]

Page 33: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the

dictionary basicsD = {'dna':'T','rna':'U'} # dictionary literal assignmentD = {} # make an empty dictionaryD.keys() # get the keys as a listD.values() # get the values as a listD['dna'] # get a value based on keyD['dna'] = 'T' # set a key:value pairdel D['dna'] # delete a key:value pair'dna' in D # True if key 'dna' is found in D, else False

The keys must be immutable objects (e.g. string, int, tuple).

The values can be anything (including a list or another dictionary).

The order of elements in the list returned by D.keys() or D.values() is arbitrary (effectively random).

Each key can be stored only once in the dictionary, so if you set the value for a key for a second time it OVERWRITES the old value!

Page 34: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the
Page 35: Quiz Section Week 5 April 26, 2018 - GitHub Pages · dictionarybasics D = {'dna':'T','rna':'U'}# dictionary literal assignment D = {} # make an empty dictionary D.keys() # get the