git: the lean, mean, distributed machine

139
Chris Wanstrath http://defunkt.github.com hi everyone, i’m chris wanstrath. how many people here use git?

Upload: err

Post on 17-May-2015

19.037 views

Category:

Business


4 download

TRANSCRIPT

Page 1: Git: The Lean, Mean, Distributed Machine

Chris Wanstrath

http://defunkt.github.com

hi everyone, i’m chris wanstrath.

how many people here use git?

Page 2: Git: The Lean, Mean, Distributed Machine

i play guitar

i have a schecter classic similar to this. mine is prettier.

Page 3: Git: The Lean, Mean, Distributed Machine

i’m from cincinnati ohio

Page 4: Git: The Lean, Mean, Distributed Machine

but live in san francisco

Page 5: Git: The Lean, Mean, Distributed Machine

i started as a lowly paid consultant

Page 6: Git: The Lean, Mean, Distributed Machine

Then I worked at CNET

then worked at CNET for a few years

Page 7: Git: The Lean, Mean, Distributed Machine

Then I worked at CNET

(which is now owned by CBS)

Page 8: Git: The Lean, Mean, Distributed Machine

Then I was a highly paid consultant

after that i became a highly paid consultant

Page 9: Git: The Lean, Mean, Distributed Machine

before co-founding github

Page 10: Git: The Lean, Mean, Distributed Machine

gitThe Lean Mean Distributed Machine

By Chris Wanstrathanyway, i want to talk a bit about git today

Page 11: Git: The Lean, Mean, Distributed Machine

if we’re going to talk about git, we need to start by talking about source control management(SCM)

or version control

Page 12: Git: The Lean, Mean, Distributed Machine

basically version control is like wikipedia

Page 13: Git: The Lean, Mean, Distributed Machine

for your code

Page 14: Git: The Lean, Mean, Distributed Machine

you use it to see what changes others made

Page 15: Git: The Lean, Mean, Distributed Machine

inspect those changes

Page 16: Git: The Lean, Mean, Distributed Machine

and contribute your own

Page 17: Git: The Lean, Mean, Distributed Machine

gitwho uses ?token promo slide

Page 18: Git: The Lean, Mean, Distributed Machine

these companies use git

Page 19: Git: The Lean, Mean, Distributed Machine

and so do these open source projects

Page 20: Git: The Lean, Mean, Distributed Machine

let’s briefly go through the history of open source SCMs

(at least the ones we care about)

Page 21: Git: The Lean, Mean, Distributed Machine

Revision Control System (RCS) was written in the early 80s and used to store history on a file by file basis. A directory of source code could contain many RCS repositories, each concerning itself with a single file.

it’s like a hut - very basic, primitive even, but works great when all you need is shelter

Later that decade, a professor began working with two grad students on a C compiler (in the name of scholarly pursuits). As they began using RCS, the professor noted a number of limitations. It was difficult to share files, and even more difficult to share entire projects.

Page 22: Git: The Lean, Mean, Distributed Machine

So, they wrote CVS - the Concurrent Versioned System - and released it as open source in the early 1990s. Concurrent because it allowed multiple individuals to collaborate on a project together, without stepping on each other's toes, and versioned system because it was initially a collection of RCS repositories with network awareness.

CVS was like a cabin. better than a hut, but still pretty crappy

it worked well for a while, but there were limitations. Dealing with directories was difficult, and much different than the way one would normally deal with directories in Unix.

Page 23: Git: The Lean, Mean, Distributed Machine

Ten years later (see a pattern?) a new revision control system was released, called Subversion (or SVN). Subversion was intended to replace CVS by improving on CVS. History, directories, deletions, and other CVS warts were fixed. mod_dav integration was included, as well as anonymous checkout. (Anonymous checkout in CVS was literally a hack added by the OpenBSD.)

Subversion was not subversive, but it did work well enough. Many felt it a welcome relief and hurriedly switched. Big, lumbering organizations spent years converting their repositories to SVN. IDEs and editors included Subversion integration.

it was like a house, same idea but much better than a cabin

Page 24: Git: The Lean, Mean, Distributed Machine

Committer Committer Committer

Server

this is the rcs / cvs / svn model.

Page 25: Git: The Lean, Mean, Distributed Machine

Committer Committer Committer

Server

someone commits to the server

Page 26: Git: The Lean, Mean, Distributed Machine

Committer Committer Committer

Server

and everyone else pulls down the changes

Page 27: Git: The Lean, Mean, Distributed Machine

this is bad

why?

Page 28: Git: The Lean, Mean, Distributed Machine

Server

first off, the server is the babysitter

Page 29: Git: The Lean, Mean, Distributed Machine

SVN’s down!

“The Subversion server’s down”

you can’t do anything without the server’s permission

Page 30: Git: The Lean, Mean, Distributed Machine

second, low visibility into your coworkers and subordinates’ activity.

Page 31: Git: The Lean, Mean, Distributed Machine

at cnet, we used bugzilla. it was awesome (as you can see)

Page 32: Git: The Lean, Mean, Distributed Machine

we’d also get diffs emailed to us after each commit

Page 33: Git: The Lean, Mean, Distributed Machine

because our group was large, and fluid, i’d often get commits emailed to me i didnt care about

Page 34: Git: The Lean, Mean, Distributed Machine

or understand

Page 35: Git: The Lean, Mean, Distributed Machine

this meant i spent extra time throwing away junk

Page 36: Git: The Lean, Mean, Distributed Machine

emails that come to me should be for me

Page 37: Git: The Lean, Mean, Distributed Machine

Trac

if you’re lucky, you use trac to watch what everyone is doing.

it has rss and is a bit smarter.

usually you have to set it up yourself

Page 38: Git: The Lean, Mean, Distributed Machine

another problem: your subversion workflow is single threaded

it’s hard to stop working on a feature to quickly fix a bug without losing your feature’s changes

changed files are either committed or discarded

well, not entirely true...

Page 39: Git: The Lean, Mean, Distributed Machine

you could always use a branch

but that sucks, and big changes are usually disasters

i once worked on a project for 3 months where we worked on a massive new feature

Page 40: Git: The Lean, Mean, Distributed Machine

as it neared completion, we had a big meeting

Page 41: Git: The Lean, Mean, Distributed Machine

a 2 hour meeting

Page 42: Git: The Lean, Mean, Distributed Machine

there we decided how to merge in the changes from our branch to trunk

it was one of the worst meetings ever

Page 43: Git: The Lean, Mean, Distributed Machine

and i’ve been in some bad meetings

Page 44: Git: The Lean, Mean, Distributed Machine

one person was assigned with the task of merging the branch and trunk

Page 45: Git: The Lean, Mean, Distributed Machine

he was merging and fixing bugs in code he did not write

Page 46: Git: The Lean, Mean, Distributed Machine

or understand

Page 47: Git: The Lean, Mean, Distributed Machine

it did not go well

Page 48: Git: The Lean, Mean, Distributed Machine

another problem with the ‘babysitter’ model is that experimentation is difficult

Page 49: Git: The Lean, Mean, Distributed Machine

all your experiments are public

Page 50: Git: The Lean, Mean, Distributed Machine

everyone sees everything you commit

Page 51: Git: The Lean, Mean, Distributed Machine

solution? don’t commit

at least, that was my solution

Page 52: Git: The Lean, Mean, Distributed Machine

and that’s just work stuff

the centralized model, when applied to open source, is a huge pain

it’s difficult to maintain patched versions of open source independently

if a project dies on the internet, does anyone care?

Page 53: Git: The Lean, Mean, Distributed Machine

what’s the answer?

dvcs! git! happy!

Page 54: Git: The Lean, Mean, Distributed Machine

git was started by linus torvalds

a DVCS built as an efficient content addressable file system

Page 55: Git: The Lean, Mean, Distributed Machine

the guy who started the linux kernel

Page 56: Git: The Lean, Mean, Distributed Machine

if rcs is a hut

Page 57: Git: The Lean, Mean, Distributed Machine

cvs is a cabin

Page 58: Git: The Lean, Mean, Distributed Machine

and subversion is a house

Page 59: Git: The Lean, Mean, Distributed Machine

git is a castle

Page 60: Git: The Lean, Mean, Distributed Machine

or a ninja

Page 61: Git: The Lean, Mean, Distributed Machine

or shaq

Page 62: Git: The Lean, Mean, Distributed Machine

Committer Committer Committer

Server

the thing that makes git, and all distributed version control systems like it different, is the idea that it’s “distributed”

take this centralized model

Page 63: Git: The Lean, Mean, Distributed Machine

Server

Committer CommitterCommitter

and make every copy of the code its own, full fledged repository

any copy can accept or create commits. anyone can pull commits from any copy.

Page 64: Git: The Lean, Mean, Distributed Machine

Server

Committer CommitterCommitter

now you can push to the server

Page 65: Git: The Lean, Mean, Distributed Machine

Server

Committer CommitterCommitter

or committers can push and pull from each other

Page 66: Git: The Lean, Mean, Distributed Machine

Committer

Server

instead of “checking out” code

Page 67: Git: The Lean, Mean, Distributed Machine

Server

Committer

you “clone” or copy a repository

Page 68: Git: The Lean, Mean, Distributed Machine

if github explodes, you don’t lose any code

Page 69: Git: The Lean, Mean, Distributed Machine

Server

Committer

ServerServer

in fact, because you have a full copy of your repository at all times, you don’t need to tie yourself to a single remote repository

Page 70: Git: The Lean, Mean, Distributed Machine

you can push to as many servers as you want

Page 71: Git: The Lean, Mean, Distributed Machine

another word for ‘clone’ or ‘copy’ is ‘fork’

Page 72: Git: The Lean, Mean, Distributed Machine

it may seem like anarchy at first

but sane and useful workflows have evolved

Page 73: Git: The Lean, Mean, Distributed Machine

#1

in fact, the first workflow i want to talk about is called Anarchy

Page 74: Git: The Lean, Mean, Distributed Machine

Server

Committer CommitterCommitter

you remove the server

Page 75: Git: The Lean, Mean, Distributed Machine

Committer CommitterCommitter

then make everyone a peer

Page 76: Git: The Lean, Mean, Distributed Machine

Committer

CommitterCommitter

then take away commit access from each other

Page 77: Git: The Lean, Mean, Distributed Machine

Coder

CoderCoder

and you end up with repositories floating in the void

this is how the internet works, or how git works by default

Page 78: Git: The Lean, Mean, Distributed Machine

Coder

CoderCoder

everyone pushes and pulls from each other, managing their own version of the code

for small projects with very few contributors, it works fine

and it would work great on small, experimental projects inside of any organization

(you just need a place to publish your changes)

Page 79: Git: The Lean, Mean, Distributed Machine

an example of this, let’s say i was on github and i wanted to add a patch to schacon’s ticgit

Page 80: Git: The Lean, Mean, Distributed Machine

i’d click the fork button

Page 81: Git: The Lean, Mean, Distributed Machine

i now have a copy of schacon’s ticgit called defunkt’s ticgit

now i can make changes and ask scott to check them out. if he likes them, he’ll merge them in

if he doesn’t like them, oh well. i can still use them in my project and keep up to date with his changes.

someone can come along and fork from me if they want, too

Page 82: Git: The Lean, Mean, Distributed Machine

#2blessed

but anarchy doesn’t scale

the second workflow is called Blessed

Page 83: Git: The Lean, Mean, Distributed Machine

Coder

Coder

CoderCoder

the blessed workflow has the same basic idea as Anarchy

Page 84: Git: The Lean, Mean, Distributed Machine

Blessed

Coder

CoderCoder

but one of the repositories is the Blessed repo

Page 85: Git: The Lean, Mean, Distributed Machine

Blessed

Coder

CoderCoder

everyone takes their cues from the blessed repository

its development is considered the mainline, or trunk

deploys and packages are pushed from the blessed repo

Page 86: Git: The Lean, Mean, Distributed Machine

Blessed

Coder

CoderCoder

others can still push and pull from each other, remember

Page 87: Git: The Lean, Mean, Distributed Machine

in the business world this works great for dealing with contractors

Page 88: Git: The Lean, Mean, Distributed Machine

Blessed

Coder

CoderCoder

don’t give them push access, just pull access

Page 89: Git: The Lean, Mean, Distributed Machine

Blessed

Coder

they pull down your code, make their changes, then tell you when the changes are ready

Page 90: Git: The Lean, Mean, Distributed Machine

Blessed

Coder

if you like what you see, you merge in the contractor’s changes

the contractor never has direct write access to your company’s code

Page 91: Git: The Lean, Mean, Distributed Machine

if there were a bunch of us working on ticgit, scott’s may be the Blessed repository

he started the project and is in charge of merging in all changes. we all watch his changes

Page 92: Git: The Lean, Mean, Distributed Machine

this is how rails works

rails/rails is the Blessed repository, from which the gems are built and david controls.

we all follow this repo’s development and treat it as “official” by convention only

Page 93: Git: The Lean, Mean, Distributed Machine

this is also how rentzsch’s click to flash works

click to flash is an amazing safari plugin that disables flash, similar to the firefox extension

Page 94: Git: The Lean, Mean, Distributed Machine

it was forked from google code and has been given a life of its own on github, under rentzsch’s guidance. contributors fork his repository and he merges in good changes.

the plugin’s development has been a perfect example of how distributed version control puts the power in the hands of the developer, not the server

Page 95: Git: The Lean, Mean, Distributed Machine

#3lieutenant

the next workflow is called ‘lieutenant’ - great for massive projects, like the kernel

Page 96: Git: The Lean, Mean, Distributed Machine

Blessed

Coder

LieutenantLieutenant

CoderCoder

Coder

in this model, there is a blessed repository and a few designated lieutenants

the lieutenants are people trusted by the blessed repository

Page 97: Git: The Lean, Mean, Distributed Machine

Blessed

LieutenantLieutenant

CoderCoder

Coder

Coder

coders will pull from a lieutenant, make their changes, then request the lieutenant merge in their changes

Page 98: Git: The Lean, Mean, Distributed Machine

Blessed

LieutenantLieutenant

CoderCoder

Coder

Coder

lieutenants are usually in charge of a specific subsystem or part of the large system

if they like the change, they will pull it in

Page 99: Git: The Lean, Mean, Distributed Machine

Blessed

LieutenantLieutenant

CoderCoder

Coder

Coder

they’ll then inform the blessed repository that they have changes which need to be merged in

the blessed repo, trusting the lieutenant, pulls in the changes

Page 100: Git: The Lean, Mean, Distributed Machine

Blessed

LieutenantLieutenant

CoderCoder

Coder

Coder

this is typically coordinated over a mailing list

Page 101: Git: The Lean, Mean, Distributed Machine

this is the kernel’s model

Page 102: Git: The Lean, Mean, Distributed Machine

#4centralized

finally the centralized model, one repository acts as the ‘server’

Page 103: Git: The Lean, Mean, Distributed Machine

Server

Committer CommitterCommitter

this mimics the traditional babysitter model

but you only need your babysitter to pull and push changes - branches and commits can still be created locally whenever

Page 104: Git: The Lean, Mean, Distributed Machine

Server

Committer CommitterCommitter

the server being down does not dramatically hamper your work

Page 105: Git: The Lean, Mean, Distributed Machine

Server

Committer CommitterCommitter

yet the flow stays mostly the same

Page 106: Git: The Lean, Mean, Distributed Machine

DeployServer

Committer CommitterCommitter

the central server, as in the old model, can also be used to deploy

Page 107: Git: The Lean, Mean, Distributed Machine

Production

Committer CommitterCommitter

Staging

and staging servers can easily be setup

Page 108: Git: The Lean, Mean, Distributed Machine

git isn’t all about distributed servers

in fact, one of the best parts about git is its branching support

Page 109: Git: The Lean, Mean, Distributed Machine

branches are local, incredibly lightweight, and easy to switch between

it’s easy to devote each branch to a single feature or bug

we call these ‘topic branches’

Page 110: Git: The Lean, Mean, Distributed Machine

$ git checkout -b bug_2342

from your working directory

you just made a new branch

Page 111: Git: The Lean, Mean, Distributed Machine

buckets of different things

because branches are so cheap, you can keep around buckets filled with experiments, new ideas, or new features

no one will ever see them unless you want them to be seen

Page 112: Git: The Lean, Mean, Distributed Machine

Production

Committer CommitterCommitter

Staging

this changes the staging server idea

Page 113: Git: The Lean, Mean, Distributed Machine

Production

Committer CommitterCommitter

Feature A Feature B

you may start to have topic staging servers where you boot up staging for a single branch and test out a new feature,

no more generic ‘staging’ branch - each person may even have their own staging server

Page 114: Git: The Lean, Mean, Distributed Machine

Coder

because every copy is its own repository, we are given the freedom to structure our workflows socially rather than technically

Page 115: Git: The Lean, Mean, Distributed Machine

Team A Team B

it’s easy to have multiple small teams, move people between projects, and monitor multiple projects

no need for one monolithic subversion server - git repositories are a breeze to setup

Page 116: Git: The Lean, Mean, Distributed Machine

with something like github, watching your team’s development is trivial.

do it with rss...

Page 117: Git: The Lean, Mean, Distributed Machine

or with a service you’re already comfortable using

integration with campfire, email, fogbugz, lighthouse, friendfeed, twitter, etc

Page 118: Git: The Lean, Mean, Distributed Machine

the site also lets you comment on commits, providing dead simple and effective code review

git and github are what we use in our private client work and on our own websites, as well as for our open source

Page 119: Git: The Lean, Mean, Distributed Machine

as far as git IDE support, the textmate ProjectPlus extension shows you the status of tracked files right in the drawer

Page 120: Git: The Lean, Mean, Distributed Machine

there’s also a git textmate bundle available on github

Page 121: Git: The Lean, Mean, Distributed Machine

if you’re an eclipse user, the egit plugin lets you commit to, manage, and track git repositories from within eclipse

it’s written using jgit, a pure-java implementation of git

Page 122: Git: The Lean, Mean, Distributed Machine

emacs people can use DVC which aims to provide a common interface for all distributed version control systems

there’s also a git mode

Page 123: Git: The Lean, Mean, Distributed Machine

or my person favorite, magit

Page 124: Git: The Lean, Mean, Distributed Machine

if you use os x, an open source program called GitX is under active development

Page 125: Git: The Lean, Mean, Distributed Machine

which is based on the cross platform Git-GUI

Page 126: Git: The Lean, Mean, Distributed Machine

for OS X there’s also GitNub which isn’t as actively developed

Page 127: Git: The Lean, Mean, Distributed Machine

as far as libraries go, a search for ‘git’ on github returns almost 3000 unique repositories

darcs or hg to git converters, git vim projects, git in .NET, even blogs and wikis based on git

Page 128: Git: The Lean, Mean, Distributed Machine

remember when i said git was a content addressable file system? well, it’s true

this is gist. it’s a git powered pastie

Page 129: Git: The Lean, Mean, Distributed Machine

you paste in code and share it with coworkers or friends

Page 130: Git: The Lean, Mean, Distributed Machine

but the best part are these clone URLs

i can check out a pastie i made, make changes, then push a new version

Page 131: Git: The Lean, Mean, Distributed Machine

these are the revisions

you can’t tell the difference between changes i made on the web and changes i made locally then pushed

Page 132: Git: The Lean, Mean, Distributed Machine

this kind of stuff is the future

imagine a distributed, versioned wiki or documentation project or book

a distributed, versioned bug tracker

a distributed, versioned chat application

Page 133: Git: The Lean, Mean, Distributed Machine

in fact, a number of books are already being written on github

Page 134: Git: The Lean, Mean, Distributed Machine

scott’s book is being translated right now

Page 135: Git: The Lean, Mean, Distributed Machine

a distributed, versioned everything

Page 136: Git: The Lean, Mean, Distributed Machine

http://git-scm.comthis has been a fairly basic overview of git

for more information, check out git-scm.com

Page 137: Git: The Lean, Mean, Distributed Machine

thanks questions?

Page 138: Git: The Lean, Mean, Distributed Machine

http://flickr.com/photos/dooleymtv/2660521051/http://flickr.com/photos/shankrad/219945665/http://flickr.com/photos/thomashawk/268524287/http://flickr.com/photos/drp/41370809/http://flickr.com/photos/keithmarshall/432924465/http://flickr.com/photos/ivviphotography/2555026221/http://flickr.com/photos/raybyrne/43604789/http://flickr.com/photos/robertlz/481261885/http://flickr.com/photos/petervanallen/774522321/http://flickr.com/photos/seantubridy/389310649/http://flickr.com/photos/millzero/705902956/http://flickr.com/photos/larskflem/95757299/http://flickr.com/photos/dc5dugg/2330760034/http://flickr.com/photos/will-lion/2629598996/http://flickr.com/photos/probablykat/377911954/http://flickr.com/photos/pixelbuffer/5375185/http://flickr.com/photos/gordonhamilton/2096653379/http://flickr.com/photos/poolie/2250698836/http://flickr.com/photos/doergn/425407884/http://flickr.com/photos/miscellaneous/531864821/http://flickr.com/photos/hchalkley/30724738/http://flickr.com/photos/absence-is-steel/419515663/http://flickr.com/photos/samiksha/408007916/http://flickr.com/photos/keithallison/3230928864/

http://flickr.com/photos/baonguyen/2557970434/http://flickr.com/photos/bocavermelha/66759796/http://flickr.com/photos/dark-o/408532292/http://flickr.com/photos/vividbreeze/480057824/http://flickr.com/photos/12693492@N04/1338136415/http://flickr.com/photos/theducks/2236111019/http://flickr.com/photos/pfenwick/2229585929/http://flickr.com/photos/41232123@N00/457119628/http://flickr.com/photos/pedacitosdemi/2346980097/http://flickr.com/photos/aleks/2759584309/http://flickr.com/photos/bigfrank/71691236/http://flickr.com/photos/chinapix/2757393280/http://flickr.com/photos/28960190@N03/2766239909/http://flickr.com/photos/68226666@N00/444253460/http://flickr.com/photos/chilledsalad/2335688523/http://flickr.com/photos/thefuzz90/444183705/http://flickr.com/photos/domk/291808114/http://flickr.com/photos/csb13/66558459/http://flickr.com/photos/kalhusoru/1346688882/http://flickr.com/photos/heather_shade/254728862/http://flickr.com/photos/ostromentsky/479147675/http://flickr.com/photos/flopper/1336613530/http://flickr.com/photos/jagelado/16631508/http://flickr.com/photos/ansy/2533871446/http://flickr.com/photos/cdm/54246114/http://flickr.com/photos/chibnall/2278573838/

flickr

Page 139: Git: The Lean, Mean, Distributed Machine

flickr

http://www.flickr.com/photos/andrer69/246847221/http://www.flickr.com/photos/willstotler/758104340/