introduction to git for developers
DESCRIPTION
This is introduction to Git, distributed version control system. You will learn about git history, reasons behind its invention, design considerations, internal structure and see how to use git for your projects.TRANSCRIPT
git --introlenin.gillenin.gif
lenin.budet.git
HistoryHistory must be written of, by and for the survivors.
2
Brief history• Local only
– Open-source: SCCS (1972) · RCS (1982)– Proprietary: PVCS (1985)
• Client-server– Open-source: CVS (1990) · CVSNT (1998) · Subversion (2000)– Proprietary: Software Change Manager (1970s) · ClearCase (1992) · Visual
SourceSafe (1994) · Perforce (1995) · Team Foundation Server (2005)
• Distributed– Open-source: GNU arch (2001) · Darcs (2002) · DCVS (2002) · SVK (2003) ·
Monotone (2003) · Codeville (2005) · Git (2005) · Mercurial (2005) · Bazaar (2005) · Fossil (2007)
– Proprietary: TeamWare (1990s?) · Code Co-op (1997) · BitKeeper (1998) · Plastic SCM (2006)
3
Предпосылки
• Б льшую часть существования ядра Linux о́�(1991-2002) изменения вносились в код путем приёма патчей и архивирования версий. В 2002 году проект перешёл на проприетарную BitKeeper
• В 2005 отношения между сообществом разработчиков ядра Linux и владельцем BitKeeper испортились, и право бесплатного пользования продуктом было отменено
• Guess who started git? Linus Torvalds
4
Problems• Linus’ April 7, 2005 email:– “SCMs I've looked at make this hard. One of the things
(the main thing, in fact) I've been working at is to make that process really efficient.”
– “If it takes half a minute to apply a patch […] a series of 250 emails takes two hours”
– “When I say I hate CVS with a passion, I have to also say that if there are any SVN (Subversion) users in the audience, you might want to leave […] I see Subversion as being the most pointless project ever started”
– “The slogan of Subversion for a while was "CVS done right", or something like that, and if you start with that kind of slogan, there's nowhere you can go. There is no way to do CVS right”
5
“git” ?
• “I’m an egotistical bastard, and I name all my projects after myself. First Linux, now git.” – Linus
• git – (British) a foolish or worthless person– Examples of GIT• That git of a brother of yours has ruined everything!• <oh, don't be such a silly git, of course your mates want
you around>
6
PRINCIPLESLess is more.
7
Design criteria
1. Take CVS as an example of what not to do; if in doubt, make the exact opposite decision.
2. Support a distributed, BitKeeper-like workflow.
3. Very strong safeguards against corruption, either accidental or malicious
4. Very high performance
8
Characteristics• Non-linear development
– rapid branching and merging– specific tools for visualizing and navigating a non-linear
development history
• Distributed development– Like Darcs, BitKeeper, Mercurial, SVK, Bazaar and Monotone
• Compatibility with existing systems/protocols– Repositories can be published via HTTP, FTP, rsync, or a Git
protocol over either a plain socket or ssh– CVS server emulation– Subversion and svk repositories can be used directly with git-svn
9
Characteristics
• Efficiency– order of magnitude faster than some revision control
systems– fetching revision history from a locally stored repository
can be two orders of magnitude faster than fetching it from the remote server
– Git does not get slower as the project history grows larger
• Toolkit-based design– set of programs written in C– shell scripts that provide wrappers around those programs
10
INTERNALSNo man should marry until he has studied anatomy and dissected at least one woman. - Honore de Balzac
11
Storage model
• Subversion, CVS, Perforce, Mercurial are Delta Storage systems– store the differences between one commit and the
next– yes, mercurial is a delta-storage system
• Git is different– stores a snapshot of what all the files in your project
look like in this tree structure each time you commit.
12
Everything has hash• All the information needed to represent the history of a
project is stored in files referenced by a 40-digit "object name" that looks something like this:
6ff87c4664981e4397625791c8ea3bbb5f2279a3
• SHA1 hash of the contents of the object. Advantages:–Git can quickly determine whether two objects are identical or not, just by comparing names.–Since object names are computed the same way in every repository, the same content stored in two repositories will always be stored under the same name.–Git can detect errors when it reads an object, by checking that the object's name is still the SHA1 hash of its contents.
13
Objects
• Every object consists of three things - a type, a size and content
• There are four different types of objects: "blob", "tree", "commit", and "tag".– A "blob" is used to store file data - it is generally a file– A "tree" is basically like a directory– A "commit" points to a single tree, marking it as what
the project looked like at a certain point in time– A "tag" is a way to mark a specific
14
Blob Object
• Chunk of binary data• Files with same content (anywhere in repo)
share same blob
15
Tree Object
• Simple object with pointers to blobs and other trees – like directory.
• Two trees have the same hash name if and only if their contents (including, recursively, the contents of all subdirectories) are identical
16
Commit Object
• Links a physical state of a tree with a description of how we got there and why.
17
Commit Object• Commit is defined by– a tree: The SHA1 name of a tree object, representing the
contents of a directory at a certain point in time.– parent(s): The SHA1 name of some number of commits
which represent the immediately previous step(s) in the history of the project. A commit with no parents is called a "root" commit, and represents the initial revision of a project.
– an author: The name of the person responsible for this change, together with its date.
– a committer: The name of the person who actually created the commit, with the date it was done.
– a comment describing this commit.
18
The Object Model
19
REVISION HISTORYYou can either have software quality or you can have pointer arithmetic, but you cannot have both at the same time. -- Bertrand Meyer
20
History is a DAG
• In computer science speak, the Git object data is a directed acyclic graph.
• That is, starting at any commit you can traverse its parents in one direction and there is no chain that begins and ends with the same object
21
History is a DAGTo keep all the information and history on the three versions of this tree, Git stores 16 immutable, signed, compressed objects.
22
BRANCHESThere are two major products that come out of Berkeley: LSD and UNIX. We don’t believe this to be a coincidence. -- Jeremy S. Anderson
23
Objects vs References
• Git objects are immutable
• Beside objects, there are references– Unlike the objects, references can change– References are simple pointers to a particular
commit
24
Branches
• Examples of references are branches and remotes– A branch in Git is just a file that contains the SHA-
1 of the most recent commit of that branch– Creating a branch is nothing more than just
writing 40 characters to a file. – As you continue to commit, one of the branches
will keep changing to point to the new commit SHA-1s, while the other one can stay where it was.
25
The Model
26
Local branching
27
Local branching
28
Local branching
• Suppose we need a hot fix to production
29
Local branching
• After experiment merged to master
30
What to do with fast branches
• New branch each time you begin to work on a story or feature
• If you get blocked and need to put it on hold, it doesn’t effect anything else.
• Often you merge the branch back into development and delete it the same day that you created it
• If you get a huge project or idea (refactoring, etc), you create a long-term branch, continuously rebase it to keep it in line with other development, and once everything is tested and ready, merge it in with your master.
31
Real workflow example
32
PRACTICE : LOCAL REPOSITORYThe function of good software is to make the complex appear to be simple. -- Grady Booch
33
git init
c:\> mkdir test
c:\> cd test
c:\test> git initInitialized empty Git repository in c:/test/.git/
34
git addc:\test> dir /bcities.cppcities.h
c:\test> git add cities.h
c:\test> git status# On branch master## Initial commit## Changes to be committed:# (use "git rm --cached <file>..." to unstage)## new file: cities.h## Untracked files:# (use "git add <file>..." to include in what will be committed)## cities.cpp
35
git commitc:\test> git commit -m "first commit"[master (root-commit) 207b79d] first commit 1 files changed, 44 insertions(+), 0 deletions(-) create mode 100644 cities.h
c:\test> git status# On branch master# Untracked files:# (use "git add <file>..." to include in what will be committed)## cities.cppnothing added to commit but untracked files present (use "git add" to track)
36
git commit -ac:\test> git status# On branch masternothing to commit (working directory clean)
c:\test> echo "aaa" > cities.cpp
c:\test> git commit -m "test"# On branch master# Changed but not updated:# (use "git add <file>..." to update what will be committed)# (use "git checkout -- <file>..." to discard changes in working directory)## modified: cities.cpp#no changes added to commit (use "git add" and/or "git commit -a")
c:\test> git commit -m "test" -a[master 6eaf41e] test 1 files changed, 1 insertions(+), 210 deletions(-) rewrite cities.cpp (100%)
37
change-> add -> commit
38
git logc:\test> git commitAborting commit due to empty commit message.
c:\test> git logcommit 57e762203d0b522fa3a47afcc907af313b5d6d78Author: Dmitry Guyvoronsky <[email protected]>Date: Fri Feb 25 16:18:15 2011 +0200
second commit
commit 207b79dd89469a75c9e92a38c4b3eac904bea603Author: Dmitry Guyvoronsky <[email protected]>Date: Fri Feb 25 16:15:17 2011 +0200
first commit
c:\test> git log --pretty=oneline57e762203d0b522fa3a47afcc907af313b5d6d78 second commit207b79dd89469a75c9e92a38c4b3eac904bea603 first commit
39
git branchc:\test> git status# On branch masternothing to commit (working directory clean)
c:\test> git branch* master
c:\test> git branch mytest
c:\test> git branch* master mytest
c:\test> git checkout mytestSwitched to branch 'mytest'
c:\test> git branch master* mytest
40
git checkout -bc:\test> git branch master* mytest
c:\test> git checkout -b anotherSwitched to a new branch 'another'
c:\test> git branch* another master mytest
41
DISTRIBUTED WORKFLOWNothing is more fairly distributed than common sense: no one thinks he needs more of it than he already has
42
Cloning
• To clone repo = to create a copy• Git can clone a repository over several
transports, including local, HTTP, HTTPS, SSH, its own git protocol, and rsync.
43
Remote branches• Remotes are pointers to branches in other peoples
copies of the same repository• If you got your repository by cloning it, you should
have a remote branch of where you copied it from automatically added as origin by default.
44
Remote branches
• A fetch pulls all the refs and objects that you don’t already have from the remote repository you specify.
45
Remote branches• We look at the origin/idea branch and like it, but we also want the
changes they’ve made on their origin/master branch• So we do a 3-way merge of their two branches and our master.• We don’t know how well this is going to work, so we make a
tryidea branch first and then do the merge there.
46
Just for your information
• The current record for number of commit parents in the Linux kernel is 12 branches merged in a single commit
47
git clone
c:\test> git clone [email protected]:dreamiurg/test.gitInitialized empty Git repository in c:/test/test/.git/remote: Counting objects: 10, done.remote: Compressing objects: 100% (10/10), done.remote: Total 10 (delta 1), reused 0 (delta 0)Receiving objects: 100% (10/10), 5.69 KiB, done.Resolving deltas: 100% (1/1), done.
48
Local branches are yours only
c:\test\test> git branch -a* master remotes/origin/HEAD -> origin/master remotes/origin/master
c:\test\test> git checkout -b workingSwitched to a new branch 'working'
c:\test\test> git branch -a master* working remotes/origin/HEAD -> origin/master remotes/origin/master
49
git fetch ; git mergec:\test\test> git st# On branch masternothing to commit (working directory clean)
c:\test\test> git fetchremote: Counting objects: 4, done.remote: Compressing objects: 100% (2/2), done.remote: Total 3 (delta 1), reused 0 (delta 0)Unpacking objects: 100% (3/3), done.From dreamiurg.unfuddle.com:dreamiurg/test 3ade0ca..6309355 master -> origin/master
c:\test\test> git st# On branch master# Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.#nothing to commit (working directory clean)
c:\test\test> git merge origin/masterUpdating 3ade0ca..6309355Fast-forward new.cpp | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) create mode 100644 new.cpp 50
git pull = git fetch ; git merge
c:\test\test> git pullremote: Counting objects: 5, done.remote: Compressing objects: 100% (2/2), done.remote: Total 3 (delta 1), reused 0 (delta 0)Unpacking objects: 100% (3/3), done.From dreamiurg.unfuddle.com:dreamiurg/test 6309355..a4212c7 master -> origin/masterUpdating 6309355..a4212c7Fast-forward new.cpp | 1 + 1 files changed, 1 insertions(+), 0 deletions(-)
51
GUIsSimplicity is not a matter of dumbing things down. Simplicity is when someone takes care of the details.
52
Graphical interfaces
• gitk• TortoiseSvn• SmartGit• mysysgit• … at least 25 more• Interfaces, frontends, and tools
53
Graphical interfaces - gitk
54
Graphical interfaces - TortoiseGit
55
Graphical interfaces - SmartGit
56
COMPARISONThere ain't no such thing as a free lunch?
57
git• History is DAG• Manipulate history –
rebase, reset, commit amend, etc
• Branch is just a reference (head)
• Faster on Linux systems• C• Linux, Rails, Perl, Android,
Wine, Fedora, Gnome etc.• github.org (619,333 users,
1,783,177 repos)• That’s it
Mercurial• History is DAG, but tries to be
linear, causing negative effects in some places (same rev number over different repos)
• No tools to manipulate history by default
• Confusion working with branches – named/unnamed, etc.
• Python• Mozilla, OpenJDK,
OpenSolaris, Xen, Symbian, Go etc.
• bitbucket.org (100,000+ users, 49,334 repos)
• That’s it
58
… investigate it yourself
• Rebase• Git stash• Git bisect (binary search)• Tagging (with/without message, +signed tags
possible)• …• Profit!
59
Q & A
• Start here : http://book.git-scm.com/• For those who know SVN - http://git.or.cz/course/svn.html
• Git for Windows - http://code.google.com/p/msysgit/
60
[email protected]://demiurg.com.ua