We are moving to github. I haven’t done much with git and tend to get confused now and then. And I’m more comfortable with mercurial. But… I get to give an introduction/explanation on git this afternoon for my colleauges anyway.
Perhaps this relatively fresh look at git/github can help others, so I’m writing it down here. Handy in any case as one of my colleagues isn’t here now as he’s preparing for a marathon :-)
Oh, and I’ll probably get things wrong in here as I’m not familiar enough yet. So consider yourself warned.
The basic concept of git, paraphrased a bit:
You’ve got multiple respositories in various places.
Every repository is basically a big bucket of changesets and a handful of pointers.
Git effectively starts out empty and and applies a string of changesets and ends up at a directory filled with your source code.
Which string of changesets? That is determined by the pointer. You point at a certain changeset and you get that one and its parents. (They all have a pointer to their parent).
A pointer can be a tag or a branch or even a pointer at something in another repository.
“Multiple repositories in various places”? Yep. A project on github is a repository. If you grab a local copy (=you clone it), your local copy is also a repository with all the contents. If you “fork” a project on github? Yep, a full copy. Getting stuff from one to the other means “push” and “pull”.
But if your local thingy is a full repository, including all branches/tags/whatever, what ends up in your actual code directory? This is because git has two layers (and effectively three if you count a remote repository):
Your actual code directory. This is your visible directory. The git name for it is your “working copy”.
The full contents of the repository including pointers and
changesets. Hidden in your working copy in the .git/
directory. When you
grab stuff from a remote repository, it ends up in here. And when you push,
it is the contents of the .git
directory that you push over.
One or more remote repositories. (Technically those repositories’ indexes: you can’t muck around directly in someone else’s working copy, of course).
Let me sneak in one extra layer:
The so-called “index”. These are the local changes you’ve collected for your next commit. So if you commit them, the index is empty again and what’s in your index is now in your repository.
So, compared to subversion, you’ve got one extra layer to keep track of, which of course complicates the mental model of what’s happening. With subversion, you only have a central repository and a local working copy. Git (and mercurial, bzr and the rest) add that full local repository (git’s “index”).
Here are the git commands I’ve personally used in the last few weeks:
62 git status
31 git diff
26 git push
20 git add
18 git commit
17 git show
13 git mv
13 git help
10 git pull
(Output of history|grep git|awk '{print $2 " " $3}'|sort|uniq -c|sort
-nr|head
).
I won’t repeat git’s own documentation here. You’ll probably have to do quite some googling before you’re comfortable with those commands. I’ll have to do that googling myself, too.
Some comments on most of those commands that might help. Note that I’m using italics explicitly here to differentiate your repository from the remote repository (assuming you have a remote repository at github). Here you go:
git status
. Relatively helpful, as it often suggests what you need to
do. If files are changed, but they haven’t been “added to the index” yet, it
suggests that you might want to git add
them. The index is what ends up
as your next commit, so this is the set of changes you want to stuff into
your repository:
$ git status
# On branch master
# Changes not staged for commit:
# (use "git add <file>..." to update what will be committed)
# (use "git checkout -- <file>..." to discard changes in working directory)
#
# modified: MANIFEST.in
#
no changes added to commit (use "git add" and/or "git commit -a")
git diff
. All by itself, without options, it is the difference between
your local working copy and the index. So the changes that you haven’t
marked yet for inclusion in your next commit:
$ git diff
diff --git a/MANIFEST.in b/MANIFEST.in
index c57b4ff..fe6332b 100644
--- a/MANIFEST.in
+++ b/MANIFEST.in
@@ -1,2 +1,2 @@
include *.rst
-recursive-include lizard_map *.rst *.py *.html *.css *.js *.jpg *.png *.gif *.pdf *.shp *.json
+recursive-include lizard_map *.rst *.py *.html *.css *.js *.jpg *.png *.gif *.pdf *.shp *.json *.po *.mo
After adding a couple of things to the index, you probably want to review what you’ve “staged for commit in the index”:
$ git diff --staged
diff --git a/MANIFEST.in b/MANIFEST.in
index c57b4ff..fe6332b 100644
--- a/MANIFEST.in
+++ b/MANIFEST.in
@@ -1,2 +1,2 @@
include *.rst
-recursive-include lizard_map *.rst *.py *.html *.css *.js *.jpg *.png *.gif *.pdf *.shp *.json
+recursive-include lizard_map *.rst *.py *.html *.css *.js *.jpg *.png *.gif *.pdf *.shp *.json *.po *.mo
git add
. Just add filenames as suggested by git status
. A helpful
variant is git add -u
, this automatically adds changes to all
already-known files. (-a
adds all changes, but at the risk of including
unwanted files).
git commit
. This stuffs the changes you added to (=staged in) the index
to your repository.
git push
. This pushes what’s in your repository to the remote
repository. (Assuming git knows what your remote repository is and which
branch in the remote repository it has to talk to. And when the push
doesn’t break the remote repository. And when you’ve got access to that
remote repository.)
git pull
. This is basically a combination of git fetch
followed by
git merge
. Fetch fetches the current state of the remote repository
(all changesets and pointers), merge merges those changes (=*your*
repository) into your working copy. So just do a git pull
to grab the
changes of the remote repository and get them in your repository and
your working copy.
git help
. git help some_command
or git some_command --help
both
do the same. Quite decent documentation, actually! Use “git help” often
when starting out with git!
Don’t forget you can add -v
to get more verbose output of what git is
doing. This can be very helpful to see what’s happening! Compare:
$ git pull
Already up-to-date.
with the more explanatory:
$ git pull -v
From github.com:lizardsystem/lizard-map
= [up to date] alexandr-workspace-changes -> origin/alexandr-workspace-changes
= [up to date] gijs-nepal -> origin/gijs-nepal
= [up to date] master -> origin/master
= [up to date] reinout-api -> origin/reinout-api
= [up to date] reinout-tastypie -> origin/reinout-tastypie
Already up-to-date.
When you create a new repository on github, you get a “clone url”. Something
like https://github.com/jcrocholl/pep8.git
. To get a local copy, clone it:
$ git clone https://github.com/jcrocholl/pep8.git
Cloning into pep8...
remote: Counting objects: 861, done.
remote: Compressing objects: 100% (369/369), done.
remote: Total 861 (delta 424), reused 833 (delta 398)
Receiving objects: 100% (861/861), 160.20 KiB, done.
Resolving deltas: 100% (424/424), done.
This sets up some defaults in your repository. “Your repository” means
that “pep8” directory that got set up (and more specifically the .git/
directory inside it).
Some of the defaults that are helpful to know and that partially explains why several of the commands I showed don’t need that many options:
git branch
shows just a “master” branch. That’s “trunk” in
svn-speak. The star in front of it shows it as the default branch:
$ git branch
* master
git branch -a
shows also the remote branches:
$ git branch -a
* master
remotes/origin/HEAD -> origin/master
remotes/origin/master
That remotes/origin/master
remote is a pointer at the master (so: the trunk) of
the remote github repository.
repository’s “master” as its default push/pull location. So git pull
grabs new revisions from github into your repository and updates your
working copy.
Many colleages (and others!) use my checkoutmanager tool. Based on a config file,
it manages your various svn/git/hg/bzr checkouts. Normally you have a couple
of directories with checkouts and doing “svn up” in all of them is a bit of a
chore. That’s where checkoutmanager comes in. checkoutmanager up
does an
svn up
in all your svn directories.
And…. for git it does a git pull
. Yes, it supports git.
checkoutmanager status
does git status
in your git directories that it
knows about.
Very handy: checkoutmanager out
, which tells you which of your
respositories have commits that you haven’t pushed to github yet!
As a quick example, here’s a snippet from my .checkoutmanager.cfg
:
[git]
vcs = git
basedir = ~/git/
checkouts =
git@github.com:reinout/pep8.git
git@github.com:reinout/gitignore.git
[nensgit]
vcs = git
basedir = ~/git/nens/
checkouts =
git@github.com:lizardsystem/nensskel.git
git@github.com:lizardsystem/lizard-ui.git
git@github.com:lizardsystem/lizard-map.git
We use buildout for all our projects. Handier than pip/virtualenv as it provides some extra functionality like easy django setup, apache config files from templates, cronjob setup and so on. And… it has mr.developer. A buildout extension for managing a couple of git (or svn/bzr/hg) checkouts inside your project.
Because…. git and mercurial don’t have something that works like
svn:externals. Or at least that doesn’t work as
nice/reliable/integrated/whatever. But mr.developer takes care of that in
buildout. Here’s a relevant snippet from a buildout.cfg
:
[buildout]
...
extensions =
mr.developer
buildout-versions
parts =
mkdir
django
...
develop = .
...
[sources]
lizard-ui = git git@github.com:lizardsystem/lizard-ui.git
...
This does three things:
It adds a bin/develop
command for managing your sources.
It adds a src/
directory where it places any checkouts/clones it makes.
bin/develop now knows that the “lizard-ui” package is a git checkout with a certain clone url.
mr.developer does a couple of things. Here’s the help:
$ bin/develop -h
usage: develop [-h] [-v] ...
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
commands:
activate, a Add packages to the list of development packages.
checkout, co Checkout packages
deactivate, d Remove packages from the list of development packages.
help, h Show help
info Lists informations about packages.
list, ls Lists tracked packages.
rebuild, rb Run buildout with the last used arguments.
reset Resets the packages develop status.
status, stat, st
Shows the status of tracked packages.
update, up Updates all known packages currently checked out.
Once you want to work on the “trunk” of one of the packages managed by mr.developer, check it out (which also activates it):
$ bin/develop checkout lizard-ui
INFO: Queued 'lizard-ui' for checkout.
INFO: Cloned 'lizard-ui' with git.
INFO: Activated 'lizard-ui'.
WARNING: Don't forget to run buildout again, so the checked out packages are used as develop eggs.
As mentioned in that message, run buildout again and it will be installed as a development egg.
bin/develop status
shows the current status of the checkouts, but the
output isn’t very clear. Run bin/develop status --help
to see what it all
means.
bin/develop update
does a git pull
for every package.
Starting a new project and getting it in github is pretty simple:
First create it in github (just clicky-clicky on the github website).
Then set up your project locally (we use nensskel).
Run git init
in that directory to turn it into a git
directory. Add/commit/adjust where needed.
Do a git push
to the clone url you see in your git webpage!
What I haven’t covered yet: branches and workflow. I’ll do that in a later article.
My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):