[PATCH] Simplified GIT usage guide

From: David Howells
Date: Fri Dec 12 2008 - 13:29:27 EST


Add a guide to using GIT's simpler features.

Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
---

Documentation/git-haters-guide.txt | 1283 ++++++++++++++++++++++++++++++++++++
1 files changed, 1283 insertions(+), 0 deletions(-)
create mode 100644 Documentation/git-haters-guide.txt


diff --git a/Documentation/git-haters-guide.txt b/Documentation/git-haters-guide.txt
new file mode 100644
index 0000000..51e4dac
--- /dev/null
+++ b/Documentation/git-haters-guide.txt
@@ -0,0 +1,1283 @@
+ ===================================
+ THE GIT HATER'S GUIDE TO THE GALAXY
+ ===================================
+
+By David Howells <dhowells@xxxxxxxxxx>
+
+Contents:
+
+ (*) Introduction.
+
+ - Disclaimer.
+
+ (*) Overview of GIT.
+
+ - Git objects.
+ - Symbolic pointers.
+ - The GIT tree.
+ - GIT trees after merging.
+
+ (*) Downloading upstream trees.
+
+ - Local mirroring.
+ - Automatic updates.
+ - Using your local mirror.
+
+ (*) Accessing the repository.
+
+ - Viewing the history.
+ - Viewing a commit.
+ - Viewing source differences.
+
+ (*) Making changes.
+
+ - Applying patches.
+ - Applying formatted patches.
+ - Incorporating GIT trees.
+
+ (*) Amending and reverting changes.
+
+ - Amending committed changes.
+ - Discarding committed changes.
+ - Reverting committed changes.
+
+ (*) Publishing changes by GIT tree.
+
+ - Setting up.
+ - Updating your development tree.
+ - Publishing your changes.
+
+ (*) Manually merging failed fetches.
+
+ (*) Locating bugs.
+
+ - Bisection.
+ - Blame.
+
+
+============
+INTRODUCTION
+============
+
+So, you want to do some Linux kernel development? And you hear there's this
+piece of software called 'GIT' that you probably ought to be using when dealing
+with the kernel community? Then you find out that not only was Linux started
+by this Linus Torvalds person, but GIT was too! Perhaps it doesn't seem fair:
+Linus has not just _one_ huge piece of software named after himself, but _two_!
+And on top of that, globe spanning hardware vendors just queue up to give him
+all the herring he can eat!!
+
+Then you look at webpages about GIT. You look at the manpages! You run the
+commands with --help! And you *still* don't know how to do anything complex
+with it!! You feel certain that there's some secret rite you have to perform
+to become a GIT initiate - probably something involving two goats, an altar and
+a full moon - oh, and lots of beer (we *are* talking about kernel developers
+after all).
+
+Then you ask around, and people look at you blankly, hedge or say that it's
+easy and obvious (they should know - they wrote the damned thing). You realise
+that the manpages are more an aide-memoire and that what you really want is
+some sort of crib sheet; something that can hold your hand whilst you cut and
+paste things from of it until you can see the point.
+
+Well, let's see if I can help...
+
+
+DISCLAIMER
+----------
+
+I don't really know what I'm doing with GIT either. I'm not sure anyone really
+does, apart from Linus (and then only after some strange Finnish snack
+involving red and white mushrooms). If you'd pause to wonder why things are
+like they are, you'd realise that only someone totally barking would try to
+write a kernel in the first place... and then it'd dawn on you what the mental
+state must be like of someone who'd try writing something like a source code
+management system from scratch... and then you'd consider what it must take to
+be someone who'd do *both*.
+
+
+===============
+OVERVIEW OF GIT
+===============
+
+GIT is a source code management system. You give it your sources to retain,
+and it manages the history of all the changes and provides you with a set of
+tools by which that history can be viewed, extracted and extended.
+
+GIT is unusual in its design in that the objects it retains are referred to by
+hashes of their content. Because it is mathematically possible for object IDs
+to collide, large hash IDs are used to reduce the probability of a collision.
+If the content of an object changes, rather than updating the existing object,
+GIT will create a new object with a new hash ID. Objects are _invariant_.
+
+The GIT database in a GIT tree has two sets of data:
+
+ (1) A set of objects, indexed by the object hash ID.
+
+ (2) A set of symbolic object tree heads, as object hash IDs.
+
+
+GIT OBJECTS
+-----------
+
+There are three basic types of object:
+
+ (1) File objects.
+
+ A file object contains the contents of a source file and the attributes of
+ that file (such as file mode).
+
+ (2) Directory objects.
+
+ A directory object contains the attributes of that directory plus a list
+ of file and directory objects that are members of this directory. The
+ list includes the names of the entries within that directory and the
+ object ID of each object.
+
+ (3) Commit objects.
+
+ A commit object contains the attribute of that commit (the author and the
+ date for instance), a textual description of the change imposed by that
+ commit as provided by the committer, a list of object IDs for the commits
+ on which this commit is based, and the object ID of the root directory
+ object representing the result of this commit.
+
+ Note that a commit does not literally describe the changes that have been
+ made in the way that, say, a diff file does; it merely carries the current
+ state of the sources after that change, and points to the commits that
+ describe the state of the sources before that change. GIT's tools then
+ infer the changes when asked.
+
+ A commit object will typically refer to one base commit when someone has
+ merely committed some changes on top of the current state, and two base
+ commits when a couple of trees have been merged.
+
+Because objects are invariant, and because they can thus be referred to by a
+hash of their contents, objects can be shared between trees simply by using the
+same object ID in two different places. This allows objects to be compared to
+see whether they are the same thing or not simply by comparing the object ID.
+
+
+SYMBOLIC POINTERS
+-----------------
+
+GIT retains its historical information in a set of overlapping, shared trees,
+but the notion of where a tree starts isn't really a primary concept with GIT.
+What it has instead is a number of symbolic pointers to commits within the tree
+that are considered to be of some sort of significance. These are called
+'heads' and include:
+
+ (1) The base for the current working state of the checked out sources (HEAD).
+
+ (2) Branches (by branch name).
+
+ (3) Tags (by tag name).
+
+ (4) Merge base (for incomplete merges).
+
+ (5) Points of interest, such as those that pertain to a git fetch (FETCH_HEAD
+ and ORIG_HEAD).
+
+ (6) Bisection points (when bisection is being used to find a bug).
+
+In essence, these symbolic pointers are just names or conventions for
+particular roots in the tree. They are a name that maps to the object ID of a
+commit object.
+
+Some of them have special meanings, such as branches, that can be configured to
+behave in various ways under certain conditions (such as when a git fetch is
+performed).
+
+
+THE GIT TREE
+------------
+
+The GIT tree in its simplest terms is a backbone of commits that point to
+directories that point to files. To give a simple example of the commit
+process, consider the sources for a project that contains one directory, D,
+which contains three files, F1, F2 and F3.
+
+This could then be committed into GIT to begin a project, in this case as
+commit C0. This would hold version D0 of the directory, and versions F1A, F2A
+and F3A of the three files, and the GIT repository HEAD pointer would point to
+C0:
+
+ +-----+
+ +-->| F3A |
+ | +-----+
+ |
+ +-----+ +-----+ | +-----+
+ HEAD--->| C0 |------->| D0 |------+-->| F2A |
+ +-----+ +-----+ | +-----+
+ |
+ | +-----+
+ +-->| F1A |
+ +-----+
+
+Now imagine that someone changes file F2 and commits the change. F1A and F3A
+are still useful, and can be shared by the new view of the world, but F2 is now
+on a new version, F2B. The old directory object, D0, pointed to F2A, so that
+cannot be reused, and so D1 is generated. The commit process then writes a new
+commit object, C1, that points to D1 as the state of the tree after this
+commit, and points to C0 as the commit on which C1 was based. Finally, HEAD is
+changed to point to C1.
+
+ +-----+
+ +---->| F2B |
+ +-----+ +-----+ | +-----+
+ HEAD--->| C1 |------->| D1 |----+
+ +-----+ +-----+ |
+ | |
+ | | +-----+
+ | +---->| F3A |
+ | | +-->+-----+
+ V | |
+ +-----+ +-----+ | | +-----+
+ | C0 |------->| D0 |------+-->| F2A |
+ +-----+ +-----+ | | +-----+
+ | |
+ +-|-->+-----+
+ +-->| F1A |
+ +-----+
+
+Then imagine that someone changes file F1 and commits the change. F3A is still
+viable in its original state, and F2B is usable from commit C1, but F1A is now
+obsolete and gets replaced by version F1B. This means that neither D0 nor D1
+are usable, so directory object D2 has to be created, and new commit C2 is
+created to point to that and base commit C1. Then HEAD is set to point to C2:
+
+ +-----+
+ +------>| F1B |
+ +-----+ +-----+ | +-----+
+ HEAD--->| C2 |------->| D2 |--+
+ +-----+ +-----+ |
+ | |
+ | +------>+-----+
+ V | +---->| F2B |
+ +-----+ +-----+ | | +-----+
+ | C1 |------->| D1 |----+
+ +-----+ +-----+ | |
+ | | |
+ | +-|---->+-----+
+ | +---->| F3A |
+ | | +-->+-----+
+ V | |
+ +-----+ +-----+ | | +-----+
+ | C0 |------->| D0 |------+-->| F2A |
+ +-----+ +-----+ | | +-----+
+ | |
+ +-|-->+-----+
+ +-->| F1A |
+ +-----+
+
+Now, consider what would have happened if, instead of changing F1A to be F1B to
+produce C2, F2B had been reverted to the same state as F2A. GIT would realise
+that it already has a file object to represent F2A (by comparing object IDs)
+and would use that rather than creating a new one. The new set of files in the
+directory would then be F1A, F2A and F3A - but there's already a directory
+object for that: D0. This would also be discovered by object ID matching, and
+would be used instead. Commit C3 would then point to base commit C1 and
+directory D0, and HEAD would be moved to point to C3:
+
+ +-----+
+ HEAD--->| C3 |---+
+ +-----+ |
+ | |
+ | | +-----+
+ V | +---->| F2B |
+ +-----+ | +-----+ | +-----+
+ | C1 |------->| D1 |----+
+ +-----+ | +-----+ |
+ | | |
+ | | | +-----+
+ | | +---->| F3A |
+ | | | +-->+-----+
+ V | | |
+ +-----+ +--->+-----+ | | +-----+
+ | C0 |------->| D0 |------+-->| F2A |
+ +-----+ +-----+ | | +-----+
+ | |
+ +-|-->+-----+
+ +-->| F1A |
+ +-----+
+
+
+GIT TREES AFTER MERGING
+-----------------------
+
+Now, imagine that two GIT trees are merged. You start off with two sets of
+commits (for convenience, I'm going to leave out the directories and files, but
+you can just assume they're there):
+
+ +-----+ +-----+
+ HEAD--->| C3 | Branch->| B3 |
+ +-----+ +-----+
+ | |
+ V V
+ +-----+ +-----+
+ | C2 | | B2 |
+ +-----+ +-----+
+ | |
+ V V
+ +-----+ +-----+
+ | C1 |<------------------------| B1 |
+ +-----+ +-----+
+ |
+ V
+ +-----+
+ | C0 |
+ +-----+
+
+In the above example, I've assumed that you've got your own tree with the head
+at commit C3, and that you've got a branch that you want to merge, which has
+its head at commit B3. After merging them, you'd end up with a directed,
+cyclic tree:
+
+ +-----+
+ HEAD--->| C4 |----------------------------+
+ +-----+ |
+ | |
+ V V
+ +-----+ +-----+
+ | C3 | Branch->| B3 |
+ +-----+ +-----+
+ | |
+ V V
+ +-----+ +-----+
+ | C2 | | B2 |
+ +-----+ +-----+
+ | |
+ V V
+ +-----+ +-----+
+ | C1 |<------------------------| B1 |
+ +-----+ +-----+
+ |
+ V
+ +-----+
+ | C0 |
+ +-----+
+
+and the C4 commit will have pointers to *both* contributing commits, C3 and B3.
+If GIT stored the differences at each commit rather than the terminal state, it
+would have to store a delta for each contributing commit.
+
+
+==========================
+DOWNLOADING UPSTREAM TREES
+==========================
+
+The first thing you'll usually want to do with GIT is to grab a copy of the
+cutting edge version of an upstream project and build it; perhaps you want to
+work on it, perhaps because it has a fix in it that you need or perhaps because
+you like living on the cutting edge and enjoy grepping your disks to recover
+your data when things go wrong. Whatever your reasons, you need to be able
+make a local copy of an upstream GIT tree.
+
+With GIT-based projects, grabbing a local copy of an upstream repository is
+very easy:
+
+ git clone %UPSTREAM_REPO %MY_DIR
+
+This will create a checked-out copy of the the upstream repository
+(%UPSTREAM_REPO) by pulling over the internet and sticking it in a directory on
+the local machine.
+
+For example, to fetch Linus's cutting edge kernel tree, you'd do:
+
+ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git \
+ linux-2.6-local
+
+Then you look in linux-2.6-local and there is what you're looking for.
+
+
+LOCAL MIRRORING
+---------------
+
+You might find that you wish to run several concurrent, separate developments
+all based upon a single upstream repository. You could simply clone each one
+as mentioned above, but that has the potential to use excessive amounts of disk
+space as each clone would include an independent copy of the entire source
+repository.
+
+What you might want to do is to set up a mirror of the upstream repository, and
+then share that mirror with each of the clones. Even better, you can share it
+with other people who can also access the filesystem it is stored upon.
+
+So what you can do is create a local mirror:
+
+ git clone -n %UPSTREAM_REPO %MIRROR_DIR
+
+For example:
+
+ git clone -n git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git \
+ /warthog/git/linux-2.6.git
+
+The -n flag tells git to save space by not bothering to check the files out of
+the repository. You don't really need the checkout if all you're going is to
+use this as a reference, but you can still check out if you like by omitting
+the -n.
+
+
+AUTOMATIC UPDATES
+-----------------
+
+Furthermore, you might want to automatically update your sources at some
+unfeasible hour of the morning when only Australians are awake because, say,
+your internet supply is rated more cheaply then - but you don't necessarily
+want the automatic update to dump into the sources you're actively meddling
+with. A local mirror can help with this too.
+
+One way of automatically updating your mirror is to use cron. To do this
+create a script that looks something like:
+
+ #!/bin/sh
+ cd %MIRROR_DIR || exit $?
+ exec git pull >/tmp/git-pull.log
+
+and chmod u+x it. Then run the crontab program to modify your personal cron
+schedule and add something like the following line to it (not forgetting to
+remove the leading tab!):
+
+ 0 %HOUR * * * %MIRROR_SCRIPT
+
+where %HOUR is the hour you want it to go off every day. For my local mirror
+of Linus's upstream kernel, I use:
+
+ #!/bin/sh
+ cd /warthog/git/linux-2.6 || exit $?
+ exec git pull >/tmp/git-pull.log
+
+and:
+
+ 0 6 * * * /home/dhowells/bin/do-git-pull.sh
+
+which will do the update every day at 6am.
+
+
+USING YOUR LOCAL MIRROR
+-----------------------
+
+You can then create a directory to actually do your development in by:
+
+ git clone -l -s %MIRROR_DIR %MY_DIR
+
+The "-l" tells git clone that the source (mirror) repository is on the local
+machine, that it shouldn't go over the internet for it, and that it should
+hardlink GIT objects from the source repository rather than copying them where
+possible.
+
+The "-s" says that git clone should insert a reference under %MY_DIR that
+points to the %MIRROR_DIR's collection of objects. This means that GIT won't
+bother to copy the objects that it can get from %MIRROR_DIR at all, it'll just
+use them out of %MIRROR_DIR.
+
+ [!] NOTE: This makes %MY_DIR dependent on %MIRROR_DIR: if you delete
+ %MIRROR_DIR or prune it you may make %MY_DIR unusable!
+
+You can repeat this again and again from the same mirror. You can even share a
+mirror with other people that can access the filesystem holding the mirror.
+You don't need write access to it, only read.
+
+
+========================
+ACCESSING THE REPOSITORY
+========================
+
+One of the things you'll want to be able to do with what you've downloaded is
+look at changes other people have made. GIT has some powerful tools to allow
+you to do this.
+
+
+VIEWING THE HISTORY
+-------------------
+
+You might wish, for example, to look back through the commit tree and see what
+changes have been made. The command to do this is:
+
+ git log
+
+This will take you back through the commit information, starting at the current
+HEAD and going all the way back to the beginning if you let it:
+
+ warthog>git log
+ commit 8b1fae4e4200388b64dd88065639413cb3f1051c
+ Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
+ Date: Wed Dec 10 15:11:51 2008 -0800
+
+ Linux 2.6.28-rc8
+
+ commit f9fc05e7620b3ffc93eeeda6d02fc70436676152
+ Merge: b88ed20... 9a2bd24...
+ Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
+ Date: Wed Dec 10 14:41:06 2008 -0800
+
+ Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
+
+ * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
+ sched: CPU remove deadlock fix
+
+ commit b88ed20594db2c685555b68c52b693b75738b2f5
+ Author: Hugh Dickins <hugh@xxxxxxxxxxx>
+ Date: Wed Dec 10 20:48:52 2008 +0000
+ ...
+
+
+VIEWING A COMMIT
+----------------
+
+Now that you can see the commit IDs in the history, you can examine one more
+closely:
+
+ git show
+
+to see the current HEAD commit, or:
+
+ git show %COMMIT_ID
+
+to see a particular commit:
+
+ warthog>git show 1da177e
+ commit 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2
+ Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxx>
+ Date: Sat Apr 16 15:20:36 2005 -0700
+
+ Linux-2.6.12-rc2
+
+ Initial git repository build. I'm not bothering with the full history,
+ ...
+ diff --git a/COPYING b/COPYING
+ new file mode 100644
+ index 0000000..2a7e338
+ --- /dev/null
+ +++ b/COPYING
+ @@ -0,0 +1,356 @@
+ +
+ + NOTE! This copyright does *not* cover user programs that use kernel
+ ...
+
+
+VIEWING SOURCE DIFFERENCES
+--------------------------
+
+The 'git-show' command shows you what it thinks the differences are that you
+want to see, between a commit and its first listed base commit. However, there
+are other differences you might wish to see.
+
+Firstly, you might like to see the differences between what's in the current
+HEAD commit, and what you've got checked out:
+
+ git diff
+
+or you might wish to see the differences between two particular commits, for
+example:
+
+ git diff v2.6.24 v2.6.25
+
+
+==============
+MAKING CHANGES
+==============
+
+So you've got a fresh development GIT tree and you want to make changes in it
+and commit them to it. The first is easy enough: just use your preferred text
+editor to edit the files directly, or you could use sed or perl to apply some
+textual transformations - that's entirely up to you.
+
+However, once you've made those changes and you've compiled and tested them,
+you'll probably want to consign them to GIT.
+
+Files you've added must be marked by:
+
+ git add <filename>
+
+and files you've deleted must be noted by:
+
+ git rm <filename>
+
+so that GIT knows to include or exclude these files from its tree.
+Furthermore, you must tell GIT about any files that have changed that you want
+to be updated also:
+
+ git add <filename>
+
+You can then commit your changes. This is done by running:
+
+ git commit
+
+Rather than doing lots of git add and git rm commands to register updated and
+removed files, you can give git commit a '-a' flag. Note, though, that this
+takes no account of new files that git doesn't already know about. Those must
+be added manually.
+
+git commit will pop up your favourite editor, asking you to enter a commit
+message describing your changes (don't forget to add your sign-off). It will
+list the files it sees that have been added, altered and removed, and will
+differentiate between those that it has been told about (and thus will include)
+and those it hasn't (which will be ignored).
+
+After git commit completes successfully, 'git show' should show the new commit
+you've just made, and gitk should show the new tree structure with your new
+commit at the top.
+
+
+APPLYING PATCHES
+----------------
+
+If you have a patch file you wish to apply, you can do that with:
+
+ git apply <patch-file>
+
+This will make the changes specified by the patch, but it won't register any of
+the changes and won't record any of the metadata that might be in the patch
+file, such as authorship, description or attribution. That has to be done
+manually as if you'd made the changes yourself.
+
+
+APPLYING FORMATTED PATCHES
+--------------------------
+
+Sometimes you may wish to incorporate a patch that someone has emailed you.
+You could use the 'patch' or 'git apply' programs and then set up the commit
+information manually, but if someone has sent you an appropriately formatted
+message - perhaps in an email - you can have GIT import the metadata from the
+message rather than you having to type it manually.
+
+If someone has given you an email or appropriately formatted patch file, the
+following command can import it:
+
+ git am <patch-file>
+
+If successful, this will automatically register all added, altered and removed
+files and commit the changes for you. The commit message will be concocted
+from the description and email headers (From: and Subject: for instance). If
+you want to add your own sign-off to the bottom of the commit message whilst
+you're at it, you can add a '-s' flag:
+
+ git am -s <patch-file>
+
+You may find it convenient to edit unformatted patches to make it possible to
+use 'git am' rather than 'git apply'.
+
+
+INCORPORATING GIT TREES
+-----------------------
+
+And sometimes, rather then sending you patches, people may attempt to
+contribute changes to you that are contained within GIT trees and you may wish
+to incorporate these into your development tree.
+
+To do this, the following command will work:
+
+ git pull %CONTRIB_REPO %CONTRIB_BRANCH
+
+where %CONTRIB_REPO is the URL of a repository and %CONTRIB_BRANCH is the name
+of the branch within that repository (usually this will be 'master').
+
+If successful, this will either just stack the pulled changes directly on top
+of your tree (assuming the contributed tree is based on the head of your tree)
+or it will automatically produce a merge commit indicating that the resulting
+tree is a union of the changes in your tree and the contributed tree.
+
+If unsuccessful due to conflicting changes, you'll need to perform the merge
+manually and perform the commit yourself. See the "Manually merging failed
+fetches" section.
+
+An example of the command line you might use is:
+
+ git pull git://git.infradead.org/mtd-2.6.git master
+
+which will pull master branch of the upstream MTD tree into the GIT tree you're
+currently in.
+
+
+==============================
+AMENDING AND REVERTING CHANGES
+==============================
+
+There will be times when you make a mistake in your changes, and you find that
+you either want to amend them, or you want to discard them entirely. GIT
+provides a number of tools to do this.
+
+If you make a mistake in changes you haven't yet committed, you can just edit
+them again with your text editor, or if you'd prefer to discard all the changes
+you made to a particular file, you can do:
+
+ git checkout <filename>
+
+This will just wipe away the changes that you've made and restore the file to
+the state it has recorded for it as part of the topmost commit.
+
+
+AMENDING COMMITTED CHANGES
+--------------------------
+
+If you've committed some changes and you realise that those changes are
+incorrect, you can amend them without precisely making a whole new commit -
+provided you haven't committed anything else on top of them.
+
+To do this, you make your changes, run git add and git rm as normal, and then
+do:
+
+ git commit --amend
+
+This will replace the topmost commit with a similar commit that includes the
+amendments. The old commit will be displaced from the tree and will not appear
+again.
+
+Changes that are buried beneath further commits unfortunately have to be
+altered by making a new commit with the amendments, unless you wish to discard
+all the commits down to the one that needs amending, and then apply them all
+again.
+
+
+DISCARDING COMMITTED CHANGES
+----------------------------
+
+Upon occasion, you'll want to discard one or more commits entirely from the top
+of your tree. To do this you need to find the ID of the latest commit that you
+want to keep. Everything from the commit after that to the current commit will
+be discarded.
+
+You can find the commit ID in a number of ways. Firstly, you can use 'git log'
+to look back through the commits. The commit ID is shown as something like:
+
+ commit 6c34bc2976b30dc8b56392c020e25bae1f363cab
+
+Secondly, you can use gitk: select the commit of interest; the commit ID
+appears in the box labelled "SHA1 ID".
+
+You can then perform the discard with the following command:
+
+ git reset --hard %COMMIT_ID
+
+Using the above commit ID as an example, you could do:
+
+ git reset --hard 6c34bc2976b30dc8b56392c020e25bae1f363cab
+
+
+REVERTING COMMITTED CHANGES
+---------------------------
+
+And sometimes you'll want to revert changes that you've committed, but that are
+now buried beneath other commits. Short of discarding and reapplying commits,
+you have to apply a reverse patch:
+
+ git diff %COMMIT_ID | patch -p1 -R
+
+and then commit it. Both the original application and the reversion will be
+retained by GIT.
+
+
+==============================
+PUBLISHING CHANGES BY GIT TREE
+==============================
+
+Now that you've got a tree and have mangled it in unspeakable ways, you
+probably want to donate the glory of your works back to the community - usually
+with an eye to getting your changes pulled into an upstream maintainer's
+repository. Your upstream maintainer may then push your changes on to their
+upstream maintainer, until it ends into the ultimate upstream repository
+(Linus's linux-2.6 tree in the case of the Linux kernel).
+
+You could, of course, just push patches to the upstream maintainer, be that
+Linus or one of his cronies in the case of the Linux kernel, or some other
+person if some other project.
+
+GIT, however, leans strongly towards another option. If you can get access to
+a computer that is accessible by way of the internet, you might be able to set
+up a public GIT tree upon it and ask an appropriate upstream maintainer to pull
+from that.
+
+That computer, however, may not be particularly convenient for developing on:
+it may be remote from where you're working, for example, perhaps even on a
+different continent - so you'll probably want to have two trees: a remote,
+public, published tree, and a local private tree where you can break stuff at
+will. I'm going to assume the two trees approach.
+
+
+SETTING UP
+----------
+
+First of all, you'll need to set up your two trees. There are a number of
+steps to go through to do this:
+
+ (1) Find somewhere that's accessible by the internet (%REMOTE_BOX) that you
+ have SSH access to, and set up a public GIT tree that's a clone of the
+ upstream tree you want to use as a base:
+
+ ssh %REMOTE_BOX
+ cd /my/git/trees
+ git clone -n --bare %UPSTREAM_REPO %MY_DIR
+
+ Where %UPSTREAM_REPO is something like:
+
+ git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+
+ This will create a directory called /my/git/trees/%MY_DIR that contains a
+ bare GIT repository to which you can upload your changes. There will be
+ no checked out files here, and everything that would usually be in the
+ .git directory is in the top directory instead.
+
+ If your tree is on the same box as the tree you want to fork, you can
+ tell GIT to use that rather than going to the internet:
+
+ git clone -l -s -n --bare %UPSTREAM_DIR %MY_DIR
+
+ For example, I might wish to set up a tree to publish NOMMU changes so
+ that they're available through git.kernel.org. To that end, I would do:
+
+ ssh master.kernel.org
+ cd /pub/scm/linux/kernel/git/dhowells
+ git clone -l -n -s --bare \
+ /pub/scm/linux/kernel/git/torvalds/linux-2.6.git linux-2.6-nommu
+
+
+ (2) You should set the description on your public repository:
+
+ echo %DESCRIPTION >%MY_DIR/description
+
+ For example:
+
+ echo "NOMMU development" >linux-2.6-nommu/description
+
+ This will be published through the GIT web interface if one is set up, and
+ so can be viewed by going to the appropriate URL. For instance:
+
+ http://git.kernel.org/?p=linux/kernel/git/dhowells/linux-2.6-nommu.git
+
+
+ (3) Now go to the work machine on which you'll be doing your development.
+ You'll need to create a local fork of your public GIT repository. You can
+ do this by:
+
+ git clone ssh://%REMOTE_BOX/my/git/trees/%MY_DIR %DEVEL_DIR
+
+ This will create a checked-out GIT tree in a directory (%DEVEL_DIR) that
+ you can later use for development. If you have a local mirror of the
+ upstream tree that you're using as a base, you can tell git to use the
+ objects from that to save space:
+
+ git clone --reference %LOCAL_UPSTREAM_MIRROR \
+ ssh://%REMOTE_BOX/my/git/trees/%MY_DIR \
+ %DEVEL_DIR
+
+ [!] NOTE: You must use ssh: and not git: to clone your tree because you
+ need to be able to push back (write) to your public tree.
+
+ To continue my example, I have a local mirror of Linus's kernel, regularly
+ updated by cron, and so to make my local NOMMU development tree, I would
+ do:
+
+ git clone --reference /warthog/git/linux-2.6 \
+ ssh://master.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-nommu.git \
+ linux-2.6-nommu
+
+
+ (4) Now you need to set up your local GIT tree to make it possible (a) update
+ your development tree by pulling in the upstream tree, and (b) publish
+ your changes by pushing them to your public tree.
+
+ cd %DEVEL_DIR
+
+ Tell your repository where to find the upstream tree:
+
+ git remote add %UPSTREAM %UPSTREAM_REPO
+
+ where %UPSTREAM is the name you by which you want to refer to the upstream
+ repository to git pull. For Linus's upstream kernel, you might wish to
+ use 'linus' for example.
+
+ In my example, I did the following to pull Linus's tree into branches of
+ my tree:
+
+ cd linux-2.6-nommu
+ git remote add linus \
+ git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+
+ Looking in .git/config, I now see section that looks like this:
+
+ [remote "origin"]
+ url = ssh://master.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-nommu.git
+ fetch = +refs/heads/*:refs/remotes/origin/*
+ [branch "master"]
+ remote = origin
+ merge = refs/heads/master
+ [remote "linus"]
+ url = git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ fetch = +refs/heads/*:refs/remotes/linus/*
+
+
+ (5) You should now be able to update your development tree from the upstream
+ repository to make sure that works:
+
+ git fetch -v %UPSTREAM
+
+ In my case, that's:
+
+ git fetch -v linus
+
+ If you've just created the repository, it'll probably just say that things
+ are up to date:
+
+ From git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
+ = [up to date] master -> linus/master
+
+ [!] NOTE: I cannot determine a way of making "git pull linus" work without
+ setting branch.master.remote to 'linus'.
+
+
+ (6) And then you should be able to publish your development tree by pushing it
+ to your public tree, thus allowing the rest of the world to see your
+ changes.
+
+ git push -v origin
+
+
+ (7) Finally you should be able to pull your published tree back into your
+ development tree, and it should just say that it's up to date:
+
+ git pull -v
+
+
+UPDATING YOUR DEVELOPMENT TREE
+------------------------------
+
+Okay: so you've got your tree, and you've made changes to it, and now Linus has
+gone and dumped five thousand patches into his tree, making the base for your
+changes obsolete. You need to update your tree and fix up your changes.
+
+If you haven't yet committed your changes, you'll have to siphon them off into
+a file:
+
+ git diff >a.diff
+
+and deapply them:
+
+ patch -p1 -R <a.diff
+
+You can then update your tree from the upstream tree with no fear of a conflict
+(assuming you don't also have changes that you have committed). Once you've
+updated your tree, you can reapply your changes:
+
+ patch -p1 <a.diff
+
+And then fix up the rejects with your favourite editor and a few choice curses.
+
+
+To actually update your tree, you can do the following:
+
+ git fetch %UPSTREAM
+
+In my example, that'd be:
+
+ git fetch linus
+
+If you have committed changes, this will attempt to merge them, but you may
+still need to fix them up. If everything went smoothly this will automatically
+commit a merge on top of the tree and set the HEAD pointer to that. This merge
+will point at your last tree and the tree you just merged from upstream, and
+will indicate that the resulting tree is a combination of both. Of course, you
+shouldn't assume it will still compile, let alone still work...
+
+If you do need to fix them up, refer to the "Manually merging failed fetches"
+section for guidance.
+
+You can view the merge that git pull committed by:
+
+ git show
+
+And you can view the tree structure at that point with the gitk command.
+
+
+PUBLISHING YOUR CHANGES
+-----------------------
+
+Finally, you're in a position to make your changes available. Firstly, you
+have to commit them to your development tree (as mentioned previously) and then
+you have to make them available to the rest of the world. To do that, simply
+run:
+
+ git push
+
+which will apply the changes to your public tree. If you have web access to
+your git tree, these will eventually become visible through there.
+
+You may then have to tell your upstream maintainer what you'd like them to pull
+from your tree. The standard way to do this is to do:
+
+ git request-pull %BASE_ID %MY_REPO >/tmp/request.txt
+
+where %BASE_ID is the head of the tree on which your changes are based, and
+%MY_REPO is the public URL of your repository. If you have your development
+git tree configured to know where the upstream remote repository is, then if
+you've ever done 'git fetch' you should have a branch for it, named something
+like "%UPSTREAM/%UPSTREAM_BRANCH" where %UPSTREAM is the name you gave to 'git
+remote' and %UPSTREAM_BRANCH is the upstream branch on which you've based your
+development (almost certainly 'master').
+
+This command will generate a list of all the patches between %BASE_ID and the
+head of your tree that you are asking to be pulled.
+
+In my example, I can do:
+
+ git request-pull linus/master \
+ git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-nommu.git \
+ >/tmp/request.txt
+
+You should then edit /tmp/request.txt to include a description of what you're
+trying to achieve with these patches, and then mail the whole file to the
+upstream maintainer.
+
+[!] NOTE: It may take some time for the git push to take full effect. Before
+ that time is up, git request-pull may give spurious warnings and the test
+ it produces may say that the branch is unverified.
+
+
+===============================
+MANUALLY MERGING FAILED FETCHES
+===============================
+
+Occasionally, when you pull someone else's tree in to your repository, either
+because the base needs updating or because you're incorporating stuff from a
+contributor, the merge will fail due to conflicts between the changes you have
+made in your tree, and the changes you're importing.
+
+GIT will try and automatically merge where possible, but it can't always manage
+it. In such cases you have to unlimber your text editor and fix it manually.
+
+GIT will report the files that need merging during the git fetch/git pull:
+
+ CONFLICT (content): Merge conflict in drivers/char/tty_audit.c
+
+and they can also be determined by looking in ".git/MERGE_MSG".
+
+GIT will interpolate markers into the affected files, along with both versions
+of the code:
+
+ <<<<<<< HEAD:drivers/char/tty_audit.c
+ tsk->pid, uid, loginuid, sessionid,
+ =======
+ tsk->pid, tsk->uid, loginuid, sessionid,
+ >>>>>>> b3985e2bf6ce51ae943208af4bd336287fb34ed6:drivers/char/tty_audit.c
+
+The first section (<<<<<<<< to =======) is the version from your tree, the
+second section (======= to >>>>>>>) is the version from the tree being
+imported. The markers must be removed, and the conflicting code resolved down
+to the appropriate final version.
+
+
+Once that is done, git add (or git rm) must be called on the changed files so
+that git commit knows to include them in the new head. It works exactly like
+changing files normally (as per the "Making changes" section), except that GIT
+has stored extra data that will go into the merge commit when git commit
+creates it.
+
+
+=============
+LOCATING BUGS
+=============
+
+There will be times when the program you've built malfunctions. It happens now
+and then even to the best of projects. Sometimes you can easily locate the bug
+by looking at the symptoms and the debugging output and then eyeballing the
+code, and sometimes you can't.
+
+For very big projects such as the Linux kernel, finding a bug that someone else
+has inadvertently introduced can be very hard, but GIT allows you to take
+advantage of the fact that the changes are introduced a bit at a time with
+clear boundaries (commits) to make life a bit easier.
+
+
+BISECTION
+---------
+
+What you really want to be able to do is to isolate the commit that's causing
+the malfunction, but with automation support so that you don't have to trace
+the commit tree yourself. GIT has a tool to do this: git bisect.
+
+The way this works is to take two points in the tree: one at which you know the
+program malfunctions, and one at which you know it doesn't, and then chop its
+way through the tree to locate the failing commit.
+
+To illustrate this:
+
+ (1) Assume that you're dealing with the kernel, and that you find that after
+ Linus's merge window, 2.6.25-rc1 does not boot for you, but you know that
+ 2.6.24 did prior to the window.
+
+ Firstly you have to start your search and describe the bounds (the working
+ and non-working points). This is done with the following commands:
+
+ git bisect start [%BAD_COMMIT [%GOOD_COMMIT]]
+ git bisect bad [%BAD_COMMIT]
+ git bisect good [%GOOD_COMMIT]
+
+ where %BAD_COMMIT and %GOOD_COMMIT are optional commit object IDs or
+ symbolic representations thereof. The 'bad' command is unnecessary if
+ %BAD_COMMIT is given to 'start', and the 'good' command is not required if
+ %GOOD_COMMIT is given to 'start'.
+
+ So, in the example we're looking at, you could do:
+
+ git bisect start
+ git bisect bad v2.6.25-rc1
+ git bisect good v2.6.24
+
+ or:
+
+ git bisect start v2.6.25-rc1
+ git bisect good v2.6.24
+
+ or:
+
+ git bisect start v2.6.25-rc1 v2.6.24
+
+ [!] NOTE: This is using a symbolic tag 'v2.6.24' to refer to the last
+ commit before 2.6.24 was declared.
+
+
+ However, if 2.6.25-rc1 is at currently at the head of your tree, you can
+ do:
+
+ git bisect start
+ git bisect bad
+
+ to indicate that this malfunctioned, or you could do this in a single
+ command:
+
+ git bisect start HEAD
+
+ to start bisection _and_ indicate that the HEAD revision is bad.
+
+
+ Alternatively, if you're at a point where the program _does_ work, you can
+ pass either HEAD or no parameter to the 'good' bisection command, or pass
+ HEAD as the %GOOD_COMMIT parameter to the 'start' bisection command.
+
+
+ (2) Now GIT will rumble through the commits between the two points you have
+ declared, and set the current HEAD of the repository to a point that
+ approximates midway between the two:
+
+ warthog>git bisect start v2.6.25-rc1 v2.6.24
+ Bisecting: 4814 revisions left to test after this
+ [d2e626f45cc450c00f5f98a89b8b4c4ac3c9bf5f] x86: add PAGE_KERNEL_EXEC_NOCACHE
+
+ and then it will check out the sources to reflect their state at this point.
+
+
+ (3) You should now attempt to compile this and test it. If the test succeeds,
+ you should run the command:
+
+ git bisect good
+
+ If the test fails, run the command:
+
+ git bisect bad
+
+ These will tell GIT to binary chop the commits between either the current
+ point and the good end or the current point and the bad end to find a new
+ commit to test:
+
+ warthog>git bisect bad
+ Bisecting: 2406 revisions left to test after this
+ [fb46990dba94866462e90623e183d02ec591cf8f] [NETFILTER]: nf_queue: remove unnecessary hook existance check
+ warthog>git bisect good
+ Bisecting: 1203 revisions left to test after this
+ [936722922f6d2366378de606a40c14f96915474d] [IPV4] fib_trie: compute size when needed
+
+ As for when bisection started, GIT will set the current HEAD pointer and
+ then check out the sources. You should repeat step (3).
+
+ If the commit is broken for you and the compile fails, run the command:
+
+ git bisect skip
+
+ this will cause the bisection algorithm to move onto the next commit in
+ the hope that this one will be better:
+
+ warthog>git bisect skip
+ Bisecting: 1203 revisions left to test after this
+ [1328042e268c936189f15eba5bd9a5a4605a8581] [IPV4] fib_trie: use hash list
+
+ this will change the HEAD pointer and check out the sources. Repeat step
+ (3).
+
+
+ (4) Eventually, after you've tested a number of different commits, GIT will
+ tell you that it has narrowed the problem down to either a single commit,
+ or if there were compile errors that got in the way, a range of commits:
+
+ warthog>git bisect bad
+ e3ac5298159c5286cef86f0865d4fa6a606bd391 is first bad commit
+ commit e3ac5298159c5286cef86f0865d4fa6a606bd391
+ Author: Patrick McHardy <kaber@xxxxxxxxx>
+ Date: Wed Dec 5 01:23:57 2007 -0800
+
+ [NETFILTER]: nf_queue: make queue_handler const
+
+ Signed-off-by: Patrick McHardy <kaber@xxxxxxxxx>
+ Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
+ ...
+
+
+ (5) At any time during the bisection process, you can use:
+
+ git show
+
+ to examine the commit currently selected for testing, and:
+
+ git bisect log
+
+ to view the log of information provided by you through git bisect start,
+ good and bad, and:
+
+ git bisect visualize
+
+ to start up the gitk program to show you a graphical view of the current
+ good-to-bad range of commits as narrowed down by bisection.
+
+
+ (6) You should then end the bisection process by:
+
+ git bisect reset
+
+
+BLAME
+-----
+
+Now imagine that rather than indulging in bisection you've found a bug by
+simply looking at the code: who do you tell about it? You could look at the
+banner comment at the top of the file to look for names and email addresses,
+and you could also look in the kernel MAINTAINERS file or its equivalent, but
+the person you really want to harangue is whoever made the change...
+
+There's a very useful GIT tool to help determine this:
+
+ git blame <file>
+
+also known as:
+
+ git annotate <file>
+
+which will give you a list of lines in a source file against who changed them
+last and in what commit. You may find that your favourite editor has a
+facility to run this for you (Emacs has vc-annotate, bound to C-x v g, for
+example).
+
+Running git blame on the kernel's README file, for example, might show:
+
+ warthog>git blame README
+ 620034c8 (Jesper Juhl 2006-12-07 00:45:58 +0100 1) Linux kernel release 2.6.xx <http://kernel.org/>
+ ^1da177e (Linus Torvalds 2005-04-16 15:20:36 -0700 2)
+ ^1da177e (Linus Torvalds 2005-04-16 15:20:36 -0700 3) These are the release notes for Linux version 2.6. Read them carefully,
+ ...
+
+The hex number that occurs first on the line is a truncated commit object ID,
+and this can be passed to git-show (remove the '^' symbol first, if given).
+
+ warthog>git show 1da177e
+ commit 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2
+ Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxx>
+ Date: Sat Apr 16 15:20:36 2005 -0700
+ ...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/