title | author | date |
A brief introduction to Git and GitHub |
Peter Humburg |
24th November 2014 |
. . .
- Keeping track of changes
- Collaborate
- Maintain multiple versions
- Understand what happened
- Recover previous versions
- Backup source code
- Can maintain several parallel versions.
- Use one branch for the latest stable version (master).
- Other branches for development.
- Changes to one branch will not interfere with use of other branches.
- Different workflows use branches in variety of ways.
GitHub hosts Git repositories. Great backup for source code and other documents (like this presentation).
Make sure it is initialised with a README.
Quickly show how files (in this case the README) can be edited on the website.
We will use this to add a few files and process them.
Two ways to access repositories on GitHub
git clone https://github.com/jknightlab/git-tutorial.git
#. SSH
git clone git@github.com:jknightlab/git-tutorial.git
Two ways to access repositories on GitHub
git clone https://github.com/jknightlab/git-tutorial.git
* No additional setup required
* Works from behind firewalls/proxies
* Requires user name and password for every `push`, `pull` or `fetch`
* Git can do this for you
git config --global credential.helper 'cache --timeout=36000'
#. SSH
git clone git@github.com:jknightlab/git-tutorial.git
Two ways to access repositories on GitHub
git clone https://github.com/jknightlab/git-tutorial.git
#. SSH
git clone git@github.com:jknightlab/git-tutorial.git
* Need to generate and deploy SSH keys
* If private keys are password protected this has to be entered for each `push`,
`pull` or `fetch` command.
* [Can use `ssh-agent` to take care of passwords](#ssh-agent-setup).
Tell Git your name and email address. These will be used to attribute commits.
git config --global user.name <your name>
git config --global user.email <your email>
. . .
If you are working on Windows also set this option
git config --global core.autocrlf true
Depending on the setup user name and/or email may be populated properly already.
Also make sure Git is converting line endings on Windows (but not on Mac/Linux).
Currently we are using the master
git status
. . .
List all existing branches
git branch -a
. . .
Switch to data-collection
git checkout data-collection
Create a new file (using your favourite text editor).
. . .
We now have an untracked file in our repository.
git status
. . .
Add the file to the staging area
git add <your file>
Run git status
after each step to see how the file is recognised by Git.
Time to commit all staged changes (don't forget to add a descriptive commit message)
git commit -m "Added file with very important information"
. . .
Now we can push the new files to GitHub
git push
Again, use git status
to see the status change after each step.
After files have been pushed to GitHub, go to the website to see them appear there.
Before doing anything else, make sure your local repository is up to date.
git pull
Create a new branch (off the data-collection
git checkout -b analysis
. . .
Combine all the data (in R) ...
files <- dir(pattern=".txt")
data <- lapply(files, read.table)
data <- do.call(rbind, data)
names(data) <- "value"
names <- lapply(files, strsplit, ".", fixed=TRUE)
data$name <- sapply(names, sapply, "[[", 1)
write.table(data, file="combined.tab", row.names=FALSE)
. . .
and add it to the repository.
git add combined.tab
git rm *.txt
git commit -m "Combined data into single file"
When pushing a new branch for the first time we need to tell Git where it should go.
git push --set-upstream origin analysis
Let's plot the data
data <- read.table("combined.tab", header=TRUE)
data$rank <- order(data$value)
ggplot(data, aes(y=value, x=rank)) + geom_point() + theme_bw()
. . .
and add the plot to the repository.
git add figure/combined.png
git commit -m "added plot of data"
git push
Make sure the output directory for the figure exists.
After pushing the plot back to GitHub it may be a good opportunity to show what things look like now.
Maybe that plot could be improved?
ggplot(data, aes(y=value, x=rank)) + geom_point() +
geom_text(aes(label=name), hjust=0, vjust=0) +
. . .
git add figure/combined.png
git commit -m "Added labels to data points."
git push
#. Create pull request for branch that should be merged #. Approve pull request and merge #. Delete merged branch
Branches that have been deleted on GitHub still exist in the local repository. Best to clean them up.
git pull
git branch --merged | grep -v "\*" | grep -v master | xargs -n 1 git branch -d
git pull --prune
The prune command then removes all remote tracking branches that no longer exist.
Can always merge locally and then push to GitHub.
Here we merge data-collection
into master
git checkout master
git merge data-collection
. . .
and then delete the local branch and push everything to GitHub
git branch -d data-collection
git push
. . .
Finally, delete the remote branch as well.
git push origin --delete data-collection
#. Network graph #. View all changes made by a commit #. View history of individual files #. See who made changes to a file
Click on a node in the graph to get details of the commit.
Look at the history of a file, compare two commits to see the changes.
Make sure to show the image diff:
#. Find the commit SHA for the first version of the plot and copy it. #. Go to the compare view for the repository and set the base to the SHA.
git clone
~ Create a copy of a remote repository.
git add
~ Stage new or changed files for the next commit.
git commit
~ Commit a change set to the local repository.
git push
~ Push committed changes to the remote repository.
git pull
~ Get latest version of files from remote repository and merge them with the local copies.
git status
~ Show status of files in working directory relative to index.
git branch
~ Create a new branch or list existing branches. Can also delete local or remote branches
(may want to merge into another branch first).
git checkout
~ Switch to a different branch.
git merge
~ Merge two branches.
git rm
~ Delete files from index and working directory.
git reset
~ Reset index and working directory to a previous commit.
git stash
~ Temporarily undo changes that you don't want to commit immediately.
- GitHub for Beginners part 1 and part 2
- Git: Your new best friend
- Git for Scientists
- Interactive online tutorial
- GitHub bootcamp
- Tutorials from Atlassian
- Git documentation including installation instructions
- GitHub GUI for Windows and Mac
- GitHub workflow explained.
- Comparison of Git workflows.
- Detailed description of Git configuration.
If working on a Linux machine that isn't automatically starting an ssh-agent instance
this can be achieved by adding the following code to .profile
function start_agent {
echo "Initialising new SSH agent..."
/usr/bin/ssh-agent -s | sed 's/^echo/#echo/' > "${SSH_ENV}"
echo succeeded
chmod 600 "${SSH_ENV}"
. "${SSH_ENV}" > /dev/null
# Source SSH settings, if applicable
if [ -f "${SSH_ENV}" ]; then
. "${SSH_ENV}" > /dev/null
ps -ef | grep ${SSH_AGENT_PID} | grep "$(whoami).*ssh-agent\s" > /dev/null || {
The ssh passphrase then only needs to be entered once when the ssh agent is started.