(Some of this information is borrowed from CME 257, taught by Brad Nelson: https://github.com/icme/cme257-advanced-julia/blob/master/class/class4/class4.md)
What is git? It is a version control system. This enables:
- periodic saving of work (called committing)
- returning to old versions when a problem is introduced
- creation of experimental code branches with out disturbing the main or working code
- merging the concurrent work of independent developers
- remote backup and storage of work
- tracking a log of project history
Git is not the only tool used for version control, although it is one of the most popular. Others include
- CVS (Concurrent Versioning System)
- SVN (Subversion)
- HG (Mercurial)
- Git homepage: http://git-scm.com/
- Git documentation: http://git-scm.com/doc
- Git Book: http://git-scm.com/book/en/v2
While git is a method of version control and backing up files, they still need to be hosted somewhere. GitHub is one of he most common hosts for remote git repositories (BitBucket is another). Instead of using either of these services, you could set up a remote repository on a private server. GitHub and BitBucket allow you to work privately (with paid or student account) or share your projects with the world, like I am doing with these notes.
git clone <url>
- Create a copy of a remote repository on your machinegit init
- Create a new local repositorygit add <file>
- Add file to staging to begin trackinggit add --patch <file>
- Interactively go through file and choose changes to stagegit reset <file>
- Remove a file from staging,--hard <commit>
will reset the file to its state at the given commit.git rm <file>
- Stop tracking the filegit commit <file>
- Save changes to local repository,-a
will commit all changes,-m "<message>"
will add the commit messagegit push
- Send changes to the master branch of the remote repositorygit pull
- Fetch and merge remote changes into your local repositorygit status
- List the files you've changedgit log
- Show change historygitk
- Open a GUI that functions as a log and a statusgit diff
- Show difference between local repository and remotegit branch
- Print available branchesgit branch <branch name>
- Create a new branchgit checkout <branch name>
- Switch to branchgit checkout master
- Switch to master branchgit merge <branch name>
- Merges the branch with the active branchgit mergetool
- Tool to help resolve merge conflictsgit checkout <commit number>
- Revert code back to the state it was at that commit number, this is a temorary change.gitignore
- File that tells git to ignore certain files
The basic Git workflow is as follows:
- Create a repository by cloning from GitHub or intializing
git clone <url>
- Create a copy of a remote repository on your machinegit init
- Create a new local repository
- Make new files and change old files
- Add new files to staging
git add <file>
- Add file to staging to begin tracking
- Commit changes to files
git commit <file>
- Save changes to local repository,-a
will commit all changes,-m "<message>"
will add the commit message
- Push changes to remote repository
git push
- Send changes to the master branch of the remote repository
- Repeat
When other users make changes to the repository, you will need to use git pull
to get those changes in your local repository
- Checking staged changes
git status
- List the files you've changed
- Checking differences between repositorie
git diff
- Show difference between local repository and remote
- Checking change history
git log
- Show change history
- A tool for all of this
gitk
- Open a GUI that functions as a log and a status
- You can also edit the
.gitignore
file to make git ignore certain files or types of files that you don't want to track
One of the things I find most confusing about git is the different kinds of ways you can change a repository.
A clone is a local copy of a repository. Here you can change the code locally as much as you like. If you have access to the repository (if it is your own or you are working with the people that maintain the respoitory), then you can directly push changes to the code base. If it is a bigger project, like an open source project, you can send a pull request to the maintainers and they will choose whether your changes get added to the main code base. This is the procedure documented above in the basic workflow.
A branch is a temporary version of the code that is usually used to implement a specific feature, which is later implemented into the master or main branch. This is good if you are working on multiple features at once and you don't want them to interact with each other during development. This also allows you to accidentally break the code with your new features and still have a working version available without having to revert back. Once a feature is completed, it can merged back into the master branch, from which it can be pushed to the remote repository. The general workflow for branches is:
- Create a new branch
git branch <branch name>
- Create a new branch
- Make changes and commit those changes, like in the basic workflow
- Move between branches to make changes where need
git checkout <branch name>
- Switch to branchgit checkout master
- Switch to master branchgit branch
- Print available branches
- Merge changes to master branch
git merge <branch name>
- Merges the branch with the active branch
- Resolve merge conflicts
- Git does a resonable job of combining different branches, but there are often areas where it can't figure out how to merge the changes. In these places it leaves sections of
>>>>>>>>>>>>>>>>>>>>>
to indicate it doesn't know what to do. You can resolve these issues using any text editor or using the mergetool git mergetool
- Tool to help resolve merge conflicts
- Git does a resonable job of combining different branches, but there are often areas where it can't figure out how to merge the changes. In these places it leaves sections of
- Commit the merged changes and merge again
- Push changes to remote repository if desired
A fork is a method to copy a repository with the intention of creating a separate project from it. When you create a fork, you get the history fro mthe repository, but a new orgin/master is created from the split, so changes will not go back to the original repository. I have worked in places where we created a fork of the original code, made our changes, and then used a pull request to get them back into the original code, however this is generally not the workflow. The usual way to create a fork is to simply click the fork button on GitHub or whatever hosting site you are using. From there you can download the repository with clone like any other repository.
In conclusion, clones are local copies of the repository that you can work in, branches are versions of the code used for creating new features which are later combined into the original version, and forks are copies of the code designed to split off into a separate code.
When you work in Git, especially with big projects with multiple developers, tracking all of the changes that are made and that state of your code can get confusing. To help with this, each commit has an associated commit number associated with that. This commit number can be used when reverting back to previous versions of the code. git checkout <commit number>
does this temporarily and git reset --hard <commit number>
will permanently revert back. git log
will list all of the past changes and commit messages so that you can see what was done. gitk
offers a more detailed view of this that also shows the changes to the files in each commit.
The most common way to think of a git history is by using a directed acyclic graph. Branches, forks, and merges will create vertices and edges in the graph. This is a very powerful way to think about a repository and can often help solve confusions with how to resolve merge conflicts and other issues. Once again, gitk
can help with this and show the graph view. We will not go into any more detail about this in this course, but for those that are interested in learning more, here is a link with some information. http://eagain.net/articles/git-for-computer-scientists/