Git Fundamentals
This section covers the basics of setting up the git environment and the most commonly used commands.
Install the latest version of git for your computer
Simply click on the link for your operating system to download the installer
Follow the instructions to install the package
When prompted to choose an editor, choose
nano
for now.
Optionally install GitHub's desktop GUI client
If you don't already have one, sign up for a free GitHub account
Commands are executed in the terminal window. Examples in this document show the command to be executed after a dollar sign ($) prompt. Do not type the dollar sign when entering commands. Note that some commands include quotes around words or phrases. These quotes are important -- don't skip them. Lines that begin with the hash symbol (#) are comments, for informational purposes only.
Important: git provides helpful information -- pay attention to the git output messages.
There are many git references and tutorials on-line. Here is the link to GitHub's bootcamp.
Understanding git
Git is a version control system. It keeps track of changes that the developer makes to the files in the repo, and allows these changes to be shared with others, removed at a later time, or re-ordered and combined to package several small changes into one.
A developer causes a change by editing files in the repo, or adding or files from the repo. After editing the files, git must be told what the change is so that it can track it. To do so, the developer groups the changes together into a commit. The commit is the basic trackable unit of git, and it is identified by a hash, a string of hexadecimal numbers that uniquely identifies that particular change to the repo.
In order to create the commit, the developer will add (or stage) the files and other changes to a staging area, where git keeps track of items that will be included in the next commit. In this way, changes can be grouped together into a batch that is included in a single commit.
All of the behind-the-scenes work performed by git goes on in a special directory in the top directory of the repository: .git. This is the place where changes are staged, and the rest of the repository history is kept. Basically, the state of every file for the life of the repo is instantly available to the developer through git commands that act on the .git directory. (Note that the dot in front of the directory name causes it to be hidden in a normal ls directory listing. Use ls -a to reveal these hidden files.)
Important set-up for the new git user
Username & Email
When a developer commits a change to a repository, the commit is labeled with their user name and email address. It is important to set up your development machine with valid identity information so that your teammates know who is working on what. While you can set up any name/address you like, for the purposes of our team please use your real name and email address that you signed up for GitHub with.
You only need to set this up once when you install git on your computer. After that, all repositories that you work in will be aware of your identity. The information that you are entering here will be stored in the .gitconfig file in your user home directory.
Default editor
Git brings up a text editor whenever the user is required to enter information about a commit. The default editor that it chooses may not be what the developer likes. It is possible to change the default editor in a similar way to setting the user identity.
Creating and contributing to a repo
Git init
You can create a repo easily on your computer. Doing so will create a private repo, as it is not automatically associated with a server (ie, GitHub) that will allow it to be shared. You can create these repos wherever you like, but be aware that creating a repo inside another repo can be confusing to work in. Creating and using repos from scratch is a great way to try out git without fear of messing up a shared group repo.
If you are creating a brand new repo, create your working directory first, and switch to that directory before initializing the repo.
Initializing a git repo automatically creates a .git directory to manage tracked files. Deleting this directory will delete all project history. Don't do that unless you really want to destroy your copy of the repository
At this point, our project tracking is setup and ready to go, but there are no files in the repo. Remember previous comment about how helpful git is? Let's take a look at its status after git init. It tells us to use git add to track files we want to add to version control.
Adding files
To add some files to the repository, or to track the changes that are made to files that are already in the repo, there is a two-step process: staging and then committing. The first step (the add) copies the file into a special staging area inside the .git directory of the repo. It is possible to add multiple files, one at a time, in groups with wildcards, or even an entire directory of files all at once. After adding files, you can check what has been staged by using the git status command.
With files ready to go, it's time to finalize the change to the repo. To do this you use the git commit command. This command takes the contents of the staging area and bundles all of these new and changed files as a group, and then asks you to describe the change with a commit message. After the change has been described by the developer, it gets added to the current branch history. git status will report that the staging area is empty, but git log will now include the most recent commit message, and the files will be updated with the changes.
Git commit messages
git commit messages are how changes to a repo are documented and described. They are critical for collaboration, and also merely remembering what work was done before. It is very important that all developers write high quality commit messages that are brief and to the point, and that follow the best practices with regard to formatting. This is where you get to tell others what your change does, how to use the feature, and even what additional work needs to be done to enhance the feature in the future.
What is a bad commit message? Check out examples here.
The commit message format is as follows.
1st line is a summary, 50 characters or less
2nd line is a blank line
3rd line and onward, 80 characters or less per line, describes the change in detail
Read more about writing good commits here.
Git log and git show
git log is a great way to explore the contents of a repo. An important thing to note is the git hash, the long string of hexadecimal digits that identifies each unique commit. This hash is important for manipulating individual commits, and also for "going back in time" to examine the project at any previous point in its history. Some handy git log options.
git show displays detailed information on a given commit. As the parameter, you provide the first several characters of the commit hash you want to see.
Git clone
git clone is the process of creating a copy of a shared remote repository. The command takes a parameter that is the location of the remote repo, and this location can be another git directory on the same machine, or a URL that points to a remote network location.
With GitHub you can simply copy the URL for the repo and execute a git clone command on your system to create a copy.
GitHub provides HTTPS or SSH URLs for cloning. HTTPS is the easiest way to clone a public repository that you don't intend to push changes to. SSH uses the secure shell protocol to create an authenticated two-way link between your computer and GitHub, and requires that you set up shared keys on your machine and GitHub. You can read more about this here
Git fork
GitHub supports the use of the forking workflow. In this workflow, the developer duplicates the main project repo into their own GitHub account workspace, and then clones that duplicated repo onto their computer in order to make changes to the project. In this way, the developer can make all the changes that they want to the clone on their machine, and can also push those changes to their copy of the repo on GitHub, without affecting the main project at all.
Here are example of forks in my GitHub account.
Once you have a fork in your account, you can git clone it to your system. Any changes you make to your forked repo will not impact the main project repo. When you are satisfied that you have a change that should be included in the main project, create a pull request to allow the team to review the changes and then merge them from your repo into the main team repo. If issues are found, the reviewer can make suggestions for changes to be made before merging.
GitHub pull request
You can contribute your changes (that you pushed to your fork) to the upstream repository by submitting a pull request. You do this on GitHub, as pull requests are not a part of the basic git tool. They are part of a workflow that GitHub provides, to make it easier for groups to work together on the same repository.
The basic idea is that you select the branch with changes in your repo that you want to submit to the main project repo, and also specify the destination branch in the main repo. When you submit the pull request, the managers of the main repo are notified and can review your changes for any issues, and can make comments if there are things that they need you to fix. If they accept the changes, they click a button to merge your changes into the main repository, after which everyone on the team can fetch or pull the changes into their local repostories.
Merging
When a pull request is accepted, the changes from the developer repo are merged into the main repo. It is also possible to merge one branch into another within your own repo, or merge a remote branch on GitHub into your private working branch.
The merge operation is how git reconciles two different histories into one. In the simplest case, a merge operation performs a fast forward, adding the new change to the end of the commit history. The new commits appear at the very top of the git log output, before any of the old, existing commits.
Usually things are not so simple, however. It is common for two or more developers to all be making changes to their copies of the repository at the same time, and the first developer to make a pull request will have no issues with their changes, as they will be a fast forward merge. However, the remaining developers merges will be trying to add their changes to a history that doesn't match the local repo (as the main repo now contains a new commit). In this case, git tries to take care of everything and creates a merge commit to reconcile any differences between the two histories.
Often a merge commit goes fine. Sometimes there are odd artifacts as a result of two developers working on the same area of functionality. But sometimes git cannot resolve the two histories into one, and throws up a scary looking message about a merge conflict. This can happen if two developers both made changes to the same file, especially if their changes were very close together (or even on the same line of code).
Merge conflicts
In a collaborative environment, merge conflicts are inevitable. A merge conflict occurs when changes in your history are nearby changes made by another developer, and that other developer merged their changes into the main repo first. Basically it is a conflict that the git tool is not smart enough to fix on its own, as it doesn't know who's changes are more important, or how one set of changes might affect the other.
For example, if one developer changed the variable names in a function to make a formula easier to read, and another developer broke up the confusing formula and put comments in between lines, these changes could not be merged together and git would report a conflict.
When a merge conflict happens, git requires the second developer to decide which changes are valid in the final merge. In the case of this example, the developer can choose to keep the new variable names or the comments, or both.
Frequently syncing your local repository with your upstream/main repo helps avoid potential merge conflicts. This requires frequent git pull from the upstream to ensure you have a complete set of changes from other developers before creating your changes. The more out of sync your local repo is from the upstream, the more merge conflicts you can expect to have.
Basic repo configuration
Git remotes
For background on git remotes, see [What is a repo and remotes?](./git_about.md#"What-is-a-repo-and-remotes").
Using these remotes, I can pull updated history from origin, i.e. Spartronics' repo for developershandbook, to stay in sync with other developers on the repo. Using the _binnur remote I can push my updates to my GitHub repo in order to submit a pull request to Spartronics.
Note, your remotes can be named to anything. However, we will use the standard 'upstream' to refer to the source, and 'origin' to refer to our fork. It is not wise to name a remote the same as a branch (ie, don't name a remote master), or vice-versa. It can make git operations very confusing.
.gitignore
The .gitignore file lists all files (by name, or via wildcard characters) that should not be tracked by git. This is useful to keep build artifacts or developer-specific configuration files out of git, so that they don't affect the development environment of other contributors. See GitHub's ignoring files for more information.
Git Commands
git help
You can access git help anytime -- and don't forget to pay attention to git messages after git commands are executed! (They are trying to tell you something...)
git init
See git init section for more information.
git status
git status is your friend! It is an overview of the status of the repo.
Looking at the git status, I can tell:
I am on the 'gitintro' branch
My branch is ahead of its remote by 1 commit, meaning other developers
following my fork on binnur/gitintro can not see my most recent change
I have several untracked files and I need to use git add to add them to the repo
git pull vs git fetch && git merge
git pull vs git fetch && git merge has a similar outcome with different intents.
git pull pulls the changes from the remote and magically applies them to your local repo. It is a fetch && merge in one. The git pull fetches and downloads content from a remote repository and immediately updates the local repository to match that content.
However, as it is magical, the recommendation is to use fetch and merge to ensure you understand how your working directory will be updated, this is important when you get to next level of git with branches.
git fetch downloads all related commits, files, etc. from the remote repo to your local repo. It allows you to see what everyone has been working on. As it does not apply the changes to the currently checked out branch of your local repo, git fetch allows you to review changes and decide how to git merge the updates.
git merge will apply the changes to your local repo after git fetch.
The merge process can result in conflicts -- this is basically git's way of saying it cannot automatically decide how to handle conflicts and it needs help to determine which version/entry is the correct one. Read more on handling merge conflicts.
What is git checkout? And a branch?
When you perform git clone, git will automatically checkout the default branch that is set by the remote repo. git checkout allows the developer to switch between different versions of a branch or commits. This process updates the contents of your working directory to match the state of that branch or commit.
In general, you can checkout any number of branches, such as creating a feature branch to ensure your master branch is production ready at all times. You will always be able to pull or merge upstream changes into your master branch without conflicts because you make your modifications only to the feature branch(es), and push those branches to GitHub to submit merge requests.
git add and git commit
git add and git commit are used in combination to 'save' the git repo's current state. git add makes a change to the staging area by adding files from working directory to the repo's staging area. Changes are not recorded until you git commit.
As discussed in the git status section, git status highlights the status of the repo.
git commit will publish the git_intro/git_fundamentals.md file to the local repo
Passing the -m
flag allows you to write your commit message without the use of a visual editor. As such:
git push
Using git push commits changes made to your local repo to remote repository. git push takes two arguments:
remote name, ex. origin
branch name, ex. master
git reset
Sometimes you mess up, and don't realize it until after committing your changes.
Git makes it easy to revert your changes with use of the git reset
command. Git reset lets you revert to a previous commit hash, which you have to find first with git log
.
Usage:
An even easier way to do this is using HEAD. git reset HEAD
resets to the last commit, and git reset HEAD^
resets to the one before that.
Passing the --hard
tag between reset
and the commit hash will get rid of all changes. This is dangerous, and should be used with great caution.
Last updated