Git
Git is for collaboration. Its like a non-automated version of google-drive sync that is designed to show you every single file change, deletion or addition. It has multiple ‘branches’ which are parallel versions of the repository. You work on a branch and only merge the changes with the main periodically, when you’re confident it works
- git software is the back-end. Its a ubiquitous open source software you have have to download.
- GitHub is a remote server which you can upload or download your repositories to and from and view it online using the git protocol
- git-graph is the visualisation of all the changes to all the branches. PyCharm has functionality to switch the git-branch and upload and download changes built in to the GUI. In VScode, you have to download an extension called git graph to do it properly.
Technical Terms
The following terms have very specific meanings in the context of git, which are essential to understand.
- Checkout - choose a different branch or inspect a previous commit. To move to another branch, you need to commit or cache any changes in the current worktree. Checkout closes all the files from that branch and creates a new folder structure locally inside the repository folder which corresponds to the other branch.
- Clone: download an entire repository from GitHub, including all branches on to your local machine. It creates a folder for the repo and stores all the data in a ‘.git’ folder inside. The folder structure will initially be that of the ‘main’ branch.
- Commit Pointer a.k.a. HEAD: The commit you have checked out.
- Commit - the act of updating the local branch with any changes you have made. You have to attach a message describing the changes to each commit.
- Current Branch - The branch which you currently have checked-out. This is the current folder structure of your local repository directory.
- Dot-file i.e.
.gitignore
or.gitattributes
or.git
: A settings configurations file for git. You can write file names separated by linebreaks or commands into.gitignore
, and the git program will read it and ignore those files. - Feature branch: In ‘git-flow’, a management technique, this means the branch which you are habitually check-outing and edit in order to develop a feature.
- Fetch - Compare the changes aka. ‘commits’ to the remote version of the current branch from online and show in your local git-graph diagram, BUT it doesn’t merge them into your local branches. If changes have been made remotely, it will cause the remote (origin) branch to be different to the local branch and it will show these two branches separately in git-graph.
- Head a.k.a. commit pointer: The commit which you have checked out.
- Index: Git documents sometimes call it the index, sometimes calls it the staging area, and sometimes—rarely these days—calls it the cache. The index / staging-area acts as your proposed next commit. They are frozen copies of the files made with
git add
. If you go to the git log. The location of the index will be shown by the location of HEAD. Files in the index are referenced in commands by the--staged
flag--staged
is a synonym of--cached
. - ‘Main’ or ‘Master’ branch: the ‘production’ version of the code.
- Merge into current branch - Pull the changes from another branch into the current branch. Adds merge
- Origin branch - this represents the version of the a given branch which is online on git-hub aka. remote.
- Pull - download and merge all the changes from the remote version of the current branch into the local version of the current branch, BUT don’t upload your changes onto the remote.
- Pull (Rebase) [should be done from feature to main]- It effectively makes the fork of your current branch begin at the current state of main, rather than where you forked it originally. It is done because when merging changes from a feature branch to main using merge, the feature branch will have an extraneous merge commit every time you need to incorporate changes. Pull (Rebase) to pull from main to feature, should be used prior to merging in order to remove those superfluous markers and make the history far easier to understand by halving the number of logged commits.
- ‘Pull request’ merge feature into main. Normally has to be done on github. Can download externally.
- Push - upload and merge your changes into the remote branch, but don’t download any changes.
- Sync - Push and Pull i.e. synchronise local and remote so they are identical.
- Tree-ish path: means either a branch name, or commit. As opposed to a file-path. Normally. This means commit-ish. Previous commit is HEAD, HEAD^ or HEAD~1, HEAD ^^ or HEAD~2
- Upstream changes: changes made to the remote of the branch which you haven’t yet pulled.
- Worktree: A state of files. Typically, ‘the worktree’ means the file structure you are viewing in your repo.
--
: When this appears in a git command line command it means that everything after it is a path.
Head, Index and Working-Tree
Inside your repo, in the .git
folder, there is a complex file structure which represents an archived copy of every commit. In every branch. Every commit is effectively a save point, so try to commit only at meaningful points to avoid wasteful archiving. Commits are saved by compressing them in ‘blob’ format. Blob is basically a file compression similar to ‘zipping’.
When the repo is small, when you commit, git stores snapshots of each and every file in blob, it does not store diffs from the previous commits. But as the repository grows bigger, it becomes inefficient for Git to store snapshots of every version of every file. So git stores only the most recent version, and all the other files are stored as ‘packs’ which just represent the delta from the most recent.
At any given moment, the entire repository has a single commit checked out, from a single branch. That checked-out commit is known as ‘HEAD’. The files you are seeing in your repository are checkout-out from the HEAD commit and they are known as the ‘working-tree’. Each branch also has an index which represents the proposed next commit. So you can swap branches without loosing staged changes. But un-staged changes are only present in the current working-tree. So you must stage or commit changes before swapping branch. See a summary of the tracked, staged and un-staged files with git status
.
- Un-Staged changes means difference between working tree vs Index and can be viewed with
git diff
. - Staged changes means differences between Index vs head and can be seen with
git diff --staged
.
Confusingly, when you look at GitHub, every file has a commit message next to it. That does not mean that each file is currently saved at a different commit and has a different HEAD. This is just GitHub trying to be useful by showing you the commit when it most most recently updated. All the files are saved in the current commit under head. A commit is always a repository-wide thing but you can erase the changes from the git log with the git reset command..
Just after you have checked-out or committed a HEAD commit, ‘working-tree’ will match the HEAD commit exactly and the index will be empty.
- The index is filled when you stage changes and;
- the working-tree differs from the head commit when you edit any file in the repo (known as un-staged changes).
A summary of the three trees:
- HEAD:
- The pointer to a given commit in the commit history. Almost always at the tip of a branch. HEAD is mostly, synonymous with branch for basic operations
- Represents a frozen snapshot of committed files compressed into the
.git
folder. - To change the HEAD commit you either make a new commit by committing the index with
git commit
, or move to another commit withgit checkout
. - When you move the HEAD commit by committing or checking-out another branch or commit, you get a new working-tree derived from the new HEAD commit. If the HEAD is the tip of a branch, it also has and index saved to it. Otherwise, you have an empty index.
- Index:
- The staging area: a frozen snapshot of files in the proposed next commit compressed into the
.git/index
folder. - Empty unless there are Staged changes. Staged is a synonym of Cached.
- To change the contents of the index, you need to ‘stage’ files with
git add …
. Inspect the list of staged files (a.k.a. the index) with:git ls-files -s
. - See the difference of Index vs HEAD with
git diff --staged
- The staging area: a frozen snapshot of files in the proposed next commit compressed into the
- Working-Tree:
- The visible repository file structure on your PC at any given moment.
- Initially the same as the HEAD commit.
- Any difference with the Index is known as ‘unstaged_changes’..
- To change the contents of the working-tree you just edit any of the files in the repo.
- See the difference of all changed files: working-tree vs Index with
git diff
Use git status
to see a summary of how both index and working directory differ from HEAD.
There is one more thing. The git stash save your un-committed changes in a “stash”. Note: this removes changes from working tree!
Detached HEAD
You may also encounter the detached-HEAD state if you check out an earlier commit in the git log.
- ‘Attached-HEAD’ means HEAD is pointing at the most recent commit in a branch, known as the ‘tip’. Attached-HEAD is the default state and is always the case unless you are messing with commit history.
- ‘Detached-HEAD’ means HEAD is at an earlier commit. Used to inspect or use the files from earlier commits. It is a temporary, auto-named branch. Any changes are thrown away when you move back to the tip. If you make changes in the detached head branch, you have to manually branch away and then merge it back in.
- ‘Enter detached-head’: detach to an earlier commit with
git checkout ###
and; - ‘Attach’ back the tip with
git checkout <branchname>
.
- ‘Enter detached-head’: detach to an earlier commit with
Origin
The ‘origin’ is the remote server version of the repo which you pull and push to. Normally GitHub, GitLab or Bitbucket but you also can have a git repository somewhere else on your local network. That would just be a conventional file path. The remote URL address to this server location is attached when you clone the repository with git clone <URL>
. But, if you change between HTTPS or SSH authentication, you need a different URL.
See the remote URL with:
git remote -v
Remove remote URL with:
git remote remove origin
Add a replacement remote URL:
git remote add origin <URL>
Sometimes you loose the connection between a local and remote branch and will have to use the following command which sets the default remote branch for the current local branch:
git branch --set-upstream-to <remote-branch>
This can equally be represented within a push command as:
git push -u origin <remote-branch>
When you run git fetch
then git log
, unless you have diverged:
- If you are up to date,
origin/<your_branch>
,origin/HEAD
, andHEAD -> <your_branch>
should all be on the tip commit. - If you are ahead, you should see
origin/<your_branch>
andorigin/HEAD
attached to an earlier commit andHEAD -> <your_branch>
at the tip. You can obviously push your code with no conflicts. - If you are behind, you will not see
origin/<your_branch>
andorigin/HEAD
at all. You have to rungit log origin/<your_branch>
to see where you appear. You should pull before committing, to assure there are no conflicts.
If however, you are behind, and you commit. You will diverge from origin. You have to pull before you can push. The git pull
will open a merge mode. The easiest way to deal with this is open the repo folder in VSCode and go to the ‘Source Control’ page. Here you can decide what to keep. The merge will be saved as another commit.
Check-out a new remote branch
git fetch
git branch -r
Copy the name of the branch, without the origin/
part.
git switch <name of branch, without the origin/ part>
Set your user name and user-email
Configure the author name and email address to be used with your commits. When performed from within the repo without the -global
or -system
flag, this user-name and -email update only applies to this repo. Note that Git strips some characters (for example trailing periods) from user.name
. Check if you have these assigned by running:
git config user.name
git config user.email
These are what will be attached to each commit. You can set them for the current repo by adding no flags:
git config user.name sebanalysis
git config user.email sam@example.com"
The user account, by adding the --global
flag:
git config --global user.name sebanalysis
git config --global user.email sam@example.com"
Or the entire PC by adding the --system
flag:
git config --system user.name sebanalysis
git config --system user.email sam@example.com
Stage Changes, Commit and Push
Initially at least, these are the only commands you will need on a regular basis.
git status
See what is staged and if you have any untracked files.git add -A
Add new, modified and deleted everywhere in branch.git commit -m "my commit message"
Commit with a basic message.git push
Upload to GitHub.
You can combine add and commit into one step with:
git commit -am "my message"
Git add also has a few other options which you will need:
git add -A
: New, modified and deleted files everywhere in branchgit add -u
: modified and deleted files everywhere in branch.git add .
: New and modified ( in.file
) in current- and sub-directories (including files whose names begin with a dot)git add *
: Advise against using as not part of git. It is file globbing done by the shell. In summary, it means new and modified files in current- and sub-directories (but without files whose names begin with a dot)
Note on syncing repos with other software like google drive: It is not a good idea because the changes are already made outside git, so when you open it git reads them as local changes. You then have to pull, anyway to overwrite the tracked changes.
Undo with git reset
You will inevitably want to undo your commits but undoing in Git is complicated. Unlike a Google Drive synced My Documents folder, there are three different things you might want to undo to different commits:
- Restore Working-tree: Erase un-staged changes. Change the visible files in the repo root back to a previous commit or index (if present). You can’t delete files here, need to remove from the index.
- Reset Index: Erase staged changes but retain working-tree. Equally could be called un-stage changes.
- Reset Commit history: Remove the archived ‘save-points’ from history back to a previous commit. Equally sometime called cleaning the log when you erase the change or squashing commits when you just want to compress them into a single commit.
There are two primary ways to undo changes in Git; restore and reset:
git restore
- mainly to restore-files in the working-tree
- cannot ever effect the commit history.
- Default flags:
--worktree
(-W
) andHEAD
git reset
- mainly to reset the index and commit history
- only effects the working tree in a hard reset of whole repo
- Default flags:
--mixed
andHEAD
There are technical terms for common combinations of these undo actions:
- ‘Un-Stage-Files‘:__ Is when you delete the index of the present branch.
- ‘Restore-to-Index’ is when you revert just the work-tree to a the index.
- ‘Restore-to-Head’: Delete index and revert working tree back to HEAD.
- A ‘soft-reset’ is when you erase the commit history back to commit ###. This allows you to squash the current commits down.
- A ‘mixed-reset’ is the same as a soft reset, but you also deletes the index.
- A ‘hard-reset’ is when you reset all of these: working-tree, index and commit-history back to the a prior commit.
These terms can be represented as follows:
Mode | Commit History | Index | Worktree |
---|---|---|---|
Un-Stage Files | = | 🔙 | = |
Restore-to-Index | = | = | 🔙 |
Restore-to-Head | = | 🔙 | 🔙 |
Soft-Reset | 🔙 | = | = |
Mixed-Reset | 🔙 | 🔙 | = |
Hard-Reset | 🔙 | 🔙 | 🔙 |
Restore Files to Index or HEAD
Restore is most useful as a way to remove un-staged changes from specific paths back to HEAD. The default flags are HEAD
and --worktree
(equally -W
) .
The --
path separator is always good practice.
Mode | Path |
---|---|
Un-Stage Files | restore -S -- . or reset . |
Restore-to-Index | restore . |
Restore-to-Head | restore -SW -- . |
Reset Repo to any Commit
Reset is essentially a rollback feature for the whole repo that allows you to jump to earlier commits without preserving intermediate history in a single command. The default flags areHEAD
and --mixed
.
But when specifying a given path, it can only apply the --mixed
flag. To hard reset a specific path you need run reset
followed by restore
.
To push changes, to remote, you will need to use git push -f
as you are ‘behind’ remote at the time of the push and need to forcibly and permanently overwrite the remote.
Mode | Path | Whole Repo |
---|---|---|
Soft-Reset | Not possible | reset ## --soft |
Mixed-Reset | reset ## -- . |
reset ## |
Hard-Reset | reset ## -- . ; restore . |
reset ## --hard |
This is a typical workflow to hard-reset the repo to commit hash ##
git log
to find the commit you want to revert to. Copy the hash.git reset ## --hard
git commit -am "my commit message"
git push --force
The next push will need the –force option as you are overwriting the contents on the remote server.
There is also a way to do a hard reset and preserve the commits you removed in the archive for later reference:
git revert --no-commit ###..HEAD
Squashing Commit History
Git reset is also important for combining the list of n most recent commits into a single commit in order to have a clean and understandable archive.
To reset commit-history of whole repo back to commit ###:
git reset ### --soft ; git commit -am "message"
To squash commits in a range run git rebase referencing the commit you want to squash back to:
git rebase -i commit
Clean Repo
Finally you can delete or untrack files:
git clean -d -f
: Delete all un-tracked files and directoriesgit rm -rf --cached ; git add -A
: Un-tracks all files not in.gitignore
.git stash drop
: delete the stash.
Rename a Branch
You cannot re-name a branch. You have to create a totally new branch and force push to overwrite the existing remote branch.1st rename locally. 2nd push new branch. 3rd delete old branch. have to literally create a new branch..
git branch -m <old> <new>
git push origin -u <new>
git push origin --delete origin/<old>
All other users will have to set upstream URL for the branch:
git branch --set-upstream-to <new>
Revert a Commit
Otherwise, if you want to reverse the file diff specific commit and leave all the other commits in the commit history, you can run: git revert ###
. This will add a new commit to the tip, which reverts just that specific change.
Reverting is relatively simple. It involves reversing the changes made in the repos’ history from a specific commit. Not all the changes from any subsequent commits. It can only be done to an entire repo, not specific paths. For example:
git revert ###
git checkout command
Make a branch from your uncommitted changes in a repo:
git checkout -b newbranch
Git checkout is a large and complicated command and it can mean vastly different things with just a small change of syntax. Git checkout has, in effect, two modes of operation. One mode is “safe”: it won’t accidentally destroy any unsaved work. The other mode is “unsafe”: if you use it, and it tells Git to wipe out some unsaved file.
This is not very friendly, so the Git folks finally—after years of users griping—split git checkout
into two new commands called switch and restore. This leads us to:
- Safe:
- Switch:
git switch <branch>
equivalent togit checkout <branch>
: Switch branch or re-attach to tip. - Detach:
git switch --detach ###
equivalent togit checkout ###
: Browse in ‘detached head state’. Inspect the state of the work-tree, at a previous commit, whilst not loosing subsequent commits. It is called a detached-head state. Get back to the tip (the most recent commit) withgit checkout <branch>
).
- Switch:
- Unsafe:
- Restore:
git restore --source=### -SW -- <path>
**: equivalent to:git checkout ### -- <path>
. Restores working-tree to a previous commit. BUT git restore also deletes untracked files at that path, whereas checkout doesn’t.
- Restore:
** Note that the
--
just separates flags and branches from paths. It is good practice to use this in case a branch and path share a name, or a file begins with-…
and git reads it as a flag
One thing both Git checkout and git reset can do is moving between commits. But they differ:
git reset ### <flag>
will navigate back to a previous commit AND delete:- the interceding commits-history (with
--soft
), - the current index (
--mixed
) and; - the current working-tree (
--hard
).
- the interceding commits-history (with
git checkout ###
navigates between views of commits. It does this without affecting the rest of the commit history by viewing a copy of the commit from an auto temporary branch called ‘detached HEAD’. Any changes while in detached head will be to the working-tree and index of that temporary branch. To save changes, you need to branch off withgit branch <your_name>
. It won’t overwrite unless you add the--force
flag to make it do so.
Git Stash
Is a convenient method to save your un-committed (staged and un-staged) changes to disk in a recoverable “stash”. Note: this removes changes from working tree and index!)
git stash
git stash list
(list stashes)
You can see:
stash@{0}: WIP on {branch_name}: {SHA-1 of last commit} {last commit of you branch}
stash@{1}: WIP on master: 085b095c6 modification for test
git stash apply
(to apply stash to working tree in current branch)
git stash apply stash@{12}
(if you will have many stashes you can choose what stash will apply – in this case we apply stash 12
)
git stash drop stash@{0}
(to remove from stash list – in this case stash 0
)
git stash pop stash@{1}
(to apply selected stash and drop it from stash list)
Branch vs Rebase vs Merge
Create a new branch with:
git branch feature
Eventually, you need to merge this branch back with the main. You would do this from the main branch:
git switch main
git merge feature
Merge brings in the commits to the current branch. Its always bringing-in. If the current branch is called ‘main’, this is bringing the commits into main, leaving feature unchanged.
If there are no conflicts, when you merge, all the commits be integrated integrated seamlessly into the main git log.
If there are conflicts, git enters a new mode called ‘merge mode’ in which you need to resolve the conflicts. I prefer to open VSCode at this point, open the repo folder, and look at the staging area. VSCode has a good interface for dealing with conflicts.
Once you have resolved the conflicts, you get a special commit called a merge commit which refers you to the other branch to see the origin of all the changes, giving you a non-linear history.
To avoid this, you have another option. First merge all the commits from main into the feature first, them move all of the feature commits to the tip of main with the rebase command :
git merge main
git switch main
git rebase -i feature
Again, you are brining in the commits from feature to main.
This moves all the commits from the feature branch to the tip of the main branch, and re-brands them all with the timestamp of the rebase, rather than integrating the histories and gives you a linear history.
The golden rule of git rebase
is to never rebase public branch into a private branch. I.e. Never rebase main into a feature you are working on. Only ever rebase your private branch onto to public.
Never do:
❌ git rebase main ❌
Configure
Make the diff colour prettier:
git config --global diff.colorMoved zebra
Always associate with remote of same name:
git config --global push.autoSetupRemote True
Checkout as-is, commit unix style line-endings:
git config --global core.autocrlf input
Errors
Try restarting first, but otherwise:
git config --global core.compression 0
git clone --depth 1 <repo_URI>
git fetch --unshallow
git pull --all