Github Collaboration Basics

In this article we will focus on some basic concepts that take you beyond merley backing up your own code, to how to work within a group and collaborate. We will be doing this in the context of github since it's the defacto.

Typically when you work solo in git, you setup where you are sending your code to when you want to back it up. This is called the remote. In order to back your code up, you send it upstream. This is where you cloned the code from, generally speaking. Downstream then is any repo that integrates the upstream code.

When working on a team, or contributing to a project you like, rather than cloning the authors code and working directly on their source, you can create what's called a fork. A fork is just a copy of the repo. So if a fork is a copy and a clone is a copy what's the difference?

Fork vs. Clone.

In essence these are sides of the same coin. The difference comes down to one thing, access. When you clone a repo you can't push your code up unless you've been given access by someone. Forking is essetially cloning the repo onto your github account and in essence created a whole new repo. After you're done you can submit a pull request to have your code merged back into the original repository.

Keeping the fork in sync

A copy is nice and all, but anything that is more than a basic fix will require you to keep your code in sync with the original. You can do this by adding an upstream remote that points to the original. This is done with:

git remote add upstream https://github.com/ORIGINAL_OWNER/ORIGINAL_REPOSITORY.git

After that you can sync your code with a few commands.

  • git fetch upstream: Pulls the latest branch and tracking info from the remote.
  • git checkout master: Switch to your local version of the master branch.
  • git merge upstream/master: This merges the upstream version of master with your local master, syncing you to the latest changes.

When you pull upstream git stores the changes from the upstream repository in the upstream/master branch.

The pull request - submitting your fix to the original repo

Now that you have made your fix or implemented your features it's time to give your work back to the author. When collaberating in a team, code is usually peer reviewed before being pushed to the master branch. In the case of git flow, submitting a pull request for a feature branch will highlight all the changes made in the branch and allow the peer reviewer to make comments visable to all collaborators. You may continue to push commits to the branch even after the pull request was sent and your changes will be shown. Once the code is peer reviewed and accepted you can then merge the branch into master or develop if you're using git flow.

Taking it up a level: Git Submodules

A submodule is basically a git repo within another git repo. That means when you update your main repo supmodules will not be changed and you can clone these submodules into other repositories allowing you to easily share that component across multiple repositories.

If you use Nx Workspaces, or anything similar, or have ever made edits to a 3rd party library for use in your app and haven't heard of sub modules then you're in for a treat. When you add a submodule in Git you're not actually adding the code to the repo but rather some information that points to which commit that submodule should be pointing to. This is what allows you to run a git pull command and pull the latest changes in the main repo without pulling any changes made to the submodule, if any.

Adding a submodule

To add a submodule do:

git submodule add git@github.com:url_to/awesome_submodule.git path_to_awesome_submodule

After a git status you will notice a .gitmodules file and the folder where you put your submodule. Any collaberators will need to do a git pull followed by a git submodule init to get the submodule.

Making Changes to submodules and syncing code.

When you make changes to a submodule you push it up just like normal. In order to get the refrence to the submodule in the main repo up to date you'll have to repeat the process.

If you do a git status you'll see the changes to the submodule Listed under Changes not staged for commit. This means that your main repo is pointing to an older commit and needs to be updated. If someone else updates a submodue you can pull the changes down with a git submodule update.

If you want you can make an alias to update submodule while pulling new code with git config --global alias.update '!git pull && git submodule update --init --recursive'.

Note that the submodule update command will only update the submodules to the latest commit specified in the main repo.

Note that the submodule update command will only update the submodules to the latest commit specified in the main repo.

What that means is that if the main repo isn't updated to the latest commit via the method listed above it will not be aware of any commits made on the submodules own remote. You can use the --remote flag when running the update command to circumvent this.

Mono vs Multi Repos

Put simply a mono repo is one repo. Having a mono repo for a project or a group of related projects or even all of your projects entirely is commonly used for a number of reasons.

More Organized

Having a mono repo keeps things more organized and easy to navigate. You can group projects that are similar in nature or communicate with each other in some way and have easy access to multipl projects from one area.

Simpler Dependecy Management

This probably goes without saying, but with multiple repos, you need to have some way of specifying and versioning dependencies between them. That sounds like it ought to be straightforward, but in practice, most solutions are cumbersome and involve a lot of overhead. With a monorepo, it's easy to have one universal version number for all projects.

Tooling

The simplification of navigation and dependencies makes it much easier to write tools. Instead of having tools that must understand relationships between repositories, as well as the nature of files within repositories, tools basically just need to be able to read files (including some file format that specifies dependencies between units within the repo).

Refrences