GitHub Logo

GitHub Repositories

Overview:

  1. Explore GitHub Repositories

Prerequisites

Concepts

Importance

Notes

What is GitHub?

Necessary

  • Time to learn: 15 minutes


What is a GitHub repository?

GitHub gives the following explanation of a repository:

A repository is usually used to organize a single project. Repositories can contain folders and files, images, videos, spreadsheets, and data sets – anything your project needs. Often, repositories include a README file, a file with information about your project. GitHub makes it easy to add one at the same time you create your new repository. It also offers other common options such as a license file.

In short, it is a collection of files. Each GitHub repository has an owner, which could be an individual or an organization. Repositories can also be set to public or private, determining who can see and interact with it. While a repository can simply store files, GitHub is designed with collaboration in mind. Three key collaborative tools in GitHub are:

  1. Issues: report a bug, plan improvements, or provide feedback to others working on the repository.

  2. Discussions: post ideas or other conversations that are not as specific or actionable as an Issue.

  3. Pull requests: We will go into the specifics later, but a Pull request allows a user to propose a change to any of the files within a repository.

Tip

Typically, a GitHub repository will always include the Issues and Pull requests tabs. Discussions are not enabled by default, but are increasingly prevalent.

What are some examples of repositories?

All of the Python packages covered (e.g. Numpy and Xarray) in this Foundations book have associated GitHub repositories, as well as Python itself:

NumPy GitHub Xarray GitHub Python GitHub

As you can see by the recent timestamps, these repositories are actively changing; this reflects the adaptability of the open-source software ecosystem surrounding Python.

Tip

Notice that each of the three Repositories each exist as part of their own Organization. In other words, the NumPy repository exists within the NumPy organization; the Xarray repo exists within the Pydata org, and so forth.

When you create your own GitHub account, your user ID functions as the organization. Any repositories you create (and therefore, own) will exist within that org.

Another example is this project’s Pythia Foundations repository, on which this tutorial is stored. It is owned by the Project Pythia organization. This organization also owns several other repositories that store the files needed to generate https://projectpythia.org/, among other things.

GitHub’s distributed repositories

Finally, we introduce an important concept that is vital to your understanding when working with GitHub. It is the source of GitHub’s power, as well as much of its complexity. GitHub repositories are distributed; in the general case, there is more than one repository for any project. In fact, repositories can come and go at any time, created and deleted as need dictates. Creating new repositories from existing ones, synchronizing them, and managing them are the topics of later sections. For now, it is only important to understand that for a GitHub-managed project, there is typically one “official” repository, often called the “upstream” repository, and it lives on GitHub.com. There may be any number of copies of the “official” repository, known as forks (or origins, if it is owned by you), that also reside on GitHub.com. Repos that are hosted on GitHub.com are referred to as remotes. In addition to the remotes, there may be one or more copies of the remotes on your desktop or laptop computer that are referred to as locals. A conceptual diagram of the various repos is shown in the image below.

GitHub repositories


Things to try:

  1. Browse the NumPy, Xarray, Python, and Pythia Foundations repos.

  2. Browse the organizations (e.g., Pydata) which house the repos within.

  3. Check out GitHub’s “Create a repo” tutorial to learn how to create your own repository!


Summary

  • GitHub’s Repositories are collections of files.

  • Issues, Discussions, and Pull requests can be used to collaborate within a repository.

  • A GitHub Organization contains Repositories.

What’s Next?

We will further explore Issues and Discussions.

References

  1. GitHub’s quickstart guide