Time to learn: 15 minutes
GitHub gives the following explanation of a repository:
A repository is usually used to organize a single project. Repositories can contain folders and files, images, videos, spreadsheets, and data sets – anything your project needs. Often, repositories include a
READMEfile, a file with information about your project. GitHub makes it easy to add one at the same time you create your new repository. It also offers other common options such as a license file.
In short, it is a collection of files. Each GitHub repository has an owner, which could be an individual or an organization. Repositories can also be set to public or private, determining who can see and interact with it. While a repository can simply store files, GitHub is designed with collaboration in mind. Three key collaborative tools in GitHub are:
Issues: report a bug, plan improvements, or provide feedback to others working on the repository.
Discussions: post ideas or other conversations that are not as specific or actionable as an Issue.
Pull requests: We will go into the specifics later, but a Pull request allows a user to propose a change to any of the files within a repository.
Typically, a GitHub repository will always include the Issues and Pull requests tabs. Discussions are not enabled by default, but are increasingly prevalent.
As you can see by the recent timestamps, these repositories are actively changing (and, as mentioned previously, the above screenshots each represent just one specific moment in time); this reflects the adaptability of the open-source software ecosystem surrounding Python.
Notice that each of the three Repositories each exist as part of their own Organization. In other words, the NumPy repository exists within the NumPy organization; the Xarray repo exists within the Pydata org, and so forth.
When you created your own GitHub account, your user id functions as the organization. Any repositories you create (and therefore, own) will exist within that org.
Another example is this project’s Pythia Foundations repository, on which this tutorial is stored. It is owned by the Project Pythia organization, which also owns several other repositories that store the files needed to generate https://projectpythia.org/, among other things.
Finally, we introduce an important concept that is vital to your understanding when working with GitHub. It is both the source of GitHub’s power, as well as much of its complexity. GitHub repositories are distributed; in the general case, there is more than one repository for any project. In fact, repositories can come and go at any time, created and deleted as need dictates. Creating new repositories from existing ones, synchronizing, and managing them are the topics of later sections. For now, it is only important to understand that for a GitHub managed project there is typically one “official” repository, often called the “upstream” repository, and it lives on GitHub.com. There may be any number of copies of the “official” repository, known as forks (or origins, if it is owned by you), that also reside on GitHub.com. Repos that are hosted on GitHub.com are referred to as remotes. In addition to the remotes, there may be one or more copies of the remotes on your desktop or laptop computer that are referred to as locals. A conceptual diagram of the various repos is shown in the image below.
GitHub’s Repositories are collections of files.
Issues, Discussions, and Pull requests can be used to collaborate within a repository.
A GitHub Organization contains Repositories.