How to Use Github API in Python

Using Github Application Programming Interface v3 to search for repositories, users, making a commit, deleting a file, and more in Python using requests and PyGithub libraries.
  · 8 min read · Updated dec 2022 · Application Programming Interfaces

Struggling with multiple programming languages? No worries. Our Code Converter has got you covered. Give it a go!

GitHub is a Git repository hosting service that adds many of its own features, such as a web-based graphical interface to manage repositories, access control, and several other features, such as wikis, organizations, gists, and more.

As you may already know, there is a ton of data to be grabbed. In addition to using GitHub API v3 in Python, you might also be interested in learning how to use the Google Drive API in Python to automate tasks related to Google Drive. Or perhaps you need to use the Gmail API in Python to automate tasks related to your Gmail account.

In this tutorial, you will learn how you can use GitHub API v3 in Python using both requests or PyGithub libraries.

Table of content:

To get started, let's install the dependencies:

$ pip3 install PyGithub requests

Related: How to Extract YouTube Data using YouTube API in Python.

Getting User Data

Since it's pretty straightforward to use Github API v3, you can make a simple GET request to a specific URL and retrieve the results:

import requests
from pprint import pprint

# github username
username = "x4nth055"
# url to request
url = f"https://api.github.com/users/{username}"
# make the request and return the json
user_data = requests.get(url).json()
# pretty print JSON data
pprint(user_data)

Here I used my account; here is a part of the returned JSON (you can see it in the browser as well):

{'avatar_url': 'https://avatars3.githubusercontent.com/u/37851086?v=4',
 'bio': None,
 'blog': 'https://www.thepythoncode.com',
 'company': None,
 'created_at': '2018-03-27T21:49:04Z',
 'email': None,
 'events_url': 'https://api.github.com/users/x4nth055/events{/privacy}',
 'followers': 93,
 'followers_url': 'https://api.github.com/users/x4nth055/followers',
 'following': 41,
 'following_url': 'https://api.github.com/users/x4nth055/following{/other_user}',
 'gists_url': 'https://api.github.com/users/x4nth055/gists{/gist_id}',
 'gravatar_id': '',
 'hireable': True,
 'html_url': 'https://github.com/x4nth055',
 'id': 37851086,
 'login': 'x4nth055',
 'name': 'Rockikz',
<..SNIPPED..>

A lot of data, that's why using the requests library alone won't be handy to extract this ton of data manually. As a result, PyGithub comes to the rescue.

Related: Webhooks in Python with Flask.

Getting Repositories of a User

Let's get all the public repositories of that user using the PyGithub library we just installed:

import base64
from github import Github
from pprint import pprint

# Github username
username = "x4nth055"
# pygithub object
g = Github()
# get that user by username
user = g.get_user(username)

for repo in user.get_repos():
    print(repo)

Here is my output:

Repository(full_name="x4nth055/aind2-rnn")
Repository(full_name="x4nth055/awesome-algeria")
Repository(full_name="x4nth055/emotion-recognition-using-speech")
Repository(full_name="x4nth055/emotion-recognition-using-text")
Repository(full_name="x4nth055/food-reviews-sentiment-analysis")
Repository(full_name="x4nth055/hrk")
Repository(full_name="x4nth055/lp_simplex")
Repository(full_name="x4nth055/price-prediction")
Repository(full_name="x4nth055/product_recommendation")
Repository(full_name="x4nth055/pythoncode-tutorials")
Repository(full_name="x4nth055/sentiment_analysis_naive_bayes")

Alright, so I made a simple function to extract some useful information from this Repository object:

def print_repo(repo):
    # repository full name
    print("Full name:", repo.full_name)
    # repository description
    print("Description:", repo.description)
    # the date of when the repo was created
    print("Date created:", repo.created_at)
    # the date of the last git push
    print("Date of last push:", repo.pushed_at)
    # home website (if available)
    print("Home Page:", repo.homepage)
    # programming language
    print("Language:", repo.language)
    # number of forks
    print("Number of forks:", repo.forks)
    # number of stars
    print("Number of stars:", repo.stargazers_count)
    print("-"*50)
    # repository content (files & directories)
    print("Contents:")
    for content in repo.get_contents(""):
        print(content)
    try:
        # repo license
        print("License:", base64.b64decode(repo.get_license().content.encode()).decode())
    except:
        pass

Repository object has a lot of other fields. I suggest you use dir(repo) to get the fields you want to print. Let's iterate over repositories again and use the function we just wrote:

# iterate over all public repositories
for repo in user.get_repos():
    print_repo(repo)
    print("="*100)

This will print some information about each public repository of this user:

====================================================================================================
Full name: x4nth055/pythoncode-tutorials
Description: The Python Code Tutorials
Date created: 2019-07-29 12:35:40
Date of last push: 2020-04-02 15:12:38
Home Page: https://www.thepythoncode.com
Language: Python
Number of forks: 154
Number of stars: 150
--------------------------------------------------
Contents:
ContentFile(path="LICENSE")
ContentFile(path="README.md")
ContentFile(path="ethical-hacking")
ContentFile(path="general")
ContentFile(path="images")
ContentFile(path="machine-learning")
ContentFile(path="python-standard-library")
ContentFile(path="scapy")
ContentFile(path="web-scraping")
License: MIT License
<..SNIPPED..>

I've truncated the whole output, as it will return all repositories and their information; you can see we used repo.get_contents("") method to retrieve all the files and folders of that repository, PyGithub parses it into a ContentFile object, use dir(content) to see other useful fields.

Extracting Private Repositories of a Logged-in User

Also, if you have private repositories, you can access them by authenticating your account (using the correct credentials) using PyGithub as follows:

username = "username"
password = "password"

# authenticate to github
g = Github(username, password)
# get the authenticated user
user = g.get_user()
for repo in user.get_repos():
    print_repo(repo)

It is also suggested by GitHub to use the authenticated requests, as it will raise a RateLimitExceededException if you use the public one (without authentication) and exceed a small number of requests.

Downloading Files in a Repository

You can also download any file from any repository you want. To do that, I'm editing the print_repo() function to search for Python files in a given repository. If found, we make the appropriate file name and write the content of it using content.decoded_content attribute. Here's the edited version of the print_repo() function:

# make a directory to save the Python files
if not os.path.exists("python-files"):
    os.mkdir("python-files")

def print_repo(repo):
    # repository full name
    print("Full name:", repo.full_name)
    # repository description
    print("Description:", repo.description)
    # the date of when the repo was created
    print("Date created:", repo.created_at)
    # the date of the last git push
    print("Date of last push:", repo.pushed_at)
    # home website (if available)
    print("Home Page:", repo.homepage)
    # programming language
    print("Language:", repo.language)
    # number of forks
    print("Number of forks:", repo.forks)
    # number of stars
    print("Number of stars:", repo.stargazers_count)
    print("-"*50)
    # repository content (files & directories)
    print("Contents:")
    try:
        for content in repo.get_contents(""):
            # check if it's a Python file
            if content.path.endswith(".py"):
                # save the file
                filename = os.path.join("python-files", f"{repo.full_name.replace('/', '-')}-{content.path}")
                with open(filename, "wb") as f:
                    f.write(content.decoded_content)
            print(content)
        # repo license
        print("License:", base64.b64decode(repo.get_license().content.encode()).decode())
    except Exception as e:
        print("Error:", e)

After you run the code again (you can get the complete code of the entire tutorial here), you'll notice a folder named python-files created that contain Python files from different repositories of that user:

Downloaded files from a GitHub repositoryLearn also: How to Make a URL Shortener in Python.

Searching for Repositories

The GitHub API is quite rich; you can search for repositories by a specific query just like you do on the website:

# search repositories by name
for repo in g.search_repositories("pythoncode tutorials"):
    # print repository details
    print_repo(repo)

This will return 9 repositories and their information.

You can also search by programming language or topic:

# search by programming language
for i, repo in enumerate(g.search_repositories("language:python")):
    print_repo(repo)
    print("="*100)
    if i == 9:
        break

To search for a particular topic, you simply put something like "topic:machine-learning" in search_repositories() method.

Read also: How to Extract Wikipedia Data in Python.

Manipulating Files in your Repository

If you're using the authenticated version, you can also create, update and delete files very easily using the API:

# searching for my repository
repo = g.search_repositories("pythoncode tutorials")[0]

# create a file and commit n push
repo.create_file("test.txt", "commit message", "content of the file")

# delete that created file
contents = repo.get_contents("test.txt")
repo.delete_file(contents.path, "remove test.txt", contents.sha)

The above code is a simple use case; I searched for a particular repository, I've added a new file and called it test.txt, I put some content in it and made a commit. After that, I grabbed the content of that new file and deleted it (and it'll count as a git commit as well).

And sure enough, after the execution of the above lines of code, the commits were created and pushed:

Github CommitsConclusion

We have just scratched the surface of the GitHub API, there are a lot of other functions and methods you can use, and obviously, we can't cover all of them. Here are some useful ones you can test on your own:

  • g.get_organization(login): Returns an Organization object that represents a GitHub organization.
  • g.get_gist(id): Returns a Gist object representing a gist in GitHub.
  • g.search_code(query): Returns a paginated list of ContentFile objects representing matched files on several repositories.
  • g.search_topics(query): Returns a paginated list of Topic objects representing a GitHub topic.
  • g.search_commits(query): Returns a paginated list of Commit objects in which it represents a commit in GitHub

There are a lot more; please use dir(g) to get other methods. Check PyGithub documentation or the GitHub API for detailed information.

Learn also: How to Use Google Custom Search Engine API in Python.

Happy Coding ♥

Ready for more? Dive deeper into coding with our AI-powered Code Explainer. Don't miss it!

View Full Code Build My Python Code
Sharing is caring!



Read Also



Comment panel

    Got a coding query or need some guidance before you comment? Check out this Python Code Assistant for expert advice and handy tips. It's like having a coding tutor right in your fingertips!