GitHub Integrationยถ
CellMage provides integration with GitHub through the %github magic command, allowing you to fetch GitHub repositories and pull requests directly into your notebook and use them as context for LLM queries.
Installationยถ
To use the GitHub integration, install CellMage with the GitHub extra dependency:
pip install "cellmage[github]"
This will install the required dependencies including PyGithub and python-dotenv.
Configurationยถ
The GitHub integration requires a GitHub personal access token. You can set it using environment variables:
# In your terminal
export GITHUB_TOKEN="your_personal_access_token"
# Or in a .env file
GITHUB_TOKEN=your_personal_access_token
To create a GitHub Personal Access Token:
Go to your GitHub account settings โ Developer settings โ Personal access tokens
Generate a new token with the
reposcope (for private repositories) or justpublic_repofor public repositoriesCopy the token and set it as your
GITHUB_TOKENenvironment variable
Basic Usageยถ
To fetch a specific repository:
%github username/repo
This fetches the repository summary and adds it as a user message in the chat history.
Advanced Usageยถ
Fetching Pull Requestsยถ
You can also fetch a specific pull request from a repository:
%github username/repo --pr 123
Command Optionsยถ
--pr ID: Fetch a specific pull request by ID--system: Add content as system message instead of user message--show: Only display the content without adding it to the chat history--clean: Clean the repository content to focus on code (removes non-essential files)--full-code: Include all code content from the repository (may be very large)--exclude-dir PATTERN: Exclude directories matching the pattern (can use multiple times)--exclude-file PATTERN: Exclude files matching the pattern (can use multiple times)--exclude-ext EXT: Exclude files with the specified extension (can use multiple times)--exclude-regex PATTERN: Exclude files matching the regex pattern (can use multiple times)--contributors-months N: Include contributors from the last N months (default: 6)
Examplesยถ
Fetch a repository and add it to history:
%github username/repo
Fetch a repository and add it as system context:
%github username/repo --system
Just view a repository summary without adding to history:
%github username/repo --show
Fetch a pull request:
%github username/repo --pr 123
View a pull request without adding to history:
%github username/repo --pr 123 --show
Exclude certain directories and file types:
%github username/repo --exclude-dir "node_modules" --exclude-ext ".json" --exclude-ext ".md"
Using GitHub Content with LLM Queriesยถ
Once youโve fetched GitHub content, you can reference it in your LLM queries:
# First, fetch the repository
%github username/repo
# Then, reference it in your prompt
%%llm
Based on the GitHub repository above, can you explain the project architecture and suggest improvements?
Or with pull requests:
# First, fetch the pull request
%github username/repo --pr 123
# Then, use it as context in your prompt
%%llm
Please review the pull request above and suggest any improvements or issues to address.
Troubleshootingยถ
Authentication Issuesยถ
Verify your token is set properly:
import os print("GITHUB_TOKEN is set:", os.environ.get("GITHUB_TOKEN") is not None)Check token scope and permissions:
Ensure your token has the required scopes (
repofor private repositories,public_repofor public ones)Verify the token hasnโt expired
Regenerate the token if necessary
Rate Limitingยถ
GitHub has API rate limits that may affect your usage:
Authenticated rate limits: With a token, you get 5,000 requests per hour
Unauthenticated rate limits: Without a token, only 60 requests per hour
Rate limit errors: If you see
403 Rate Limit Exceedederrors, wait for your rate limit to reset
Repository Access Issuesยถ
Private repositories: Ensure your token has
reposcope for accessing private repositoriesOrganization repositories: You need appropriate organization permissions if accessing org repositories
Repository not found: Check if the repository exists and youโve spelled the name correctly
Large Repository Problemsยถ
Timeout errors: For very large repositories, you might experience timeouts
Memory issues: Large repositories may cause memory problems
Solutions:
Use
--cleanto reduce the amount of dataUse
--exclude-dir,--exclude-file, and--exclude-extto filter contentAvoid
--full-codefor large repositories
For any persistent issues, you can enable debug logging:
import logging
from cellmage.utils.logging import setup_logging
setup_logging(level=logging.DEBUG)
# The logs will be written to cellmage.log in your working directory