Google Docs Integrationยถ
CellMage provides integration with Google Docs through the %gdocs magic command, allowing you to fetch Google Document content directly into your notebook and use it as context for LLM queries.
Installationยถ
To use the Google Docs integration, install CellMage with the gdocs extra:
pip install "cellmage[gdocs]"
This will install the necessary dependencies including:
google-authgoogle-auth-oauthlibgoogle-auth-httplib2google-api-python-client
Configurationยถ
The Google Docs integration requires OAuth 2.0 credentials or a service account. You can configure it using environment variables:
# OAuth configuration (default)
CELLMAGE_GDOCS_AUTH_TYPE=oauth
CELLMAGE_GDOCS_TOKEN_PATH=~/.cellmage/gdocs_token.pickle
CELLMAGE_GDOCS_CREDENTIALS_PATH=~/.cellmage/gdocs_credentials.json
# Or service account configuration
CELLMAGE_GDOCS_AUTH_TYPE=service_account
CELLMAGE_GDOCS_SERVICE_ACCOUNT_PATH=~/.cellmage/gdocs_service_account.json
# Configure request timeout (default: 300 seconds)
CELLMAGE_GDOCS_REQUEST_TIMEOUT=600
OAuth 2.0 Authenticationยถ
Go to the Google Cloud Console
Create a new project or use an existing one
Enable the Google Docs API and Google Drive API
Create OAuth 2.0 credentials and download the credentials JSON file
Rename it to
gdocs_credentials.jsonand place it in the~/.cellmage/directoryThe first time you use the integration, a browser window will open to authenticate
Make sure to grant access to both Documents and Drive when authorizing
Service Account Authenticationยถ
Go to the Google Cloud Console
Create a new project or use an existing one
Enable the Google Docs API and Google Drive API
Create a Service Account and download the JSON key file
Rename it to
gdocs_service_account.jsonand place it in the~/.cellmage/directoryShare your Google Documents with the service account email address
Required Scopesยถ
By default, CellMage uses the following scopes:
https://www.googleapis.com/auth/documents.readonly- For reading documentshttps://www.googleapis.com/auth/drive.readonly- For searching and listing documents
You can customize these with the CELLMAGE_GDOCS_SCOPES environment variable:
# Optional: Override the default scopes (comma-separated)
CELLMAGE_GDOCS_SCOPES=https://www.googleapis.com/auth/documents.readonly,https://www.googleapis.com/auth/drive.readonly
Basic Usageยถ
To fetch a specific Google Document by ID:
%gdocs your_google_doc_id
To fetch a document using its URL:
%gdocs https://docs.google.com/document/d/YOUR_DOC_ID/edit
This fetches the document content and adds it as a user message in the chat history.
Advanced Usageยถ
Searching for Documentsยถ
You can search for Google Docs documents containing specific terms:
%gdocs --search "project documentation"
This returns a table of matching documents with their metadata.
To customize the number of search results:
%gdocs --search "project documentation" --max-results 20
Fetching Document Content from Search Resultsยถ
To fetch and display content from the top search results:
%gdocs --search "project documentation" --content
By default, this fetches content for the top 3 documents. You can customize this:
%gdocs --search "project documentation" --content --max-content 5
Filtering Search Resultsยถ
You can filter search results by various criteria:
# Filter by author/owner
%gdocs --search "project documentation" --author "user@example.com"
# Filter by creation date (supports natural language)
%gdocs --search "project documentation" --created-after "3 days ago"
%gdocs --search "project documentation" --created-before "2023-12-31"
# Filter by modification date
%gdocs --search "project documentation" --modified-after "last week"
%gdocs --search "project documentation" --modified-before "2023-12-31"
# Sort results
%gdocs --search "project documentation" --order-by "modifiedTime" # Options: relevance, modifiedTime, createdTime, name
Handling Timeoutsยถ
When dealing with large documents or many documents in parallel, you might encounter timeout issues. You can customize the timeout duration:
# Increase timeout to 10 minutes (600 seconds) for a large document search
%gdocs --search "project documentation" --content --max-content 10 --timeout 600
This is especially useful when:
Fetching large documents
Retrieving content from many documents in parallel
Experiencing connectivity issues
The default timeout is 300 seconds (5 minutes), which is sufficient for most operations. For very large operations, consider using a timeout of 600-900 seconds.
Authentication Optionsยถ
You can specify the authentication type for a specific command:
%gdocs your_google_doc_id --auth-type service_account
System Contextยถ
To add the document as system context instead of a user message:
%gdocs your_google_doc_id --system
Display Onlyยถ
To only display the document content without adding it to chat history:
%gdocs your_google_doc_id --show
Command Optionsยถ
Option |
Description |
|---|---|
|
Add as system message instead of user message |
|
Only display the content without adding to chat history |
|
Authentication type to use ( |
|
Search for Google Docs files containing the specified term |
|
Retrieve and display content for search results |
|
Maximum number of search results to return (default: 10) |
|
Maximum number of documents to retrieve content for (default: 3) |
|
Request timeout in seconds (default: 300) |
|
Filter documents by author/owner email |
|
Filter documents created after this date (YYYY-MM-DD or natural language) |
|
Filter documents created before this date |
|
Filter documents modified after this date |
|
Filter documents modified before this date |
|
How to order search results ( |
Using Google Docs Content with LLM Queriesยถ
After fetching a Google Document, you can directly reference it in your LLM prompts:
# First, fetch the document content
%gdocs https://docs.google.com/document/d/YOUR_DOC_ID/edit
# Then, reference it in your prompt
%%llm
Based on the Google Document above, summarize the key points and provide actionable insights.
Troubleshootingยถ
Authentication Issuesยถ
OAuth Error: If you see an error with OAuth authentication, ensure your credentials file is correct and the Google Docs API is enabled in your project.
Service Account Error: If using a service account, ensure the document is shared with the service account email address.
Token Refresh Error: If your token expires, you might need to re-authenticate. Delete the token file and run the command again.
Access Permission Issuesยถ
Document Not Found: Ensure the document exists and you have access to it.
Permission Denied: Ensure you have at least read access to the document.
Connection Problemsยถ
API Rate Limits: Google API has rate limits. If you hit them, wait a few minutes.
Network Issues: Check your internet connection.
For persistent issues, examine the CellMage log:
import logging
from cellmage.utils.logging import setup_logging
setup_logging(level=logging.DEBUG)
# The logs will be written to cellmage.log in your working directory