๐ผ๏ธ Image Integrationยถ
CellMage helps you work with images directly in your notebooks by providing the %img magic command, which processes images for optimal use with Large Language Models (LLMs). This integration allows you to display images inline, resize them, convert formats, and add them to the conversation context for visual analysis by LLMs.
Using the Image Magic Commandยถ
The %img magic command provides a simple interface for image processing:
%img path/to/image.jpg [options]
Key Featuresยถ
Automatic Format Conversion: Converts images to LLM-compatible formats
Resizing: Optimize images for better LLM processing
Quality Control: Adjust compression levels for lossy formats
Metadata Display: View image details including dimensions and format
LLM Context Integration: Add images to your conversation history for visual analysis
Command Optionsยถ
Option |
Description |
|---|---|
|
Path to the image file to process |
|
Width to resize the image to (maintains aspect ratio) |
|
Quality for lossy image formats (0.0-1.0) |
|
Display the image inline after processing |
|
Display information about the image |
|
Add the image to the current chat session (default: always added) |
|
Force conversion to a compatible format |
|
Format to convert the image to (e.g., โjpgโ, โpngโ, โwebpโ) |
Examplesยถ
Basic Usage: Add Image to Contextยถ
Simply process an image and add it to the LLM context without displaying it:
%img path/to/your/image.jpg
# Output: โ
image.jpg processed and added to conversation history.
Display Image with Informationยถ
Process the image, display it inline, and show detailed metadata:
%img path/to/your/image.jpg --show --info
# Displays the image and shows metadata like dimensions, format, and file size
Resize and Convertยถ
Resize an image to a specific width and convert it to a different format:
%img path/to/your/image.png --resize 800 --format webp --quality 0.85 --show
# Resizes to 800px width, converts to WebP format with 85% quality, and displays
Image Processing Utilitiesยถ
Under the hood, CellMageโs image integration uses utility functions from cellmage.integrations.image_utils module, which provides:
Format Detection: Automatically identify image formats
Format Conversion: Convert between different image formats
Resizing: Intelligently resize images while maintaining aspect ratios
Base64 Encoding: Encode images for API compatibility
Metadata Extraction: Get detailed information about images
For more details about the image utilities, see Image Utilities.
Requirementsยถ
Image processing requires the Pillow library:
pip install pillow
Technical Implementationยถ
The image magic integration consists of two main components:
ImageMagicsClass: Defined incellmage.magic_commands.tools.image_magic, this implements the IPython magic command and handles user interaction.ImageProcessorClass: Found incellmage.integrations.image_utils, this handles the core image processing tasks.
For details about the technical implementation of the image magic integration, see the Image Magic Integration page.
Using Images with LLMsยถ
When you process an image with the %img command, it is automatically:
Processed for optimal size and format
Added to your conversation history
Included in the context for future LLM queries
This allows the LLM to โseeโ and analyze the image in subsequent interactions, enabling visual analysis, description, and question-answering about image content.
Example Workflowยถ
# First, process and add an image to context
%img path/to/chart.png --resize 1024 --show
# Now ask the LLM about the image
%%llm
What trends can you identify in this chart? What might explain the spike in February?
Configurationยถ
Image processing behavior can be customized through CellMageโs configuration options:
image_default_width: Default width for image resizing (if not specified)image_default_quality: Default quality setting for lossy formats (0.0-1.0)image_target_format: Default format for conversion when neededimage_formats_llm_compatible: List of formats considered compatible with LLMs
These settings can be adjusted in your .env file or through environment variables.