🖼️ Image Magic Integration

The Image Magic integration provides an easy way to process, display, and incorporate images into your LLM conversations within IPython notebooks. This feature is implemented through the %img magic command and relies on the ImageProcessor class and utility functions from the cellmage.utils.image_utils module.

Features

  • Display images directly in your notebook

  • Resize images while maintaining aspect ratios

  • Convert image formats for optimal LLM compatibility

  • Adjust image quality for lossy formats

  • View detailed image metadata

  • Add images to LLM context automatically

  • Base64 encoding for LLM API compatibility

Installation Requirements

The Image Magic integration requires the Pillow library:

pip install pillow

Using the %img Magic Command

Basic Usage

%img path/to/image.jpg

This command processes the image and adds it to your conversation history/context, allowing the LLM to “see” and analyze the image in subsequent queries.

Command Syntax

%img image_path [options]

Command Options

Option

Description

image_path

Path to the image file to process

-r, --resize WIDTH

Width to resize the image to (maintains aspect ratio)

-q, --quality VALUE

Quality for lossy image formats (0.0-1.0)

--show

Display the image inline after processing

-i, --info

Display information about the image

-a, --add-to-chat

Add the image to the current chat session (default: always added)

-c, --convert

Force conversion to a compatible format

-f, --format FORMAT

Format to convert the image to (e.g., “jpg”, “png”, “webp”)

Examples

Display an Image with Information

%img path/to/image.jpg --show --info

This command will display the image inline and show detailed metadata including dimensions, format, and file size.

Resize and Convert an Image

%img path/to/image.png --resize 800 --format webp --quality 0.85 --show

This resizes the image to 800px width (maintaining aspect ratio), converts it to WebP format with 85% quality, and displays it in the notebook.

Add Multiple Images to Context

%img image1.jpg
%img image2.png --resize 1024
%img image3.webp --info

# Now ask the LLM about the images
%%llm
Compare the three images I just shared. What are the key differences?

How It Works

The %img magic command is powered by the ImageProcessor class from cellmage.integrations.image_utils. When you process an image:

  1. The image is loaded using Pillow (PIL)

  2. It’s optionally resized and/or converted to a compatible format

  3. Basic metadata is extracted and can be displayed

  4. The processed image is encoded in base64 format

  5. The image is added to your conversation history/context for LLM analysis

  6. If requested with --show, it’s displayed inline in the notebook

Implementation Details

  • The ImageMagics class in cellmage/integrations/image_magic.py implements the IPython magic command

  • The ImageProcessor class in cellmage/utils/image_utils.py handles the image processing

  • Format conversion prioritizes LLM-compatible formats

  • Configuration settings are used for default values

ImageMagics Class Architecture

The ImageMagics class extends BaseMagics and provides:

@magics_class
class ImageMagics(BaseMagics):
    """IPython magic commands for displaying and processing images."""

    def __init__(self, shell):
        super().__init__(shell)
        self._image_processor = get_image_processor() if is_image_processing_available() else None

    @magic_arguments()
    @argument(...)  # Arguments definition
    @line_magic
    def img(self, line):
        """Process an image for LLM context and optionally display it."""
        # Implementation

Error Handling

The implementation includes robust error handling for various scenarios:

try:
    # Image processing logic
except Exception as e:
    logger.error(f"Error processing image: {str(e)}", exc_info=True)
    return f"Error processing image: {str(e)}"

Integration with Chat Manager

When an image is processed, it’s automatically added to the conversation context:

chat_manager = self._get_chat_manager()
if chat_manager and hasattr(chat_manager, "conversation_manager"):
    llm_image = format_image_for_llm(image_data, mime_type, metadata)
    msg = Message(
        role="user",
        content="[Image sent]",
        metadata={"source": image_path, "llm_image": llm_image, **metadata},
    )
    chat_manager.conversation_manager.add_message(msg)

Customizing Image Processing

Configuration Settings

Image processing behavior can be customized through settings in cellmage.config:

  • image_default_width: Default width to resize images to

  • image_default_quality: Default quality for lossy formats (0.0-1.0)

  • image_target_format: Default format for conversion

  • image_formats_llm_compatible: List of formats compatible with LLMs

You can override these settings in your CellMage configuration.

Extending the Integration

To extend or customize image processing:

  1. Subclass ImageProcessor: Create a custom processor with additional capabilities

from cellmage.integrations.image_utils import ImageProcessor

class MyCustomImageProcessor(ImageProcessor):
    def process_image(self, image_path, **kwargs):
        # Custom preprocessing
        # ...
        return super().process_image(image_path, **kwargs)
  1. Register a custom processor factory:

# Replace the default processor with your custom one
import cellmage.integrations.image_utils as utils

original_get_processor = utils.get_image_processor

def custom_get_processor():
    return MyCustomImageProcessor()

utils.get_image_processor = custom_get_processor

Advanced Usage Examples

Multiple Image Analysis

# Process and add three images to context
%img image1.jpg --resize 800
%img image2.jpg --resize 800
%img image3.jpg --resize 800

# Ask the LLM to compare the images
%%llm
Compare the three images I sent. What are their similarities and differences?

Working with Scientific Images

# Process and analyze a microscopy image
%img microscopy_sample.tiff --convert --format png --show --info

%%llm
Describe the cellular structures visible in this microscopy image.
What abnormalities, if any, do you notice?

Image Processing Workflow

# Process a chart and ask for analysis
%img sales_chart.png --resize 1200 --show

%%llm
Analyze this sales chart. What are the key trends?
Can you identify seasonal patterns?
What recommendations would you make based on this data?

Performance Considerations

  • Image Size: Larger images require more memory and processing time

  • Format Conversion: Converting between formats adds processing overhead

  • LLM Token Usage: Images consume tokens from your LLM quota

  • Quality vs. Size: Higher quality settings increase file size

Best Practices

  • Resize large images for better performance

  • Use WebP format for best quality/size ratio

  • Add the --info flag to see details about the original and processed images

  • Always verify the image has been added to context by checking the confirmation message

  • For better LLM analysis, ensure images are clear and focused on the relevant subject

  • When analyzing multiple images, use consistent sizing for fair comparison

Troubleshooting

Common issues and their solutions:

  1. “Image processing not available”: Install Pillow with pip install pillow

  2. Image not found: Check the file path and ensure it’s accessible

  3. Format not recognized: Try converting the image to a standard format like PNG or JPEG

  4. Large images: Use the --resize option to reduce memory usage and improve LLM processing