What is describe-images?

This Claude skill automates the conversion of Wiki-style image links into standard Markdown format while leveraging AI to generate descriptive alt-text. By analyzing visual content and surrounding context, it transforms static image references into semantically enriched Markdown, making documents more accessible and ready for advanced AI workflows like automated presentation generation.

When should I use describe-images?

describe-images is useful in the following scenarios: • Workflow Pre-processing: Prepare Markdown files for AI-driven PPT generation by providing text-based descriptions that help planners understand visual context without consuming excessive tokens. • Accessibility Optimization: Automatically generate descriptive alt-text for images in large documentation sets to improve SEO and meet web accessibility (WCAG) standards. • Knowledge Base Migration: Seamlessly convert Obsidian or Wiki-style notes into standard Markdown for better compatibility with static site generators and other documentation tools. • Enhanced Searchability: Transform visual data into searchable text descriptions, allowing users to find specific images within a knowledge base using standard text-based queries.

name	Describe-Images
description	Convert Wiki-style image links to Markdown format with AI-generated descriptions. Use when preprocessing markdown files for PPT generation, converting ![[image.png]] to ![description](path) format, or enabling text-based image understanding.
allowed-tools	Read, Bash, Glob
version	1.0.0
updated	2026-01-04
status	active
color	green

Image-Describer: Wiki to Markdown Image Link Converter

Convert Wiki-style image links (![[image.png]]) to Markdown format ( AI-generated description ) with intelligent AI-powered image analysis.

Quick Reference (30 seconds)

Purpose: Convert Wiki-style image links to Markdown format with AI-generated alt text descriptions.

Execution Command:

cd "{working_directory}/.claude/skills/Describe-Images/Scripts" && \
source .venv/bin/activate && \
python Convert_Image-Link_Wiki-to-Markdown.py "{markdown_path}" -m gpt

Script Location: .claude/skills/Describe-Images/Scripts/Convert_Image-Link_Wiki-to-Markdown.py

Prerequisites:

Python venv with: openai, google-generativeai, pillow, python-dotenv
API Key: .claude/skills/Describe-Images/Scripts/.env with OPENAI_API_KEY or GOOGLE_API_KEY

Output: In-place modification of the markdown file.

Implementation Guide (5 minutes)

Basic Usage

Step 1: Ensure virtual environment is set up

cd "{working_directory}/.claude/skills/Describe-Images/Scripts"
python -m venv .venv
source .venv/bin/activate
pip install python-dotenv openai google-generativeai Pillow

Step 2: Create .env file in Scripts directory

# Create at: .claude/skills/Describe-Images/Scripts/.env
OPENAI_API_KEY=sk-your-openai-key-here
OPENAI_MODEL=gpt-5-mini

# Or for Gemini
GOOGLE_API_KEY=AIza-your-google-key-here
GOOGLE_MODEL=gemini-3-flash-preview

Step 3: Execute the script

# Using GPT (default)
python Convert_Image-Link_Wiki-to-Markdown.py "/path/to/document.md" -m gpt

# Using Gemini
python Convert_Image-Link_Wiki-to-Markdown.py "/path/to/document.md" -m gemini

# Path conversion only (no AI analysis)
python Convert_Image-Link_Wiki-to-Markdown.py "/path/to/document.md" --no-describe

# Dry-run mode (preview changes without modifying file)
python Convert_Image-Link_Wiki-to-Markdown.py "/path/to/document.md" -n

Input/Output Format

Input (Wiki-style):

![[image.png]]
![[folder/diagram.jpg]]
![[screenshot.webp]]

Output (Markdown-style):

![Detailed AI-generated description of the image content](absolute/path/to/image.png)
![System architecture diagram showing...](absolute/path/to/folder/diagram.jpg)
![Screenshot of the application interface...](absolute/path/to/screenshot.webp)

CLI Options

Option	Description
`-n, --dry-run`	Preview changes without modifying the file
`-m, --model`	AI model selection: `gpt` (default) or `gemini`
`--no-describe`	Skip AI description generation (path conversion only)

Script Configuration

Configurable constants in the script:

CONTEXT_CHARS = 500   # Characters before/after image for context
MAX_RETRIES = 3       # API call retry attempts
CONCURRENT = 20       # Maximum parallel image analyses

Image Search Behavior

The script searches for images in the following order:

Current directory of the markdown file
Parent directory
Grandparent directory
Common asset folders (Attachments, images, assets)
Full vault search as fallback

Output Example

Building image index...
Found 10751 image files

============================================================
  Images requiring description: 5
  Model: GPT
  Concurrent processing: 20
============================================================

[1/5] Analyzing: chart.png
        Completed: This chart shows sales growth from 2020 to 2024...
[2/5] Analyzing: diagram.jpg
        Completed: System architecture diagram illustrating...

============================================================
  Processing Results Summary
   Success: 5
   Failed: 0
   Skipped: 0
   Total: 5
============================================================

Completed: document.md

Advanced Implementation (10+ minutes)

Two-Phase Processing

The script operates in two distinct phases:

Phase 1 - Link Conversion:

Converts Wiki links (![[file]]) to Markdown format ()
Resolves image paths using vault-wide search
Saves changes immediately after conversion

Phase 2 - AI Description Generation:

Identifies Markdown images with empty alt text
Extracts contextual text (500 chars before/after)
Generates AI descriptions using GPT or Gemini
Saves each description immediately after generation

Error Handling

Image Not Found:

Detection: File not found in vault index
Behavior: Keeps original Wiki link unchanged
Output: Warning message displayed

API Failure:

Detection: Network or API errors
Recovery: Up to 3 retry attempts with 1-second delay
Behavior: Continues with remaining images

Invalid Markdown:

Detection: Malformed image links
Behavior: Graceful handling, partial conversion

Concurrent Processing

The script uses asyncio for efficient parallel processing:

Semaphore limits concurrent API calls to 20
Lock ensures safe file writes from multiple tasks
Results are saved immediately as each image completes

Supported Image Formats

png, jpg, jpeg, gif, webp, svg, bmp, tiff, ico

PPT Workflow Integration

This skill serves as a critical pre-processing step in the PPT generation workflow.

Integration Flow

Step 1: Image-Describer processes markdown file

Converts Wiki links to Markdown format
Generates AI descriptions for all images

Step 2: Planner-PPT reads pre-processed markdown

Understands image content through text descriptions
Creates slide outlines based on visual content

Step 3: Nano-Banana generates slide images

Uses Planner-PPT output to create final slides

Benefits

Context Overflow Prevention:

Visual content converted to text descriptions
Significantly reduces token usage in Planner-PPT

Enhanced Understanding:

Planner-PPT can make intelligent decisions about image placement
AI descriptions provide semantic understanding of visual content

Related Resources

Related Agent:

Planner-PPT: Consumes pre-processed markdown for slide planning

Related Skills:

Nano-Banana: Generates final PPT slide images
Prepare-Book: Can pre-process book chunks before conversion

Related Command:

/ppt: Orchestrates complete workflow including image pre-processing

Works Well With

/ppt command - Pre-processes markdown before slide generation workflow
Planner-PPT - Consumes pre-processed markdown with image descriptions
Prepare-Book - Can pre-process book chunks before PPT generation
Nano-Banana - Final step in PPT generation pipeline

Troubleshooting

API Key Not Found:

Ensure .env file exists at .claude/skills/Describe-Images/Scripts/.env
Verify OPENAI_API_KEY or GOOGLE_API_KEY is set correctly