extract-images

Embassy-of-the-Free-Mind's avatarfrom Embassy-of-the-Free-Mind

Extract and catalog illustrations from historical books using AI vision. Generates rich metadata (subjects, figures, symbols, style, technique) and museum-style descriptions. Use when asked to extract images, run image detection, or process book illustrations.

0stars🔀0forks📁View on GitHub🕐Updated Jan 7, 2026

When & Why to Use This Skill

This Claude skill leverages advanced AI vision to automatically extract, classify, and catalog illustrations from historical book scans. It transforms unstructured visual data into structured digital galleries by generating rich metadata, including artistic techniques, symbolic elements, and museum-standard scholarly descriptions, significantly accelerating digital humanities research and archiving workflows.

Use Cases

  • Digital Archiving: Automating the identification and cataloging of woodcuts, engravings, and maps from massive digitized library collections to create searchable databases.
  • Art History Research: Analyzing the evolution of specific artistic styles, techniques (like Northern European Renaissance), or symbolic motifs across centuries of printed material.
  • Curated Content Creation: Using AI-driven quality scoring to efficiently select high-impact illustrations for educational guides, museum exhibits, or digital publications.
  • Academic Metadata Enrichment: Converting visual elements in historical texts into structured JSON/SQL data, enabling complex queries on subjects, figures, and symbols.
nameextract-images
descriptionExtract and catalog illustrations from historical books using AI vision. Generates rich metadata (subjects, figures, symbols, style, technique) and museum-style descriptions. Use when asked to extract images, run image detection, or process book illustrations.

Image Extraction Skill

Extract illustrations from historical book scans using Gemini vision AI. Creates a curated gallery with rich metadata for each image.

Invocation

/extract-images                      # Show extraction status across library
/extract-images BOOK_ID              # Extract images from specific book
/extract-images --all                # Extract from all books missing images
/extract-images --priority           # Extract from high-priority books first

What Gets Extracted

For each illustration found:

  • Bounding box - Precise crop coordinates
  • Type classification - emblem, woodcut, engraving, portrait, frontispiece, musical_score, diagram, symbol, decorative, map
  • Gallery quality score - 0.0-1.0 rating for curation
  • Museum description - 2-3 sentence scholarly label
  • Structured metadata:
    • subjects - Topics (alchemy, transformation, mythology)
    • figures - People/beings depicted (Mercury, serpent, old man)
    • symbols - Symbolic elements (ouroboros, athanor, philosophical egg)
    • style - Art historical style (Northern European Renaissance)
    • technique - Production method (woodcut, engraving)

API Endpoints

Endpoint Purpose
POST /api/extract-images Extract images from pages
GET /api/gallery List extracted images
GET /api/gallery/image/[id] Get single image with metadata
PATCH /api/gallery/image/[id] Update metadata/quality

Check Extraction Status

# Get books with extraction counts
curl -s "https://sourcelibrary.org/api/books" | jq '[.[] | {
  id,
  title: .title[0:40],
  pages: .pages_count,
  images: (.detected_images_count // 0)
}] | sort_by(-.pages)'

# Find books needing extraction (no detected_images)
curl -s "https://sourcelibrary.org/api/books" | jq '[.[] |
  select((.detected_images_count // 0) == 0) |
  {id, title: .title[0:40], pages: .pages_count}
] | sort_by(-.pages)'

Extract Images from a Book

Smart Extraction (Recommended)

Uses OCR <image-desc> tags to find pages with images, ranks them with AI, and only extracts the best 15-20:

# Extract top 20 images from a book
node scripts/smart-extract.mjs BOOK_ID --limit=20

# Extract top 15 images
node scripts/smart-extract.mjs BOOK_ID --limit=15

How it works:

  1. Finds pages with <image-desc> tags in OCR (from previous OCR pass)
  2. Uses AI to rank descriptions and pick the most interesting
  3. Only extracts those specific pages
  4. ~30 seconds for 20 images instead of scanning all pages

Full Extraction (All Pages)

For comprehensive extraction of every page (slow, use sparingly):

node scripts/evaluate-extraction.mjs BOOK_ID --clear

Via API (Individual Pages)

curl -s -X POST "https://sourcelibrary.org/api/extract-images" \
  -H "Content-Type: application/json" \
  -d '{
    "bookId": "BOOK_ID",
    "pageIds": ["PAGE_ID_1", "PAGE_ID_2"],
    "overwrite": false
  }'

Books with OCR Image Descriptions

Check which books have <image-desc> tags (best for smart extraction):

# Find books with image descriptions in OCR
node -e "
const { MongoClient } = require('mongodb');
require('dotenv').config({ path: '.env.local' });
// ... aggregation to find books with <image-desc> tags
"

Output Quality

Gallery Quality Scoring (0.0-1.0)

Score Description Examples
0.9-1.0 Exceptional Striking emblems, allegorical scenes with figures
0.8-0.9 High Images featuring people, well-composed scenes
0.6-0.8 Good Illustrations without people, interesting diagrams
0.4-0.6 Moderate Musical scores, simple diagrams
0.2-0.4 Low Page ornaments, generic borders
0.0-0.2 Minimal Marbled papers, blank frames

Key Rule: Images with people/figures always score 0.8+

Image Types

Type Description
emblem Symbolic/allegorical with motto
woodcut Bold relief print lines
engraving Fine detailed lines, crosshatching
portrait Depiction of a person
frontispiece Decorative title page
musical_score Sheet music, notation
diagram Technical/scientific illustration
symbol Alchemical, astrological symbols
decorative Ornaments, borders, initials
map Geographic representation

Viewing Results

Gallery Page

https://sourcelibrary.org/gallery?bookId=BOOK_ID

Book Guide Page (high-quality images only)

https://sourcelibrary.org/book/BOOK_ID/guide

Individual Image Detail

https://sourcelibrary.org/gallery/image/PAGE_ID:INDEX

Cost Estimation

Using Gemini 2.5 Flash:

Metric Cost
Per page ~$0.0004
Per high-quality image ~$0.0012
100-page book ~$0.04
500-page book ~$0.20

Monitoring Extraction

# Check extraction results for a book
curl -s "https://sourcelibrary.org/api/gallery?bookId=BOOK_ID" | jq '{
  total: .total,
  high_quality: [.images[] | select(.galleryQuality >= 0.75)] | length,
  types: [.images[].type] | group_by(.) | map({type: .[0], count: length})
}'

# Sample image metadata
curl -s "https://sourcelibrary.org/api/gallery/image/PAGE_ID:0" | jq '{
  type,
  galleryQuality,
  museumDescription,
  metadata
}'

Troubleshooting

Rate Limiting

The evaluation script has built-in 500ms delays. For higher throughput:

  • Use multiple API keys (rotate on 429)
  • Reduce delay for paid tiers

Missing Images

If extraction finds fewer images than expected:

  • Check page image quality (blurry scans)
  • Verify pages have cropped_photo or photo_original URLs
  • Re-run with --clear to retry

Wrong Classifications

If images are misclassified:

  • Edit via gallery UI at /gallery/image/PAGE_ID:INDEX
  • Use PATCH endpoint to update type/metadata