What is award-extractor?

This Claude skill automates the extraction of structured wine award data from PDF competition booklets, rating guides, and certification documents. By leveraging advanced parsing techniques, it identifies key information such as wine names, producers, vintages, medals, and scores, seamlessly importing them into a structured database to eliminate manual data entry and improve accuracy in wine inventory management.

When should I use award-extractor?

award-extractor is useful in the following scenarios: • Importing competition results: Automatically parse large PDF booklets from major events like the IWSC or Decanter World Wine Awards to update award databases. • Digitizing rating guides: Extract wine scores and tasting notes from PDF publications like Wine Spectator or Wine Enthusiast for cellar management apps. • Batch processing: Handle multiple award documents simultaneously to build a comprehensive historical record of wine performance across different vintages. • Database synchronization: Identify and validate extracted data against existing records to prevent duplicates and ensure high-quality data integrity in wine cellar systems.

name	award-extractor
description	Extracts wine awards from PDF documents. Use when importing competition results, processing wine ratings, or when user mentions "extract awards", "parse awards PDF", "import competition results", or "process wine ratings booklet".
allowed-tools	Read, Bash(node:), Bash(sqlite3:), mcp__pdf-reader__, mcp__sqlite__

Wine Award Extraction Skill

Overview

Extracts structured wine award data from PDF competition booklets, rating guides, and certification documents for import into the wine cellar app's awards database.

When to Use

Importing awards from competition PDFs (IWSC, Decanter World Wine Awards, etc.)
Processing wine rating booklets (Wine Spectator, Wine Enthusiast)
Batch-importing multiple award documents
User says: "extract awards", "import competition results", "process this awards PDF"

Database Schema

Awards are stored in data/awards.db with this structure:

CREATE TABLE awards (
  id INTEGER PRIMARY KEY,
  wine_name TEXT NOT NULL,
  producer TEXT,
  vintage INTEGER,
  country TEXT,
  region TEXT,
  grape_variety TEXT,
  award_name TEXT NOT NULL,        -- e.g., "IWSC 2024"
  medal TEXT,                       -- Gold, Silver, Bronze, Trophy
  score INTEGER,                    -- Points (if applicable)
  category TEXT,                    -- Competition category
  source_file TEXT,                 -- Original PDF filename
  extracted_at TEXT DEFAULT CURRENT_TIMESTAMP
);

Extraction Process

Step 1: Read the PDF

Use the PDF Reader MCP to extract text content:

Use mcp__pdf-reader__read_pdf tool with the PDF file path

Step 2: Identify Document Structure

Look for common patterns in wine competition PDFs:

Table format: Rows with Wine | Producer | Medal | Score columns
Category sections: Headers like "CABERNET SAUVIGNON", "SOUTH AFRICAN REDS"
Award indicators: Gold/Silver/Bronze, Trophy, points (90-100 scale)
Vintage patterns: 4-digit years (2018, 2019, 2020, etc.)

Step 3: Extract Award Data

For each wine entry, extract:

Field	Description	Examples
wine_name	Full wine name	"Kanonkop Paul Sauer"
producer	Winery/producer	"Kanonkop"
vintage	Year of wine	2019
country	Country of origin	"South Africa"
region	Wine region	"Stellenbosch"
grape_variety	Grape(s)	"Cabernet Sauvignon Blend"
award_name	Competition + year	"IWSC 2024"
medal	Medal type	"Gold", "Silver", "Bronze", "Trophy"
score	Points if given	95
category	Competition category	"Red Bordeaux Blends over $20"

Step 4: Validate and Match

Before importing:

Check for duplicate entries in awards.db
Cross-reference producer names with existing cellar entries
Normalize medal names (GOLD -> Gold, G -> Gold)
Verify vintage years are reasonable (1950-current year)

Step 5: Import to Database

Use the SQLite MCP to insert awards:

INSERT INTO awards (wine_name, producer, vintage, country, region,
                    grape_variety, award_name, medal, score, category, source_file)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?);

Output Format

Return extracted awards as JSON array:

[
  {
    "wine_name": "Kanonkop Paul Sauer",
    "producer": "Kanonkop",
    "vintage": 2019,
    "country": "South Africa",
    "region": "Stellenbosch",
    "grape_variety": "Cabernet Sauvignon Blend",
    "award_name": "Decanter World Wine Awards 2024",
    "medal": "Gold",
    "score": 95,
    "category": "Red Bordeaux Blends - South Africa"
  }
]

Common Competition Formats

IWSC (International Wine & Spirit Competition)

Categories by grape variety and country
Medals: Trophy, Gold Outstanding, Gold, Silver, Bronze
No numeric scores

Decanter World Wine Awards

Regional categories
Medals: Best in Show, Platinum, Gold, Silver, Bronze
Points: 95-100 (Platinum), 90-94 (Gold), etc.

Tim Atkin South Africa Report

Wines rated on 100-point scale
Categories by region and style
First Growths, Wines of Origin designations

Platter's South African Wine Guide

5-star rating system
Wines organized by producer
Includes drinking windows

Tips for Accuracy

Table extraction: Look for consistent column spacing or delimiters
Multi-page handling: Track category headers across page breaks
OCR artifacts: Handle common OCR errors (0 vs O, l vs 1)
Partial data: Flag entries missing critical fields for review
Duplicate detection: Use wine_name + vintage + award_name as unique key

Example Usage

User: "Extract awards from this IWSC 2024 booklet"

Claude will:

Use pdf-reader MCP to extract text from the PDF
Parse the table structure to identify wine entries
Extract medal, producer, wine name, vintage for each entry
Validate data and check for duplicates
Insert into awards.db using sqlite MCP
Report summary: "Extracted 147 awards, 3 duplicates skipped"

award-extractor

When & Why to Use This Skill

Use Cases