📋Document Data Extraction Skills
Browse skills in the Document Data Extraction category.
detailed-design-parser
Parses detailed-design.md files to extract and format file-specific changes for easier copying. Use this skill when the user wants to generate a `detailed-design-by-file.md` from a `detailed-design.md` file.
award-extractor
Extracts wine awards from PDF documents. Use when importing competition results, processing wine ratings, or when user mentions "extract awards", "parse awards PDF", "import competition results", or "process wine ratings booklet".
extract-images
Extract and catalog illustrations from historical books using AI vision. Generates rich metadata (subjects, figures, symbols, style, technique) and museum-style descriptions. Use when asked to extract images, run image detection, or process book illustrations.
gemini-ocr
Document OCR integration using Google Gemini Flash API for extracting data from passports and licenses.
pdf-tools
Search and extract content from PDF files. Use when searching PDFs, finding text in documents, or extracting specific pages without reading the entire file.
receipt-parser-engineer
Create, fix, and refine deterministic receipt/invoice parsers in curlys-books, including vendor detection, OCR text extraction routing (pdfplumber for text PDFs → AWS Textract fallback; images → Textract), golden fixture creation, and updates to the vendor dispatcher/registry. Use when adding a new vendor parser, debugging mis-detections, improving totals/tax/date/line extraction, or deciding when to rely on Claude Vision fallback for vendors without a tested parser.
timeline-generator
Generates a chronological timeline of key events, decisions, and flashpoints from a collection of documents. Use when asked to create a timeline, understand sequence of events, see what happened when, or track how a situation evolved over time.
parse-bank-statement-pdf
Parse bank statement PDF text into structured transaction data with account information and transactions in consistent JSON format. Works with any bank format. Use when you need to extract or parse transactions from PDF bank statements.
nlp-pipeline-builder
Build natural language processing pipelines for text analysis and understanding
datasheet
Extract structured information from integrated circuit and component datasheets (PDF files or URLs) and generate consistent markdown summaries. Use when the user requests to extract, summarize, analyze, or document information from IC/component datasheets, or when they provide a datasheet and want structured documentation. Triggers on phrases like "extract this datasheet", "summarize this datasheet", "analyze [component name]", "document this IC", or when working with datasheets for hardware design.
entity-extractor
Extract named entities from text with high accuracy and customization
receipt-processing
Receipt OCR extraction, LLM fallback, and job pipeline patterns for TaxHelper. Use when working on receipt upload, extraction, inbox, or transaction creation from receipts.
ocr
画像ファイルからテキストを抽出しクリップボードにコピー。「文字起こし」「OCR」などで使用。
ingredient-scanner
扫描护肤品成分表,OCR识别并AI解读成分功效与风险。实现成分扫描功能时使用此技能。
genome-analyzer
Анализирует генетические данные пользователя из VCF файла. Используй когда пользователь спрашивает о своей генетике, наследственных признаках, предрасположенностях, метаболизме веществ (кофеин, алкоголь, лекарства), спортивных способностях, рисках заболеваний, питании на основе генов.
look-at
This skill should be used when the user asks to 'look at', 'analyze', 'describe', 'extract from', or 'what's in' media files like PDFs, images, diagrams, screenshots, or charts. Triggers include: 'what does this image show', 'extract the table from this PDF', 'describe this diagram', 'what's in this screenshot', 'analyze this chart', 'read this image', 'get text from this PDF', 'summarize this document', or requests for specific data extraction from visual or document files. Use when analyzed/interpreted content is needed rather than literal file reading (which uses Read tool).
langextract
Extract structured information from unstructured text using LLMs with source grounding. Use when extracting entities from documents, medical notes, clinical reports, or any text requiring precise, traceable extraction. Supports Gemini, OpenAI, and local models (Ollama). Includes visualization and long document processing.
multimodal-looker
多模态内容分析专家代理,专注于图片、PDF、图表等视觉内容的解读和信息提取。当用户需要以下帮助时使用:(1) 分析图片内容 (2) 提取 PDF 信息 (3) 解读图表和数据可视化 (4) 理解架构图和流程图 (5) 从截图提取信息 (6) 设计稿分析。触发词包括:「看这个图」「分析这个 PDF」「这个图表说明什么」「帮我看一下」等视觉内容分析请求。
docling
Document reading and conversion using Docling. Use this skill when user asks to read, open, or process document files in these formats: PDF, DOCX, PPTX, XLSX, HTML, Markdown, AsciiDoc, or images (PNG, JPG, TIFF). Supports OCR for scanned documents. Trigger when:(1) User asks to read/open a document file (e.g., "このPDFを読んで", "read this document", "ファイルの内容を確認して")(2) File extension is .pdf, .docx, .pptx, .xlsx, .html, .md, .adoc, .png, .jpg, .tiff(3) User wants to extract text from scanned documents with OCR(4) User wants to convert documents to Markdown/JSON/HTML(5) User wants to process documents with tables, figures, or photos(6) User wants to extract images/figures from documents
docx-reader
Reads Microsoft Word (.docx) files and extracts text content. Use when needing to read .docx documents. Requires python-docx package.
pdf-vision-reader
Converts PDF pages to images and uses vision analysis to extract content including diagrams, charts, and visual elements. Use for PDFs with rich visual content. Requires pdf2image and poppler-utils.
pdf-reader
Reads PDF files and extracts text content in Markdown format. Handles tables and multi-page documents. Use when needing to read PDF documents. Requires pdfplumber package.
land-reduction-trespass
Clerk for reserve reduction, trespass, survey errors, and railway takings; use when processing the Land_Reduction_Trespass queue.
fiduciary-duty-negligence
Clerk for Crown fiduciary breaches, fund mismanagement, conflicts of interest, and failure to protect reserve lands; use for Fiduciary_Duty_Negligence queue.
data-analysis
AI-powered data analysis for Empathy Ledger. Use when working with themes, quotes, story suggestions, transcript analysis, storyteller connections, or any feature requiring extracted insights. Ensures consistent analysis patterns across the platform.
coercion-duress
Clerk for forced surrenders, threats, procedural irregularities, and lack of informed consent; use for Coercion_Duress queue.
water-rights-fishing
Clerk for water licenses, irrigation, riparian rights, and fishing restrictions affecting Pukaist/Nlaka'pamux; use for Water_Rights_Fishing queue.
governance-sovereignty
Clerk for Chief/Council authority, assertions of title, self-government, and resistance to federal imposition; use for Governance_Sovereignty queue.
design-asset-parser
Parse Figma/PDF design exports to extract UI specs and draft design-spec.md and pending-questions.md. Use when analyzing design assets.
gemini-document-processing
Guide for implementing Google Gemini API document processing - analyze PDFs with native vision to extract text, images, diagrams, charts, and tables. Use when processing documents, extracting structured data, summarizing PDFs, answering questions about document content, or converting documents to structured formats. (project)
docling
Convert documents (PPTX, PDF, DOCX, XLSX, images) to Markdown/JSON/HTML using IBM Docling. This skill should be used when user asks to convert, parse, or extract content from documents. Triggers on "convert pptx", "parse pdf", "extract from document", "конвертуй презентацію", "витягни з pdf".
legal-ocr
Extrai texto de documentos jurídicos escaneados em PDF usando OCR otimizado para linguagem jurídica brasileira. Use quando precisar converter PDFs escaneados (sentenças, petições, acórdãos) em texto editável com alta precisão. Suporta documentos de baixa qualidade, multi-colunas, tabelas e termos jurídicos específicos.
document-ocr-processing
Process scanned documents and images containing Chuukese text using OCR with specialized post-processing for accent characters and traditional formatting. Use when working with scanned books, documents, or images that contain Chuukese text that needs to be digitized.
bill-processing
Extract data from bill/receipt images and return JSON for lunch-splitter app
airparser-api
Guia para integrar con el servicio de parsing de documentos Airparser via API, webhooks y Make.com. Usar cuando se configuren inboxes, esquemas de extraccion, o flujos de automatizacion para procesamiento de recibos.
data-extraction
Use when extracting structured data from medical research PDFs, parsing study characteristics, patient demographics, outcomes, and results. Invoke for systematic review data collection from papers.
ai-training-data-generation
Generate high-quality training datasets from documents, text corpora, and structured content. Use when creating AI training data from dictionaries, documents, or when generating examples for machine learning models. Optimized for low-resource languages and domain-specific knowledge extraction.
document-parser
Parse large documents into structured sections with abstracts and metadata
legislative-flattener
Converts hierarchical legislative text from Word documents into a flat list of requirements. Use when processing regulatory documents, compliance frameworks, or legal text that needs to be extracted into individual, numbered requirements for analysis or mapping.
neurosurgical-book-parser
Extract structured knowledge from neurosurgical and spine surgery textbooks. Identifies anatomical structures, surgical procedures, complications, and clinical relationships. Use when processing medical PDFs, building surgical knowledge graphs, or creating clinical decision support content. Applies kaizen continuous improvement from prior extractions.
gemini-pdf
Process multimodal documents using Gemini CLI, leveraging Gemini's superior multimodal capabilities. Use for PDFs, scanned documents, image-heavy documents, or any file where visual understanding matters. Ideal for extracting content from complex layouts, tables, diagrams, handwritten notes, or mixed text/image documents. Triggers on PDF processing, document extraction, "use Gemini for this", or when document has visual complexity that benefits from multimodal understanding.
data-normalizer
발굴조사 자료(논문/보고서/주변유적) 수집 및 메타데이터 정규화
nanonets-api
Guia para integrar con el servicio OCR de Nanonets via API. Usar cuando se necesite extraer datos de documentos, crear modelos OCR, subir archivos para prediccion, o entrenar modelos personalizados.
pdf-reader
Extract text, tables, and images from PDF files using pdfplumber and PyMuPDF. Use when analyzing PDF documents, brand materials, reports, or any content that requires structured extraction from PDF format. Supports table detection, layout preservation, and high-quality image extraction.
sense
sense - Diagrammatic Video Extraction with Subtitle Alignment
bsee-data-extractor
Extract and process BSEE (Bureau of Safety and Environmental Enforcement) production data. Use for querying oil/gas production data by API number, block, lease, or field with automatic data normalization and caching.
sodir-data-extractor
SODIR Data Extractor (user)
ocr-super-surya
GPU-optimized OCR using Surya. Use when: (1) Extracting text from images/screenshots, (2) Processing PDFs with embedded images, (3) Multi-language document OCR, (4) Layout analysis and table detection. Supports 90+ languages with 2x accuracy over Tesseract.
extract-heal-text-from-powerpoint
Extract HEAL matrix content from PowerPoint slides and save as formatted text files matching N2 HEAL format
extract-epiroc-bev-weekly-report
Extract key sections from Epiroc BRMO BEV weekly PDF reports into structured markdown format for Week N