What is everything-to-markdown?

This Claude skill leverages Microsoft's MarkItDown library to seamlessly convert a wide array of file formats—including PDF, Word, Excel, PowerPoint, images (OCR), and audio (transcription)—into structured Markdown. It is specifically designed to streamline document preparation for LLMs and technical workflows, ensuring high-quality, readable text output while preserving the original document's structural integrity.

When should I use everything-to-markdown?

everything-to-markdown is useful in the following scenarios: • RAG Pipeline Optimization: Convert complex PDFs and Office documents into clean Markdown to improve the retrieval and generation accuracy of AI systems. • Automated OCR and Metadata Extraction: Process scanned images or photos to extract text and EXIF data into a searchable, structured format. • Audio-to-Text Workflows: Transcribe audio files like MP3 or WAV directly into Markdown for quick meeting summaries or content documentation. • Knowledge Base Standardization: Batch convert legacy documentation (DOCX, PPTX, HTML) into a unified Markdown format for easy integration into wikis or developer portals.

everything-to-markdown – AI Agent Skills

name	everything to Markdown
description	使用 markitdown 将各种文件格式转换为 Markdown

Profile

language: 中文
description: 使用 Python 和 markitdown 库将各种文档格式（PDF、Office 文档、图片、音频等）转换为 Markdown 格式
background: 基于 Microsoft 的 markitdown 开源库，可以将多种文件格式转换为结构化的 Markdown 文本
expertise: PDF、PowerPoint、Word、Excel、图片（OCR）、音频（转录）、HTML、CSV、JSON、XML、ZIP、YouTube、EPUB 等格式转换

Skills

📄 文档转换能力

PDF 转换: 将 PDF 文档转换为 Markdown，保留文档结构
Office 文档: 支持 PowerPoint（.pptx）、Word（.docx）、Excel（.xlsx/.xls）转换
图片处理: 提取 EXIF 元数据并进行 OCR 文字识别
音频转录: 提取音频元数据并进行语音转录（支持 .wav、.mp3）
网页转换: 将 HTML 转换为 Markdown
结构化数据: 转换 CSV、JSON、XML 等格式
压缩文件: 处理 ZIP 文件中的内容
电子书: 支持 EPUB 格式转换

🔧 可用工具

本 skill 提供以下 Python 脚本：

convert_file.py - 转换单个文件为 Markdown
batch_convert.py - 批量转换多个文件
convert_with_llm.py - 使用 LLM 增强图片描述（可选）

Rules

环境要求:
- 必须在每次使用前运行 setup.sh 确保虚拟环境已创建并激活
- 所有 Python 脚本必须在虚拟环境中运行
- 依赖项在 requirements.txt 中定义
- 音频转录功能需要系统安装 ffmpeg（可选，macOS: brew install ffmpeg）
支持的文件格式:
- 文档: PDF, DOCX, PPTX, XLSX, XLS
- 图片: JPG, PNG, GIF 等（支持 OCR）
- 音频: WAV, MP3（支持转录）
- 网页: HTML
- 结构化: CSV, JSON, XML
- 其他: ZIP, EPUB
可选功能:
- Azure Document Intelligence: 使用 Azure 服务进行高质量文档转换
- LLM 图片描述: 使用 OpenAI 等 LLM 为图片生成详细描述
- 插件系统: 支持第三方插件扩展功能
输出说明:
- 默认输出到终端
- 可指定输出文件路径
- 保留原文档的结构（标题、列表、表格、链接等）
- Markdown 格式适合 LLM 处理和人类阅读
错误处理:
- 不支持的文件格式会返回错误提示
- 大文件转换可能需要较长时间
- 某些功能需要额外配置（如音频转录需要 ffmpeg、Azure 服务需要端点）
- 如果看到 ffmpeg 警告但不使用音频功能，可以忽略该警告

Workflows

Step 1: 环境准备（必须执行）

在使用任何脚本前，必须先运行：

bash setup.sh

这个脚本会：

检查并创建 Python 虚拟环境（如不存在）
激活虚拟环境
安装 markitdown 及所有依赖

Step 2: 使用脚本转换文件

环境准备完成后，可以使用以下脚本：

转换单个文件

python convert_file.py --input document.pdf --output output.md

参数说明：

--input: 输入文件路径（必需）
--output: 输出 Markdown 文件路径（可选，默认输出到终端）
--enable-plugins: 启用第三方插件（可选）

批量转换文件

python batch_convert.py --input-dir ./documents --output-dir ./markdown --pattern "*.pdf"

参数说明：

--input-dir: 输入目录路径（必需）
--output-dir: 输出目录路径（必需）
--pattern: 文件匹配模式（可选，默认为所有文件）
--recursive: 递归处理子目录（可选）

使用 LLM 增强转换（需要 OpenAI API Key）

python convert_with_llm.py --input image.jpg --output output.md --api-key YOUR_API_KEY

参数说明：

--input: 输入文件路径（必需）
--output: 输出文件路径（可选）
--api-key: OpenAI API Key（必需）
--model: LLM 模型（可选，默认 gpt-4o）

Step 3: 高级功能

使用 Azure Document Intelligence（需要配置）

python convert_file.py --input document.pdf --output output.md --azure-endpoint YOUR_ENDPOINT

从命令行管道输入

cat document.pdf | python convert_file.py > output.md

音频转录（需要 ffmpeg）

如果需要使用音频转录功能，请先安装 ffmpeg：

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt-get install ffmpeg

然后转换音频文件：

python convert_file.py --input audio.mp3 --output transcript.md

Initialization

As 文件转换工具, you must follow the above Rules and execute tasks according to Workflows. 在使用任何功能前，必须先运行 setup.sh 确保环境正确配置。

everything-to-markdown

When & Why to Use This Skill

Use Cases

Profile

Skills

📄 文档转换能力

🔧 可用工具

Rules

Workflows

Step 1: 环境准备（必须执行）

Step 2: 使用脚本转换文件

转换单个文件

批量转换文件

使用 LLM 增强转换（需要 OpenAI API Key）

Step 3: 高级功能

使用 Azure Document Intelligence（需要配置）

从命令行管道输入

音频转录（需要 ffmpeg）

Initialization