etl-duckdb
Load CSV/XLSX into DuckDB with validation and an ETL markdown report
When & Why to Use This Skill
This Claude skill automates the ETL (Extract, Transform, Load) process by seamlessly converting CSV and XLSX spreadsheets into a high-performance DuckDB database. It enhances data integrity through automated validation checks—tracking input rows, null values, and duplicates—while generating a comprehensive markdown report for immediate auditing and transparency.
Use Cases
- Data Migration: Efficiently move large datasets from legacy Excel or CSV formats into a DuckDB environment for high-speed analytical querying.
- Data Quality Auditing: Automatically identify and document data anomalies, such as missing values or duplicate records, before they impact downstream analysis.
- Automated Pipeline Reporting: Generate standardized ETL markdown reports that provide evidence of data ingestion success and integrity for stakeholders.
- Local Analytics Setup: Quickly initialize a local analytical database from raw files to perform complex SQL operations without manual database setup.
| name | etl-duckdb |
|---|---|
| description | Load CSV/XLSX into DuckDB with validation and an ETL markdown report |
Instructions:
- Run: powershell -ExecutionPolicy Bypass -File .codex/skills/etl-duckdb/scripts/run.ps1
- Output:
- data/_artifacts/ops.duckdb
- reports/etl_report.md Fail-safe:
- If inputs missing, produce only the '以묐떒' table in reports/etl_report.md and exit. Evidence Required:
- input_rows, null_cells, duplicated_rows, duckdb_written