etl-duckdb

macho715's avatarfrom macho715

Load CSV/XLSX into DuckDB with validation and an ETL markdown report

0stars🔀0forks📁View on GitHub🕐Updated Jan 9, 2026

When & Why to Use This Skill

This Claude skill automates the ETL (Extract, Transform, Load) process by seamlessly converting CSV and XLSX spreadsheets into a high-performance DuckDB database. It enhances data integrity through automated validation checks—tracking input rows, null values, and duplicates—while generating a comprehensive markdown report for immediate auditing and transparency.

Use Cases

  • Data Migration: Efficiently move large datasets from legacy Excel or CSV formats into a DuckDB environment for high-speed analytical querying.
  • Data Quality Auditing: Automatically identify and document data anomalies, such as missing values or duplicate records, before they impact downstream analysis.
  • Automated Pipeline Reporting: Generate standardized ETL markdown reports that provide evidence of data ingestion success and integrity for stakeholders.
  • Local Analytics Setup: Quickly initialize a local analytical database from raw files to perform complex SQL operations without manual database setup.
nameetl-duckdb
descriptionLoad CSV/XLSX into DuckDB with validation and an ETL markdown report

Instructions:

  • Run: powershell -ExecutionPolicy Bypass -File .codex/skills/etl-duckdb/scripts/run.ps1
  • Output:
    • data/_artifacts/ops.duckdb
    • reports/etl_report.md Fail-safe:
  • If inputs missing, produce only the '以묐떒' table in reports/etl_report.md and exit. Evidence Required:
  • input_rows, null_cells, duplicated_rows, duckdb_written
etl-duckdb – AI Agent Skills | Claude Skills