nixtla-schema-mapper
Transform data sources to Nixtla schema (unique_id, ds, y) with column inference. Use when preparing data for forecasting. Trigger with 'map to Nixtla schema' or 'transform data'.
When & Why to Use This Skill
The Nixtla Schema Mapper is a specialized Claude skill designed to streamline time-series data preparation for forecasting. It automates the transformation of diverse data sources—including CSV, SQL, Parquet, and dbt models—into the standardized Nixtla format (unique_id, ds, y). By utilizing intelligent column inference and automated Python code generation, this skill eliminates manual data munging, ensures schema consistency through validation contracts, and accelerates the transition from raw data to actionable forecasting insights.
Use Cases
- Case 1: Converting raw sales spreadsheets or CSV files into the required Nixtla format by automatically detecting timestamps, target values, and series IDs.
- Case 2: Generating reusable Python transformation scripts for SQL databases to ensure data pipelines consistently output forecasting-ready datasets.
- Case 3: Establishing schema contracts and validation rules for data engineering teams to maintain high data quality and prevent 'no timestamp' or 'non-numeric target' errors during model training.
- Case 4: Preparing complex dbt models for integration with forecasting tools like TimeGPT or StatsForecast by automating column renaming and type casting.
| name | nixtla-schema-mapper |
|---|---|
| description | "Transform data sources to Nixtla schema (unique_id, ds, y) with column inference. Use when preparing data for forecasting. Trigger with 'map to Nixtla schema' or 'transform data'." |
| allowed-tools | "Read,Write,Glob,Grep,Edit" |
| version | "1.1.0" |
| author | "Jeremy Longshore <jeremy@intentsolutions.io>" |
| license | MIT |
Nixtla Schema Mapper
Transform data sources to Nixtla-compatible schema (unique_id, ds, y).
Overview
This skill automates data transformation:
- Column inference: Detects timestamp, target, and ID columns
- Code generation: Python modules for CSV/SQL/Parquet/dbt
- Schema contracts: Documentation with validation rules
- Quality checks: Validates transformed data
Prerequisites
Required:
- Python 3.8+
pandas
Optional:
pyarrow: For Parquet supportsqlalchemy: For SQL sourcesdbt-core: For dbt models
Installation:
pip install pandas pyarrow sqlalchemy
Instructions
Step 1: Identify Data Source
Supported formats:
- CSV/Parquet files
- SQL tables or queries
- dbt models
Step 2: Analyze Schema
python {baseDir}/scripts/analyze_schema.py --input data/sales.csv
Output:
Detected columns:
Timestamp: 'date' (datetime64)
Target: 'sales' (float64)
Series ID: 'store_id' (object)
Exogenous: price, promotion
Step 3: Generate Transformation
python {baseDir}/scripts/generate_transform.py \
--input data/sales.csv \
--id_col store_id \
--date_col date \
--target_col sales \
--output data/transform/to_nixtla_schema.py
Step 4: Create Schema Contract
python {baseDir}/scripts/create_contract.py \
--mapping mapping.json \
--output NIXTLA_SCHEMA_CONTRACT.md
Step 5: Validate Transformation
python data/transform/to_nixtla_schema.py
Output
- data/transform/to_nixtla_schema.py: Transformation module
- NIXTLA_SCHEMA_CONTRACT.md: Schema documentation
- nixtla_data.csv: Transformed data (optional)
Error Handling
Error:
No timestamp column detectedSolution: Specify manually with--date_colError:
Multiple target candidatesSolution: Specify manually with--target_colError:
Date parsing failedSolution: Specify format with--date_format "%Y-%m-%d"Error:
Non-numeric target columnSolution: Check for string values, usepd.to_numeric(errors='coerce')
Examples
Example 1: CSV Transformation
python {baseDir}/scripts/generate_transform.py \
--input sales.csv \
--id_col product_id \
--date_col timestamp \
--target_col revenue
Generated code:
def to_nixtla_schema(path="sales.csv"):
df = pd.read_csv(path)
df = df.rename(columns={
'product_id': 'unique_id',
'timestamp': 'ds',
'revenue': 'y'
})
df['ds'] = pd.to_datetime(df['ds'])
return df[['unique_id', 'ds', 'y']]
Example 2: SQL Source
python {baseDir}/scripts/generate_transform.py \
--sql "SELECT * FROM daily_sales" \
--connection postgresql://localhost/db \
--id_col store_id \
--date_col sale_date \
--target_col amount
Resources
- Scripts:
{baseDir}/scripts/ - Templates:
{baseDir}/assets/templates/ - Nixtla Schema Docs: https://nixtla.github.io/statsforecast/
Related Skills:
nixtla-timegpt-lab: Use transformed data for forecastingnixtla-experiment-architect: Reference in experiments