date-normalizer
Use when asked to parse, normalize, standardize, or convert dates from various formats to consistent ISO 8601 or custom formats.
When & Why to Use This Skill
The Date Normalizer Claude skill is a robust utility designed to parse, standardize, and convert dates from over 100 different formats into consistent ISO 8601 or custom structures. It streamlines data preparation for ETL pipelines, database management, and international data harmonization by intelligently handling timezones, relative dates, and ambiguous formats to ensure data integrity.
Use Cases
- Standardizing inconsistent date formats across large CSV datasets to prepare them for seamless database imports and ETL processing.
- Parsing and normalizing timestamps in system log files from multiple sources to facilitate accurate time-series analysis and debugging.
- Resolving regional date format ambiguities (such as US MM/DD vs. EU DD/MM) in international datasets using smart detection and explicit configuration.
- Converting natural language relative date expressions like 'yesterday', 'last month', or '2 weeks ago' into structured, machine-readable calendar dates.
- Validating and cleaning messy user-submitted date strings in web forms or legacy systems to prevent downstream data errors.
| name | date-normalizer |
|---|---|
| description | Use when asked to parse, normalize, standardize, or convert dates from various formats to consistent ISO 8601 or custom formats. |
Date Normalizer
Parse and normalize dates from various formats into consistent, standardized formats for data cleaning and ETL pipelines.
Purpose
Date standardization for:
- Data cleaning and ETL pipelines
- Database imports with mixed date formats
- Log file parsing and analysis
- International data harmonization
- Report generation with consistent dating
Features
- Smart Parsing: Automatically detect and parse 100+ date formats
- Format Conversion: Convert to ISO 8601, US, EU, or custom formats
- Batch Processing: Normalize entire CSV columns
- Ambiguity Detection: Flag dates that could be interpreted multiple ways
- Timezone Handling: Convert and normalize timezones
- Relative Dates: Parse "today", "yesterday", "next week"
- Validation: Detect and report invalid dates
Quick Start
from date_normalizer import DateNormalizer
# Normalize single date
normalizer = DateNormalizer()
result = normalizer.normalize("03/14/2024")
print(result) # {'normalized': '2024-03-14', 'format': 'iso8601'}
# Normalize to specific format
result = normalizer.normalize("March 14, 2024", output_format="us")
print(result) # {'normalized': '03/14/2024', 'format': 'us'}
# Batch normalize CSV column
normalizer.normalize_csv(
'data.csv',
date_column='created_at',
output='normalized.csv',
output_format='iso8601'
)
CLI Usage
# Normalize single date
python date_normalizer.py --date "March 14, 2024"
# Convert to specific format
python date_normalizer.py --date "14/03/2024" --format us
# Normalize CSV column
python date_normalizer.py --csv data.csv --column date --format iso8601 --output normalized.csv
# Detect ambiguous dates
python date_normalizer.py --date "01/02/03" --detect-ambiguous
API Reference
DateNormalizer
class DateNormalizer:
def normalize(self, date_string: str, output_format: str = 'iso8601',
dayfirst: bool = False, yearfirst: bool = False) -> Dict
def normalize_batch(self, dates: List[str], **kwargs) -> List[Dict]
def normalize_csv(self, csv_path: str, date_column: str,
output: str = None, **kwargs) -> str
def detect_format(self, date_string: str) -> str
def is_valid(self, date_string: str) -> bool
def is_ambiguous(self, date_string: str) -> bool
def parse_relative(self, relative_string: str) -> datetime
Output Formats
ISO 8601 (default):
'2024-03-14' # Date only
'2024-03-14T15:30:00' # With time
'2024-03-14T15:30:00+00:00' # With timezone
US Format:
'03/14/2024' # MM/DD/YYYY
EU Format:
'14/03/2024' # DD/MM/YYYY
Long Format:
'March 14, 2024'
Custom Format:
normalizer.normalize(date, output_format='%Y%m%d') # '20240314'
Supported Input Formats
Numeric:
2024-03-14(ISO)03/14/2024(US)14/03/2024(EU)14.03.2024(German)2024/03/14(Japanese)20240314(Compact)
Textual:
March 14, 202414 March 2024Mar 14, 202414-Mar-2024
Relative:
today,yesterday,tomorrownext week,last month2 days ago,in 3 weeks
With Time:
2024-03-14 15:30:0003/14/2024 3:30 PM2024-03-14T15:30:00Z
Ambiguity Handling
Dates like 01/02/03 are ambiguous. Specify interpretation:
# Day first (EU)
normalizer.normalize("01/02/03", dayfirst=True)
# Result: 2003-02-01
# Month first (US)
normalizer.normalize("01/02/03", dayfirst=False)
# Result: 2003-01-02
# Year first
normalizer.normalize("01/02/03", yearfirst=True)
# Result: 2001-02-03
Use Cases
Clean Messy Data:
messy_dates = [
"March 14, 2024",
"2024-03-15",
"03/16/2024",
"17-Mar-2024"
]
normalized = normalizer.normalize_batch(messy_dates)
# All converted to: ['2024-03-14', '2024-03-15', '2024-03-16', '2024-03-17']
CSV Normalization:
# Input CSV with mixed date formats
# Convert all to ISO 8601
normalizer.normalize_csv(
'orders.csv',
date_column='order_date',
output='orders_normalized.csv',
output_format='iso8601'
)
Validation:
if not normalizer.is_valid("invalid date"):
print("Invalid date detected")
Timezone Conversion:
normalizer.normalize(
"2024-03-14 15:30:00+00:00",
output_timezone='America/New_York'
)
Limitations
- Cannot parse dates from images or PDFs (use OCR first)
- Ambiguous dates require manual specification of format
- Very old dates (<1900) may have limited support
- Non-Gregorian calendars not supported
- Some regional formats may need explicit configuration