time-series-analysis
Temporal pattern detection and forecasting. Use when analyzing trends over time, detecting seasonality, identifying anomalies in time series, or building simple forecasting models.
When & Why to Use This Skill
This Claude skill provides a comprehensive framework for temporal pattern detection and forecasting. It enables users to analyze historical data to identify underlying trends, seasonal cycles, and statistical anomalies. By leveraging advanced statistical models like ARIMA, it generates data-driven forecasts with confidence intervals, helping businesses make informed decisions for capacity planning, financial budgeting, and operational monitoring.
Use Cases
- Revenue and Financial Forecasting: Predict future sales and revenue trends to inform quarterly budget cycles and strategic planning.
- Operational Capacity Planning: Forecast infrastructure needs, such as web traffic or server load, to ensure resource availability during peak periods.
- Anomaly Detection and Monitoring: Identify unusual spikes or drops in business metrics to quickly investigate data quality issues or external incidents.
- Seasonal Trend Analysis: Decompose metrics to understand the impact of holidays, weekends, and annual cycles on user behavior and performance.
| name | time-series-analysis |
|---|---|
| description | Temporal pattern detection and forecasting. Use when analyzing trends over time, detecting seasonality, identifying anomalies in time series, or building simple forecasting models. |
Time Series Analysis
Quick Start
Analyze temporal data to identify trends, seasonality, and anomalies, then create forecasts with confidence intervals for business planning.
Context Requirements
Before analyzing time series, I need:
- Time Series Data: Values measured over time
- Time Granularity: Daily, weekly, monthly, hourly, etc.
- Forecast Horizon: How far ahead to predict
- Known Events (optional): Holidays, campaigns, incidents that affect data
- Business Context: What drives changes in this metric
Context Gathering
For Time Series Data:
"I need historical data with timestamps:
Format:
date | metric_value
2024-01-01 | 1,250
2024-01-02 | 1,320
2024-01-03 | 1,180
...
Requirements:
- Regular intervals (no gaps preferred)
- At least 2+ seasonal cycles of data
- Daily data: 2+ years
- Weekly data: 2+ years
- Monthly data: 3+ years
Additional Variables (optional but helpful):
- External factors (marketing spend, weather, etc.)
- Holiday indicators
- Event flags
What time period do you have data for?"
For Granularity:
"What's the natural granularity of your data?
Options:
- Hourly: Web traffic, API calls
- Daily: User signups, revenue, sessions
- Weekly: Aggregate business metrics
- Monthly: Financial reporting, subscriptions
- Quarterly: Board-level metrics
Choose based on:
- How data is collected
- Decision-making frequency
- Forecast use case"
For Forecast Horizon:
"How far ahead do you need to forecast?
Common Horizons:
- Short-term: 1-7 days (operational planning)
- Medium-term: 1-3 months (budget cycles)
- Long-term: 6-12 months (strategic planning)
Note: Forecast accuracy decreases with longer horizons."
For Seasonal Patterns:
"Does this metric have known patterns?
Common Patterns:
- Day of week: Higher on weekdays vs weekends
- Monthly: Month-end spikes, seasonal trends
- Quarterly: Q4 surge for retail
- Annual: Holiday seasons, fiscal year cycles
- Multiple: Weekly + Annual (e.g., summer weekends)
Understanding patterns improves forecast accuracy."
For External Factors:
"What external factors influence this metric?
Examples:
- Marketing campaigns (start/end dates)
- Product launches
- Pricing changes
- Competitor actions
- Economic indicators
- Weather (for some businesses)
- Holidays
These can be included in advanced forecasting models."
Workflow
Step 1: Load and Visualize Time Series
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.stattools import adfuller, acf, pacf
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.arima.model import ARIMA
from sklearn.metrics import mean_absolute_error, mean_squared_error
import warnings
warnings.filterwarnings('ignore')
# Load data
df = pd.read_csv('time_series_data.csv')
df['date'] = pd.to_datetime(df['date'])
df = df.sort_values('date')
df.set_index('date', inplace=True)
print(f"📊 Time Series Loaded:")
print(f" Start: {df.index.min()}")
print(f" End: {df.index.max()}")
print(f" Length: {len(df)} observations")
print(f" Frequency: {df.index.inferred_freq or 'Unknown'}")
# Plot raw time series
plt.figure(figsize=(14, 6))
plt.plot(df.index, df['value'], linewidth=1.5)
plt.title('Time Series Plot')
plt.xlabel('Date')
plt.ylabel('Value')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('time_series_plot.png', dpi=300, bbox_inches='tight')
plt.show()
Checkpoint: "Data loaded. Time range looks good? Any obvious issues?"
Step 2: Check for Stationarity
def test_stationarity(timeseries):
"""
Test if time series is stationary using Augmented Dickey-Fuller test
"""
# Perform ADF test
result = adfuller(timeseries.dropna())
print('\n📈 Stationarity Test (Augmented Dickey-Fuller):')
print(f' ADF Statistic: {result[0]:.4f}')
print(f' P-value: {result[1]:.4f}')
print(f' Critical Values:')
for key, value in result[4].items():
print(f' {key}: {value:.3f}')
if result[1] <= 0.05:
print(f' ✅ STATIONARY: Reject null hypothesis (p < 0.05)')
print(f' Time series has no unit root')
stationary = True
else:
print(f' ⚠️ NON-STATIONARY: Cannot reject null hypothesis (p > 0.05)')
print(f' Time series may have trend/seasonality')
print(f' Consider differencing or detrending')
stationary = False
return stationary
is_stationary = test_stationarity(df['value'])
Step 3: Decompose Time Series
def decompose_time_series(df, period=None):
"""
Decompose time series into trend, seasonal, and residual components
"""
# Determine period if not specified
if period is None:
freq = df.index.inferred_freq
if freq and 'D' in freq:
period = 7 # Weekly seasonality for daily data
elif freq and ('W' in freq or 'M' in freq):
period = 12 # Annual seasonality for weekly/monthly data
else:
period = 12 # Default
# Perform decomposition
decomposition = seasonal_decompose(df['value'], model='additive', period=period)
# Plot components
fig, axes = plt.subplots(4, 1, figsize=(14, 10))
# Original
df['value'].plot(ax=axes[0], title='Original Time Series')
axes[0].set_ylabel('Value')
# Trend
decomposition.trend.plot(ax=axes[1], title='Trend Component')
axes[1].set_ylabel('Trend')
# Seasonal
decomposition.seasonal.plot(ax=axes[2], title='Seasonal Component')
axes[2].set_ylabel('Seasonal')
# Residual
decomposition.resid.plot(ax=axes[3], title='Residual Component')
axes[3].set_ylabel('Residual')
plt.tight_layout()
plt.savefig('decomposition.png', dpi=300, bbox_inches='tight')
plt.show()
# Calculate strength of components
trend_strength = 1 - (decomposition.resid.var() /
(decomposition.trend + decomposition.resid).var())
seasonal_strength = 1 - (decomposition.resid.var() /
(decomposition.seasonal + decomposition.resid).var())
print(f"\n📊 Decomposition Analysis:")
print(f" Trend Strength: {trend_strength:.1%}")
print(f" Seasonal Strength: {seasonal_strength:.1%}")
if trend_strength > 0.6:
print(f" ↗️ Strong trend detected")
if seasonal_strength > 0.6:
print(f" 🔄 Strong seasonality detected")
return decomposition
decomposition = decompose_time_series(df)
Step 4: Detect Anomalies
def detect_anomalies(df, method='iqr', window=30):
"""
Detect anomalies in time series using multiple methods
"""
anomalies = pd.DataFrame(index=df.index)
anomalies['value'] = df['value']
# Method 1: IQR-based detection
if method in ['iqr', 'all']:
rolling_median = df['value'].rolling(window=window, center=True).median()
rolling_std = df['value'].rolling(window=window, center=True).std()
# Calculate z-score equivalent
z_score = np.abs((df['value'] - rolling_median) / rolling_std)
anomalies['iqr_anomaly'] = z_score > 3
# Method 2: Statistical outliers (MAD - Median Absolute Deviation)
if method in ['mad', 'all']:
median = df['value'].median()
mad = np.median(np.abs(df['value'] - median))
modified_z_score = 0.6745 * (df['value'] - median) / mad
anomalies['mad_anomaly'] = np.abs(modified_z_score) > 3.5
# Method 3: Residual-based (using decomposition)
if method in ['residual', 'all']:
try:
decomp = seasonal_decompose(df['value'], model='additive', period=7)
resid_std = decomp.resid.std()
anomalies['residual_anomaly'] = np.abs(decomp.resid) > 3 * resid_std
except:
anomalies['residual_anomaly'] = False
# Combine methods (any method flags as anomaly)
if method == 'all':
anomalies['is_anomaly'] = (
anomalies['iqr_anomaly'] |
anomalies['mad_anomaly'] |
anomalies['residual_anomaly']
)
else:
anomalies['is_anomaly'] = anomalies[f'{method}_anomaly']
# Plot anomalies
plt.figure(figsize=(14, 6))
plt.plot(df.index, df['value'], label='Normal', alpha=0.7)
if anomalies['is_anomaly'].any():
anomaly_points = anomalies[anomalies['is_anomaly']]
plt.scatter(anomaly_points.index, anomaly_points['value'],
color='red', s=100, label='Anomaly', zorder=5)
plt.title('Anomaly Detection')
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('anomalies.png', dpi=300, bbox_inches='tight')
plt.show()
# Report anomalies
anomaly_dates = anomalies[anomalies['is_anomaly']].index
print(f"\n🚨 Anomalies Detected: {len(anomaly_dates)}")
if len(anomaly_dates) > 0:
print(f"\n Top 5 Most Extreme Anomalies:")
top_anomalies = anomalies[anomalies['is_anomaly']].nlargest(5, 'value')
for date, row in top_anomalies.iterrows():
print(f" {date.strftime('%Y-%m-%d')}: {row['value']:,.0f}")
return anomalies
anomalies = detect_anomalies(df, method='all')
Step 5: Build Forecast Model
def build_forecast_model(df, forecast_periods=30):
"""
Build ARIMA forecast model
"""
# Split into train/test
train_size = int(len(df) * 0.8)
train, test = df[:train_size], df[train_size:]
print(f"\n🔮 Building Forecast Model:")
print(f" Train size: {len(train)} observations")
print(f" Test size: {len(test)} observations")
print(f" Forecast horizon: {forecast_periods} periods")
# Auto ARIMA (simple version - could use pmdarima.auto_arima for better)
# Using simple ARIMA(1,1,1) as default
model = ARIMA(train['value'], order=(1, 1, 1))
model_fit = model.fit()
print(f"\n Model: ARIMA(1,1,1)")
print(f" AIC: {model_fit.aic:.2f}")
# Forecast on test set
forecast_test = model_fit.forecast(steps=len(test))
# Calculate errors
mae = mean_absolute_error(test['value'], forecast_test)
rmse = np.sqrt(mean_squared_error(test['value'], forecast_test))
mape = np.mean(np.abs((test['value'] - forecast_test) / test['value'])) * 100
print(f"\n 📊 Test Set Performance:")
print(f" MAE: {mae:.2f}")
print(f" RMSE: {rmse:.2f}")
print(f" MAPE: {mape:.1f}%")
# Refit on full dataset
model_full = ARIMA(df['value'], order=(1,1,1))
model_full_fit = model_full.fit()
# Generate future forecast
forecast_result = model_full_fit.get_forecast(steps=forecast_periods)
forecast_mean = forecast_result.predicted_mean
forecast_ci = forecast_result.conf_int()
# Create forecast dates
last_date = df.index[-1]
freq = df.index.inferred_freq or 'D'
forecast_dates = pd.date_range(start=last_date + pd.Timedelta(days=1),
periods=forecast_periods, freq=freq)
forecast_df = pd.DataFrame({
'forecast': forecast_mean.values,
'lower_ci': forecast_ci.iloc[:, 0].values,
'upper_ci': forecast_ci.iloc[:, 1].values
}, index=forecast_dates)
return model_full_fit, forecast_df, mae, rmse, mape
model, forecast, mae, rmse, mape = build_forecast_model(df, forecast_periods=30)
Step 6: Visualize Forecast
def plot_forecast(df, forecast, title='Time Series Forecast'):
"""
Visualize historical data and forecast with confidence intervals
"""
fig, ax = plt.subplots(figsize=(14, 6))
# Plot historical data
ax.plot(df.index, df['value'], label='Historical', linewidth=2)
# Plot forecast
ax.plot(forecast.index, forecast['forecast'],
label='Forecast', linewidth=2, linestyle='--', color='red')
# Plot confidence interval
ax.fill_between(forecast.index,
forecast['lower_ci'],
forecast['upper_ci'],
alpha=0.2, color='red', label='95% Confidence Interval')
# Add vertical line at forecast start
ax.axvline(df.index[-1], color='black', linestyle=':', alpha=0.5)
ax.set_xlabel('Date')
ax.set_ylabel('Value')
ax.set_title(title)
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('forecast.png', dpi=300, bbox_inches='tight')
plt.show()
# Print forecast summary
print(f"\n📅 Forecast Summary:")
print(f"\n Next 7 Days:")
print(forecast.head(7).to_string())
plot_forecast(df, forecast)
Step 7: Analyze Forecast Uncertainty
def analyze_uncertainty(forecast):
"""
Quantify forecast uncertainty
"""
forecast['uncertainty_pct'] = (
(forecast['upper_ci'] - forecast['lower_ci']) / forecast['forecast'] * 100
)
print(f"\n📊 Forecast Uncertainty Analysis:")
print(f" Average uncertainty: {forecast['uncertainty_pct'].mean():.1f}%")
print(f" Near-term (Day 1-7): {forecast['uncertainty_pct'][:7].mean():.1f}%")
print(f" Medium-term (Day 8-14): {forecast['uncertainty_pct'][7:14].mean():.1f}%")
print(f" Long-term (Day 15+): {forecast['uncertainty_pct'][14:].mean():.1f}%")
if forecast['uncertainty_pct'].mean() < 10:
confidence = "High"
elif forecast['uncertainty_pct'].mean() < 20:
confidence = "Moderate"
else:
confidence = "Low"
print(f"\n Forecast Confidence: {confidence}")
return confidence
confidence = analyze_uncertainty(forecast)
Step 8: Generate Insights and Recommendations
def generate_insights(df, decomposition, anomalies, forecast, confidence):
"""
Synthesize findings into actionable insights
"""
print(f"\n{'='*70}")
print("💡 KEY INSIGHTS")
print('='*70)
# Trend insight
recent_trend = decomposition.trend.dropna().iloc[-30:].mean()
older_trend = decomposition.trend.dropna().iloc[-60:-30].mean()
trend_change = (recent_trend - older_trend) / older_trend * 100
print(f"\n📈 TREND:")
if abs(trend_change) < 5:
print(f" Status: Stable")
print(f" Recent vs Previous: {trend_change:+.1f}%")
elif trend_change > 5:
print(f" Status: ↗️ Growing")
print(f" Recent vs Previous: {trend_change:+.1f}%")
print(f" Action: Capitalize on growth momentum")
else:
print(f" Status: ↘️ Declining")
print(f" Recent vs Previous: {trend_change:+.1f}%")
print(f" Action: Investigate drivers of decline")
# Seasonality insight
seasonal_range = decomposition.seasonal.max() - decomposition.seasonal.min()
seasonal_pct = seasonal_range / df['value'].mean() * 100
print(f"\n🔄 SEASONALITY:")
print(f" Seasonal variation: {seasonal_pct:.1f}% of average")
if seasonal_pct > 20:
print(f" Status: Strong seasonal pattern")
print(f" Action: Plan inventory/staffing for seasonal peaks")
elif seasonal_pct > 10:
print(f" Status: Moderate seasonal pattern")
print(f" Action: Consider seasonal adjustments")
else:
print(f" Status: Minimal seasonality")
# Anomaly insight
anomaly_count = anomalies['is_anomaly'].sum()
anomaly_pct = anomaly_count / len(anomalies) * 100
print(f"\n🚨 ANOMALIES:")
print(f" Total detected: {anomaly_count} ({anomaly_pct:.1f}%)")
if anomaly_pct > 5:
print(f" Status: High anomaly rate")
print(f" Action: Investigate data quality or unusual events")
else:
print(f" Status: Normal anomaly rate")
# Forecast insight
forecast_direction = "increase" if forecast['forecast'].iloc[7] > df['value'].iloc[-1] else "decrease"
forecast_change = abs(forecast['forecast'].iloc[7] - df['value'].iloc[-1]) / df['value'].iloc[-1] * 100
print(f"\n🔮 FORECAST (Next 7 Days):")
print(f" Direction: {forecast_direction.title()}")
print(f" Expected change: {forecast_change:.1f}%")
print(f" Forecast confidence: {confidence}")
if confidence == "High":
print(f" Action: Use forecast for planning with confidence")
elif confidence == "Moderate":
print(f" Action: Use forecast but build in buffer")
else:
print(f" Action: Use forecast as directional guide only")
generate_insights(df, decomposition, anomalies, forecast, confidence)
# Save results
forecast.to_csv('forecast_results.csv')
anomalies.to_csv('detected_anomalies.csv')
Context Validation
Before proceeding, verify:
- Have sufficient historical data (2+ seasonal cycles)
- Time series has regular intervals (minimal gaps)
- Understand seasonal patterns and external factors
- Forecast horizon aligns with business planning needs
- Know how forecast will be used (decision context)
Output Template
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
TIME SERIES ANALYSIS REPORT
Metric: Daily Active Users
Period: Jan 2023 - Dec 2024 (730 days)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📊 HISTORICAL ANALYSIS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Time Series Characteristics:
✅ Stationary: Yes (ADF p-value: 0.012)
📈 Trend Strength: 75% (Strong upward trend)
🔄 Seasonal Strength: 62% (Clear weekly pattern)
📉 Noise Level: 15%
Decomposition Summary:
• Trend: +25% growth over period
• Seasonality: 30% variation peak-to-trough
• Pattern: Higher on weekdays, dips on weekends
Anomalies Detected: 15 (2.1% of days)
Top Anomalies:
• 2024-07-04: 45,000 users (holiday spike)
• 2024-11-28: 52,000 users (Black Friday)
• 2024-03-15: 18,000 users (outage)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔮 FORECAST (Next 30 Days)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Model: ARIMA(1,1,1)
AIC: 8,234.5
Test MAPE: 5.2% (Good accuracy)
Forecast Confidence: HIGH
Next 7 Days Forecast:
Day 1: 32,500 [31,800 - 33,200]
Day 2: 33,100 [32,300 - 33,900]
Day 3: 33,800 [32,900 - 34,700]
Day 4: 34,200 [33,200 - 35,200]
Day 5: 34,500 [33,400 - 35,600]
Day 6: 30,800 [29,700 - 31,900] (Weekend dip)
Day 7: 29,500 [28,300 - 30,700] (Weekend dip)
Month Ahead:
Average: 32,750 users/day
Range: [31,200 - 34,300]
Uncertainty: ±10%
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
💡 KEY INSIGHTS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
1. STRONG GROWTH TRAJECTORY
• 25% increase over 2-year period
• Accelerating in recent months
• Momentum: Positive
2. PREDICTABLE WEEKLY PATTERN
• 30% higher on weekdays vs weekends
• Stable pattern enables accurate forecasting
• Opportunity: Target weekend engagement
3. HOLIDAY IMPACT SIGNIFICANT
• Major holidays show +50% spikes
• Plan capacity for upcoming holidays
• Black Friday 2024: 52k users (record high)
4. SEASONAL TRENDS
• Q4 strongest quarter (holiday shopping)
• Summer months show slight dip
• Plan marketing around seasonal patterns
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📋 RECOMMENDATIONS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
IMMEDIATE:
1. Use forecast for next month's capacity planning
2. Prepare for 32-35k daily users range
3. Build 15% buffer for weekend dips
STRATEGIC:
4. Investigate weekend engagement opportunities
5. Prepare for holiday peaks (Q4 2025)
6. Monitor trend for any deceleration signals
MONITORING:
7. Set up automated anomaly alerts
8. Refresh forecast weekly
9. Track actual vs forecast accuracy
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📁 FILES GENERATED
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✓ time_series_plot.png (historical data)
✓ decomposition.png (trend/seasonal components)
✓ anomalies.png (anomaly detection)
✓ forecast.png (30-day forecast with CI)
✓ forecast_results.csv (detailed forecast)
✓ detected_anomalies.csv (anomaly list)
Common Scenarios
Scenario 1: "Forecast next quarter's revenue"
→ Build ARIMA or Prophet model → Include seasonality and trends → Generate point estimates + confidence intervals → Validate against historical accuracy → Provide business planning ranges
Scenario 2: "Detect anomalies in web traffic"
→ Establish baseline using historical patterns → Flag statistical outliers → Investigate anomaly causes → Set up automated monitoring → Create alert thresholds
Scenario 3: "Understand what's driving metric changes"
→ Decompose into trend/seasonal/residual → Quantify strength of each component → Correlate with external events → Identify primary drivers → Provide actionable insights
Scenario 4: "Capacity planning for infrastructure"
→ Forecast peak usage periods → Add buffer for uncertainty → Include known upcoming events → Calculate required resources → Monitor actual vs forecast
Scenario 5: "Compare YoY growth trends"
→ Normalize for seasonality → Calculate year-over-year changes → Test statistical significance → Identify inflection points → Project future trajectories
Handling Missing Context
User has irregular time series: "Your data has gaps. Options:
- Impute missing values (forward fill, interpolation)
- Model irregular time series (different approach)
- Aggregate to courser time grain (daily → weekly)
Which makes most sense for your use case?"
User doesn't know forecast horizon: "Let me ask: What decisions will this forecast inform?
- Operational planning: 1-2 weeks
- Budget cycles: 1-3 months
- Strategic planning: 6-12 months
Typical sweet spot: 30-90 days ahead"
Limited historical data: "With only {X} observations, forecast uncertainty will be high. Can you:
- Provide more history from archive/logs?
- Use business assumptions to supplement?
- Start with simple moving average approach?"
Multiple seasonal patterns: "I see daily, weekly, AND annual patterns. We should:
- Model each separately
- Use more sophisticated model (Prophet, TBATS)
- Focus on most important pattern
Which matters most for your use case?"
Advanced Options
After basic analysis, offer:
Prophet Model: "For complex seasonality + holidays, I can use Facebook Prophet for more robust forecasts."
Multivariate Forecasting: "If you have external variables (marketing spend, weather), I can include them for better predictions."
Ensemble Forecasting: "I can combine multiple models (ARIMA, Prophet, etc.) for more reliable forecasts."
Intervention Analysis: "I can quantify impact of specific events (product launches, outages) on the time series."
Automatic Monitoring: "I can set up automated forecast updates and anomaly alerts that run daily/weekly."
Confidence Intervals Calibration: "I can validate if confidence intervals are well-calibrated using historical forecasts."