Question 1

What is data-engineer?

Accepted Answer

This Claude skill provides comprehensive expertise in data engineering, enabling the design and implementation of scalable ETL/ELT pipelines, data warehouses, and real-time streaming architectures. It leverages industry-standard tools like Apache Spark, Airflow, and Kafka to build robust, cost-optimized, and high-quality data infrastructure for modern analytics.

Question 2

When should I use data-engineer?

Accepted Answer

data-engineer is useful in the following scenarios: • Designing and deploying automated ETL pipelines with Airflow DAGs to streamline data movement and transformation processes. • Optimizing Apache Spark jobs through efficient partitioning and resource management to handle massive datasets while reducing cloud compute costs. • Architecting real-time data streaming solutions using Kafka or Kinesis for low-latency data processing and instant business insights. • Developing structured data warehouse models, such as Star or Snowflake schemas, to improve query performance and support complex reporting requirements. • Implementing data quality monitoring and validation frameworks to ensure the reliability and integrity of organizational data assets.

name	data-engineer
description	Build ETL pipelines, data warehouses, and streaming architectures. Implements Spark jobs, Airflow DAGs, and Kafka streams. Use PROACTIVELY for data pipeline design or analytics infrastructure.
license	Apache-2.0
author	edescobar
version	"1.0"
model-preference	sonnet

data-engineer

When & Why to Use This Skill

Use Cases

Data Engineer

Focus Areas

Approach

Output