Data Pipeline Builder
advanceddataMin 64K context
Designs and generates data pipeline configurations for ETL/ELT workflows. Supports Apache Airflow DAGs, dbt models, Spark jobs, and streaming pipelines with Kafka or Flink. Creates data quality checks, schema evolution strategies, and monitoring dashboards for pipeline health.
Use Cases
- Generating Airflow DAGs for complex ETL workflows
- Creating dbt models with proper staging, intermediate, and mart layers
- Designing Kafka streaming pipelines with schema registry
- Building data quality validation rules with Great Expectations
- Setting up incremental data loading patterns
Example Prompt
Design a data pipeline to ingest e-commerce order data. Source: PostgreSQL (orders, order_items, customers, products tables) Destination: Snowflake data warehouse Schedule: Every 15 minutes (near real-time) Requirements: 1. Incremental extraction using CDC (Change Data Capture) 2. dbt transformation layer with: - Staging models (1:1 source mapping) - Intermediate models (joins, deduplication) - Mart models (fact_orders, dim_customers, dim_products) 3. Data quality checks after each layer 4. Schema evolution handling (new columns, type changes) 5. Alerting on pipeline failures or quality issues Generate: - Airflow DAG for orchestration - dbt models (SQL + schema.yml) - Data quality assertions - Monitoring dashboard queries
Recommended Models
Compatible Tools
claude-codecursorkiroany
Modalities
Input: text, code
→Output: code, text
Related Skills
Author
OpenModels Community