Overview
We are looking for an experienced Sr ETL Developer with strong expertise in Apache Airflow, Redshift, and SQL-based datapipelines, with upcoming transitions to Snowflake. This is a contract role based in Coimbatore, ideal for professionals who can independently deliver high-quality ETL solutions in a cloud-native, fast-paced environment.
Candidate Specification
- 6+ years of hands-on experience in ETL development.
- Proven experience with Apache Airflow, Amazon Redshift, and strong SQL.
- Strong understanding of data warehousing concepts and cloud-based data ecosystems.
- Familiarity with handling flat files, APIs, and external sources.
- Experience with job orchestration, error handling, and scalable transformation patterns.
- Ability to work independently and meet deadlines.
- Exposure to Snowflake or plans to migrate to Snowflake platforms.
- Experience in healthcare, life sciences, or regulated environments is a plus.
- Familiarity with Azure Data Factory, Power BI, or other cloud BI tools.
- Knowledge of Git, Azure DevOps, or other version control and CI / CD platforms.
Roles and Responsibilities
ETL Design and Development : Design and develop scalable and modular ETL pipelines using Apache Airflow, with orchestration and monitoring capabilities. Translate business requirements into robust data transformation pipelines across cloud data platforms. Develop reusable ETL components to support a configuration-driven architecture.Data Integration and Transformation : Integrate data from multiple sources : Redshift, flat files, APIs, Excel, and relational databases. Implement transformation logic such as cleansing, standardization, enrichment, and deduplication. Manage incremental and full loads, along with SCD handling strategies.SQL and Database Development : Write performant SQL queries for data staging and transformation within Redshift and Snowflake. Utilize joins, window functions, and aggregations effectively. Ensure indexing and query tuning for high-performance workloads.Performance Tuning : Implement best practices in distributed data processing and cloud-native optimizations. Tune SQL queries and monitor execution plans. Optimize data pipelines and orchestrations for large-scale data volumes.Error Handling and Logging : Implement robust error handling and logging in Airflow DAGs. Enable retry logic, alerting mechanisms, and failure notifications.Testing and Quality Assurance : Conduct unit and integration testing of ETL jobs. Validate data outputs against business rules and source systems. Support QA during UAT cycles and help resolve data defects.Deployment and Scheduling : Deploy pipelines using Git-based CI / CD practices. Schedule and monitor DAGs using Apache Airflow and integrated tools. Troubleshoot failures and ensure data pipeline reliability.Documentation and MaintenanceDocument data flows, DAG configurations, transformation logic, and operational procedures. Maintain change logs and update job dependency charts.
Collaboration and Communication : Work closely with data architects, analysts, and BI teams to define and fulfill data needs. Participate in stand-ups, sprint planning, and post-deployment reviews.Compliance and Best Practices : Ensure ETL processes adhere to data security, governance, and privacy regulations (HIPAA, GDPR, etc.). Follow naming conventions, version control standards, and deployment protocols.J-18808-Ljbffr