Job Summary :
As a Senior Data Engineer, you will play a pivotal role in designing, building, and optimizing complex data pipelines and ETL processes while leveraging your expertise in Azure Synapse and PySpark to build advanced data analytics systems and data outputs for downstream consumption.
You will be responsible for architecting scalable and efficient data models to support the business processes for reporting and data integrations.
You will utilize cutting-edge Generative AI (GenAI) technologies to drive innovative data extraction and analysis solutions for our downstream consumers.
The ideal candidate will bring over 10 years of experience in data engineering with a strong background in ETL, Data Modeling through Azure cloud solutions, and the integration of AI into data workflows.
The Role :
Key Responsibilities :
ETL Development & Maintenance : Lead the design, development, and optimization of complex ETL processes and pipelines that enable reliable data ingestion, transformation, and loading across a variety of sources.
Azure Synapse Analytics : Architect and develop scalable data solutions utilizing Azure Synapse with a deep focus on performance optimization.
Create optimized PySpark notebooks for advanced data transformations and analytical queries.
Data Modeling : Design and maintain logical and physical data models, ensuring data structures align with business needs, scalability, and optimization for data warehousing and analytics.
GenAI Integration for Data : Lead the application of Generative AI technologies for data analysis and data extraction, including leveraging GenAI for predictive analytics, automated data transformation, and natural language query processing.
Data Pipeline Automation : Develop, implement, and manage automated data pipelines for continuous integration and deployment of data solutions.
Incorporate best practices for monitoring and error-handling in production environments.
Collaboration : Work closely with other Data Engineers, analysts, and business stakeholders to understand their data requirements and provide innovative, scalable data solutions.
Performance Tuning & Optimization : Continuously monitor, evaluate, and optimize data pipelines and queries to enhance the performance of data systems, minimize latency, and ensure real-time data availability.
Cloud Engineering : Drive cloud-native engineering best practices on the Azure platform including security, scalability, high availability, disaster recovery, and cost-efficiency in data storage and processing.
Documentation & Best Practices : Create and maintain clear, concise documentation for data pipelines, models, and processes.
Promote best practices for data governance, quality, and security.