Overview
Job Title : Lead Data Engineer
Location : Greenfield, IN (Onsite)
Duration : Long term contract
We are seeking a highly skilled Data Engineering specialist to join our dynamic team focused on Azure cloud data engineering with DevOps practices. The role involves leading design discussions, mentoring engineers, and delivering scalable data lakehouse solutions in healthcare-focused environments.
Responsibilities
- Lead solution design discussions, mentor junior engineers, and ensure adherence to coding guidelines, design patterns, and peer review processes.
- Prepare design documents for development and guide the team technically; experience preparing technical design documents, HLD / LLDs, and architecture diagrams.
- Collaborate with product owners, QA, and business analysts to translate requirements into deliverables.
- Develop modular, testable Python code for data transformations and packaging reusable components; write unit tests and integrate with CI / CD pipelines.
- Contribute to agile / scrum project execution and provide regular updates on progress and risks.
- Communicate effectively with internal and customer stakeholders, and build collaborative relationships across teams.
- Maintain data governance, security, monitoring, and cost management practices within the Azure ecosystem.
Qualifications
4+ years of experience in Azure Databricks with PySpark.2+ years of experience in Databricks workflow & Unity Catalog.3+ years of experience in Azure Data Factory (ADF).3+ years of experience in ADLS Gen 2.3+ years of experience in Azure SQL.5+ years of experience in Azure Cloud platform.2+ years of experience in packaging builds.Data management experience for analytics workloads, design, development, and maintenance of lakehouse solutions using Databricks / PySpark; handling data from ERP, API, relational, NoSQL, and on-prem sources with batch and near-real-time processing.Ability to optimize Spark jobs, partitioning strategies, file formats (Parquet / Delta), and Spark SQL tuning.Experience with Unity Catalog, Azure AD integration, data permissions, lineage, and audit trails.Experience building orchestration solutions using Azure Data Factory and Databricks Workflows; modular, reusable pipelines with triggers and dependencies.Familiarity with data lake storage architecture (bronze-silver-gold) using ADLS Gen2; lifecycle policies, RBAC / ACLs, and performance tuning.Experience with T-SQL queries, stored procedures, and Azure SQL metadata management.Exposure to DevOps tools for deployment automation (e.g., Azure DevOps, ARM / Bicep / Terraform).Experience writing modular, testable Python code; dependency management and packaging; unit testing with PyTest / unittest; CI / CD integration.Strong communication, stakeholder engagement, and ability to translate requirements into deliverables.Nice-to-have : Azure Entra / AD skills, GitHub Actions, orchestration with Airflow / Dagster / LogicApp, event-driven architectures (Kafka, Azure Event Hub), Google Cloud Pub / Sub, Debezium CDC, and experience with Azure Synapse and Databricks Lakehouse migrations.Seniority level : Mid-Senior level
Employment type : Full-time
Industries : Software Development; Information Technology
We are committed to equal opportunity employment. Referrals increase your chances of interviewing at Dice.
J-18808-Ljbffr