12 month contract Azure Data Engineer (Azure Databricks & PySpark)
Job Description
We’re hiring a talented Azure Data Engineer enthusiast to work in our platform to help
ensure that our data quality is flawless. As a company, we have millions of new data
points every day that come into our system. You will be working with a passionate team
of engineers to solve challenging problems and ensure that we can deliver the best data
to our customers, on-time. You will be using the latest cloud data lake technology to
build robust and reliable data pipelines.
Job Responsibilities
Develop expertise in the different upstream data stores and systems across the
company.
Design, develop and maintain data integration pipelines for the organization growing
data sets and product offerings.
Build unit testing and QA plans for data processes.
Build data validation testing frameworks to ensure high data quality and integrity.
Write and maintain documentation on data processes.
Developing and maintaining data models and schemas.
Strong analytical experience with database in writing complex queries, query
optimization, debugging, user defined functions, views, indexes etc.
Write code that adheres to coding standards, procedures, and techniques. Maintain
the integrity of existing program logic according to specifications.
Actively participate in the code review process to ensure development work adheres
to standards and specifications (including peer review and code review external to
team).
Respond to all inquiries and issues in a timely manner as developed code / program
moves through the testing process.
Participate in scrum, sprints, and backlog grooming meetings.
Evaluate interrelationships between applications to determine whether a change in
one part of a project would impact or cause undesirable results in related applications
and design for effective interfaces between interrelated applications.
Improve the health of system assets by identifying enhancements to improve
performance through tuning and monitoring, reliability, and resource consumption.
Evaluate and troubleshoot root-cause analysis for production issues and
system failures; determine corrective action(s) and propose improvements to
prevent their recurrence.
Maintain up-to-date business domain knowledge and technical skills in software
development technologies and methodologies.
Provide input in the selection, implementation and use of development tools and best
practices.
Requirements
Technical :
BS or MS in Computer Science or equivalent experience.
4 + years of experience in Databricks / Apache Spark with Azure data storage
solutions handling large datasets.
Expert in SQL and Spark SQL, including advanced analytical queries.
Proficiency in Python (data structures, algorithms, object-oriented programming,
using API’s) and familiarity with PySpark.
Experience working with Databricks Delta tables, unity catalog, Dataframe API, read
and write from / to various data sources and data formats.
Experience with both Batch and streaming data pipeline.
Knowledge of Azure Data Factory, Azure Data Lake, Azure SQL DW, Azure SQL is a
plus.
Nice to haves
Understanding of PostgreSQL and MS SQL
Experience working in a fast-paced environment.
Experience in an Agile software development environment.
Ability to work with large datasets and perform data analysis
Worked on migration project to build Unified data platform.
Experience working with JIRA, Azure DevOps CI / CD pipelines.
Benefits
This is a 12 month contract position, 40 working hours per week and not benefits eligible. The base salary range for this position is $100,000-$120,000.
A final compensation offer will ultimately be based on the candidate's location, skill level and experience.
If you need assistance or an accommodation, you may contact us at [email protected]