Sr/Lead Data Engineer (Python/Spark/Jupyter Notebooks/Delta Lake/Data Vault 2.0)

Yoh, A Day & Zimmermann Company
Avenel, New Jersey
$77-$110 an hour
Full-time

Sr / Lead Data Engineer (Python / Spark / Jupyter Notebooks / Delta Lake / Data Vault 2.0)

Location : Remote (MST / CST / EST preferred)

Pay rate : $77-$110 / HR W2

Duration : 6-month increments, if going well will extend out yearly

Notes :

  • Client has built a modern data platform and needs senior data engineers to work on various projects supporting Client’s business Storage initiatives Data Applications Code reviews Azure Synapse Analytics Delta Lake initiatives
  • Data Group 65-70 resources including Data Engineers, Data Analysts, BA’s, QA, Scrum Masters, etc broken out into 6 teams Do NOT use ETL tools, utilize Data Vault 2.

0 methods for data transfer

  • Looking for VERY senior resources, up to hands-on lead level Experienced with Assertion based Architecture Engineers vs coders Coding is done in Jupyter Notebooks on Delata Lakes Need resources who can articulate design and build highly scalable solutions before jumping into coding Do NOT want resources who need to be told what to do Need critical thinkers who can troubleshoot and debug Independent workers, self starters, who speak up and raise impediments and offer solutions
  • Required skills : Python Jupyter Notebooks Delta Lake Spark, PySpark, Spark SQL Serverless data infrastructure Data Vault 2.

0 methodology experience Great Expectations data quality validation Automated Testing

Bonus skills : Kakfa streaming HUGE plus if they have solid background here Scala

Key Responsibilities :

  • Design, develop, and maintain data pipelines using Python, PySpark, and Spark SQL to process and transform large-scale datasets.
  • Implement Delta Lake architecture to ensure data reliability, consistency, and integrity for large, distributed datasets.
  • Utilize serverless data infrastructure (e.g., AWS Lambda, Azure Functions, Databricks) to build scalable and cost-efficient data solutions.
  • Collaborate with Data Scientists and Analysts by creating reusable Jupyter Notebooks for data exploration, analysis, and visualization.
  • Optimize and manage data storage and retrieval processes, ensuring high performance and low latency.
  • Implement best practices for data security, governance, and compliance within the data infrastructure.
  • Work closely with cross-functional teams to understand data requirements and deliver solutions aligned with business objectives.
  • Continuously monitor, troubleshoot, and improve the performance of data processing pipelines and infrastructure.

Qualifications :

  • 10-15+ years of experience in data engineering or related fields.
  • Strong programming skills in Python with experience in data processing frameworks like PySpark.
  • Extensive hands-on experience with Apache Spark and Spark SQL for processing and querying large datasets.
  • Expertise with Delta Lakes for implementing scalable data lakehouse architectures.
  • Experience with Jupyter Notebooks for prototyping and collaboration with data teams.
  • Familiarity with serverless data technologies such as AWS Lambda, Azure Functions, or similar platforms.
  • Proficient in working with cloud platforms such as AWS, Azure, or Google Cloud.
  • Experience with data pipeline orchestration tools (e.g., Apache Airflow, Prefect, or similar).
  • Solid understanding of data warehousing, ETL / ELT pipelines, and modern data architectures.
  • Strong problem-solving skills and ability to work in a collaborative environment.
  • Experience with CI / CD pipelines and DevOps practices is a plus.

Preferred Qualifications :

  • Experience with Databricks for data engineering workflows.
  • Familiarity with modern data governance practices and tools like Apache Atlas or AWS Glue.
  • Knowledge of machine learning workflows and how data engineering supports AI / ML model s.

Note : Any pay ranges displayed are estimations. Actual pay is determined by an applicant's experience, technical expertise, and other qualifications as listed in the job description.

All qualified applicants are welcome to apply.

Yoh, a Day & Zimmermann company, is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.

Visit to contact us if you are an individual with a disability and require accommodation in the application process.

For California applicants, qualified applicants with arrest or conviction records will be considered for employment in accordance with the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act.

All of the material job duties described in this posting are job duties for which a criminal history may have a direct, adverse, and negative relationship potentially resulting in the withdrawal of a conditional offer of employment.

9 hours ago
Related jobs
Promoted
Pinnacle Group, Inc.
Jersey City, New Jersey

Job Title: Senior Java Engineer w/ Big Data. Lead engineer in Jersey City office. Must have strong understanding of the big data framework and understands Spark, Hive, HDFS, Impala, Presto and can query and tune those systems. Need Spark for sure, but Impala can be nice to have. ...

Promoted
Photon
Jersey City, New Jersey

Design, develop, and test PySpark-based applications to automate data reconciliation processes across various financial data sources, including relational databases, NoSQL databases, batch files, and real-time data streams. PySpark Data Reconciliation Engineer. PySpark Data Reconciliation Engineer. ...

Promoted
Regeneron Pharmaceuticals, Inc.
Bernards, New Jersey

The Program Data Management Lead is responsible for leadership and overall strategic management of Programs in Clinical Data Management (CDM). The PDML is a member of the Clinical Data Management extended leadership team, and as such interacts with senior level management, external vendors, collabor...

Promoted
PulsePoint
Newark, New Jersey

Full-stack toolset: hard stats/data skills, model development, campaigns execution, business sense, real-time data engineering. Our Data Analysts build, deliver & continually innovate on PulsePoint's insightful reporting and data-driven solutions. Work with Product and Data Engineering teams to ...

JPMorgan Chase Bank, N.A.
Jersey City, New Jersey

Job responsibilities * Executes creative software solutions, design, development, and technical troubleshooting with ability to think beyond routine or conventional approaches to build solutions or break down technical problems * Develops secure high-quality production code, and ...

Prudential Financial
Newark, New Jersey

As a UI/UX Developer, you will be responsible for designing, developing Python based data access layers and rest APIs. You will work closely with cross-functional teams to understand requirements and deliver high-quality, responsive, scalable and flexible data access layer, rest Api and other python...

JPMorgan Chase Bank, N.A.
Jersey City, New Jersey

Demonstrated understanding of CI/CD, application resiliency, agile methodology, and application security * Experience integrating SOLID principles to achieve professional outcomes * Knowledge using data querying syntax like PostgresQL or ANSI SQL * Able to evaluate avai...

Audible, Inc.
Newark, New Jersey

You have a background in developing large scale data collections systems and an interest in tackling challenges to make near real-time data accessible which drive the business forward. It’s why we work with some of the world’s leading creators to produce and share audio storytelling with our million...

Royal Bank of Canada>
Jersey City, New Jersey

Experience must include: financial industry experience using Azure, Airflow, Python, PySpark, Scala, and shell scripting; AI, Big Data, Cloud, distributed computing, alternative data, and data science; building meta data driven AI and statistical models; experience with KPIs and back testing of alte...

Prudential Financial
Newark, New Jersey

As a Lead Software Engineer - Python Developer in the Network Engineering team, you will partner with product owners, tech leads, designers, engineers and delivery professionals to improve the overall Network product. Technology - Engineering & Cloud. Are you interested in building capabilities ...