Lead Data Engineer-Apache Iceberg

Mastech Digital • Strongsville, OH, US

job_description.job_card.variable_hours_ago

serp_jobs.job_preview.job_type

job_description.job_card.job_description

Lead the migration of datasets and ETL workflows from Cloudera Hadoop (Hive, Impala, HDFS, etc.) to an Apache Iceberg based architecture.

Analyze existing data pipelines and storage formats (e.g., Parquet, ORC) to plan and execute a smooth migration strategy.

Design and implement scalable data ingestion and transformation pipelines using Apache Spark, Flink, or equivalent tools.

Optimize data partitioning, schema evolution, compaction, and metadata management using Iceberg best practices.

Integrate Iceberg tables with query engines like Trino or Presto to support data analytics use cases.

Ensure compatibility and data quality during the migration phase through robust testing, validation, and lineage tracking.

Establish monitoring, logging, and performance tuning for migrated pipelines and Iceberg tables.

Seniority level

Mid-Senior level

Employment type

Contract

Job function

Information Technology

Industries

IT Services and IT Consulting

J-18808-Ljbffr

Data Lead • Strongsville, OH, US