Search jobs > Fremont, CA > Lead data engineer

Lead Data Engineer

https:/www.energyjobline.com/sitemap.xml
Fremont, California, US
Full-time

Job Summary :

To be considered for an interview, please make sure your application is full in line with the job specs as found below.

We are looking for a highly skilled and experienced Lead Data Engineer with over 10 years of IT expertise in software analysis, design, development, testing, and implementation of Big Data, Hadoop, Java, ETL, and database technologies.

The ideal candidate should have a deep understanding of the application lifecycle, from initiation to deployment and support, with hands-on experience in designing and implementing complex data engineering solutions.

Key Responsibilities :

  • Architect and develop the best suitable business logics and application framework, including the selection of technical stack for data engineering projects.
  • Build and maintain Ingestion frameworks for detecting and reading data from source folders using CDC (Change Data Capture) strategy.
  • Convert Hive / SQL queries into Spark transformations using Spark RDDs, Spark SQL, and Scala.
  • Manage Spark applications and launch clusters with Spark on GCP DataProc cluster, including the use of CICD for deployment.
  • Develop data ingestion pipelines using Kafka and Spark Streaming APIs.
  • Perform Spark RDD transformations, map business analysis, and implement actions for optimal data processing.
  • Integrate real-time streaming of data using GCP PubSub into Spark applications.
  • Develop and maintain Spark SQL tables and queries for ad-hoc data analysis.
  • Create Lambda workflow jobs for automation and schedule using Airflow, passing configurations dynamically.
  • Migrate Hive queries into Spark transformations using DataFrames, SQL Context, and Scala.
  • Implement test scripts supporting test-driven development and continuous integration (CI).
  • Perform data processing with GCP Dataflow and load data into GCP BigQuery.
  • Write and execute Shell scripts for automating deployment processes.
  • Collaborate with cross-functional teams including clients, stakeholders, and business analysts to ensure seamless integration and delivery of data engineering projects.
  • Maintain and manage Hadoop infrastructure, including log files and security integrations, leveraging Cloudera Manager.

Required Skills and Qualifications :

  • 10+ years of experience in IT, focusing on Big Data, Hadoop, and related technologies.
  • Expertise in Hadoop ecosystem tools including HDFS, MapReduce, Hive, Pig, Impala, Spark, Kafka, Zookeeper, Oozie, and Sqoop.
  • Hands-on experience with Hadoop shell commands, Spark RDDs, Spark SQL, and Scala programming.
  • Proficiency in data analysis, transformation, validation, and cleansing.
  • Experience with Java, Scala, Python, and familiarity with databases like Oracle and MySQL.
  • Proficient in version control tools like GIT, SVN, and project management tools such as JIRA and GitHub.
  • Experience with GCP technologies like DataProc, PubSub, BigQuery, and Dataflow.
  • Strong understanding of Agile methodology and Software Development Lifecycle (SDLC).
  • Excellent interpersonal, technical, and communication skills.
  • Ability to manage and adapt to changing technologies and environments with a self-driven, adaptive, and quick learning approach.

Skills :

  • Experience with CICD pipelines for deploying applications in GCP environments.
  • Experience in cloud security, integrating with Kerberos authentication and authorization.
  • Familiarity with tools like Airflow for job scheduling and orchestration.
  • Experience in VPN, winSCP, FileZilla, SFTP, and FTP protocols.

Environment :

Hadoop, Scala, PySpark, Spark SQL, Hive, GCP DataProc, Storage, Secret Management, GSutils, BigQuery, MySQL, UNIX Shell Scripting, PubSub, Springboot API.

J-18808-Ljbffr

19 hours ago
Related jobs
Promoted
San Jose State University
San Jose, California

The Lead Data Engineer should possess a breadth of knowledge, technical skills, and strategic thinking to build a Data Warehouse to answer important questions across a variety of functional areas and collaborate with business stakeholders and IT management to understand solution requirements and sys...

Promoted
Cisco Systems, Inc.
San Jose, California

As a Data Plane Engineer, you'll join an agile team engaged in the design, development and testing data-center features set in Cisco 8000 Platform using SONiC network operating system. You are a technologist at heart and a leader in practice that is passionate about building and delivering data plan...

Sephora
Remote, CA, US
Remote

Reporting to the Engineering Manager, Data Platform, you will work closely with other team members like data architects and business analysts to understand what the business is trying to achieve, move data from source to target, and design optimal data models. Design and build frameworks for various...

HexaQuEST Global
Fremont, California

Come up with solutions on optimizely for a given requirement.Manage & Support existing SDK version and enhance the system....

TikTok
San Jose, California

You will have the opportunity to work closely with a multidisciplinary team of Mobile Engineers, Frontend Engineers, Site Reliability Engineers, Data Engineers, and Data Scientists in a high-impact and fast-paced environment. TikTok Data Access Architecture team is responsible for data access contro...

Cisco
San Jose, California

As a Data Plane Engineer, you'll join an agile team engaged in the design, development and testing data-center features set in Cisco 8000 Platform using SONiC network operating system. You are a technologist at heart and a leader in practice that is passionate about building and delivering data plan...

TikTok
San Jose, California

TikTok is the leading destination for short-form mobile video. Strong coding skills with a solid foundation in data structure and algorithms. Familiarity with one or more areas in machine learning, computer vision, natural language processing, or data mining. ...

https:/wayup.com/sitemap.xml
Santa Clara, California

SENIOR-LEVEL BIG DATA SOFTWARE ENGINEER. If your passion is Big Data and you want to join us in building our next-generation security platform, then we want to hear from you!. Experience working on Big data computing systems like Hadoop MapReduce, Spark, etc. No-SQL databases like HBase, Cassandra, ...

Hireio, Inc.
San Jose, California

Experience with performing data analysis, data ingestion and data integration. As a software engineer in experimentation and evaluation team, you will have the opportunity to build, optimize and grow one of the largest data platforms in the world. Establish solid design and best engineering practice...

https:/www.energyjobline.com/sitemap.xml
Fremont, California

We are looking for a highly skilled and experienced Lead Data Engineer with over 10 years of IT expertise in software analysis, design, development, testing, and implementation of Big Data, Hadoop, Java, ETL, and database technologies. Perform data processing with GCP Dataflow and load data into GCP...