Search jobs > South San Francisco, CA > Data engineering lead

Data Engineering Lead, Translational Genomics

Roche
South San Francisco, California, United States of America
Full-time

The Position

If you are a big data engineer and want to work on something that truly can change the world, this job is for you. Biology is approaching an inflection where we can directly leverage data to understand the cellular basis of human diseases and from this generate therapeutics that can treat these diseases.

Our Translational Genomics initiative is spearheading this effort and bringing together data from human genetics, functional genomics, molecular biology, disease model engineering, and tissue and cellular profiling.

We need a Data Engineering Lead to help us create a next-generation data engine that scalably and rigorously ingests and transforms data generated from this initiative so they are ready for machine-driven analysis.

The Data Engineering Lead will act as an architect and engineering manager tasked to oversee the construction and operation of this data engine.

This data engine will be used to help assemble an exabyte scale connected and computable data universe composed of high value internally and externally generated data and results that we can build our data science efforts on top of.

Your efforts will therefore directly enable computational discovery of disease targets and from these potentially life saving therapies.

A person hired in this position will

Manage a team that will architect and deliver a next generation data engine that enables scalable, flexible, and rigorous data transformations using modern data management practices.

Help architect and deliver data infrastructure that will enable machines to crawl and compute on and across all our data.

Work with a cross functional team of scientists and engineers to design and deliver these solutions.

Exert influence across the informatics organization via presentations and collaborations.

Successful candidates will meet the following requirements

You have a BS in a computational discipline with 12 years of work experience or a Masters with 7 years of experience.

7+ years experience architecting and developing scalable pipelines, frameworks and platforms to power data science efforts in distributed cloud environments, 5 of which are on AWS.

Multiple years of experience leading a distributed team of engineers to deliver solutions.

Practical understanding of the data management practices required to power rigorous data science and enable advanced analytics like AI & ML.

Exceptional communication skills.

Experience leading projects focused on omics data.

Hands-on experience working with the following technologies, frameworks, and languages : Java, Scala, Python, Spark, Airflow, RabbitMQ, Spring.

What to expect from us

A highly collaborative and dynamic research environment where we aim to advance the rate of scientific discovery using purposefully built solutions.

Access to large multimodal omic datasets focused on disease biology, samples and compute resources.

Access to state-of-the-art technologies and pioneering research.

Participation in seminar series featuring academic and industry scientists.

Campus-like lifestyle with a healthy work-life balance.

Mentored opportunities to further develop professional skills.

30+ days ago
Related jobs
Promoted
Deloitte
San Francisco, California

Leverage advanced technical skills in modern data architecture, data science engineering, data transformation, and management of structured and unstructured data sources using cloud computing or on-prem technologies. As a Data Engineering Lead, you will lead client engagements around the design and ...

Roche
South San Francisco, California

Our Translational Genomics initiative is spearheading this effort and bringing together data from human genetics, functional genomics, molecular biology, disease model engineering, and tissue and cellular profiling. We need a Data Engineering Lead to help us create a next-generation data engine that...

Promoted
PostHog
San Francisco, California

Full Time] Data Engineering Lead - Pipeline at PostHog (United States) | BEAMSTART Jobs. Data Engineering Lead - Pipeline. You’ll have the opportunity to work on huge data challenges (we peak at 1m events/minute and process billions of events) while leading a team of seasoned and smart engineers. We...

Promoted
STEM
Millbrae, California

As the Technical Project Manager you will be the technical resource assigned to a number of projects in coordination with the project managers. As a Technical Project Manager, you will work with the project team to provide technical expertise during key parts of the deployment process. Work with Pro...

Promoted
Informatica LLC
Redwood City, California

We pioneered the Informatica Intelligent Data Management Cloud that manages data across any multi-cloud, hybrid system, democratizing data to advance business strategies. We're guided by our DATA values and we are passionate about building and delivering solutions that accelerate data innovations. I...

Promoted
Wells Fargo Bank, N.A.
San Leandro, California

Lead Software Engineer, Wells Fargo Bank, , San Leandro, CA: Provide design solutions to satisfy business and regulatory requirements. ...

Promoted
Pinterest
San Francisco, California

We are looking for a Senior Staff Data Scientist for Ecosystem. Ability to manipulate large data sets with high dimensionality and complexity; fluency in SQL (or other database languages) and a scripting language (Python or R). You will collaborate on a wide array of product and business problems wi...

Promoted
Palo Alto Networks
San Francisco, California

We're looking for a resourceful Data Scientist to join a growing Product Analytics team at Xpanse, the latest addition to Palo Alto Networks Cortex. In this role, you will have access to the unique data that we continuously collect and monitor. You will seek new, critical cyb...

Promoted
R&D Partners
South San Francisco, California

The Oncology Bioinformatics team in the Computational Biology and Translation Pillar of gRED Computational Sciences (gCS) is seeking a highly motivated data scientist to support extraction, interpretation, analysis, and workflow implementation of -omics datasets. Bioinformatics Data Scientist Analys...

Promoted
Slalom Consulting
San Francisco, California

Data architecture, acquisition, processing and storage, integration, preparation, and modeling. Defining the policies, standards, rules, and decision rights around data capture, storage, use, and management. ...