Talent.com
Principal DevOps Engineer - ML / AI Algorithms

Principal DevOps Engineer - ML / AI Algorithms

F. Hoffmann-La Roche GruppeSanta Clara, CA, United States
job_description.job_card.variable_hours_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

At Roche you can show up as yourself, embraced for the unique qualities you bring. Our culture encourages personal expression, open dialogue, and genuine connections, where you are valued, accepted and respected for who you are, allowing you to thrive both personally and professionally. This is how we aim to prevent, stop and cure diseases and ensure everyone has access to healthcare today and for generations to come. Join Roche, where every voice matters.

The Position

Principal DevOps Engineer - ML / AI Algorithms

Developing software is great, but developing software with a purpose is even better! As a Principal DevOps Engineer - ML / AI Algorithms, you will work on products that help people with the most precious thing they have — their health. You will be part of the RIS Research & Development team contributing to digital health products touching Imaging, ML / AI, and computational science.

The Opportunity

As Principal DevOps Engineer, you will collaborate with important stakeholders on the development of the build, release, and deploy toolchain for DevOps, paving the way for seamless and efficient software delivery processes.

This role can be based in Santa Clara (primary location) or in secondary locations (Mississauga, Canada or Basel, Switzerland).

Key responsibilities

  • Lead the initiative to set up, manage, and meticulously maintain parity across development, staging, and production application environments in cutting-edge cloud infrastructure, ensuring a robust and consistent deployment pipeline.
  • Champion the implementation of advanced monitoring infrastructure development, empowering the team with real-time insights and ensuring the highest levels of system reliability and performance.
  • Provide dedicated on-call support for production operations, ensuring the uninterrupted delivery of critical services and swift resolution of any operational issues.
  • Interface with software developers, product managers, test engineers and administrators on projects to design and develop the build, release, and deploy toolchain for DevOps while providing on-call support.
  • Identify, troubleshoot and resolve issues quickly and effectively, sometimes under pressure.
  • Actively involved in planning, high availability engineering, performance tuning, and automation / tools development.
  • Manage multiple releases with focus on system reliability, scalability, and efficiency.
  • Implement and manage the full lifecycle of machine learning models, including versioning, deployment strategies (e.g., canary, A / B testing), monitoring for drift and performance, and decommissioning.
  • Bring in leadership quality to improve technology and process of devops as well as provide mentorship to other devops engineers in the team.

Who You Are

  • Bachelor's degree in Computer Science, Engineering, or a related field with a minimum of 8+ years of experience in a DevOps or equivalent combination of education and experience to perform at this level.
  • 8+ years of experience with container technology, including Kubernetes, AWS EKS, Helm Charts, Splunk, and Docker, along with provisioning infrastructure through IAC using Terraform and cloud automation principles.
  • Proficiency in Unix / Linux administration in Shell scripting and internals with a preference for Ubuntu.
  • Deep working experience and extensive knowledge in building and deploying infrastructure using IaC frameworks such as terraform and AWS Cloudformation / SAM.
  • Experience building and automating scalable data pipelines for ingesting, transforming, distributed computing and versioning large-scale image datasets.
  • Familiarity with DevOps practices and proficiency in log analysis and monitoring tools are essential for effective troubleshooting and system optimization.
  • Proficiency in Python for automating production systems, including Git, Gitlab, Git actions, GitHub CI / CD, familiarity with common ML libraries such as TensorFlow, PyTorch, and scikit-learn to understand the engineering needs of the ML models you will be deploying.
  • Strong working knowledge of AWS Cloud infrastructure, including EC2, S3, API Gateway, Kubernetes, RDS, VPC peering, Route53, S3, IAM, Batch, Lambda, AWS Config and Autoscaling.
  • Preferred

  • MLOps experience with demonstrated experience supporting machine learning or computer vision teams.
  • Deep experience with container orchestration for ML workloads using Kubernetes, including frameworks like Kubeflow or KubeRay to manage distributed training jobs.
  • Familiarity with data versioning tools like DVC.
  • Familiarity with common ML libraries such as TensorFlow, PyTorch, and scikit-learn to understand the engineering needs of the ML models.
  • Familiarity with other languages such as Java, R, and C / C++.
  • Experience with AWS services for machine learning, such as Amazon SageMaker, and experience managing GPU-accelerated compute instances (e.g., EC2 P and G series) for model training and inference.
  • The expected salary range for this position based on the primary location of Santa Clara, CA is between $162,600 and $302,000. Actual pay will be determined based on experience, qualifications, geographic location, and other job-related factors permitted by law. A discretionary annual bonus may be available based on individual and Company performance. This position also qualifies for the benefits detailed at the link provided below.

    Benefits

    Relocation benefits are not available for this position.

    Who we are

    A healthier future drives us to innovate. Together, more than 100,000 employees across the globe are dedicated to advance science, ensuring everyone has access to healthcare today and for generations to come. Our efforts result in more than 26 million people treated with our medicines and over 30 billion tests conducted using our Diagnostics products. We empower each other to explore new possibilities, foster creativity, and keep our ambitions high, so we can deliver life-changing healthcare solutions that make a global impact.

    Let’s build a healthier future, together.

    Roche is an equal opportunity employer. It is our policy and practice to employ, promote, and otherwise treat any and all employees and applicants on the basis of merit, qualifications, and competence. The company\'s policy prohibits unlawful discrimination, including but not limited to, discrimination on the basis of Protected Veteran status, individuals with disabilities status, and consistent with all federal, state, or local laws.

    If you have a disability and need an accommodation in relation to the online application process, please contact us by completing this form Accommodations for Applicants.

    #J-18808-Ljbffr

    serp_jobs.job_alerts.create_a_job

    Principal Engineer • Santa Clara, CA, United States

    Job_description.internal_linking.related_jobs
    • serp_jobs.job_card.promoted
    Principal Security & DevOps Engineer

    Principal Security & DevOps Engineer

    DevopshuntSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    At NobleAI, we believe that energy, material science and chemistry are key to building a sustainable world and that artificial intelligence is essential to unlock this potential.NobleAI leverages i...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal Machine Learning Engineer

    Principal Machine Learning Engineer

    SAPPalo Alto, CA, United States
    serp_jobs.job_card.full_time
    We are seeking a highly skilled and driven.Principal Machine Learning Engineer.AI and large language model (LLM) capabilities. In this role, you will shape cutting-edge infrastructure, mentor world-...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Elasticsearch - Principal Software Engineer - Search Algorithms

    Elasticsearch - Principal Software Engineer - Search Algorithms

    ElasticMountain View, CA, United States
    serp_jobs.job_card.full_time
    Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale - unleashing the potential of businesses and people.The Elastic Search AI...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_1_day
    • serp_jobs.job_card.promoted
    Elasticsearch - Principal Software Engineer - Vector Search

    Elasticsearch - Principal Software Engineer - Vector Search

    ElasticMountain View, CA, United States
    serp_jobs.job_card.full_time
    Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale - unleashing the potential of businesses and people.The Elastic Search AI...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_1_day
    • serp_jobs.job_card.promoted
    Principal DevOps Engineer Cortex Observability

    Principal DevOps Engineer Cortex Observability

    Palo Alto NetworksSanta Clara, CA, United States
    serp_jobs.job_card.full_time
    NOTE : Due to government environments this team supports, the role requires a US Citizen or Permanent Resident.The Cortex team builds and delivers the industry’s most advanced SecOps platform, consi...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal Machine Learning Engineer

    Principal Machine Learning Engineer

    General MotorsSunnyvale, CA, United States
    serp_jobs.job_card.full_time
    We are seeking a Principal AI Engineer to lead the design and advancement of our AI platform.You will play a key role in shaping the infrastructure that powers large-scale training and cloud infere...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Machine Learning Engineer

    Principal Machine Learning Engineer

    Tubi TvSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Boldly built for every fandom, Tubi is a free streaming service that entertains over 100 million monthly active users.Tubi offers the world's largest collection of Hollywood movies and TV shows, th...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal Deep Learning Software Engineer, LLM Performance

    Principal Deep Learning Software Engineer, LLM Performance

    NVIDIA CorporationSanta Clara, CA, United States
    serp_jobs.job_card.full_time
    We are now looking for a Principal Deep Learning Software Engineer, LLM Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate about analyzing and improving the performance...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Machine Learning Engineer

    Principal Machine Learning Engineer

    SAP SEPalo Alto, CA, United States
    serp_jobs.job_card.full_time +1
    We help the world run better At SAP, we keep it simple : you bring your best to us, and we'll bring out the best in you.We're builders touching over 20 industries and 80% of global commerce, and we ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal Software Engineer, Perception Systems

    Principal Software Engineer, Perception Systems

    WaymoSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal / Senior Principal Machine Learning Engineer, AI Enablement

    Principal / Senior Principal Machine Learning Engineer, AI Enablement

    GenentechSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    We advance science so that we all have more time with the people we love.It’s what drives us to innovate.To continuously advance science and ensure everyone has access to the healthcare they need t...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Senior DevOps Engineer

    Senior DevOps Engineer

    Compunnel, Inc.San Francisco, CA, United States
    serp_jobs.job_card.full_time
    We are seeking a Senior DevOps Engineer to build and maintain development and product platforms using modern data tools such as Databricks, Immuta, Starburst, Collibra, and AWS.This role plays a ke...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Machine Learning Engineer

    Principal Machine Learning Engineer

    The Walt Disney CompanySan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Principal Machine Learning Engineer.Disney Entertainment & ESPN Technology is seeking a passionate Principal Machine Learning Engineer to drive the security and operation anomaly detection initiati...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Senior DevOps Engineer

    Senior DevOps Engineer

    Ellipsis HealthSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Ellipsis Health is creating cutting-edge AI / ML products that solve healthcare staffing issues and administrative burdens using conversational AI and our patented voice biomarker technology in the d...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal DevOps Engineer

    Principal DevOps Engineer

    DevopshuntSan Jose, CA, United States
    serp_jobs.job_card.full_time
    Roche fosters diversity, equity and inclusion, representing the communities we serve.When dealing with healthcare on a global scale, diversity is an essential ingredient to success.We believe that ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal DevOps Engineer (Cortex Observability)

    Principal DevOps Engineer (Cortex Observability)

    ZipRecruiterSanta Clara, CA, United States
    serp_jobs.job_card.full_time
    The Cortex team builds and delivers the industry’s most advanced SecOps platform, consisting of XSIAM, XSOAR, and XPANSE. As a Senior DevOps Engineer, you will be responsible for designing, building...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Machine Learning Engineer

    Principal Machine Learning Engineer

    Black OreSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Black Ore is building the leading AI platform for financial services.By combining LLMs, proprietary AI / ML and automation we accelerate core workflows for the industry, allow financial services prof...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    DevOps Engineer

    DevOps Engineer

    PlaudSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    PLAUD AI is a pioneering AI-native hardware and software company that turns meetings and conversations into actionable insights with AI devices like PLAUD NOTE and PLAUD NotePin.By recording, trans...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days