Talent.com
Machine Learning Systems Platform Engineer

Machine Learning Systems Platform Engineer

Blue SignalSan Francisco, CA, United States
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Confidential Opening : Machine Learning Systems Platform Engineer

Location : San Francisco, CA (Hybrid Preferred)

Overview

A stealth-mode innovator at the forefront of AI infrastructure is seeking a dynamic Machine Learning Systems Platform Engineer to build the backbone of their next-generation ML ecosystem. This team is leading the charge in developing tools and platforms that empower world-class ML teams to experiment, scale, and deploy faster than ever before.

In this key engineering role, you will architect and optimize the systems that make high-performance AI development possible. From training and tuning to inference and monitoring, your work will enable cutting-edge ML initiatives across the organization. You will work closely with ML scientists and engineers to ensure seamless integration of models into production environments.

Key Responsibilities

  • Build and maintain robust infrastructure to support machine learning workloads at scale, including training pipelines, tuning environments, and deployment frameworks.
  • Develop and automate MLOps pipelines for reproducibility, experiment tracking, model versioning, and validation.
  • Optimize cloud and on-prem GPU compute utilization across orchestration platforms.
  • Lead the implementation of tools for model rollback, observability, and system health monitoring.
  • Collaborate with cross-functional teams to ensure reliability, scalability, and maintainability of ML systems.

Qualifications

  • 3+ years of experience in designing and deploying ML infrastructure or production-grade MLOps tools.
  • Fluency in backend development and infrastructure engineering, especially with Python, Go, Bash, Terraform, or Helm.
  • Experience with ML orchestration tools such as Kubeflow, Airflow, MLflow, Ray, or Metaflow.
  • Proficient in containerization and cloud-native technologies, including Docker, Kubernetes, Argo, or managed ML platforms like SageMaker.
  • Deep understanding of cloud environments (AWS, GCP, or Azure) and GPU-accelerated workloads.
  • Preferred Skills

  • Exposure to distributed training techniques (FSDP, DeepSpeed, Horovod).
  • Knowledge of CI / CD strategies for ML and data drift detection methods.
  • Awareness of privacy, compliance, and security practices in ML systems.
  • Prior experience in infrastructure-first or developer-oriented AI organizations.
  • Compensation and Benefits

  • Base salary range : $160,000 to $230,000 DOE
  • Significant equity package and comprehensive benefits
  • Opportunity to work at the core of transformative AI innovation
  • Why Apply?

    This is a rare opportunity to own and shape the ML platform behind AI that will define the next era. If you thrive in system-level problem solving and want to leave your mark on how machine learning is built at scale, this role is for you.

    Apply today to learn more about this confidential opportunity and how you can play a part in the future of AI engineering.

    #J-18808-Ljbffr

    serp_jobs.job_alerts.create_a_job

    Machine Learning Engineer • San Francisco, CA, United States

    Job_description.internal_linking.related_jobs
    • serp_jobs.job_card.promoted
    Machine Learning Data Engineer - Systems & Retrieval

    Machine Learning Data Engineer - Systems & Retrieval

    ZyphraPalo Alto, CA, United States
    serp_jobs.job_card.full_time
    Machine Learning Data Engineer - Systems & Retrieval.Machine Learning Data Engineer - Systems & Retrieval.This includes designing high-performance pipelines for collecting, transforming, indexing, ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Machine Learning Engineer

    Machine Learning Engineer

    Metric BioFremont, CA, US
    serp_jobs.job_card.full_time
    Metric Bio is recruiting on behalf of a San Francisco–based digital health company that is building an AI-powered platform to transform patient care and healthcare delivery.ML techniques to s...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Machine Learning Engineer

    Principal Machine Learning Engineer

    Tubi TvSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Boldly built for every fandom, Tubi is a free streaming service that entertains over 100 million monthly active users.Tubi offers the world's largest collection of Hollywood movies and TV shows, th...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal Machine Learning Engineer

    Principal Machine Learning Engineer

    SAP SEPalo Alto, CA, United States
    serp_jobs.job_card.full_time +1
    We help the world run better At SAP, we keep it simple : you bring your best to us, and we'll bring out the best in you.We're builders touching over 20 industries and 80% of global commerce, and we ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Machine Learning Infrastructure Simulation Engineer, Optimus

    Machine Learning Infrastructure Simulation Engineer, Optimus

    Tesla Motors, Inc.Palo Alto, CA, United States
    serp_jobs.job_card.full_time
    The Optimus Simulation team is at the forefront of advancing humanoid robotics by building a high-fidelity virtual world where Optimus can safely learn, adapt, and improve.Our mission is to recreat...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Founding Machine Learning Infrastructure Engineer

    Founding Machine Learning Infrastructure Engineer

    NomadicML Inc.San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Harvard, where they both did research in the intersection of computation and evaluations.Between them, they have authored multiple published papers in the machine learning domain and hold numerous ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Machine Learning Systems Engineer, RL Engineering

    Machine Learning Systems Engineer, RL Engineering

    Menlo VenturesSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Machine Learning Engineer, Relevance

    Machine Learning Engineer, Relevance

    PatreonSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Patreon is a media and community platform where over 300,000 creators give their biggest fans access to exclusive work and experiences. We offer creators a variety of ways to engage with their fans ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Machine Learning Engineer (San Francisco)

    Machine Learning Engineer (San Francisco)

    Metric BioSan Francisco, CA, US
    serp_jobs.job_card.part_time
    Metric Bio is recruiting on behalf of a San Franciscobased digital health company that is building an AI-powered platform to transform patient care and healthcare delivery.ML techniques to solve co...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Machine Learning Engineer, Distributed Systems, Optimus

    Machine Learning Engineer, Distributed Systems, Optimus

    Tesla Motors, Inc.Palo Alto, CA, United States
    serp_jobs.job_card.full_time
    As a Software Engineer for the Optimus team, you will build the tools and infrastructure to make and measure improvements to neural network architecture, visualize data, assist with exporting and d...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Machine Learning Engineer, Platform Architecture

    Machine Learning Engineer, Platform Architecture

    Apple Inc.Cupertino, CA, United States
    serp_jobs.job_card.full_time
    Machine Learning Engineer, Platform Architecture.Cupertino, California, United States Hardware.At Apple, our Platform Architecture group is responsible for connecting our hardware and software into...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_1_day
    • serp_jobs.job_card.promoted
    Machine Learning Engineer - Intelligent Agents & Systems

    Machine Learning Engineer - Intelligent Agents & Systems

    Zyphra Technologies Inc.Palo Alto, CA, United States
    serp_jobs.job_card.full_time
    Agentic Systems and Interaction projects.You will be at the forefront of building a next-generation desktop and browser-based agent that can autonomously navigate the web, interact with filesystems...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal Machine Learning Engineer

    Principal Machine Learning Engineer

    Black OreSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Black Ore is building the leading AI platform for financial services.By combining LLMs, proprietary AI / ML and automation we accelerate core workflows for the industry, allow financial services prof...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Machine Learning Data Engineer - Systems & Retrieval

    Machine Learning Data Engineer - Systems & Retrieval

    Zyphra Technologies Inc.Palo Alto, CA, United States
    serp_jobs.job_card.full_time
    Machine Learning Data Engineer - Systems & Retrieval.This includes designing high-performance pipelines for collecting, transforming, indexing, and serving massive, heterogeneous datasets from raw ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Machine Learning Engineer, Training Infrastructure

    Machine Learning Engineer, Training Infrastructure

    Ipro Networks Pte. Ltd.San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Job Title : Machine Learning Engineer, Training Infrastructure | Position Type : Full time | Location : San Francisco, CA, USA | Salary Range : $150,000 - $250,000 (USD) | Job ID# : 158135.Design, imple...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Research Engineer - Machine Learning & Systems

    Research Engineer - Machine Learning & Systems

    World LabsSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    We are looking for a versatile Research Engineer with a strong background in machine learning or 3D, software development, and systems design. This role is ideal for someone excited about bridging c...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Machine Learning Engineer

    Machine Learning Engineer

    Notable HealthSan Mateo, CA, United States
    serp_jobs.job_card.full_time
    Notable is the leading healthcare AI platform for transforming workforce productivity.Health systems, hospitals, and payers use Notable to improve healthcare quality, close gaps in patient care, dr...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Founding Machine Learning Engineer

    Founding Machine Learning Engineer

    NomadicML Inc.San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Harvard, where they both did research in the intersection of computation and evaluations.Between them, they have authored multiple published papers in the machine learning domain and hold numerous ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30