Talent.com
Research Engineer, Training Infrastructure Lead

Research Engineer, Training Infrastructure Lead

Menlo VenturesSan Francisco, CA, United States
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

About Goodfire

Behind our name : Like fire, AI holds the potential for both immense benefit and significant risk. Just as mastering fire transformed human history, we believe the safe and intentional development of AI will shape the future of our species. Our goal is to tame this new fire.

Goodfire is an AI interpretability research company focused on understanding and designing AI systems that people can trust. Our mission is to advance humanity's understanding of AI to build safe and powerful AI systems. We believe that deep research breakthroughs are necessary to make this possible.

Goodfire is a public benefit corporation headquartered in San Francisco with a team of the world’s top interpretability researchers and engineers from organizations like OpenAI and DeepMind. We’ve raised $59M from investors like Menlo, Lightspeed and Anthropic and work with customers including Arc Institute, Mayo Clinic, and Rakuten.

About the role

We're seeking a senior engineering leader to own and evolve research platform and training infrastructure. You'll define both the technical vision and the implementation strategy for the systems that power our research breakthroughs.

Key Responsibilities :

  • Design and build customizable training pipelines that scale from experimentation to production
  • Architect and implement large-scale model serving infrastructure for interpretability (reference : NDIF, Garcon)
  • Identify and execute on opportunities to dramatically accelerate research velocity
  • Lead technical decision‑making for infrastructure that supports cutting‑edge AI research

What you’ll bring

Required experience

  • 5+ years of experience in ML infrastructure, research engineering, and / or systems programming
  • Leadership experience as senior architect, tech lead, and / or engineering manager
  • Cross‑functional expertise bridging research and engineering domains
  • Technical proficiency in Python, PyTorch / JAX, and distributed systems
  • Production experience deploying and maintaining ML systems at scale
  • Mission alignment with advancing AI safety and interpretability
  • Core competencies

    High-ownership leadership

  • Owns broad areas with autonomy, driving architectural and strategic decisions even amid uncertainty
  • Balances technical depth with speed, adapting as priorities evolve
  • Research‑to‑production mindset

  • Bridges fast research iteration with reliable, scalable production systems
  • Designs abstractions that preserve flexibility while ensuring robustness
  • Modern ML & infrastructure expertise

  • Deep experience in Python, PyTorch, and large-scale training strategies
  • Hands‑on with end‑to‑end ML infrastructure : from experiments to serving
  • Strong track record of scaling systems and debugging complex runs

    Preferred qualifications

  • Contributions to open‑source ML infrastructure projects
  • Experience in fast‑paced startup or research lab environments
  • Our values

    Goodfire is looking for individuals who embody our values and share our deep commitment to making interpretability accessible. We are building a team first and foremost.

    Put mission and team first

    All we do is in service of our mission. We trust each other, deeply care about the success of the organization, and choose to put our team above ourselves.

    Improve constantly

    We are constantly looking to improve every piece of the business. We proactively critique ourselves and others in a kind and thoughtful way that translates to practical improvements in the organization. We are pragmatic and consistently implement the obvious fixes that work.

    Take ownership and initiative

    There are no bystanders here. We proactively identify problems and take full responsibility over getting a strong result. We are self‑driven, own our mistakes, and feel deep responsibility over what we’re building.

    Action today

    We have a small amount of time to do something incredibly hard and meaningful. The pace and intensity of the organization is high. If we can take action today or tomorrow, we will choose to do it today.

    What we offer

    This role offers market competitive salary, equity, and competitive benefits.

    The expected salary range for this position is $200,000 - $400,000 USD

    Most importantly, you'll have the opportunity to join a vital mission at an important point in its trajectory — we are developing groundbreaking technology with a world‑class team on the critical path to ensuring a safe and beneficial future for humanity. If you want to do your life’s work with us, even if you believe you do not meet every single requirement, apply now.

    #J-18808-Ljbffr

    serp_jobs.job_alerts.create_a_job

    Lead Infrastructure Engineer • San Francisco, CA, United States

    Job_description.internal_linking.related_jobs
    • serp_jobs.job_card.promoted
    ML Research Engineer - Training

    ML Research Engineer - Training

    AchiraSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Join a world‑class team of scientists, ML researchers, and engineers working together to make the physical microcosm predictable and reshape the future of drug discovery. Move beyond the beaten path...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Staff Infrastructure Engineer, Discovery Team

    Staff Infrastructure Engineer, Discovery Team

    Menlo VenturesSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Software Engineer, Research Infrastructure

    Software Engineer, Research Infrastructure

    OpenAISan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Software Engineer, Research Infrastructure.This role will support the fleet infrastructure team at OpenAI.The fleet team focuses on running the world’s largest, most reliable, and frictionless GPU ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Compute Infrastructure Deployment Lead

    Compute Infrastructure Deployment Lead

    OpenAISan Francisco, CA, United States
    serp_jobs.job_card.full_time
    The Industrial Compute team builds and operates the infrastructure behind OpenAI’s research and products.We design for scale, performance, and adaptability—bridging physical and logical layers so f...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Lead Infrastructure Engineer

    Lead Infrastructure Engineer

    RagieSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    This range is provided by Ragie.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Direct message the job poster from Ragie.Fractional Head of Tech...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Travel MRI Technologist

    Travel MRI Technologist

    LRS Healthcare - AlliedGreenbrae, CA, US
    serp_jobs.job_card.permanent
    LRS Healthcare - Allied is seeking a travel MRI Technologist for a travel job in Greenbrae, California.Job Description & Requirements. LRS Healthcare - Allied Job ID #30I-28935.Pay package is ba...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    Research Scientist / Engineer – Training Infrastructure

    Research Scientist / Engineer – Training Infrastructure

    IntelliPro Group Inc.Palo Alto, CA, US
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Research Scientist / Engineer – Training Infrastructure Position Type : Full time Location : Palo Alto, CA • Remote - US • Remote - International Salary Range : $220,000 - $300...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Machine Learning Engineer - Training & Infrastructure

    Machine Learning Engineer - Training & Infrastructure

    P-1 AISan Francisco, CA, United States
    serp_jobs.job_card.full_time
    We are building an engineering AGI.We founded P-1 AI with the conviction that the greatest impact of artificial intelligence will be on the built world—helping mankind conquer nature and bend it to...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    Machine Learning Engineer, Training Infrastructure

    Machine Learning Engineer, Training Infrastructure

    IntelliPro Group Inc.San Francisco, CA, US
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Machine Learning Engineer, Training Infrastructure Position Type : Full time Location : San Francisco, CA, USA Salary Range : $150,000 - $250, 000 (USD) Job ID# : 158135 Job Description : We are l...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    • serp_jobs.job_card.new
    DevOps Engineering Lead - ML Infrastructure

    DevOps Engineering Lead - ML Infrastructure

    SymbolicaSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    DevOps Engineering Lead - ML Infrastructure.Symbolica is an AI research lab pioneering the application of category theory to enable logical reasoning in machines. We’re a well-resourced, nimble team...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_1_hour
    • serp_jobs.job_card.promoted
    Machine Learning Engineer, Training Infrastructure

    Machine Learning Engineer, Training Infrastructure

    Hedra, IncSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Hedra is a pioneering generative media company backed by top investors at Index, A16Z, and Abstract Ventures.We're building Hedra Studio, a multimodal creation platform capable of control, emotion,...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Machine Learning Research Lead, Security & Policy Research Lab

    Machine Learning Research Lead, Security & Policy Research Lab

    Scale AISan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Scale is seeking a highly experienced, thoughtful, and mission-driven research lead to drive the Scale AI Security and Policy Research Lab (SPRL). The team aims to bridge the gap between AI research...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Machine Learning Engineer, Training Infrastructure

    Machine Learning Engineer, Training Infrastructure

    Ipro Networks Pte. Ltd.San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Job Title : Machine Learning Engineer, Training Infrastructure | Position Type : Full time | Location : San Francisco, CA, USA | Salary Range : $150,000 - $250,000 (USD) | Job ID# : 158135.Design, imple...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Machine Learning Infrastructure Engineer

    Machine Learning Infrastructure Engineer

    Character.AISan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Machine Learning Infrastructure Engineer.Machine Learning Infrastructure Engineer.Machine Learning Infrastructure Engineer. Machine Learning Infrastructure Engineer.Get AI-powered advice on this job...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Research Engineer, Training Infrastructure Lead

    Research Engineer, Training Infrastructure Lead

    GoodfireSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Research Engineer, Training Infrastructure Lead.Behind our name : Like fire, AI holds the potential for both immense benefit and significant risk. Just as mastering fire transformed human history, we...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Research Scientist / Engineer - Training Infrastructure

    Research Scientist / Engineer - Training Infrastructure

    IntelliPro Group Inc.Palo Alto, CA, US
    serp_jobs.job_card.full_time
    Research Scientist / Engineer – Training Infrastructure.Palo Alto, CA • Remote - US • Remote - International.We believe that multimodality is critical for intelligence.To go beyond ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Machine Learning Engineer, Training Infrastructure

    Machine Learning Engineer, Training Infrastructure

    HedraSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Hedra is a pioneering generative media company backed by top investors at Index, A16Z, and Abstract Ventures.We're building Hedra Studio, a multimodal creation platform capable of control, emotion,...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Compute Infrastructure Deployment Lead

    Compute Infrastructure Deployment Lead

    The Rundown AI, Inc.San Francisco, CA, United States
    serp_jobs.job_card.full_time
    The Industrial Compute team builds and operates the infrastructure behind OpenAI’s research and products.We design for scale, performance, and adaptability—bridging physical and logical layers so f...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days