Talent.com
Machine Learning Engineer, Training Infrastructure

Machine Learning Engineer, Training Infrastructure

HedraSan Francisco, CA, United States
job_description.job_card.30_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

About Hedra

Hedra is a pioneering generative media company backed by top investors at Index, A16Z, and Abstract Ventures. We're building Hedra Studio, a multimodal creation platform capable of control, emotion, and creative intelligence.

At the core of Hedra Studio is our Character-3 foundation model, the first omnimodal model in production. Character-3 jointly reasons across image, text, and audio for more intelligent video generation — it’s the next evolution of AI-driven content creation.

At Hedra, we’re a team of hard-working, passionate individuals seeking to fundamentally change content creation and build a generational company together. We value startup energy, initiative, and the ability to turn bold ideas into real products. Our team is fully in-person in SF / NY with a shared love for whiteboard problem-solving.

Overview

We are looking for an ML Engineer with 3+ YOE in high-performance computing systems to manage and optimize our computational infrastructure for training and deploying our machine learning models. The ideal candidate has diverse experience managing ML workloads at scale, supporting our 3DVAE and video diffusion models. We encourage you to apply even if you don't meet every requirement — we value curiosity, creativity, and the drive to solve hard problems.

Responsibilities

Design, implement, and maintain scalable computing solutions for training and deploying ML models, ensuring infrastructure can handle large video datasets.

Manage and optimize the performance of our computing clusters or cloud instances, such as AWS or Google Cloud, to support distributed training.

Ensure that our infrastructure can handle the resource-intensive tasks associated with training large generative models.

Monitor system performance and implement improvements to maximize efficiency and utilization , using tools like Airflow for orchestration.

Collaborate across research teams to understand their computational needs and provide appropriate solutions, facilitating seamless model deployment.

Qualifications

Bachelor’s degree in Computer Science, Information Technology, or a related field, with a focus on system administration.

Experience with cloud computing platforms such as Amazon Web Services, Google Cloud, or Microsoft Azure, essential for managing large-scale ML workloads.

Values engineering processes and version control (CI / CD).

Knowledge of containerization technologies like Docker and Kubernetes required for deployments at scale.

Understanding of distributed training techniques and how to scale models across multi-node clusters aligning with video generation needs.

Strong problem-solving and communication skills, given the need to collaborate with diverse teams.

This role is vital for ensuring the computational backbone supports the company’s ML efforts, focusing on deployment and scalability.

Benefits

Competitive compensation + equity

401k (no match)

Healthcare (Silver PPO Medical, Vision, Dental)

Lunch and snacks at the office

#J-18808-Ljbffr

serp_jobs.job_alerts.create_a_job

Machine Learning Engineer • San Francisco, CA, United States

Job_description.internal_linking.related_jobs
  • serp_jobs.job_card.promoted
  • serp_jobs.job_card.new
Machine Learning Engineer, Prediction

Machine Learning Engineer, Prediction

WaymoMountain View, CA, United States
serp_jobs.job_card.full_time
Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
  • serp_jobs.job_card.promoted
Machine Learning Engineer, GenAI Applied ML

Machine Learning Engineer, GenAI Applied ML

Scale AI, Inc.San Francisco, CA, United States
serp_jobs.job_card.full_time
At Scale AI, our mission is to accelerate the development of AI applications.For 8 years, Scale has been the leading AI data foundry, helping fuel the most exciting advancements in AI, including : g...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Machine Learning Engineer, Mapping

Machine Learning Engineer, Mapping

WaymoMountain View, CA, United States
serp_jobs.job_card.full_time
Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
  • serp_jobs.job_card.new
Staff Machine Learning Engineer, Acceleration

Staff Machine Learning Engineer, Acceleration

WaymoMountain View, CA, United States
serp_jobs.job_card.full_time
Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
  • serp_jobs.job_card.promoted
Staff Machine Learning Engineer

Staff Machine Learning Engineer

VirtualVocationsConcord, California, United States
serp_jobs.job_card.full_time
A company is looking for a Staff Machine Learning Engineer to design, build, and deploy advanced AI systems for financial technology applications. Key Responsibilities Develop and fine-tune large ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Software Engineer, ML Infrastructure - Training Platform

Software Engineer, ML Infrastructure - Training Platform

Scale AI, Inc.San Francisco, CA, United States
serp_jobs.job_card.full_time
Scale is looking for an AI / ML Infrastructure Engineer to join our Machine Learning Infrastructure team to build out our Training Platform. You will partner closely with Machine Learning researchers ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
ML Ops Engineer

ML Ops Engineer

VirtualVocationsConcord, California, United States
serp_jobs.job_card.full_time
A company is looking for an ML Ops Engineer to join their AI infrastructure team.Key Responsibilities Architect, implement, and maintain end-to-end ML pipelines Automate model training and deplo...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Founding Machine Learning Infrastructure Engineer

Founding Machine Learning Infrastructure Engineer

NomadicML Inc.San Francisco, CA, United States
serp_jobs.job_card.full_time
Harvard, where they both did research in the intersection of computation and evaluations.Between them, they have authored multiple published papers in the machine learning domain and hold numerous ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Machine Learning Engineer

Machine Learning Engineer

VirtualVocationsFremont, California, United States
serp_jobs.job_card.full_time
A company is looking for a Machine Learning Engineer for a 100% remote position.Key Responsibilities Design, build, and maintain machine learning models for production deployment Develop scalabl...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Senior Lead Machine Learning Engineer

Senior Lead Machine Learning Engineer

VirtualVocationsFremont, California, United States
serp_jobs.job_card.full_time
A company is looking for a Senior Lead Machine Learning Engineer to lead the design and delivery of AI-powered intelligence systems. Responsibilities Design and implement infrastructure for agenti...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_1_day
Machine Learning Engineer, Training Infrastructure

Machine Learning Engineer, Training Infrastructure

IntelliPro Group Inc.San Francisco, CA, US
serp_jobs.job_card.full_time
serp_jobs.filters_job_card.quick_apply
Machine Learning Engineer, Training Infrastructure Position Type : Full time Location : San Francisco, CA, USA Salary Range : $150,000 - $250, 000 (USD) Job ID# : 158135 Job Description : We are l...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Applied Machine Learning Engineer

Applied Machine Learning Engineer

VirtualVocationsConcord, California, United States
serp_jobs.job_card.full_time
A company is looking for an Applied Machine Learning Engineer, Circuit Design - New College Grad 2025.Key Responsibilities Collaborate with a multi-functional team on Pre-silicon and Post Silicon...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Machine Learning Engineer, Training Infrastructure

Machine Learning Engineer, Training Infrastructure

Hedra, IncSan Francisco, CA, United States
serp_jobs.job_card.full_time
Hedra is a pioneering generative media company backed by top investors at Index, A16Z, and Abstract Ventures.We're building Hedra Studio, a multimodal creation platform capable of control, emotion,...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
  • serp_jobs.job_card.new
Senior Machine Learning Engineer

Senior Machine Learning Engineer

MeltwaterRedwood City, CA, United States
serp_jobs.job_card.full_time
Meltwater's Consumer Intelligence AI Team is looking for a.Natural Language Processing or Computer Vision features relying on the literature's state of the art. Those features are meant to be integr...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
  • serp_jobs.job_card.promoted
Manager of Applied Machine Learning

Manager of Applied Machine Learning

VirtualVocationsHayward, California, United States
serp_jobs.job_card.full_time
A company is looking for a Manager, Applied Machine Learning.Key Responsibilities Lead a team of machine learning scientists and engineers to design, develop, and deliver scalable ML solutions O...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_1_day
  • serp_jobs.job_card.promoted
Machine Learning Engineer, Training Infrastructure

Machine Learning Engineer, Training Infrastructure

Ipro Networks Pte. Ltd.San Francisco, CA, United States
serp_jobs.job_card.full_time
Job Title : Machine Learning Engineer, Training Infrastructure | Position Type : Full time | Location : San Francisco, CA, USA | Salary Range : $150,000 - $250,000 (USD) | Job ID# : 158135.Design, imple...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Machine Learning Infrastructure Engineer

Machine Learning Infrastructure Engineer

Character.AISan Francisco, CA, United States
serp_jobs.job_card.full_time
Machine Learning Infrastructure Engineer.Machine Learning Infrastructure Engineer.Machine Learning Infrastructure Engineer. Machine Learning Infrastructure Engineer.Get AI-powered advice on this job...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
  • serp_jobs.job_card.new
Machine Learning Engineer, Planning

Machine Learning Engineer, Planning

WaymoSan Francisco, CA, United States
serp_jobs.job_card.full_time
Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours