Talent.com
Senior ML Training Engineer
Senior ML Training EngineerAION • Seattle, WA, US
Senior ML Training Engineer

Senior ML Training Engineer

AION • Seattle, WA, US
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Job Description

Job Description

AION is building the next generation of AI cloud platform by transforming the future of high-performance computing (HPC) through its decentralized AI cloud. Purpose-built for bare-metal performance, AION democratizes access to compute power for AI training, fine-tuning, inference, data labeling, and beyond.

By leveraging underutilized resources such as idle GPUs and data centers, AION provides a scalable, cost-effective, and sustainable solution tailored for developers, researchers, and enterprises.

Led by high-pedigree founders with previous exits, AION is well-funded by major VCs with strategic global partnerships. Headquartered in the US with global presence, the company is building its initial core team in India, London and Seattle.

Who You Are

You're an ML systems engineer who's passionate about building high-performance inference infrastructure. You don't need to be an expert in everything - this field is evolving too rapidly for that - but you have strong fundamentals and the curiosity to dive deep into optimization challenges. You thrive in early-stage environments where you'll learn cutting-edge techniques while building production systems. You think systematically about performance bottlenecks and are excited to push the boundaries of what's possible in AI infrastructure.

Requirements

Key Responsibilities

  • Architect and implement distributed training solutions for customers running pre-training, fine-tuning, and RL workloads on AION infrastructure.
  • Guide customers through large-scale training implementations including data parallelism, model parallelism, and pipeline parallelism strategies.
  • Design and optimize multi-GPU training setups with proper gradient synchronization, communication strategies, and scaling configurations.
  • Optimize and develop POCs for customer training accelerators including efficient data loading pipelines, gradient checkpointing, and memory optimization techniques.
  • Create comprehensive monitoring and debugging frameworks for distributed training jobs with performance tracking and bottleneck resolution.
  • Conduct technical workshops and training sessions on distributed training, reasoning techniques, and post-training optimization methodologies.
  • Support customers with advanced fine-tuning workflows including reward model training, constitutional AI, and alignment techniques.
  • Troubleshoot and resolve customer training bottlenecks including scaling inefficiencies and optimization challenges.
  • Collaborate with tech and product teams to translate customer needs into platform improvements and feature requirements.

Skills & Experience

  • High agency individual looking to own customer success and influence training platform architecture.
  • 4+ years of ML engineering experience with focus on training large-scale models and distributed systems.
  • Expert-level PyTorch experience including distributed training, DDP implementation, and multi-GPU optimization.
  • Production experience with distributed training techniques including data parallelism, model parallelism, pipeline parallelism.
  • Strong understanding of gradient synchronization and communication strategies for multi-node training.
  • Hands-on experience with large dataset handling and efficient data loading at scale.
  • Proficiency in training infrastructure tools such as Megatron-LM, DeepSpeed, FairScale, or similar frameworks.
  • Excellent communication and teaching skills with ability to explain complex technical concepts to diverse audiences.
  • Customer-facing experience in technical consulting, solutions engineering, or developer relations roles.
  • Experience with RLHF and fine-tuning pipelines including reward model training and post-training optimization.
  • Understanding of reasoning techniques including Chain-of-Thought prompting and advanced reasoning workflows.
  • Nice to have

    Large-scale pre-training experience (7B+ parameters), advanced reasoning implementation (Tree-of-Thought, self-consistency), DPO and constitutional AI expertise, open-source contributions to training frameworks, conference speaking or technical evangelism experience.

    Benefits

  • Join the ground floor of a mission-driven AI startup revolutionizing compute infrastructure.
  • Work with a high-caliber, globally distributed team backed by major VCs.
  • Competitive compensation and benefits.
  • Fast-paced, flexible work environment with room for ownership and impact.
  • Hybrid model : 3 days in-office, 2 days remote with flexibility to work remotely for part of the year.
  • In case you got any questions about the role please reach out to hiring manager on linkedin or X.

    serp_jobs.job_alerts.create_a_job

    Senior Ml Engineer • Seattle, WA, US

    Job_description.internal_linking.related_jobs
    Travel MRI Tech - $2596.09 / Week

    Travel MRI Tech - $2596.09 / Week

    Atlas MedStaff • Enumclaw, WA, US
    serp_jobs.job_card.full_time
    Atlas MedStaff is seeking an experienced MRI Tech for an exciting Travel Allied job in Enumclaw, WA.Shift : 5x8 hr PMs Start Date : 11 / 03 / 2025 Duration : 13 weeks Pay : $2596.Atlas Medstaff is currentl...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Travel MRI Tech - $2832 / Week

    Travel MRI Tech - $2832 / Week

    Stability Healthcare • Enumclaw, WA, US
    serp_jobs.job_card.full_time
    Stability Healthcare is seeking an experienced MRI Tech for an exciting Travel Allied job in Enumclaw, WA.Shift : 5x8 hr days Start Date : 11 / 10 / 2025 Duration : 13 weeks Pay : $2832 / Week.Stability He...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Machine Learning Engineer, Perception, Semantics

    Senior Machine Learning Engineer, Perception, Semantics

    Waymo • Kirkland, WA, United States
    serp_jobs.job_card.full_time
    Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    ML Research Engineer, ML Systems

    ML Research Engineer, ML Systems

    Scale AI, Inc. • Seattle, WA, United States
    serp_jobs.job_card.full_time
    Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and opera...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Machine Learning Engineering Manager

    Machine Learning Engineering Manager

    VirtualVocations • Everett, Washington, United States
    serp_jobs.job_card.full_time
    A company is looking for an Engineering Manager, Machine Learning.Key Responsibilities Define and build the ML strategy to improve the assistant and user outcomes Prototype, architect, and ship ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Travel MRI Tech - $2707.64 / Week

    Travel MRI Tech - $2707.64 / Week

    Uniti Med • Enumclaw, WA, US
    serp_jobs.job_card.full_time
    Uniti Med is seeking an experienced MRI Tech for an exciting Travel Allied job in Enumclaw, WA.Shift : Inquire Start Date : 11 / 10 / 2025 Duration : 13 weeks Pay : $2707. Uniti Med provides career opportun...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Travel MRI Tech - $2286.8 / Week

    Travel MRI Tech - $2286.8 / Week

    AMN Healthcare Allied • Enumclaw, WA, US
    serp_jobs.job_card.full_time
    AMN Healthcare Allied is seeking an experienced MRI Tech for an exciting Travel Allied job in Enumclaw, WA.Shift : 8 hr days Start Date : 11 / 03 / 2025 Duration : 13 weeks Pay : $2286.Job Description &...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Remote Finance Director - AI Trainer

    Remote Finance Director - AI Trainer

    Data Annotation • Lakewood, Washington
    serp_jobs.filters.remote
    serp_jobs.job_card.full_time +1
    We are looking for a finance professional to join our team to train AI models.You will measure the progress of these AI chatbots, evaluate their logic, and solve problems to improve the q...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Remote Finance Advisor - AI Trainer

    Remote Finance Advisor - AI Trainer

    Data Annotation • Lakewood, Washington
    serp_jobs.filters.remote
    serp_jobs.job_card.full_time +1
    We are looking for a finance professional to join our team to train AI models.You will measure the progress of these AI chatbots, evaluate their logic, and solve problems to improve the q...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Travel MRI Tech - $2929 / Week

    Travel MRI Tech - $2929 / Week

    Medpro Healthcare Staffing • Monroe, WA, US
    serp_jobs.job_card.full_time
    Medpro Healthcare Staffing is seeking an experienced MRI Tech for an exciting Travel Allied job in Monroe, WA.Shift : 4x10 hr nights Start Date : 11 / 03 / 2025 Duration : 13 weeks Pay : $2929 / Week.Joint...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Staff Machine Learning Engineer

    Staff Machine Learning Engineer

    VirtualVocations • Seattle, Washington, United States
    serp_jobs.job_card.full_time
    A company is looking for a Staff Machine Learning Engineer (Remote).Key Responsibilities Design, implement, and optimize ML models for customer-facing and internal product capabilities Lead tech...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Technical Training Coordinator

    Technical Training Coordinator

    VirtualVocations • Seattle, Washington, United States
    serp_jobs.job_card.full_time
    A company is looking for a Technical Training Coordinator to equip commercial employees with essential knowledge and skills for their roles. Key Responsibilities Own and optimize training platform...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Supervisor in Training

    Supervisor in Training

    MI Windows and Doors • Tacoma, WA, US
    serp_jobs.job_card.full_time
    Step Into Leadership with MITER Brands.Join MITER Brands, a leading force in residential window and door manufacturing, offering a premier portfolio of brands for both new construction and replacem...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Machine Learning Engineer

    Senior Machine Learning Engineer

    VirtualVocations • Seattle, Washington, United States
    serp_jobs.job_card.full_time
    A company is looking for a Senior Machine Learning Engineer (ML / AI).Key Responsibilities Build tooling and services for machine learning and generative AI solutions in production Develop trainin...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior ML Training Engineer

    Senior ML Training Engineer

    AION • Seattle, WA, US
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    AION is building the next generation of AI cloud platform by transforming the future of high-performance computing (HPC) through its decentralized AI cloud. Purpose-built for bare-metal performance,...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    Consultant Engineer I - Seattle

    Consultant Engineer I - Seattle

    FM • ISSAQUAH, Washington, United States
    serp_jobs.job_card.full_time
    FM is one of the world’s largest risk management and industrial property insurance organizations.With 76 office locations in over 60 countries worldwide, FM provides specialized property protection...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Staff Machine Learning Engineer, ML Performance & Optimization

    Staff Machine Learning Engineer, ML Performance & Optimization

    Waymo • Bellevue, WA, United States
    serp_jobs.job_card.full_time
    Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Sales Training Curriculum Developer

    Sales Training Curriculum Developer

    VirtualVocations • Tacoma, Washington, United States
    serp_jobs.job_card.full_time
    A company is looking for a Training & Curriculum Lead - Dealer Services (Remote).Key Responsibilities Design and deliver customized training programs focusing on automotive Finance & Insurance (F...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Travel MRI Tech - $2926.4 / Week

    Travel MRI Tech - $2926.4 / Week

    Cynet Health • Enumclaw, WA, US
    serp_jobs.job_card.full_time
    Cynet Health is seeking an experienced MRI Tech for an exciting Travel Allied job in Enumclaw, WA.Shift : 4x10 hr PMs Start Date : 11 / 10 / 2025 Duration : 13 weeks Pay : $2926. Ranked #5 Best Travel Nursi...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Travel MRI Tech - $2610 / Week

    Travel MRI Tech - $2610 / Week

    Fusion Medical Staffing • Enumclaw, WA, US
    serp_jobs.job_card.full_time
    Fusion Medical Staffing is seeking an experienced MRI Tech for an exciting Travel Allied job in Enumclaw, WA.Shift : Inquire Start Date : 11 / 03 / 2025 Duration : 13 weeks Pay : $2610 / Week.Facility in E...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted