Talent.com
Senior ML Inference Platform Engineer
Senior ML Inference Platform EngineerAION • Seattle, WA, US
Senior ML Inference Platform Engineer

Senior ML Inference Platform Engineer

AION • Seattle, WA, US
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
  • serp_jobs.filters_job_card.quick_apply
job_description.job_card.job_description

About AION

AION is building the next generation of AI cloud platform by transforming the future of high-performance computing (HPC) through its decentralized AI cloud. Purpose-built for bare-metal performance, AION democratizes access to compute power for AI training, fine-tuning, inference, data labeling, and full stack AI / ML lifecycle.

Led by high-pedigree founders with previous exits, AION is well-funded by major VCs with strategic global partnerships. Headquartered in the US with global presence, the company is building its initial core team across India, London and Seattle.

Who You Are

You're an ML systems engineer who's passionate about building high-performance inference infrastructure. You don't need to be an expert in everything - this field is evolving too rapidly for that - but you have strong fundamentals and the curiosity to dive deep into optimization challenges. You thrive in early-stage environments where you'll learn cutting-edge techniques while building production systems. You think systematically about performance bottlenecks and are excited to push the boundaries of what's possible in AI infrastructure.

Requirements

Key Responsibilities

  • Build and optimize LLM inference systems working towards 2-4x performance improvements over standard frameworks like vLLM and TensorRT-LLM.
  • Implement modern inference optimizations including KV-cache management, dynamic batching, speculative decoding, compression and quantization strategies.
  • Develop GPU optimization solutions using CUDA, with opportunities to learn advanced techniques like Triton kernel development and CUDA graphs.
  • Design model evaluation and benchmarking systems to assess performance across reasoning, coding, and safety metrics.
  • Research and integrate trending open-source models (DeepSeek R1, Qwen 3, Llama 4, Mistral variants) with optimized configurations.
  • Build performance monitoring and profiling tools for GPU cluster analysis, bottleneck identification, and cost optimization.
  • Create cost-performance optimization strategies that balance throughput, latency, and infrastructure costs.
  • Explore agent orchestration capabilities for multi-step reasoning and tool integration workflows.
  • Collaborate with tech and product teams to identify optimization opportunities and translate them into production improvements.

Skills & Experience

  • High agency individual looking to own and influence product architecture and company direction
  • 3+ years of software engineering experience with focus on performance-critical systems and production deployments.
  • Strong Python expertise and working knowledge of C++ for performance optimization.
  • Working understanding of deep learning fundamentals including transformer architectures, attention mechanisms, and neural network training / inference.
  • Hands-on experience of model serving and deployment techniques.
  • Experience with at least one modern inference framework (vLLM, TensorRT-LLM, SGLang or similar) in a production setting.
  • Hands-on experience with PyTorch including model development, training loops, and basic distributed computing concepts.
  • Understanding of distributed systems concepts including load balancing, auto-scaling, and fault tolerance.
  • Basic GPU programming experience with CUDA or willingness to quickly learn GPU optimization techniques.
  • Strong debugging and performance profiling skills for identifying and resolving system bottlenecks.
  • Benefits

  • Join the ground floor of a mission-driven AI startup revolutionizing compute infrastructure.
  • Work with a high-caliber, globally distributed team backed by major VCs.
  • Competitive compensation and benefits.
  • Fast-paced, flexible work environment with room for ownership and impact.
  • Hybrid model : 3 days in-office, 2 days remote with flexibility to work remotely for part of the year.
  • In case you got any questions about the role please reach out to hiring manager on linkedin or X .

    serp_jobs.job_alerts.create_a_job

    Senior ML Inference Platform Engineer • Seattle, WA, US

    Job_description.internal_linking.related_jobs
    ML Research Engineer, ML Systems

    ML Research Engineer, ML Systems

    Scale AI, Inc. • Seattle, WA, United States
    serp_jobs.job_card.full_time
    Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and opera...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    AIML - Sr. / Staff ML Engineer, Machine Learning Platform & Intelligence

    AIML - Sr. / Staff ML Engineer, Machine Learning Platform & Intelligence

    Apple Inc. • Seattle, WA, United States
    serp_jobs.job_card.full_time
    Seattle, Washington, United States Machine Learning and AI.Imagine what you could do here.At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickl...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Applied AI / ML Engineer

    Applied AI / ML Engineer

    Catalyst Labs • Seattle, WA, US
    serp_jobs.job_card.full_time
    Catalyst Labs is a leading talent agency with a specialized vertical in Applied AI, Machine Learning, and Data Science.We stand out as an agency thats deeply embedded in our clients recruitment ope...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Machine Learning Engineer, Perception, Semantics

    Senior Machine Learning Engineer, Perception, Semantics

    Waymo • Kirkland, WA, United States
    serp_jobs.job_card.full_time
    Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Manager, Machine Learning Engineer - ML Ops

    Senior Manager, Machine Learning Engineer - ML Ops

    Cisco Systems, Inc. • Seattle, WA, United States
    serp_jobs.job_card.full_time
    Applications are accepted until further notice.The Cisco's AI team consists of AI researchers, and software developers who collaborate to build innovative products and platforms for Cisco.We are mo...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Machine Learning Engineer – Model Training Infrastructure

    Machine Learning Engineer – Model Training Infrastructure

    NLP PEOPLE • Seattle, WA, United States
    serp_jobs.job_card.full_time
    The Applied Machine Learning (AML) team is seeking a Machine Learning Platform Engineer to develop and maintain our machine learning platform. The platform supports deep learning models for code dev...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted
    Senior Software Engineer (AI / ML)

    Senior Software Engineer (AI / ML)

    Blue Origin LLC • Seattle, WA, United States
    serp_jobs.job_card.permanent
    Application close date : Applications will be accepted on an ongoing basis until the requisition is closed.At Blue Origin, we envision millions of people living and working in space for the benefit o...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    AIML - Staff Machine Learning Engineer - ML Efficiency, ML Platform & Technology

    AIML - Staff Machine Learning Engineer - ML Efficiency, ML Platform & Technology

    Apple Inc. • Seattle, WA, United States
    serp_jobs.job_card.full_time
    AIML - Staff Machine Learning Engineer - ML Efficiency, ML Platform & Technology.Seattle, Washington, United States Machine Learning and AI. We are seeking highly motivated and experienced engineers...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    AIML - Sr. Machine Learning Engineer - ML Platform Technologies (MLPT)

    AIML - Sr. Machine Learning Engineer - ML Platform Technologies (MLPT)

    Apple Inc. • Seattle, WA, United States
    serp_jobs.job_card.full_time
    Seattle, Washington, United States Machine Learning and AI.Join us in enabling the next generation of intelligent experiences in Apple’s products and services with the latest advancements in Genera...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    AIML-Sr. On-Device Machine Learning Engineer, Measurement

    AIML-Sr. On-Device Machine Learning Engineer, Measurement

    Apple Inc. • Seattle, WA, United States
    serp_jobs.job_card.full_time
    On-Device Machine Learning Engineer, Measurement.Seattle, Washington, United States Software and Services.We are looking for an experienced on-device machine learning engineer to join our team and ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    AIML - Sr Machine Learning Engineer - Data and ML Innovation

    AIML - Sr Machine Learning Engineer - Data and ML Innovation

    Apple Inc. • Seattle, WA, United States
    serp_jobs.job_card.full_time
    AIML - Sr Machine Learning Engineer - Data and ML Innovation.Seattle, Washington, United States - Machine Learning and AI. As a Machine Learning (ML) Engineer, you will be responsible for innovating...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Lead AI / ML Engineer - Remote

    Lead AI / ML Engineer - Remote

    Optum • Bellevue, WA, US
    serp_jobs.filters.remote
    serp_jobs.job_card.full_time
    Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives.The work you do with our team will directly improve health outcomes by connect...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted
    Senior Software Development Engineer - AI / ML, AWS Neuron, Multimodal Inference

    Senior Software Development Engineer - AI / ML, AWS Neuron, Multimodal Inference

    Amazon.com Services LLC • Seattle, WA, US
    serp_jobs.job_card.full_time
    The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon’s custom machine learning a...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    AIML - Machine Learning Engineer - Information Intelligence

    AIML - Machine Learning Engineer - Information Intelligence

    Apple Inc. • Seattle, WA, United States
    serp_jobs.job_card.full_time
    Seattle, Washington, United States Machine Learning and AI.The Knowledge Quality team is looking for extraordinary Machine Learning engineers to join a team of world-experts on Large-Scale Data Man...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior ML Training Engineer

    Senior ML Training Engineer

    AION • Seattle, WA, US
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    AION is building the next generation of AI cloud platform by transforming the future of high-performance computing (HPC) through its decentralized AI cloud. Purpose-built for bare-metal performance,...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    ML Engineer

    ML Engineer

    Sesame • Bellevue, WA, US
    serp_jobs.job_card.full_time
    Get AI-powered advice on this job and more exclusive features.This range is provided by Sesame.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.S...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Software Engineer, Perception ML Data & Benchmark

    Senior Software Engineer, Perception ML Data & Benchmark

    Waymo • Kirkland, WA, United States
    serp_jobs.job_card.full_time
    Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Staff Machine Learning Engineer, ML Performance & Optimization

    Staff Machine Learning Engineer, ML Performance & Optimization

    Waymo • Bellevue, WA, United States
    serp_jobs.job_card.full_time
    Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted