Talent.com
Engineering Manager - AI DevOps
Engineering Manager - AI DevOpsNVIDIA • California, MO, US
serp_jobs.error_messages.no_longer_accepting
Engineering Manager - AI DevOps

Engineering Manager - AI DevOps

NVIDIA • California, MO, US
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Overview

NVIDIA is looking for an outstanding AI DevOps Engineering Manager to lead and expand our next-gen inference operations infrastructure. Join us in transforming AI inference delivery, supporting NVIDIA's innovative products like Dynamo, Triton, NIXL, and our growing range of AI inference solutions. This role is essential for our GitHub First initiative, enabling public CI / CD infrastructure with GPU and Kubernetes capabilities to deliver high-throughput, low-latency inferencing solutions in distributed environments. Lead a team ensuring our AI products achieve outstanding performance and reliability worldwide.

Responsibilities

  • Supervise a team of DevOps engineers with expertise in AI inference infrastructure, test automation (SDET), and Infrastructure as Code (IaC).
  • Architect and implement scalable test automation strategies for AI inference workloads, including performance benchmarking and automated quality gates.
  • Lead the maintenance of our GitHub First public CI infrastructure, focusing on single / multi-GPU testing, Kubernetes multi-node GPU testing, and CSP validation.
  • Drive Infrastructure as Code efforts by employing Terraform, Ansible, and Kubernetes to support scaling across multiple clouds and lead GPU clusters effectively.
  • Attain operational proficiency encompassing 24x7 on-call rotations, SRE methodologies, automated monitoring, and self-repairing systems to guarantee uptime exceeding 99.9%.
  • Lead release coordination, cost optimization, and management of multi-cloud deployments.

Qualifications

  • Bachelor's / Master's degree in Computer Science, Engineering, or equivalent experience.
  • 4+ years leading DevOps / SRE organizations with direct SDET leadership experience.
  • 8+ years hands-on experience in software development, test automation, or infrastructure engineering with AI / ML or GPU-intensive workloads.
  • Proficiency in Infrastructure as Code (IaC) platforms : Terraform, Ansible, or CloudFormation with exposure to multiple cloud environments (AWS, GCP, Azure, OCI).
  • Strong technical leadership in test automation frameworks, CI / CD pipeline development, and quality engineering practices.
  • Familiarity with containerization and orchestration tools such as Docker and Kubernetes for leading AI / ML workloads and GPU resources.
  • Proven success building and scaling teams in fast-paced, high-growth environments.
  • Effective interpersonal skills to collaborate with remote teams and build agreement.
  • Proficiency in Python, Rust, or related programming languages and the ability to engage in architecture conversations.
  • Demonstrated history of operational proficiency encompassing 24x7 on-call oversight, SRE methodologies, and robust high-availability infrastructures.
  • Ways To Stand Out

  • Experience with CI / CD (specifically GitHub Actions), releasing Open-source AI software.
  • Proficient in Deep AI / ML infrastructure with expertise in NVIDIA technologies such as CUDA, TensorRT, Dynamo and Triton Inference Server, including coordinating GPU cluster operations and GPU workload performance benchmarking.
  • Background in DevOps, system software testing, and previous experience leading teams on inference engines, model serving platforms, or AI acceleration frameworks.
  • Track record with monitoring tools (Prometheus, Grafana), security scanning, static / dynamic analysis tools, and license compliance automation for critical AI inferencing frameworks.
  • Compensation & Benefits

    Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 224,000 USD - 356,500 USD for Level 3, and 272,000 USD - 425,500 USD for Level 4. You will also be eligible for equity and benefits.

    EEO Statement

    NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. We value diversity in our current and future employees and do not discriminate in hiring or promotion on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or any other characteristic protected by law.

    Applications for this job will be accepted at least until September 29, 2025.

    J-18808-Ljbffr

    serp_jobs.job_alerts.create_a_job

    Engineering Manager • California, MO, US

    Job_description.internal_linking.related_jobs
    AI Strategy Consultant, Frontier Tech

    AI Strategy Consultant, Frontier Tech

    Scale AI, Inc. • California, MO, United States
    serp_jobs.job_card.part_time
    As a member of our Frontier Tech Consultant team, you will play a critical role in advancing cutting-edge AI innovations by conducting high-impact experiments and ensuring seamless execution at the...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Hardware Engineering Program Manager

    Hardware Engineering Program Manager

    Davita Inc. • California, MO, United States
    serp_jobs.job_card.full_time
    Verkada is a leader in cloud-based B2B physical security.Verkada offers six product lines - video security cameras, access control, environmental sensors, alarms, workplace and intercoms - integrat...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Technical Program Manager, RL Developer Infrastructure

    Technical Program Manager, RL Developer Infrastructure

    Meta • Jefferson City, MO, US
    serp_jobs.job_card.full_time
    Are you looking to shape the future of computing?Meta is bringing together cutting edge research and engineering to deliver products and experiences for the future of immersive computing and connec...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Architect, AI

    Architect, AI

    Oracle • Jefferson City, MO, US
    serp_jobs.job_card.full_time
    Design, develop, troubleshoot and debug software programs for databases, applications, tools, networks etc.As a member of the software engineering division, you will take an active role in the defi...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Engineering Manager - Creator Data Services San Mateo, CA, United States Enginee

    Engineering Manager - Creator Data Services San Mateo, CA, United States Enginee

    Mediabistro • California, MO, United States
    serp_jobs.job_card.full_time
    Engineering Manager - Creator Data Services.Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences– all c...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    CEO-in-Residence - Aviation / Travel Tech Portfolio

    CEO-in-Residence - Aviation / Travel Tech Portfolio

    UP.Labs • California, MO, United States
    serp_jobs.job_card.full_time
    Founding CEO : Lead the Next Breakthrough in Aviation / Travel Tech with UP.If you've built and exited a successful aviation / travel-related SaaS or data-driven company—and are eager to do it again—thi...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    AI Delivery Director AI Center of Excellence - Remote

    AI Delivery Director AI Center of Excellence - Remote

    USA Jobs • Jefferson City, MO, US
    serp_jobs.filters.remote
    serp_jobs.job_card.full_time
    AI Delivery Director AI Center Of Excellence - Remote.We help people get the medicine they need to feel better and live well. It fuels our passion and drives every decision we make.We are putting to...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Machine Learning Engineer, Enterprise GenAI

    Machine Learning Engineer, Enterprise GenAI

    Scale AI, Inc. • California, MO, United States
    serp_jobs.job_card.full_time
    AI is becoming vitally important in every function of our society.At Scale, our mission is to accelerate the development of AI applications. For 8 years, Scale has been the leading AI data foundry, ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    IBM is hiring : IBM Associate Partner - SAP User Experience Architect in Jefferso

    IBM is hiring : IBM Associate Partner - SAP User Experience Architect in Jefferso

    Mediabistro • Jefferson City, MO, United States
    serp_jobs.job_card.full_time
    We are seeking a talented IBM Associate Partner - SAP User Experience Architect to join our dynamic team.The ideal candidate will possess deep expertise across SAP's UX interfaces and applications,...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30
    General Manager, Automation, Engineering and Consulting

    General Manager, Automation, Engineering and Consulting

    Verusaec • California, MO, United States
    serp_jobs.job_card.full_time
    General Manager, Automation, Engineering and Consulting.Verus is looking for a self-motivated, organized and results-driven leader to join our close-knit leadership team. Our clients are important t...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Head of Growth Engineering and Operations

    Head of Growth Engineering and Operations

    Verkada • California, MO, United States
    serp_jobs.job_card.full_time
    Verkada is a leader in cloud-based B2B physical security.Verkada offers six product lines - video security cameras, access control, environmental sensors, alarms, workplace and intercoms - integrat...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Principal Machine Learning Engineer, Firefly

    Principal Machine Learning Engineer, Firefly

    Adobe Inc. • California, MO, United States
    serp_jobs.job_card.full_time
    Changing the world through digital experiences is what Adobe is all about.We empower everyone—from emerging artists to global brands—to design and deliver exceptional digital experiences.Our passio...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    General Manager – North America

    General Manager – North America

    Ultrahuman Healthcare Private Limited • California, MO, United States
    serp_jobs.job_card.full_time
    San Francisco, United States | Posted on 03 / 19 / 2025.North America represents one of Ultrahuman’s most significant markets, with tech-savvy consumers and a high demand for cutting-edge health soluti...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Director Colocation Infrastructure Construction Delivery-Data Centers

    Senior Director Colocation Infrastructure Construction Delivery-Data Centers

    USA Jobs • Jefferson City, MO, US
    serp_jobs.job_card.full_time
    Senior Director Of Colocation Infrastructure Construction Delivery.The Data Center Infrastructure Construction team at Oracle Cloud Infrastructure is a dynamic group of professionals dedicated to d...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Principal Data Scientist, AI Foundations

    Principal Data Scientist, AI Foundations

    Capital One • California, MO, United States
    serp_jobs.job_card.full_time +1
    Principal Data Scientist, AI Foundations.Data is at the center of everything we do.As a startup, we disrupted the credit card industry by individually personalizing every credit card offer using st...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Founding Engineer - Anchor AI Job at Pear VC in California

    Founding Engineer - Anchor AI Job at Pear VC in California

    Mediabistro • California, MO, United States
    serp_jobs.job_card.full_time
    About the job Global trade still runs on outdated, time-consuming manual workflows.We’re fixing that! Anchor AI is building AI agents for customs brokers. Our AI works alongside humans to help...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    Director Client Engagement Partner- Utilities / AI All Cities

    Director Client Engagement Partner- Utilities / AI All Cities

    Athari • California, MO, United States
    serp_jobs.job_card.full_time +1
    As a Consult Partner, you will play a vital role with utility industry clients and account teams, building senior-level business and technology relationships and driving value-based outcomes.You wi...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Engineering Manager

    Engineering Manager

    Hitachi • Jefferson City, MO, US
    serp_jobs.job_card.full_time
    Join Hitachi Energy in Jefferson City, MO, and lead a talented team shaping the future of power distribution.You'll guide a team of 30+ engineers and designers, drive innovation, and collaborate di...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Project Manager / Lead Risk Assessment Manager

    Project Manager / Lead Risk Assessment Manager

    Chameleon Integrated Services • Jefferson City, MO, US
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    We are a growing information technology company that offers its employees a culture of success, the chance to work on revolutionary federal IT infrastructure, and the opportunity to grow alongside ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    AI / ML Computational Science Associate Director

    AI / ML Computational Science Associate Director

    Accenture • California, MO, United States
    serp_jobs.job_card.full_time
    Advanced AI Research Scientist Assoc Director.AI / ML Computational Science Associate Director | Senior Level | Full time. Accenture is helping companies use generative AI to reinvent their enterprise...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted