Talent.com
Principal AI Infrastructure Abstraction Engineer
Principal AI Infrastructure Abstraction EngineerCisco Systems, Inc. • San Jose, CA, United States
serp_jobs.error_messages.no_longer_accepting
Principal AI Infrastructure Abstraction Engineer

Principal AI Infrastructure Abstraction Engineer

Cisco Systems, Inc. • San Jose, CA, United States
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

This position requires a hybrid working schedule in the San Jose or Milpitas office.

Meet the Team

We are an innovation team on a mission to transform how enterprises harness AI. Operating with the agility of a startup and the focus of an incubator, we're building a tight-knit group of AI and infrastructure experts driven by bold ideas and a shared goal : to rethink systems from the ground up and deliver breakthrough solutions that redefine what's possible - faster, leaner, and smarter.

We thrive in a fast-paced, experimentation-rich environment where new technologies aren't just welcome - they're expected. Here, you'll work side-by-side with seasoned engineers, architects, and thinkers to craft the kind of iconic products that can reshape industries and unlock entirely new models of operation for the enterprise.

If you're energized by the challenge of solving hard problems, love working at the edge of what's possible, and want to help shape the future of AI infrastructure - we'd love to meet you.

Your Impact

As an AI Infrastructure Abstraction Engineer , you will help shape the next generation of AI compute platforms by designing systems that abstract away hardware complexity and expose logical, scalable, and secure interfaces for AI workloads. Your work will enable multi-tenancy, resource isolation, and dynamic scheduling of GPUs and accelerators at scale - making infrastructure programmable, elastic, and developer-friendly.

You will bridge the gap between raw compute resources and AI / ML frameworks, allowing infrastructure teams and model developers to consume shared GPU resources with the performance and reliability of bare metal, but with the flexibility of cloud-native systems. Your contributions will empower internal and external users to run AI workloads securely, efficiently, and predictably - regardless of the underlying hardware topology.

This role is critical to enabling AI infrastructure that is multi-tenant by design, scalable in practice, and abstracted for portability across diverse platforms.

KEY RESPONSIBILITIES

  • Design and implement infrastructure abstractions that cleanly separate logical compute units (vGPUs, GPU pods, AI queues) from physical hardware (nodes, devices, interconnects) .
  • Develop runtime services, APIs, and control planes to expose GPU and accelerator resources to users and frameworks with multi-tenant isolation and QoS guarantees .
  • Architect systems for secure GPU sharing , including time-slicing, memory partitioning, and namespace isolation across tenants or jobs.
  • Collaborate with platform, orchestration, and scheduling teams to map logical resources to physical devices based on utilization, priority, and topology.
  • Define and enforce resource usage policies , including fair sharing, quota management, and oversubscription strategies.
  • Integrate with model training and serving frameworks (e.g., PyTorch, TensorFlow, Triton) to ensure smooth and predictable resource consumption.
  • Build observability and telemetry pipelines to trace logical-to-physical mappings, usage patterns, and performance anomalies.
  • Partner with infrastructure security teams to ensure secure onboarding, access control, and workload isolation in shared environments.
  • Support internal developers in adopting abstraction APIs, ensuring high performance while abstracting away low-level details.
  • Contribute to the evolution of internal compute platform architecture, with a focus on abstraction, modularity, and scalability.

Minimum Qualifications :

  • Bachelors + 15 years of related experience, or Masters + 12 years of related experience, or PhD + 8 years of related experience
  • Experience building scalable, production-grade infrastructure components or control planes using Go, Python, and C++ ,
  • Experience with Kubernetes, Docker or Kubevirt for v irtualization, containerization , and orchestration frameworks
  • Experience designing or implementing logical resource abstractions for compute, storage, or networking with a focus in multi-tenant environments .
  • Experience integrating with AI / ML platforms or pipelines (e.g., PyTorch, TensorFlow, Triton Inference Server, MLFlow).
  • Preferred Qualifications :

  • Experience with GPU sharing, scheduling, or isolation techniques (e.g., MPS, MIG, time-slicing, device plugin frameworks, or vGPU technologies).
  • Solid grasp of resource management concepts including quotas, fairness, prioritization, and elasticity.
  • #WeAreCisco

    #WeAreCisco where every individual brings their unique skills and perspectives together to pursue our purpose of powering an inclusive future for all.

    Our passion is connection-we celebrate our employees' diverse set of backgrounds and focus on unlocking potential. Cisconians often experience one company, many careers where learning and development are encouraged and supported at every stage. Our technology, tools, and culture pioneered hybrid work trends, allowing all to not only give their best, but be their best.

    We understand our outstanding opportunity to bring communities together and at the heart of that is our people. One-third of Cisconians collaborate in our 30 employee resource organizations, called Inclusive Communities, to connect, foster belonging, learn to be informed allies, and make a difference. Dedicated paid time off to volunteer-80 hours each year-allows us to give back to causes we are passionate about, and nearly 86% do!

    Our purpose, driven by our people, is what makes us the worldwide leader in technology that powers the internet. Helping our customers reimagine their applications, secure their enterprise, transform their infrastructure, and meet their sustainability goals is what we do best. We ensure that every step we take is a step towards a more inclusive future for all. Take your next step and be you, with us!

    serp_jobs.job_alerts.create_a_job

    Principal Engineer Ai • San Jose, CA, United States

    Job_description.internal_linking.related_jobs
    Principal Engineer, Cyber Threat Intelligence

    Principal Engineer, Cyber Threat Intelligence

    VirtualVocations • Hayward, California, United States
    serp_jobs.job_card.full_time
    A company is looking for a Principal Engineer - Cyber Threat Intelligence.Key Responsibilities Lead advanced research and analysis of cyber adversary tactics and procedures Produce threat intell...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
    Principal AI Engineer

    Principal AI Engineer

    Synopsys • Mountain View, CA, United States
    serp_jobs.job_card.full_time
    You are a passionate and driven individual with a degree in Computer Science, Computer Engineering, or Electrical Engineering. With a strong foundation in Artificial Intelligence algorithms and expe...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Principal Network Software and Solution Engineer

    Principal Network Software and Solution Engineer

    Supermicro • San Jose, CA, United States
    serp_jobs.job_card.full_time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    AI Solution Architect

    AI Solution Architect

    Cognizant • San Lorenzo, CA, US
    serp_jobs.job_card.full_time
    Imagine a world where businesses predict market shifts before they happen, anticipate customer needs with precision, make the smartest decisions in real-time, and augment their business processes f...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
    AI Systems Engineer

    AI Systems Engineer

    VirtualVocations • Hayward, California, United States
    serp_jobs.job_card.full_time
    A company is looking for an Engineering & AI Systems Engineer to design and implement internal tools that enhance operational efficiency. Key Responsibilities Build and deploy internal tools to ad...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
    Principal Engineer - AI Infrastructure Abstractions

    Principal Engineer - AI Infrastructure Abstractions

    Diversity Talent Scouts • San Jose, CA, US
    serp_jobs.job_card.full_time
    Principal AI Infrastructure Abstraction Engineer.AI compute environments scalable, secure, and developer-friendly.Your work will focus on creating abstractions that hide hardware complexity while p...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    AI Infrastructure Engineer, Model Serving Platform

    AI Infrastructure Engineer, Model Serving Platform

    Scale AI, Inc. • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    As a Software Engineer on the ML Infrastructure team, you will design and build platforms for scalable, reliable, and efficient serving of LLMs. Our platform powers cutting-edge research and product...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Principal Data Architect

    Principal Data Architect

    VirtualVocations • Fremont, California, United States
    serp_jobs.job_card.full_time
    A company is looking for a Principal Data Architect- Databricks.Key Responsibilities Design and implement complex technical solutions as the Technical Architect for large-scale projects Define t...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Infrastructure Engineer

    Infrastructure Engineer

    VirtualVocations • Fremont, California, United States
    serp_jobs.job_card.full_time
    A company is looking for an Infrastructure Engineer to manage and maintain its physical IT infrastructure.Key Responsibilities Manage, maintain, and optimize physical server and storage infrastru...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Lead Platform Engineer

    Lead Platform Engineer

    VirtualVocations • Fremont, California, United States
    serp_jobs.job_card.full_time
    A company is looking for a Lead Platform Engineer to drive technical architecture and enhance platform infrastructure.Key Responsibilities Design, implement, and maintain scalable and secure clou...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Principal Platform Architect, Agentic AI

    Principal Platform Architect, Agentic AI

    NVIDIA Corporation • Santa Clara, CA, United States
    serp_jobs.job_card.full_time
    NVIDIA has been transforming accelerated computing with innovation that’s fueled by great technology—and amazing people.As part of Nvidia's applied AI team for chip design, you will have the opport...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    AWS Solutions Architect

    AWS Solutions Architect

    VirtualVocations • Fremont, California, United States
    serp_jobs.job_card.full_time
    A company is looking for a Solutions Architect - AWS.Key Responsibilities Architect and design scalable, secure, and highly available cloud solutions on AWS Implement infrastructure as code (IaC...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Principal AI Engineer, Intelligent Sensors

    Principal AI Engineer, Intelligent Sensors

    1010 Analog Devices Inc. • Rio Robles, CA, United States
    serp_jobs.job_card.full_time +1
    NASDAQ : ADI ) is a global semiconductor leader that bridges the physical and digital worlds to enable breakthroughs at the Intelligent Edge. ADI combines analog, digital, and software technologie...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Principal Data Engineer - AI

    Principal Data Engineer - AI

    VirtualVocations • Oakland, California, United States
    serp_jobs.job_card.full_time
    A company is looking for a Principal Data Engineer - AI (REMOTE).Key Responsibilities Define and drive the technical vision for data platforms supporting AI-powered features Lead the design and ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
    Senior Infrastructure Software Engineer, Enterprise AI

    Senior Infrastructure Software Engineer, Enterprise AI

    Scale AI, Inc. • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Scale GP is building the next generation of enterprise-grade Generative AI products.Our platform provides APIs for knowledge retrieval, inference, and evaluation, enabling customers to build and de...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Principal Data Engineer

    Principal Data Engineer

    VirtualVocations • Santa Clara, California, United States
    serp_jobs.job_card.full_time
    A company is looking for a Principal Data Engineer.Key Responsibilities : Lead BI development efforts from design through delivery, addressing operational and financial challenges Document and de...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    AI Infrastructure Engineer, ML Data Platform

    AI Infrastructure Engineer, ML Data Platform

    Scale AI, Inc. • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Scale's AI Infrastructure team supports both R&D and applied Generative AI initiatives, driving breakthroughs in areas of post-training research such as AI safety, agents, and evaluating state-of-t...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Principal Verification Engineer

    Principal Verification Engineer

    OSI Engineering • Menlo Park, CA, US
    serp_jobs.job_card.full_time
    A leading chip and silicon IP provider is seeking a talented Principal Verification Engineer to join its Memory Interconnect Design team. In this full-time hybrid role, you’ll work alongside world-c...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Platform & Infrastructure Engineer

    Platform & Infrastructure Engineer

    Mindsdb • San Francisco, CA, US
    serp_jobs.job_card.full_time
    Job description ABOUT USMindsDB is a fast-growing AI startup headquartered in San Francisco, California.MindsDB is an AI Analytics solution that connects to diverse data sources and applications th...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    AI Infrastructure Engineer

    AI Infrastructure Engineer

    LanceDB • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    From hyper-scalable vector search to advanced retrieval for RAG, from streaming training data to interactive exploration of large-scale AI datasets, LanceDB is the best foundation for your AI appli...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted