Talent.com
Principal Software Development Engineer - AI Group

Principal Software Development Engineer - AI Group

Advanced Micro DevicesSan Jose, CA, United States
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

THE ROLE

AMD is looking for an AI solutions validation Engineer who is passionate about complex AI solutions, AI infrastructure, building cluster scale automation for distributed training and inference workloads, MLOps. You will be a member of a core team of incredibly talented industry specialists and will work with the very latest hardware and software technology.

THE PERSON

The ideal candidate should be passionate about software engineering, system design, validation, automation and possess leadership skills to drive sophisticated issues to resolution. Able to communicate effectively and work optimally with different teams across AMD.

KEY RESPONSIBILITIES

  • Work with AMD’s architecture specialists to validate AI solutions for distributed training and inference workloads with AMD's ROCM software
  • Build cluster scale automation for distributed training and inference workloads
  • Publish reference designs and benchmark numbers for AI workloads
  • Apply a data minded approach to target optimization efforts
  • Design and develop new groundbreaking AMD technologies
  • Participating in new ASIC and hardware bring ups
  • Develop technical relationships with peers and partners

PREFERRED EXPERIENCE

  • Good experience with complex compute systems used in AI, HPC deployments, backend network designs in RDMA clusters
  • Experience in validating complex AI infrastructure - GPUs, networking, ROCEv2, UEC, running benchmark tests like IBPerf benchmarking, RCCL / NCCL
  • Experience with running training of LLMs, MoE models, Image Generation, recommendations models with different frameworks like PyTorch, Tensorflow, Megatron-LM, JAX. Running training performance benchmarks.
  • Experience with running inference workloads in AI clusters with different inference frameworks like vLLM, SGLang. Running performance benchmarks for inference.
  • Experience with distributed systems and schedulers like Kubernetes, Slurm
  • Ability to write high quality automation frameworks and scripts using Python or Golang
  • Experience with performance profiling of CPUs, GPUs and debugging complex compute, network, storage problems
  • Experience with AMD ROCM would be an added advantage
  • Experience with Linux, Windows operating systems
  • Effective communication and problem-solving skills
  • ACADEMIC CREDENTIALS

  • Bachelor’s or Master’s degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
  • #LI-G11

    #LI-HYBRID

    Benefits offered are described : AMD benefits at a glance.

    AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and / or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

    #J-18808-Ljbffr

    serp_jobs.job_alerts.create_a_job

    Principal Engineer Ai • San Jose, CA, United States

    Job_description.internal_linking.related_jobs
    • serp_jobs.job_card.promoted
    AI Software Engineer - Agent Platform

    AI Software Engineer - Agent Platform

    Perplexity AISan Francisco, CA, US
    serp_jobs.job_card.full_time
    Perplexity is an AI-powered answer engine founded in December 2022 and growing rapidly as one of the world's leading AI platforms. Perplexity has raised over $1B in venture investment from some ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Software Engineer - AI Agent Infrastructure (Healthcare) (Fremont)

    Software Engineer - AI Agent Infrastructure (Healthcare) (Fremont)

    Honey HealthFremont, CA, US
    serp_jobs.job_card.full_time +1
    Honey Health is the all-in-one AI back office for primary and specialty care.Our AI agents autonomously handle core back-office jobs, such as aggregating patient data, processing orders and prescri...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Software Engineer - Automotive AI Frameworks

    Principal Software Engineer - Automotive AI Frameworks

    Renesas Electronics CorporationSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    ADAS領域では、AIを活用した機能や性能への要求が日々高度化しています。そのため、お客様のアプリケーション開発をより容易かつ効率的にする、抽象度の高い使いやすいフレームワークの提供が急務となっています。ルネサスでは、顧客のユースケースを深く理解し、R-Car SoCのシステム性能やAIアクセラレータの能力を最大限に引き出すフレームワークの開発を推進しています。.R-Car SoC向けシステ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal Software Engineer ( Sr Architect )

    Principal Software Engineer ( Sr Architect )

    Blue YonderPalo Alto, CA, United States
    serp_jobs.job_card.full_time
    We are seeking an experienced Principal Software Engineer to lead a team of product engineers in designing, developing, and implementing AI-driven solutions at Blue Yonder and provide strategic tec...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Gen AI Architect

    Principal Gen AI Architect

    HCLTechSanta Clara, CA, US
    serp_jobs.job_card.full_time
    HCLTech is looking for a highly talented and self- motivated Principal Gen AI Architect to join it in advancing the technological world through innovation and creativity. Job Title : Principal Gen AI...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Software Engineer – AI Systems

    Principal Software Engineer – AI Systems

    WalmartSunnyvale, CA, United States
    serp_jobs.job_card.full_time
    Design and implement large-scale, production-grade AI systems that integrate LLMs and Generative AI into real-world applications. Build frameworks that support Retrieval-Augmented Generation (RAG), ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal AI Software Engineer, Product

    Principal AI Software Engineer, Product

    MonographSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Gusto is a modern, online people platform that helps small businesses take care of their teams.On top of full-service payroll, Gusto offers health insurance, 401(k)s, expert HR, and team management...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Senior / Principal Software Engineer, AI Enablement (Full stack)

    Senior / Principal Software Engineer, AI Enablement (Full stack)

    GenentechSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    We advance science so that we all have more time with the people we love.It’s what drives us to innovate.To continuously advance science and ensure everyone has access to the healthcare they need t...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal AI Software Engineer, Product

    Principal AI Software Engineer, Product

    GustoSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Denver, CO; San Francisco, CA; New York, NY.Gusto is a modern, online people platform that helps small businesses take care of their teams. On top of full-service payroll, Gusto offers health insura...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Software Engineer, Gen AI Platform

    Software Engineer, Gen AI Platform

    AbridgeSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Abridge was founded in 2018 with the mission of powering deeper understanding in healthcare.Our AI-powered platform was purpose-built for medical conversations, improving clinical documentation eff...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal AI Engineer, Intelligent Sensors

    Principal AI Engineer, Intelligent Sensors

    1010 Analog Devices Inc.Rio Robles, CA, United States
    serp_jobs.job_card.full_time +1
    NASDAQ : ADI ) is a global semiconductor leader that bridges the physical and digital worlds to enable breakthroughs at the Intelligent Edge. ADI combines analog, digital, and software technologie...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Software Engineer, Managed AI

    Principal Software Engineer, Managed AI

    Crusoe Energy Systems LLCSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Crusoe's mission is to accelerate the abundance of energy and intelligence.We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, spe...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Software Development Engineer, Full Stack

    Principal Software Development Engineer, Full Stack

    WorkdaySan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Join the AI Agent Engineering team, where we're pioneering cutting-edge HR & Finance AI Agents that deeply integrate within the Workday suite. Be part of an innovative, agile force architecting inte...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    • serp_jobs.job_card.new
    Principal Software Development Engineer

    Principal Software Development Engineer

    FortinetSunnyvale, CA, United States
    serp_jobs.job_card.full_time
    Fortinet is calling for an experienced Principal Software Developer who can think outside the box, has a logical approach to coding, and is looking to grow their career in the network security indu...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
    • serp_jobs.job_card.promoted
    Senior / Staff Software Engineer - AI Agent Infrastructure (Healthcare) (Fremont)

    Senior / Staff Software Engineer - AI Agent Infrastructure (Healthcare) (Fremont)

    Honey HealthFremont, CA, US
    serp_jobs.job_card.full_time +1
    Honey Health is the all-in-one AI back office for primary and specialty care.Our AI agents autonomously handle core back-office jobs, such as aggregating patients data, processing orders and prescr...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Software Engineer – AI Agents

    Principal Software Engineer – AI Agents

    GoodLeap, LLCSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    About GoodLeap : GoodLeap is a technology company delivering best-in-class financing and software products for sustainable solutions, from solar panels and batteries to energy-efficient HVAC, heat p...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal Software Engineer – AI Systems

    Principal Software Engineer – AI Systems

    Walmart CanadaSunnyvale, CA, United States
    serp_jobs.job_card.full_time
    Balance functional requirements with non-functional goals such as reliability, latency, and security.Generative AI / LLMs • • in production. Strong coding skills in • •Python (preferred) • • and at least o...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Software Engineer, Crusoe Cloud

    Principal Software Engineer, Crusoe Cloud

    Crusoe Energy Systems LLCSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Cruose's mission is to accelerate the abundance of energy and intelligence.We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, spe...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30