SoC Device Driver Team Lead, Annapurna Labs Machine Learning Accelerators

Annapurna Labs (U.S.) Inc.

Cupertino, California, USA

$151.3K a year

Full-time

Custom silicon chips live at the heart of AWS Machine Learning servers, and our team builds the backend software to run these servers.

We’re looking for someone to lead our system-on-chip (SoC) driver software team and help us deliver at scale, as we build the next generation of driver software.

This is a hands-on, in-the-trenches software engineering leadership position.

As the lead for the SoC driver team, you will :

Build and manage a small, strong team of 3-5 developers
Work with hardware designers to write drivers for newly developed SoC IPs
Work with system software teams to solve SoC and system-level architectural issues, drive debug, and innovate on solutions
Refactor and maintain existing codebases throughout the device lifecycle
Continuously test and deploy your software stack to multiple internal customers
Innovate on the tooling you provide to customers, making it easier for them to use and debug our SoCs

Annapurna Labs, our organization within AWS, designs and deploys some of the largest custom silicon in the world, with many subsystems that must all be managed, tested, and monitored.

The SoC drivers are a critical piece of the AWS infrastructure management software stack that ensures the chip is functional, performant, and secure.

You will thrive in this role if you :

Enjoy building, managing, and leading small teams
Love solving complex system-level issues
Are proficient in C++ and familiar with Python
Know how to build effective abstractions over low-level SoC details
Are familiar with modular driver architectures (such as the Linux or Windows driver stacks)
Have strong opinions about software architecture, and are able to apply them effectively
Enjoy learning new technologies, building software at scale, moving fast, and working closely with colleagues as part of a small team within a large organization

Although we build and deploy machine learning chips, no machine learning background is needed for this role. Your team (and your software) won’t be doing machine learning.

Our driver stack lives at the lowest level of the backend AWS infrastructure responsible for managing our ML servers. You and your team will develop drivers for components used by machine learning (example : PCIe, HBM, etc.

but won’t need to deeply understand ML yourselves.

This role can be based in either Cupertino, CA or Austin, TX. The team is split between the two sites, with no preference for one over the other.

This is a fast-paced role where you'll work with thought-leaders in multiple technology areas. You'll have high standards for yourself and everyone you work with, and you'll be constantly looking for ways to improve your software, as well as our products' overall performance, quality, and cost.

We're changing an industry. We're searching for individuals who are ready for this challenge, who want to reach beyond what is possible today.

Come join us and build the future of machine learning!

BASIC QUALIFICATIONS

6+ years of programming with at least one modern language such as C++, C#, Java, Python, Golang, PowerShell, Ruby experience
6+ years of non-internship professional software development experience
4+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience
Experience leading the design, build and deployment of complex and performant (reliable and scalable) software solutions in production
C++ development experience
Experience developing low-level software for hardware (SoC, ASIC, GPU, CPU, etc.)

PREFERRED QUALIFICATIONS

Knowledge of engineering practices and patterns for the full software / hardware / networks development life cycle, including coding standards, code reviews, source control management, build processes, testing, certification, and livesite operations
Experience taking a leading role in building complex software or computing infrastructure that has been successfully delivered to customers
Experience managing a small team of developers, including, but not limited to : scheduling, prioritizing, recruiting, coaching

30+ days ago

Related jobs

Promoted

Lead Machine Learning Researcher

Acceler8 Talent

Mountain View, California

Pioneering (Lead) Machine Learning Researcher. Provide valuable insights and guidance on hardware architecture, particularly from a Machine Learning perspective. Join our pioneering team at the forefront of AGI computing. If you are ready to make a significant impact in the field of ML research and ...

Promoted

Tech Lead Senior Machine Learning Engineer - Ads Signal

TikTok

San Jose, California

As a machine learning engineer on the Ads Signal team, you will develop novel machine learning solutions, build scalable tech foundations and launch various products to maximize signal values for ads in a privacy-preserving way. Responsible for developing machine learning and PET solutions for vario...

Sr. Machine Learning - Compiler Engineer III, AWS Neuron, Annapurna Labs

Annapurna Labs (U.S.) Inc.

Cupertino, California

The Team: As a whole, the Amazon Annapurna Labs team is responsible for silicon development at AWS. Machine Learning Compiler Engineer III on the AWS Neuron team, you will be a thought leader supporting the ground-up development and scaling of a compiler to handle the world's largest ML workloads. T...

Promoted

Tech Lead Machine Learning Engineer, Tiktok Business Integrity

TikTok

San Jose, California

You will have a chance to work with a fully globalized team made up of great engineering talents in different countries, and work closely with cross-functional teams to build safe and trusted connections between users, businesses, and TikTok. We are seeking tech lead software engineers who will lead...

Sr. Software Development Manager, AWS Neuron Machine Learning Distributed Training, Core Technologies and Infra (CoreTex)

Annapurna Labs (U.S.) Inc.

Cupertino, California

SDM of Software Development for the Machine Learning Distributed Training, Core Technologies and Infra org, you will be responsible for leading a strong teams of software engineers and managers to help design and deploy a software that enables ML workloads work seamlessly on these new products. AWS ...

Promoted

Tech Lead Machine Learning Engineer, App Ads and Gaming

TikTok

San Jose, California

As a Machine Learning Engineer on the App Ads & Gaming team, you will make efforts to develop novel machine learning solutions for ranking, build scalable foundations and launch various products that maximize the efficiency of deep funnel app ads delivery. Hence, you'll have a chance to get deep...

Tech Lead - Applied Machine Learning Algorithm

ByteDance

San Jose, California

About the team:The Applied Machine Learning (AML) team serves as ByteDance's central AI organization. Participate in the application and optimization of machine learning algorithms in products such as Douyin (TikTok) and Toutiao. This is doubly true of the teams that make our innovations possible. T...

Runtime/Driver Software Development Engineer, Neuron Runtime

Annapurna Labs (U.S.) Inc.

Cupertino, California

AWS Neuron SDK is the complete software stack for the AWS Inferentia and Trainium machine learning accelerators designed by Annapurna Labs inside AWS. This position is for a Software Engineer for the AWS Neuron SDK team with a deep background in Linux and device drivers. It’s also preinstalled in AW...

Machine Learning Model Engineer Lead

SAMSUNG

Mountain View, California

As a machine learning model engineer of the Samsung Ads Platform Intelligence (PI) team, you will have access to unique Samsung proprietary data to develop and deploy a wide spectrum of large-scale machine learning products with real-world impact. Lead multiple global feature teams to deliver produc...

Lead, Machine Learning Engineer

Prudential Financial

CA, US

As a Lead, Machine Learning Engineer. Machine Learning and Deep Learning:. Are you interested in building capabilities that enable the organization with innovation, speed, agility, scalability and efficiency? The Global Technology team takes great pride in our culture where digital transformation is...

SoC Device Driver Team Lead, Annapurna Labs Machine Learning Accelerators

Lead Machine Learning Researcher

Tech Lead Senior Machine Learning Engineer - Ads Signal

Sr. Machine Learning - Compiler Engineer III, AWS Neuron, Annapurna Labs

Tech Lead Machine Learning Engineer, Tiktok Business Integrity

Sr. Software Development Manager, AWS Neuron Machine Learning Distributed Training, Core Technologies and Infra (CoreTex)

Tech Lead Machine Learning Engineer, App Ads and Gaming

Tech Lead - Applied Machine Learning Algorithm

Runtime/Driver Software Development Engineer, Neuron Runtime

Machine Learning Model Engineer Lead

Lead, Machine Learning Engineer

Popular searches