Search jobs > Cupertino, CA > Soc lead

SoC Device Driver Team Lead, Annapurna Labs Machine Learning Accelerators

Annapurna Labs (U.S.) Inc.
Cupertino, California, USA
$151.3K a year
Full-time

Custom silicon chips live at the heart of AWS Machine Learning servers, and our team builds the backend software to run these servers.

We’re looking for someone to lead our system-on-chip (SoC) driver software team and help us deliver at scale, as we build the next generation of driver software.

This is a hands-on, in-the-trenches software engineering leadership position.

As the lead for the SoC driver team, you will :

  • Build and manage a small, strong team of 3-5 developers
  • Work with hardware designers to write drivers for newly developed SoC IPs
  • Work with system software teams to solve SoC and system-level architectural issues, drive debug, and innovate on solutions
  • Refactor and maintain existing codebases throughout the device lifecycle
  • Continuously test and deploy your software stack to multiple internal customers
  • Innovate on the tooling you provide to customers, making it easier for them to use and debug our SoCs

Annapurna Labs, our organization within AWS, designs and deploys some of the largest custom silicon in the world, with many subsystems that must all be managed, tested, and monitored.

The SoC drivers are a critical piece of the AWS infrastructure management software stack that ensures the chip is functional, performant, and secure.

You will thrive in this role if you :

  • Enjoy building, managing, and leading small teams
  • Love solving complex system-level issues
  • Are proficient in C++ and familiar with Python
  • Know how to build effective abstractions over low-level SoC details
  • Are familiar with modular driver architectures (such as the Linux or Windows driver stacks)
  • Have strong opinions about software architecture, and are able to apply them effectively
  • Enjoy learning new technologies, building software at scale, moving fast, and working closely with colleagues as part of a small team within a large organization

Although we build and deploy machine learning chips, no machine learning background is needed for this role. Your team (and your software) won’t be doing machine learning.

Our driver stack lives at the lowest level of the backend AWS infrastructure responsible for managing our ML servers. You and your team will develop drivers for components used by machine learning (example : PCIe, HBM, etc.

but won’t need to deeply understand ML yourselves.

This role can be based in either Cupertino, CA or Austin, TX. The team is split between the two sites, with no preference for one over the other.

This is a fast-paced role where you'll work with thought-leaders in multiple technology areas. You'll have high standards for yourself and everyone you work with, and you'll be constantly looking for ways to improve your software, as well as our products' overall performance, quality, and cost.

We're changing an industry. We're searching for individuals who are ready for this challenge, who want to reach beyond what is possible today.

Come join us and build the future of machine learning!

BASIC QUALIFICATIONS

  • 6+ years of programming with at least one modern language such as C++, C#, Java, Python, Golang, PowerShell, Ruby experience
  • 6+ years of non-internship professional software development experience
  • 4+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience
  • Experience leading the design, build and deployment of complex and performant (reliable and scalable) software solutions in production
  • C++ development experience
  • Experience developing low-level software for hardware (SoC, ASIC, GPU, CPU, etc.)

PREFERRED QUALIFICATIONS

  • Knowledge of engineering practices and patterns for the full software / hardware / networks development life cycle, including coding standards, code reviews, source control management, build processes, testing, certification, and livesite operations
  • Experience taking a leading role in building complex software or computing infrastructure that has been successfully delivered to customers
  • Experience managing a small team of developers, including, but not limited to : scheduling, prioritizing, recruiting, coaching
  • 30+ days ago
Related jobs
Promoted
Acceler8 Talent
Mountain View, California

Pioneering (Lead) Machine Learning Researcher. Provide valuable insights and guidance on hardware architecture, particularly from a Machine Learning perspective. Join our pioneering team at the forefront of AGI computing. If you are ready to make a significant impact in the field of ML research and ...

Promoted
TikTok
San Jose, California

As a machine learning engineer on the Ads Signal team, you will develop novel machine learning solutions, build scalable tech foundations and launch various products to maximize signal values for ads in a privacy-preserving way. Responsible for developing machine learning and PET solutions for vario...

Annapurna Labs (U.S.) Inc.
Cupertino, California

The Team: As a whole, the Amazon Annapurna Labs team is responsible for silicon development at AWS. Machine Learning Compiler Engineer III on the AWS Neuron team, you will be a thought leader supporting the ground-up development and scaling of a compiler to handle the world's largest ML workloads. T...

Promoted
TikTok
San Jose, California

You will have a chance to work with a fully globalized team made up of great engineering talents in different countries, and work closely with cross-functional teams to build safe and trusted connections between users, businesses, and TikTok. We are seeking tech lead software engineers who will lead...

Annapurna Labs (U.S.) Inc.
Cupertino, California

SDM of Software Development for the Machine Learning Distributed Training, Core Technologies and Infra org, you will be responsible for leading a strong teams of software engineers and managers to help design and deploy a software that enables ML workloads work seamlessly on these new products. AWS ...

Promoted
TikTok
San Jose, California

As a Machine Learning Engineer on the App Ads & Gaming team, you will make efforts to develop novel machine learning solutions for ranking, build scalable foundations and launch various products that maximize the efficiency of deep funnel app ads delivery. Hence, you'll have a chance to get deep...

ByteDance
San Jose, California

About the team:The Applied Machine Learning (AML) team serves as ByteDance's central AI organization. Participate in the application and optimization of machine learning algorithms in products such as Douyin (TikTok) and Toutiao. This is doubly true of the teams that make our innovations possible. T...

Annapurna Labs (U.S.) Inc.
Cupertino, California

AWS Neuron SDK is the complete software stack for the AWS Inferentia and Trainium machine learning accelerators designed by Annapurna Labs inside AWS. This position is for a Software Engineer for the AWS Neuron SDK team with a deep background in Linux and device drivers. It’s also preinstalled in AW...

SAMSUNG
Mountain View, California

As a machine learning model engineer of the Samsung Ads Platform Intelligence (PI) team, you will have access to unique Samsung proprietary data to develop and deploy a wide spectrum of large-scale machine learning products with real-world impact. Lead multiple global feature teams to deliver produc...

Prudential Financial
CA, US

As a Lead, Machine Learning Engineer. Machine Learning and Deep Learning:. Are you interested in building capabilities that enable the organization with innovation, speed, agility, scalability and efficiency? The Global Technology team takes great pride in our culture where digital transformation is...