Search jobs > Cupertino, CA > Soc lead

SoC Device Driver Team Lead, Annapurna Labs Machine Learning Accelerators

Annapurna Labs (U.S.) Inc.
Cupertino, California, USA
$151.3K a year
Full-time

Custom silicon chips live at the heart of AWS Machine Learning servers, and our team builds the backend software to run these servers.

We’re looking for someone to lead our system-on-chip (SoC) driver software team and help us deliver at scale, as we build the next generation of driver software.

This is a hands-on, in-the-trenches software engineering leadership position.

As the lead for the SoC driver team, you will :

  • Build and manage a small, strong team of 3-5 developers
  • Work with hardware designers to write drivers for newly developed SoC IPs
  • Work with system software teams to solve SoC and system-level architectural issues, drive debug, and innovate on solutions
  • Refactor and maintain existing codebases throughout the device lifecycle
  • Continuously test and deploy your software stack to multiple internal customers
  • Innovate on the tooling you provide to customers, making it easier for them to use and debug our SoCs

Annapurna Labs, our organization within AWS, designs and deploys some of the largest custom silicon in the world, with many subsystems that must all be managed, tested, and monitored.

The SoC drivers are a critical piece of the AWS infrastructure management software stack that ensures the chip is functional, performant, and secure.

You will thrive in this role if you :

  • Enjoy building, managing, and leading small teams
  • Love solving complex system-level issues
  • Are proficient in C++ and familiar with Python
  • Know how to build effective abstractions over low-level SoC details
  • Are familiar with modular driver architectures (such as the Linux or Windows driver stacks)
  • Have strong opinions about software architecture, and are able to apply them effectively
  • Enjoy learning new technologies, building software at scale, moving fast, and working closely with colleagues as part of a small team within a large organization

Although we build and deploy machine learning chips, no machine learning background is needed for this role. Your team (and your software) won’t be doing machine learning.

Our driver stack lives at the lowest level of the backend AWS infrastructure responsible for managing our ML servers. You and your team will develop drivers for components used by machine learning (example : PCIe, HBM, etc.

but won’t need to deeply understand ML yourselves.

This role can be based in either Cupertino, CA or Austin, TX. The team is split between the two sites, with no preference for one over the other.

This is a fast-paced role where you'll work with thought-leaders in multiple technology areas. You'll have high standards for yourself and everyone you work with, and you'll be constantly looking for ways to improve your software, as well as our products' overall performance, quality, and cost.

We're changing an industry. We're searching for individuals who are ready for this challenge, who want to reach beyond what is possible today.

Come join us and build the future of machine learning!

BASIC QUALIFICATIONS

  • 6+ years of programming with at least one modern language such as C++, C#, Java, Python, Golang, PowerShell, Ruby experience
  • 6+ years of non-internship professional software development experience
  • 4+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience
  • Experience leading the design, build and deployment of complex and performant (reliable and scalable) software solutions in production
  • C++ development experience
  • Experience developing low-level software for hardware (SoC, ASIC, GPU, CPU, etc.)

PREFERRED QUALIFICATIONS

  • Knowledge of engineering practices and patterns for the full software / hardware / networks development life cycle, including coding standards, code reviews, source control management, build processes, testing, certification, and livesite operations
  • Experience taking a leading role in building complex software or computing infrastructure that has been successfully delivered to customers
  • Experience managing a small team of developers, including, but not limited to : scheduling, prioritizing, recruiting, coaching
  • 30+ days ago
Related jobs
Annapurna Labs (U.S.) Inc.
Cupertino, California

We’re looking for someone to lead our system-on-chip (SoC) driver software team and help us deliver at scale, as we build the next generation of driver software. As the lead for the SoC driver team, you will:. Although we build and deploy machine learning chips, no machine learning background is nee...

Promoted
TikTok
San Jose, California

Experience with one or more of the following: Machine Learning, Deep Learning, NLP, ranking systems, recommendation systems, backend, large-scale systems, data science, full-stack. Participate in the development and iteration of Ads algorithms by using Machine Learning. Relevant professional experie...

Annapurna Labs (U.S.) Inc.
Cupertino, California

Custom silicon chips live at the heart of AWS Machine Learning servers TRN and INF, and enable machine learning for AWS's customers. Although we build machine learning chips, no machine learning background is needed for this role. We’re looking for skilled engineers to scale the software team that d...

Promoted
TikTok
San Jose, California

TikTok is the leading destination for short-form mobile video. This is doubly true of the teams that make TikTok possible. To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team. We are Generative AI team under Monetization Technology. ...

Annapurna Labs (U.S.) Inc.
Cupertino, California

As a member of the Cloud-Scale Machine Learning Acceleration team you’ll be responsible for the design and optimization of hardware in our data centers including AWS Inferentia, our custom designed machine learning inference datacenter server. Custom SoCs (System on Chip) live at the heart of AWS Ma...

Promoted
TikTok
San Jose, California

You will have a chance to work with a fully globalized team made up of great engineering talents in different countries, and work closely with cross-functional teams to build safe and trusted connections between users, businesses, and TikTok. We are seeking tech lead software engineers who will lead...

Promoted
Fiddler Labs, Inc
Palo Alto, California

Our team is motivated to unlock the AI opaque box and help society harness the power of AI. Data Science, MLOps, and business teams use Fiddler AI to monitor, explain, analyze, and improve their AI solutions to identify performance gaps, mitigate bias, and drive better outcomes. Our platform enables...

Promoted
TikTok
San Jose, California

The On-Device AI Team at TikTok is at the forefront of embedding advanced AI technologies directly into devices worldwide. Employ efficient machine learning techniques and hardware optimizations to improve algorithmic performance. Deep knowledge in machine learning optimizations such as quantization...

Apple
Cupertino, California

Our team is seeking extraordinary machine learning engineers who are passionate about creating machine learning driven user experiences. Machine Learning Engineer or Software Engineer, working on advancing the state of the art of machine learning, and deploying large-scale distributed systems. We’re...

Advanced Micro Devices, Inc
Santa Clara, California

PREFERRED EXPERIENCE: Project Management industry experience leading large and complex SW products Strong presentation and leadership skills; with ability to lead technical discussions Strong inter-personal skills Outstanding written and oral communication skills Experience in leading sophisticated,...