Search jobs > Seattle, WA > Internship > Sr software engineer

Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Amazon Web Services (aws)
Seattle, Washington, US
$151.3K a year
Internship

Description

AWS Neuron is the complete software stack for the AWS Inferentia (Inf1 / Inf2) and Trainium (Trn1), our cloud-scale Machine Learning accelerators.

This role is for a senior machine learning engineer in the Distribute Training team for AWS Neuron, responsible for development, enablement and performance tuning of a wide variety of ML model families, including massive-scale Large Language Models (LLM) such as GPT and Llama, as well as Stable Diffusion, Vision Transformers (ViT) and many more.

The ML Distributed Training team works side by side with chip architects, compiler engineers and runtime engineers to create, build and tune distributed training solutions with Trainium instances.

Experience with training these large models using Python is a must. FSDP (Fully-Sharded Data Parallel), Deepspeed and other distributed training libraries are central to this and extending all of this for the Neuron based system is key.

Key job responsibilities

You will help lead the efforts building distributed training support into Pytorch, Tensorflow using XLA and the Neuron compiler and runtime stacks.

You will help tune these models to ensure highest performance and maximize the efficiency of them running on the custom AWS Trainium and Inferentia silicon and the Trn1, Inf1 / 2 servers.

Strong software development and Machine Learning knowledge are both critical to this role.

Basic Qualifications

  • Bachelor's degree in computer science or equivalent
  • 5+ years of non-internship professional software development experience
  • 5+ years of programming with at least one software programming language experience
  • 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Experience as a mentor, tech lead or leading an engineering team
  • Experience in machine learning, data mining, information retrieval, statistics or natural language processing

Preferred Qualifications

  • Master's degree in computer science or equivalent
  • Experience in computer architecture
  • Previous software engineering expertise with Pytorch / Jax / Tensorflow, Distributed libraries and Frameworks, End-to-end Model Training.

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $151,300 / year in our lowest geographic market up to $261,500 / year in our highest geographic market.

Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.

Company - Annapurna Labs (U.S.) Inc.

Ready to apply Before you do, make sure to read all the details pertaining to this job in the description below.

Job ID : A2668304

J-18808-Ljbffr

13 days ago
Related jobs
Amazon Web Services (aws)
Seattle, Washington

The ML Distributed Training team works side by side with chip architects, compiler engineers and runtime engineers to create, build and tune distributed training solutions with Trainium instances. This role is for a machine learning engineer in the Distributed Training team for AWS Neuron, responsib...

Promoted
Visa
Bellevue, Washington

As a Visa Software Engineer, you will be an integral part of a cross-functional development team inventing, designing, building, and testing software products that reach a truly global customer base. We are looking for talented, curious, and energetic Software Engineers who embrace solving complex c...

Hasbro
Renton, Washington

Develop and maintain long term cross-disciplinary relationships, consistently raising the quality bar for WOTC application titles. As a motivated member of the engineering community, you are charged with raising the bar for both internal and external development partners. Proficiency in using AWS; c...

Nintendo
Redmond, Washington

Degree in Electrical Engineering or Computer Engineering, with emphasis in DSP. The worldwide pioneer in the creation of interactive entertainment, Nintendo Co. Kyoto, Japan, manufactures and markets hardware and software for its Nintendo SwitchTM system and the Nintendo 3DSTM family of portable sys...

Amazon Development Center U.S., Inc.
Seattle, Washington

To do that, we're looking for a Software Engineer to join AWS Clean Rooms to build scalable solutions that delight customers. AWS Applications and Higher Level Abstractions (Apps) provides horizontal and industry vertical applications for business users with the same on-demand scalability, reliabili...

Dell
Seattle, Washington

Join us to do the best work of your career and make a profound social impact as a Software Engineer on our PowerScale Engineering Team in Seattle, Washington or any of Dell’s major US office locations. Knowledge of distributed software systems, operating systems, file systems or embedded software. T...

Amazon.com Services LLC
Redmond, Washington

We are looking for software engineers to join our growing team!. As a Software Engineer in Lake formation you will:. AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistent...

Apple Inc.
Seattle, Washington

Our team is looking to hire a technical leader with a strong record in Applied Research passionate about ML and Human-Computer Interaction, mainly as applied to the responsibility, fairness, and safety of Generative AI. You will collaborate closely with highly skilled machine learning researchers an...

Disney Entertainment & ESPN Technology
Seattle, Washington

As a Sr Software Engineer, you will join Disney Entertainment & ESPN Technology's Growth & Commerce Platform Growth Life Orchestration Team. Bachelor’s degree in Computer Science, Information Systems, Software, Electrical or Electronics Engineering, or comparable field of study, and/or equivalent wo...

Google
Seattle, Washington

You will design, develop, test, deploy, maintain, and enhance software solutions. Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. We're looking for engineers who bring fresh ideas ...