Search jobs > Santa Clara, CA > Internship > Performance engineer

Senior High-Performance LLM Training Engineer

Nvidia Corporation
Santa Clara, California, US
Internship

Senior High-Performance LLM Training Engineer

We are now looking for a Senior High-Performance LLM Training Engineer!

Applying for this role is straight forward Scroll down and click on Apply to be considered for this position.

NVIDIA is seeking experienced engineers specializing in performance analysis and optimization to improve the efficiency of LLM training workloads, which are shaping the world's most advanced computing systems.

This position focuses on optimizing NVIDIA’s high-performance LLM software stack in frameworks like PyTorch and JAX for high-performance training on thousands of GPUs, while also helping shape hardware roadmaps for the next generation of GPUs powering the AI revolution.

What you will be doing :

  • Understand, analyze, profile, and optimize AI training workloads on innovative hardware and software platforms.
  • Understand the big picture of training performance on GPUs, prioritizing and then solving problems across all state-of-the-art neural networks.
  • Implement production-quality software in multiple layers of NVIDIA's deep learning platform stack, from drivers to DL frameworks.
  • Build and support NVIDIA submissions to the MLPerf Training benchmark suite.
  • Implement key DL training workloads in NVIDIA's proprietary processor and system simulators to enable future architecture studies.
  • Build tools to automate workload analysis, workload optimization, and other critical workflows.

What we want to see :

  • PhD in Computer Science, Electrical Engineering or Computer Engineering and 5+ years; or MS (or equivalent experience) and 8+ years of meaningful work experience.
  • Strong background in deep learning and neural networks, in particular training.
  • A deep background in computer architecture and familiarity with the fundamentals of GPU architecture.
  • Proven experience analyzing and tuning application performance & processor and system-level performance modelling.
  • Programming skills in C++, Python, and CUDA.

GPU computing is the most productive and pervasive platform for deep learning and AI. It begins with the most advanced GPUs and the systems and software we build on top of them.

We integrate and optimize every deep learning framework. We work with the major systems companies and every major cloud service provider to make GPUs available in data centers and in the cloud.

We craft computers and software to bring AI to edge devices, such as self-driving cars and autonomous robots. AI has the potential to spur a wave of social progress unmatched since the industrial revolution.

Widely considered to be one of tech's most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package.

Additionally, this opportunity offers you the ability to collaborate with some of the most forward-thinking and hard-working people in the world, shaping the future of AI in a creative and autonomous work environment that encourages innovation.

If you're excited to work across the full hardware & software stack from GPU architecture to application code to achieve optimal performance, we want to hear from you!

The base salary range is 180,000 USD - 339,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

J-18808-Ljbffr

5 days ago
Related jobs
Promoted
Penn Foster
Santa Clara, California

Widely considered to be one of the technology world’s most desirable employers, NVIDIA is an industry leader with groundbreaking developments in High-Performance Computing, Artificial Intelligence and Visualization. In order to do that, we also need a groundbreaking AI infrastructure platform that s...

Promoted
Nvidia Corporation
Santa Clara, California

Senior Performance Engineer - Deep Learning. NVIDIA is hiring senior software engineers to build and optimize the tools Deep Learning engineers use across the world to design, develop, and deploy AI applications. Develop and optimize open-source libraries, like Transformer Engine, which enables the ...

Promoted
Infinera
Sunnyvale, California

Senior High-Speed Analog IC Design RD Engineer page is loaded Senior High-Speed Analog IC Design RD Engineer Apply locations Sunnyvale, California time type Full time posted on Posted 2 Days Ago job requisition id 2023603. Infinera is seeking a Senior/Staff High-Speed Analog IC Design R&D Engine...

Promoted
Apple, Inc.
Cupertino, California

The Apple Maps team is looking for technically expert Performance Engineers to support the performance qualification, analysis, and profiling of all Maps services. Performance Engineering and development. This is challenging and requires both a passion for solving difficult problems and good skills ...

Promoted
Samsung Semiconductor
San Jose, California

Specifically, we are looking for a senior staff memory and storage performance engineer for cutting edge technologies including CXL memory and storage who can performance thorough system performance test, evaluation, and analysis for AI and cloud applications that leverage CXL memory and storage, in...

Promoted
Google
Sunnyvale, California

We're proud to be our engineers' engineers and love voiding warranties by taking things apart so we can rebuild them. Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. We're looking ...

NVIDIA
Santa Clara, California

Senior Developer Technology Engineer for High-Performance Databases!. Data preprocessing and data engineering are traditionally CPU based and are becoming the bottleneck for Machine Learning (ML) and Deep Learning (DL) applications, as performance of the frameworks and core ML/DL libraries has been ...

High-tech Professionals
San Jose, California

The Senior Electrical Engineer position is set to face the challenge of developing high-level-spec modules (mainly the deflection modules) for new-generation products of SEM (scanning electron microscope) system. BS or MS degree in Engineering or Applied Physics, decent knowledge in electrical engin...

Apple
Cupertino, California

We're looking for a talented engineer to build, maintain and improve WebKit performance testing infrastructure and tools. WebKit is the industry-leading web browser engine that delivers cutting-edge web features and the best performance. If you like to tinker with continuous integration, to look dee...

NVIDIA
Santa Clara, California
Remote

As a member of the DLFW Infrastructure team, you will provide leadership in the design and implementation of groundbreaking GPU compute cluster that runs demanding deep learning, high performance computing, and computationally intensive workloads. In this role, you will help us with the strategic ch...