Search jobs > Santa Clara, CA > Machine learning architect

Machine Learning Software Platform Architect

NVIDIA
Santa Clara, CA, US
$148K-$339.3K a year
Full-time

Widely considered to be one of the technology world’s most desirable employers, NVIDIA is an industry leader with groundbreaking developments in High-Performance Computing, Artificial Intelligence and Visualization.

The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services.

Our work opens up new universes to explore, enables amazing creativity and discovery and powers what were once science fiction inventions from artificial intelligence to autonomous cars.

NVIDIA is seeking a highly skilled and experienced Large Language Model (LLM) based Application Infrastructure engineer to join our growing team.

The successful candidate will work at the intersection of GPU chip design and AI. You will be responsible for the design, development, and maintenance of the infrastructure around Nvidia's internal large language model aimed at facilitating chip design.

What you'll be doing :

Develop and maintain the infrastructure for managing large language models (LLMs) based application specifically adapted for the chip design and hardware domain.

Develop and maintain LLM based applications to serve hardware engineers, such as LLM based QA bot, code generator etc.

Collaborate with HW chip designers and LLM research teams to understand the specific needs and challenges of GPU design and ensure the LLM infrastructure is well-suited to these needs.

Collaborate with LLM research teams to collect & organize training / fine-tuning data to train hardware specific language model

Optimize the infrastructure for performance, scalability, and reliability, and ensure the secure and efficient management of data.

Stay updated with the latest industry trends in AI and machine learning, and continuously look for opportunities to apply these advancements to improve the LLM infrastructure.

What we need to see :

BS in computer science or related or equivalent experience

5+ years experience

Experience in developing and maintaining AI or machine learning infrastructure, preferably in the context of large language models.

Strong proficiency in Python and web development, and familiarity with LLM related techniques e.g., langchain, vector database, prompt engineering, etc.

Understanding of chip design and related computational and data challenges.

Experience with data management, including doc cleaning, transformation, and secure storage.

Excellent problem-solving skills and the ability to work effectively in a team.

In depth understanding of Machine Learning / Deep Learning / NLP concepts.

Ways to stand out from the crowd :

You crafted & developed production quality microservices

Strong technical background in cloud / distributed infrastructure

An excellent plus if you are familiar with front-end development using React or Vue.js

Strong understanding of SQL & NoSQL Data platforms.

NVIDIA offers highly competitive salaries and a comprehensive benefits package. We have some of the most forward-thinking and hardworking people in the world working for us and, due to outstanding growth, our exclusive engineering teams are rapidly growing.

Are you a creative and passionate about applying Machine Learning to solve remarkably interesting problems? Are you interested in being involved in state-of-the-art development in the field of AI & love a challenge?

If so, we want to hear from you!

The base salary range is 148,000 USD - 339,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and . NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

30+ days ago
Related jobs
Promoted
Apple
Cupertino, California

We are a production-infrastructure-facing distributed-systems team designing, building, and growing Apple's Machine Learning platform, enabling many of Apple's signature apps and experiences. The team is seeking an expert Software Engineer to focus on building the next generation of production-facin...

Promoted
The Learning Experience #351
Mountain View, California

The newly formed Machine Learning Data Operations (MLDO) team within gUP Operations plays a critical role in delivering and tuning machine learning and GenAI data operations across Google’s product suite, leveraging extensive global vendor networks. Staff Software Engineer, Machine Learning, SearchG...

Promoted
Apple
Cupertino, California

As a member of this team, you can expect among other things to: - Be responsible for the design and architecture of the Places data pipeline to enable rapid integration of various LLM and computer vision based machine learning solutions. Enabling users to discover, explore and visit Places worldwide...

Promoted
Verily
Mountain View, California

As a member of the Precision Health Platform engineering organization, you will build modular, composable, and interoperable platform components including development and maintenance of software for ML applications (data science, computer vision and LLMs). Architect, design, and develop high-quality...

Promoted
Lamini
Menlo Park, California

We are looking for a machine learning systems expert enthusiastic about engaging with all facets of the ML system stack. Demonstrated fluency with data structures, algorithms, architecture, and agile software best practices in any language. Deep technology expertise in machine learning systems, e. E...

Promoted
Google
Mountain View, California

Experience with on-device Machine Learning. Tensor makes transformative user experiences possible with the help of cutting-edge Machine Learning (ML) running on Tensor TPU. You will work closely with cross-functional teams to prototype novel machine learning applications on Tensor System-on-Chip (So...

Promoted
Niantic, Inc.
Sunnyvale, California

Work with cross-functional partners, including engineers, machine learning researchers, product managers, game producers, and art designers, to understand their technical needs, or translate their non-technical demands into technical requirements, and develop platform solutions that accelerate their...

ByteDance
San Jose, California

About the Team: The Applied Machine Learning Enterprise team combines system engineering and machine learning to develop and operate big model service platform that offers businesses Model-as-a-Service solutions (MaaS) to both the big model vendors and users. Experience with large scale machine lear...

Walmart
Sunnyvale, California

We use cutting edge machine learning, data mining and optimization algorithms on this data. Proven experience in leading or managing machine learning/AI teams, with a track record of successful project delivery. Strong computer science fundamentals in algorithms, data structures, databases, machine ...

ByteDance
San Jose, California

About the Team: The Applied Machine Learning Enterprise team combines system engineering and machine learning to develop and operate big model service platform that offers businesses Model-as-a-Service solutions (MaaS) to both the big model vendors and users. Experience with large scale machine lear...