Search jobs > San Francisco, CA > Software engineer

Software Engineer, Model Inference

Openai
San Francisco, California, US
Full-time

About The Team

Not sure what skills you will need for this opportunity Simply read the full description below to get a complete picture of candidate requirements.

Our team brings OpenAI’s most capable technology to the world through our products. Most recently, we released ChatGPT, GPT-4, the Whisper API, and DALL-E.

We empower consumers and developers alike to use and access our start-of-the-art AI models, allowing them to do things that they’ve never been able to before.

Across all product lines, we ensure that these powerful tools are used responsibly. This is a key part of OpenAI’s path towards safely deploying broadly beneficial Artificial General Intelligence (AGI).

Safety is more important to us than unfettered growth.

About The Role

We're looking for an engineer to join our team at OpenAI to help us scale up our critical inference infrastructure, which efficiently services every customer request to use our state-of-the-art AI models, including GPT-4 and Dall-E.

In This Role, You Will

  • Work alongside machine learning researchers, engineers, and product managers to bring our latest technologies into production.
  • Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our deployed models.
  • Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues.
  • Optimize our code and fleet of Azure VMs to utilize every FLOP and every GB of GPU RAM of our hardware.

You Might Thrive In This Role If You

  • Have an understanding of modern ML architectures and an intuition for how to optimize their performance, particularly for inference.
  • Own problems end-to-end, and are willing to pick up whatever knowledge you're missing to get the job done.
  • Have at least 3 years of professional software engineering experience.
  • Are an expert in core HPC technologies : InfiniBand, MPI, CUDA.
  • Understand how to overlap compute and communication to maximize utilization of scarce compute, memory, and bandwidth resources.
  • Have experience architecting, observing, and debugging production distributed systems.
  • Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed.
  • Have needed to rebuild or substantially refactor production systems several times over due to rapidly increasing scale.
  • Are self-directed and enjoy figuring out the most important problem to work on.
  • Have a good intuition for when off-the-shelf solutions will work, and build tools to accelerate your own workflow quickly if they won’t.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.

We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products.

AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.

For US Based Candidates : Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared.

Join us in shaping the future of technology.

J-18808-Ljbffr

13 days ago
Related jobs
Promoted
Openai
San Francisco, California

We're looking for an engineer to join our team at OpenAI to help us scale up our critical inference infrastructure, which efficiently services every customer request to use our state-of-the-art AI models, including GPT-4 and Dall-E. Have at least 3 years of professional software engineering experien...

Promoted
Scale AI, Inc.
San Francisco, California

Our Generative AI Data Engine powers the world's most advanced LLMs and generative models through world-class RLHF (Reinforcement Learning with Human Feedback), human data generation, model evaluation, safety, and alignment. As a Software Engineer on the team, you'll focus on building systems that m...

Promoted
Lyft
San Francisco, California

The team develops and improves algorithms to leverage data from driver phones/cars to model driver telematics, model traffic, and directly serve accurate ETAs and routes. Lyft is looking for experienced software engineers from a scope of disciplines. These models will be used to make good driver and...

Promoted
Scale AI, Inc.
San Francisco, California

Our Generative AI Data Engine powers the world's most advanced LLMs and generative models through world-class RLHF (Reinforcement Learning with Human Feedback), human data generation, model evaluation, safety, and alignment. As a Software Engineer on the team, you'll focus on building systems that m...

Promoted
DaVita Inc.
San Francisco, California

Software is eating the world, but AI is eating software. Since before the launch of ChatGPT, through to the latest generation of frontier models coming out today, Scale has been at the forefront of providing the post-training, fine-tuning, and human preference alignment (RLHF) data needed to ensure ...

Promoted
Software Aspekte
San Francisco, California

Reporting to an Engineering Manager, the Product Software Engineer will work on the Models team. Engineers in this role will be directly working with customers and other engineering teams to build products and customer experiences that will continue to enable the acceleration of machine learning mod...

Promoted
https:/beam.biz/sitemap.xml
San Francisco, California

We are seeking a backend engineer focused on AI inference to join the team powering Deepgram’s core speech inference APIs. You’ll implement and optimize inference code, experiment with cutting-edge technologies, and develop, maintain, and deploy the stack of services behind our blazing-fast, massive...

Promoted
Baseten
San Francisco, California

Deep understanding of software engineering principles and a proven track record of developing and deploying AI/ML inference solutions. Are you passionate about advancing the frontiers of artificial intelligence? We are looking for a Senior Software Engineer to join our dynamic team. Optimize and sca...

OpenAI
San Francisco, California

We're looking for an engineer to join our team at OpenAI to help us scale up our critical inference infrastructure, which efficiently services every customer request to use our state-of-the-art AI models, including GPT-4 and Dall-E. Have at least 3 years of professional software engineering experien...

Tbwa Chiat/Day Inc
San Francisco, California

Formulate specialized optimization solutions for various inference paradigms and scenarios (autoregressive models, denoising models, hierarchical models, state machines, multi-agent systems, cloud-based inference). We are looking for a Software Engineer to work at the forefront of deploying our cutt...