Machine Learning Researcher- Inference

Acceler8 Talent

CA, United States

Full-time

What We're Developing

As we enter a new phase of expansion, our focus is on partnering with commercial clients to tailor and enhance our advanced models to meet their specific business needs.

Our achievements in creating, aligning, and deploying state-of-the-art models in our highly empathetic consumer-facing chatbot have laid a solid foundation for success.

With strong financial backing and abundant H100 resources, we have established a resilient infrastructure and efficient workflows to support top-tier finetuning.

By joining our team, you'll have the opportunity to leverage your skills while being part of a vibrant organization that values innovation and teamwork.

About Us

We are a small, interdisciplinary AI studio. We have trained several state-of-the-art language models, including multiple versions, and developed a personal assistant.

Currently, our studio is dedicated to finetuning and deploying models for specific applications for our commercial clients.

We believe that artificial intelligence marks the beginning of a period of exponential transformation. Our name reflects this moment of change, and our status as a public benefit corporation provides us with the legal framework to prioritize the well-being and happiness of our partners, users, and broader stakeholders above all else.

About the Position

Research Engineer, Member of Technical Staff (Inference)

As part of our commitment to deploying high-performance models for enterprise applications, our inference team ensures that these models operate efficiently and effectively in real-world situations.

Research engineers in this position focus on optimizing model inference processes, reducing response times, and enhancing throughput without sacrificing model performance, ensuring reliable deployment in corporate settings.

This role is ideal for you if you :

Have experience with deploying and optimizing large language models for inference, both in cloud and on-premises environments.
Are skilled in using tools and frameworks for model optimization and acceleration, such as ONNX, TensorRT, or TVM.
Enjoy diagnosing and resolving complex issues related to model performance and scalability.
Have a strong understanding of the trade-offs involved in model inference, including hardware limitations and real-time processing requirements.
Are proficient with PyTorch and familiar with infrastructure management tools like Docker and Kubernetes for deploying inference pipelines.

We do not require a specific educational background or a set number of years of experience. We are eager to see what you have been building.

Please send us examples of your best work, including but not limited to links to open-source contributions, personal projects, or a cover letter describing past projects that you are proud of.

Keywords : Advanced models, Efficient workflows, Finetuning, Innovation, language models, LLMs,

Inference, High-performance models Enterprise applications, Optimizing model inference, Enhancing throughput, Reliable deployment, Large language models,

Cloud environments, Model optimization, Model acceleration, ONNX, TensorRT, TVM, Scalability,

Real-time processing, PyTorch, Docker, Kubernetes, Inference pipelines

30+ days ago

Related jobs

Promoted

Machine Learning Researcher- Inference

Acceler8 Talent

CA, United States

Inference, High-performance models Enterprise applications, Optimizing model inference, Enhancing throughput, Reliable deployment, Large language models,. Research Engineer, Member of Technical Staff (Inference). As part of our commitment to deploying high-performance models for enterprise applicati...

Promoted

Machine Learning Researcher, Multimodal Foundation Models

Apple

Sunnyvale, California

We (Spatial Perception Team) looking for a machine learning researcher to work on the field of Generative AI and multi-modal foundation models. We are continuously advancing the state of the art in Computer Vision and Machine Learning. Deep understanding of multi-task, multi-modal machine learning d...

Promoted

Lead Machine Learning Researcher

Acceler8 Talent

Mountain View, California

Pioneering (Lead) Machine Learning Researcher. Provide valuable insights and guidance on hardware architecture, particularly from a Machine Learning perspective. Architect and establish distributed infrastructure for both training and inference processes. ...

Promoted

AIML - Senior Machine Learning/Natural Language Understanding Researcher, Siri and Information Intelligence

Apple

Cupertino, California

As part of Apple's Machine Learning and AI team, we transform every Apple product and because we fully integrate hardware and software, we can collaborate to deliver amazing experiences while protecting the privacy of users and their data. We are looking for an experienced machine learning expert wh...

Promoted

Machine Learning Researcher

MatX

Mountain View, California

Build and set up distributed infrastructure for training and inference. Build and set up distributed infrastructure for training and inference. Build and set up distributed infrastructure for training and inference. ...

Machine Learning Engineer Graduate (MultiMedia - Causal Inference) - 2025 Start (PhD)

TikTok

San Jose, California

Familiar with applied machine learning, such as classification, deep neural networks, transformers, multi-task learning, etc. We are currently seeking a passionate machine learning engineer to join our team. Develop machine learning solutions for user behavior prediction, app traffic prediction, and...

Student Researcher - Doubao (Seed) - Machine Learning System - 2025 Start (PhD)

ByteDance

San Jose, California

Team IntroductionThe AML Machine Learning System team combines system engineering and machine learning to develop and operate massively distributed machine learning training, inference systems and services around the world. Responsibilities- Participate in the research and development of machine lea...

Applied Machine Learning Engineer - Causal Inference Recommendation

DoorDash

Sunnyvale, California

We’re looking for a passionate Applied Machine Learning expert to join our team. In this role, you will utilize our robust data and machine learning infrastructure to build recommendation system, and implementing new AI solutions to expand restaurants selection and drive their growth. You will be ex...

Machine Learning Researcher

Apple

Sunnyvale, California

We’re looking for a passionate Machine Learning Researcher to join our team of applied researchers and software engineers who develop CV/ML technologies for new Apple products. We’re looking for a Machine Learning Researcher to join our team and contribute to highly visible and impactful projects. Y...

Researcher in Machine Learning

Fujitsu

Santa Clara, California

Research Scientist at Fujitsu's Self-Improving ML team (Sunnyvale, CA).The Self-Improving ML team at Fujitsu aims to create algorithms that automatically improve themselves, enabling them to handle a wider range of tasks with improved accuracy while requiring less training data....