Software Engineer, Machine Learning Infrastructure

Scale AI, Inc.
Long Island City, New York, US
Full-time

Scale is looking for an AI / ML Infrastructure Engineer to join our Machine Learning Infrastructure team to build out our Training Platform.

You will partner closely with Machine Learning researchers to understand their requirements and apply your own domain expertise and our compute resources to accelerate experimentation throughput.

If your skills, experience, and qualifications match those in this job overview, do not delay your application.

The ideal candidate is someone who has strong fundamentals in machine learning, backend system design, and has prior ML Infrastructure experience.

You should also be comfortable with infrastructure and large scale system design, as well as diagnosing both model performance and system failures.

You will :

  • Build highly available, observable, performant, and cost-effective APIs for model training.
  • Participate in our team's on call process to ensure the availability of our services.
  • Own projects end-to-end, from requirements, scoping, design, to implementation, in a highly collaborative and cross-functional environment.
  • Exercise good taste in building systems and tools and know when to make build vs. buy tradeoffs, with an eye for cost efficiency.

Ideally you'd have :

  • 4+ years of experience building machine learning training pipelines or inference services in a production setting.
  • Experience with distributed training techniques such as DeepSpeed, FSDP, etc.
  • Experience building, deploying, and monitoring complex microservice architectures.
  • Experience with Python, Docker, Kubernetes, and Infrastructure as code (e.g. terraform).

Nice to haves :

  • Experience with LLM inference latency optimization techniques, e.g. kernel fusion, quantization, dynamic batching, etc.
  • Experience working with a cloud technology stack (eg. AWS or GCP).

J-18808-Ljbffr

4 days ago
Related jobs
Promoted
VirtualVocations
Queens, New York

A company is looking for a Software Engineer, Machine Learning to join their R&D team focused on observability platforms. ...

Promoted
JPMorganChase
Queens, New York

As a Senior Machine Learning Operations Engineer, VP within our Consumer & Community Banking division, you will be responsible for building and maintaining pipelines for model training, batch/real-time model serving, hyperparameter tuning at scale, model monitoring, production validation and oth...

Promoted
Money Fit by DRS
Queens, New York

AND 5 years full-time Software Engineering work experience OR 8-10 years full-time Software Engineering work experience from which at least 5 years working on Machine Learning systems/platforms/applications. As a Machine Learning Engineer, you will:. Experience working with machine learning infrastr...

Promoted
Doordash
Queens, New York

Senior Software Engineer, Android Infrastructure. As a Senior Android Software Engineer on the Android Infrastructure team, you will build the foundational pieces for all DoorDash Android applications. You will work closely with engineers, technical product managers, and engineering managers across ...

Rippling
New York, New York

We use machine learning and large language models to build software which helps our customers operate their business effectively. You are a seasoned software engineer – having 8+ years of industry experience building software at some (or all) levels of the stack (foundational infra, backed, ux). Exp...

AllCloud
New York, New York
Remote

We are looking for a savvy Machine Learning/Data Engineer to join our growing team of data experts. The Machine Learning Engineer will support new systems designs and migrate existing ones, working closely with solutions architects, project managers, and data scientists. We seek a candidate with 3+ ...

M-Logic
New York, New York

Premier non-bank financial services firm looking for a machine learning engineer with an entrepreneurial mindset and strong interest in using classical machine learning techniques to help drive a multi -billion dollar business. Bachelor's degree in a technical subject ( science, machine learning, ma...

HKR MEDIA SRL
New York, New York

As the foundational hire for our machine learning practice, you must have experience building machine learning models with different frameworks and technologies. Work closely with our founding Staff Engineer, founding Data Engineer, and founding AI Engnineer to deliver unique insights to customers. ...

Celonis
New York, New York

At Celonis, our EPE (Engineering Product Excellence) Chapter consistently works to improve processes across agile teams and shift testing left so that we surface quality issues early and often. As a Staff Quality Engineer, you have a keen understanding of what it takes to deliver a high-quality prod...

S&P Global
New York, New York

We work alongside product teams across MI ES on break-through ideas using tools and techniques spanning the entire spectrum of Data Science, Statistics, Machine Learning, Deep Learning, Gen AI, Operations Research, Data and Machine Learning Engineering. Develop and implement machine learning models ...