Data Infrastructure Engineer

Acceler8 Talent
CA, United States
Full-time

Join Us as a Data Infrastructure Engineer

Our mission is to deepen the partnership between humans and computers, unlocking collaborative capabilities that far exceed what could be achieved today.

We believe that building delightful end-user experiences requires innovating across the stack - from the UX all the way down to models that achieve the best user value per FLOP.

We believe that a small, focused team of motivated individuals can create outsized breakthroughs. We are building a world-class multi-disciplinary team who are excited to solve hard real-world AI problems.

We are well-capitalized and supported by March Capital and Thrive Capital, with participation from AMD, Franklin Venture Partners, Google, KB Investment, NVIDIA.

About the Role : Data Infrastructure Engineer

As a Data Infrastructure Engineer, you will design, implement, and optimize a scalable infrastructure to prepare the data that powers our AI training.

This infrastructure must be reliable and capable of efficiently processing petabytes of data. You will collaborate closely with the data research team and data crawling team when designing this system.

What You Will Be Working On :

  • Building petabyte-scale, high-throughput data processing systems for preparing and curating datasets for AI training.
  • Orchestrating workloads across large clusters; Architecting and maintaining distributed computing environments.
  • Working directly with our data research team on implementing new methods of data preparation.
  • Troubleshooting and resolving infrastructure-related issues in a timely manner.

What We Are Looking For :

  • Minimum of 6+ years of experience in data-intensive applications and software development.
  • Proficient with Kubernetes & containerization and with building cloud services using providers like AWS, GCP etc.
  • Ability to write, debug and optimize distributed systems and understanding of data orchestration and automation tools (or strong willingness to learn).
  • Proficient in high performance programming languages like Go or Rust or C++.
  • You have previous experience in creating and maintaining infrastructure for processing datasets for ML model training and / or serving.

We encourage you to apply for this position even if you don’t check all of the above requirements but want to spend time pushing on these techniques.

We are based in-person in SF. We offer relocation assistance to new employees.

30+ days ago
Related jobs
Promoted
Scale AI, Inc.
San Francisco, California

In this role, you will lead the design and development of core platforms and systems, while supporting orchestration, data abstraction, data pipelines, identity & access management, and underlying infrastructure. At Scale, our products include the Generative AI Data Engine, SGP, Donovan, and others ...

Promoted
VirtualVocations
Fremont, California

A company is looking for a Senior Data Infrastructure Engineer to join their Data Platform team. ...

Promoted
Luma AI
Palo Alto, California

Senior Software engineer- Data infrastructure. We are looking for people with strong Backend Data Engineering capabilities to build highly efficient, resilient systems & pipelines for large-scale data processing. Requirement of 5+ years of engineering, including 2+ years of work experience in pe...

Promoted
VirtualVocations
Los Angeles, California

A company is looking for a Data Infrastructure Engineer to join their Data Infrastructure team. ...

Promoted
Canonical
San Bernardino, California

The data platform team is a collaborative team that develops managed solutions for a full range of data stores and data technologies, spanning from big data, through NoSQL, cache-layer capabilities, and analytics; all the way to structured SQL engines (similar to Amazon RDS approach). The data platf...

ByteDance
San Jose, California

Our data infrastructure Site Reliability Engineering (SRE) team is a pioneer in innovation. We contribute significantly to the next chapter of data infrastructure. Develop and manage components of cloud-managed data infrastructure, encompassing technologies such as Kubernetes, Redis, MySQL, Flink, a...

Western Digital
San Jose, California

The data will be used for data analytics and visualization to support hard disk drive (HDD) media development for next generation product technologies, including heat assisted magnetic recording (HAMR). Ensure high reliability of the processes, data integrity and accuracy by implementing robust data...

Circle
Los Angeles, California

Experience with: Building Docker images and deploying containers in Kubernetes clusters; Any modern CI/CD platform with seeminglyplex gates and workflows; Blue-Green, Canary, and A/B Testing deployment strategies; Distributed blockchain systems, running and maintaining blockchain full nodes; Databas...

Microsoft
Santa Clara, California

Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the team behind Microsoft’s expanding Cloud Infrastructure and responsible for powering Microsoft’s “Intelligent Cloud” mission. SCHIE delivers the core infrastructure and foundational technologies for Microsoft's over 200 ...

Joby Aviation
Santa Cruz, California

Working as a Senior Software Engineer you will be responsible for further development of infrastructure and platforms for wrangling data produced by Joby Aviation test and flight operations. Ability to support and work closely with a myriad of colleagues who are experts in aircraft design, systems e...