Search jobs > San Francisco, CA > Data infrastructure

Data Infrastructure Engineer

OpenAI
San Francisco
Full-time

About the Team

You’ll join the team that’s behind OpenAI’s data infrastructure that powers critical engineering, product, alignment teams that are core to the work we do at OpenAI.

The systems we support include our data warehouse, batch compute infrastructure, streaming infrastructure, data orchestration system, data lake, vector databases, critical integrations, and more.

About the Role

The Applied Data Platform team designs, builds, and operates the foundational data infrastructure that enables products and teams at OpenAI.

You are comfortable with work such as scaling Kubernetes services, OLAP systems, debugging Kafka consumer lag, diagnosing distributed kv store failures, designing a system to retrieve image vectors with low latency.

You are well versed with infrastructure tooling such as Terraform, worked with Kubernetes, and have the SRE skill sets.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will :

Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, streaming infrastructure while ensuring scalability, reliability, and security

Ensure our data platform can scale reliably to the next several orders of magnitude

Accelerate company productivity by empowering your fellow engineers & teammates with excellent data tooling and systems, providing a best in case experience

Bring new features and capabilities to the world by partnering with product engineers, trust & safety and other teams to build the technical foundations

Like all other teams, we are responsible for the reliability of the systems we build. This includes an on-call rotation to respond to critical incidents as needed

You might thrive in this role if you :

Have 4+ years in data infrastructure engineering OR

Have 4+ years in infrastructure engineering with a strong interest in data

Take pride in building and operating scalable, reliable, secure systems

Are comfortable with ambiguity and rapid change

Have a voracious and intrinsic desire to learn and fill in missing skills and an equally strong talent for sharing learnings clearly and concisely with others

Some of the technologies you’ll be working with include Apache Spark, Clickhouse, Python, Terraform, Kafka, Azure EventHub, Vector DBs.

30+ days ago
Related jobs
Promoted
Scale AI, Inc.
San Francisco, California

In this role, you will lead the design and development of core platforms and systems, while supporting orchestration, data abstraction, data pipelines, identity & access management, and underlying infrastructure. At Scale, our products include the Generative AI Data Engine, SGP, Donovan, and others ...

Jobs via eFinancialCareers
San Francisco, California

As a Senior Site Reliability Engineer at Circle, you will design, build, and maintain Circle's infrastructure estate to meet the growing worldwide customer base on public cloud providers across multiple regions. Circle is a financial technology company at the epicenter of the emerging internet of mo...

Scale AI, Inc.
San Francisco, California

In this role, you will lead the design and development of core platforms and systems, while supporting orchestration, data abstraction, data pipelines, identity & access management, and underlying infrastructure. At Scale, our products include the Generative AI Data Engine, SGP, Donovan, and oth...

Roblox
San Mateo, California

You will be part of a team of full stack engineers working on tools that enhance the productivity of Roblox engineers and the reliability of our applications. Work with both your engineering and product (PM) leaders to achieve the vision of a golden path for most engineering workflows. As a Senior S...

Figma
San Francisco, California

In practice, our responsibilities span a vast surface area ranging from streaming and batch ingestion, data job orchestration, data warehousing, and creating simple interfaces for users to load and transform data into insights and product features. Excellent technical communication skills and experi...

Acceler8 Talent
CA, United States

As a Data Infrastructure Engineer, you will design, implement, and optimize a scalable infrastructure to prepare the data that powers our AI training. Join Us as a Data Infrastructure Engineer. About the Role: Data Infrastructure Engineer. This infrastructure must be reliable and capable of efficien...

Watershed
San Francisco, California

Have worked on data infrastructure-focused engineering teams, built a data lake or data warehouse from scratch, or solved problems related to storing, transforming, and querying tabular data. You'll be joining the brand new Data Infrastructure team and defining the tools, schemas, and frameworks the...

Tarsal
San Francisco, California

Full Time] Founding Engineer: Data Infrastructure (Senior & Staff). Build ETL pipelines for data profiling, data cleaning, and data aggregation. Scaling our data infrastructure to multi-terabyte scale as we expand our customer base. Ensure data integrity across all data sources. ...

EON Systems, Inc.
San Francisco, California

As a data engineer, you will be responsible for acquisition, processing and handling of large amounts of complex neuroscientific data. You will build and maintain an end-to-end cloud-based data pipeline structure from data capture to providing processed data to our ML models. You will be collaborati...

Money Fit by DRS
San Francisco, California

The systems we support include our data warehouse, batch compute infrastructure, streaming infrastructure, data orchestration system, data lake, vector databases, critical integrations, and more. Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration,...