AI Systems & Data EngineerHyperFi • San Francisco, California, United States, 94102

AI Systems & Data Engineer

HyperFi • San Francisco, California, United States, 94102

job_description.job_card.30_days_ago

serp_jobs.job_preview.job_type

serp_jobs.job_card.full_time

job_description.job_card.job_description

About HyperFi

We're building the kind of platform we always wanted to use : fast, flexible, and built for making sense of real-world complexity. Behind the scenes is a robust, event-driven architecture that connects systems, abstracts messy workflows, and leaves room for smart automation. The surface is clean and simple. The interactions are seamless and intuitive. The machinery underneath is anything but. That's where you come in.

We're a well-networked founding team with strong execution roots and a clear roadmap. We're backed, focused, and delivering fast.

We are seeking an AI Systems & Data Engineer to join our team. We are building a fast, flexible, and complex platform with a robust, event-driven architecture. This role requires expertise in building data pipelines within the Databricks environment, specifically for ingesting unstructured data, and leveraging that data to build AI agents.

What You'll Do

Design and build data pipelines in Databricks for ingesting unstructured data.
Construct retrieval-augmented generation (RAG) systems from scratch using ingested data.
Build agentic LLM pipelines utilizing frameworks like LangChain, LangGraph, and LangSmith.
Own orchestration of PySpark and Databricks workflows to prepare inputs and track outputs for AI models.
Instrument evaluation metrics and telemetry to guide the evolution of prompt strategies.
Work alongside product, frontend, and backend engineers to tightly integrate AI into user-facing flows.
Leverage Databricks features such as Auto Loader for automatic detection of new files on cloud storage and schema changes.
Utilize Delta Lake for reliability, security, and performance on the data lake for streaming and batch operations.
Apply Databricks Workflows for orchestrating tasks to integrate data.
Implement Delta Live Tables for building reliable data pipelines with a declarative approach.

Tech Stack (So Far)

Python (primary language for all LLM + orchestration work)

LangChain + LangGraph + LangSmith

Databricks + PySpark for processing, labeling, and training context

Gemini + model routing logic

Postgres, and custom orchestration via MCP

GitHub Actions, GCP

You'll be a crucial member of rolling out products that will have immediate impact.

How We Build

Engineers come first : your time, focus, and judgment are respected

Deep work >

chaos : fixed cycles & cooldowns protect focus and keep context switching low

Autonomy is the default : trusted builders who own outcomes, no babysitters

Ship daily, safely : merge early, integrate vertically, ship often, use feature flags, and keep momentum

Outcomes over optics : solve real problems, not ticket soup

Voice matters : from week one, contribute, improve something, and shape how we build

Senior peers, no ego : collaborate in a high-trust, async-friendly environment

Bold problems, cool tech : work on complex challenges that actually move the needle

Fun is part of it : we move fast, but we also celebrate wins and laugh together

What We're Looking For

5-7 years of experience building production-grade ML, data, or AI systems.

Strong grasp of prompt engineering, context construction, and retrieval design.

Comfortable working in LangChain and building agents.

Experience with PySpark and Databricks to handle real-world data scale.

Ability to write testable, maintainable Python with clear structure.

Understanding of model evaluation, observability, and feedback loops.

Excited to push from prototype production iteration.

Familiarity with Databricks Data Intelligence Platform which unifies data warehousing and AI use cases on a single platform.

Knowledge of Unity Catalog for open and unified governance of data, analytics, and AI on the lakehouse.

Understanding of data security concerns related to AI and how to mitigate them using the Databricks AI Security Framework (DASF).

Confident English skills to collaborate clearly and effectively with teammates

Bonus If You :

Have built scalable agent-like workflows on the Databricks platform.

Have worked on semantic chunking, vector search, or hybrid retrieval strategies.

Can walk us through a real-world prompt failure and how you fixed it.

Have contributed to OSS tools or internal AI platforms.

Think of yourself as both an engineer and a systems designer.

Are familiar with the concept of a data lakehouse architecture.

Location & Compensation

Must be based in San Francisco, Las Vegas, or Tel Aviv

Full-time role with competitive comp

Flexible hours, async-friendly culture, engineering-led environment

PI0c7bd322b09c-30511-38844592

serp_jobs.job_alerts.create_a_job

Ai Data Engineer • San Francisco, California, United States, 94102