Talent.com
Site Reliability Engineer
Site Reliability EngineerPrimer • San Francisco, CA, United States
Site Reliability Engineer

Site Reliability Engineer

Primer • San Francisco, CA, United States
job_description.job_card.30_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Primer helps B2B products break out of the B2C-centric marketing box. Our platform turns consumer ad channels, data streams, and emerging AI workflows into measurable growth engines for go-to-market teams. We ingest billions of rows from first- and third-party sources, map them to rich company context, and surface hyper-targeted audiences and real-time performance alerts—all without vendor lock-in.

That only works if the lights stay on , queries stay fast , and incidents stay rare . That’s where you come in.

As our first dedicated Site Reliability Engineer , you’ll be the force multiplier who designs, builds, and operates the infrastructure that powers everything : petabyte-scale data pipelines, LLM-backed services, and the APIs our customers (and engineers!) rely on every day. You’ll pair hard-won ops experience with a mentor’s mindset—levelling up the whole team while keeping us four steps ahead of failure.

YOUR MISSION

Own reliability from design to customer.

  • Define and uphold SLOs / SLIs, manage error budgets, and lead blameless post-mortems.
  • Automate toil out of existence—CI / CD, infra-as-code, capacity planning, and chaos testing.
  • Drive incident response end-to-end : detection, mitigation, root-cause analysis, and long-term fixes.
  • Scale multi-cloud data pipelines (Prefect, ClickHouse, Iceberg) and GPU / LLM workloads.
  • Teach best practices, review designs, and coach engineers so reliability becomes a team sport.

WHAT YOU’LL DO

  • Design, implement, and tune distributed systems that handle high-throughput B2B traffic .
  • Harden our AWS stack with IaC (e.g. Terraform)
  • Instrument everything—logs, traces, metrics, and AI-powered anomaly detection.
  • Champion security, cost optimization, and disaster-recovery strategies.
  • Jump into the weeds when something breaks, fix it fast, then automate it away.
  • WHAT YOU’LL BRING

    Must-Haves

  • 5+ years owning production systems at meaningful scale (sub-second latency, “four-nines” targets).
  • Mastery of SRE fundamentals : SLO / SLI design, error budgets, incident playbooks.
  • Deep hands-on with Linux, networking, containers / K8s, and at least one major cloud (AWS / GCP / Azure).
  • Proven track record automating infra with Terraform, Helm, or similar IaC tooling.
  • Fluency in at least one systems / scripting language (Go, Python, Rust, etc.).
  • Experience operating complex data pipelines (Prefect, Airflow, Temporal) or real-time streaming systems.
  • History of mentoring engineers and embedding reliability culture across teams.
  • Pragmatic decision-maker—balances uptime, velocity, and cost for startup reality.
  • Curiosity for AI-augmented ops (LLM chat-ops, anomaly detection, self-healing).
  • Nice-to-Haves

  • Managed GPU clusters and ML inference workloads.
  • Operated data lakes / lakehouses at scale (Iceberg, Delta, etc.).
  • Meaningful open-source contributions in SRE, DevOps, or data-infra projects.
  • WHY PRIMER

  • Mission with impact – We’re unlocking new growth channels for thousands of B2B marketers.
  • High-trust, low-ego culture – Fully distributed team, meeting-light weeks, Friday focus days.
  • Work & life, balanced – Five weeks PTO, generous parental leave, and flexibility for families.
  • Career rocket-fuel – Small team, huge problems, real ownership. Shape the future with bold innovators, driving impact that redefines industries.
  • Diverse & global – Teammates span six countries—and counting.
  • Intro Call with Engineering Manager – 30 min
  • System Design – 60 min
  • Operational Excellence Drill-down – 60 min
  • Strategic Pragmatism Chat with CTO – 45 min
  • Technical Coding / Systems Deep Dive – 30 min
  • Culture & Values with CEO – 45 min
  • Decision typically within 24-48 hrs of final conversation.

    READY TO LEVEL UP B2B MARKETING INFRASTRUCTURE?

    Email careers@sayprimer.com with your résumé, , GitHub, or anything that showcases your reliability superpowers. Let’s build the future—without the fire-drills.

    #J-18808-Ljbffr

    serp_jobs.job_alerts.create_a_job

    Site Reliability Engineer • San Francisco, CA, United States

    Job_description.internal_linking.related_jobs
    Site Reliability Engineer

    Site Reliability Engineer

    ConductorOne • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Shape the future of identity with the highest-caliber team.If you’re amazing at what you do and want to solve big challenges in identity and security, come on board. Identity is how companies are be...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Fortinet • Sunnyvale, CA, United States
    serp_jobs.job_card.full_time
    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer I

    Site Reliability Engineer I

    prosper.com • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Bits to Atoms • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Site Reliability Engineer (SRE).You’ll work at the intersection of infrastructure, AI / ML systems, and mission-critical physical operations. You’ll collaborate directly with engineering, AI, and oper...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PsiQuantum • Palo Alto, CA, United States
    serp_jobs.job_card.full_time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Redwood Materials, Inc. • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling—keeping critical minerals in circulation and driving the energy transition.Founded in 2...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Sigmaways Inc • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    As a Site reliability engineer, you will partner with development and IT teams to implement CI / CD pipelines, develop automation and monitoring solutions to ensure our platforms are secure, scalable...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Berkley Hunt • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Senior Site Reliability Engineer (GPU Compute) | Hybrid — Bay Area, CA.Berkley Hunt is supporting a fast-growing AI startup building a high-performance, cloud-native platform to power cutting-edge ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    SS&C Technologies • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    SS&C Technologies is a global investment and financial services software provider, headquartered in Windsor, Connecticut, and supporting more than 28,000 employees across 35 countries.It specialize...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
    Site Reliability Engineer

    Site Reliability Engineer

    WorkOS • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    About WorkOS 🚀 WorkOS builds tools and services for developers to help them implement authentication, identity, authorization, and overall enterprise readiness. We’re a fully distributed team with ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Alchemy • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers th...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Together AI • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Redwood Materials • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling — keeping critical minerals in circulation and driving the energy transition.Founded in...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Fractal • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    This range is provided by Fractal.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Fractal Analytics is a strategic AI partner to Fortune 500 com...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Writemed • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Would you like to join one of the fastest-growing organizations with a goal of using the latest AI, GenAI, LLM, Cloud, and Digital Technologies to advance drug development and improve patient care ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer II

    Site Reliability Engineer II

    Hinge Health • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    From scaling Kubernetes clusters to improving observability with Datadog, we build the tooling and automation that empower product teams to ship with confidence. Collaborate with engineering teams t...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    LTD Global • Berkeley, CA, US
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    We are seeking a Site Reliability Engineer to join our Operations Group.This role plays a key part in advancing scientific discovery by supporting high-performance computing (HPC) and data analysis...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    Baseten • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Site Reliability Engineer (SRE).Baseten powers inference for the world's most dynamic AI companies, like OpenEvidence, Clay, Mirage, Gamma, Sourcegraph, Writer, Abridge, Bland, and Zed.By uniting a...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted