Talent.com
Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

Air AppsSan Francisco, CA, United States
job_description.job_card.30_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

About Air Apps

At Air Apps, we believe in thinking bigger—and moving faster. We’re a family-founded company on a mission to create the world’s first AI-powered Personal & Entrepreneurial Resource Planner (PRP), and we need your passion and ambition to help us change how people plan, work, and live. Born in Lisbon, Portugal, in 2018—and now with offices in both Lisbon and San Francisco—we’ve remained self-funded while reaching over 100 million downloads worldwide.

Our long-term focus drives us to challenge the status quo every day, pushing the boundaries of AI-driven solutions that truly make a difference. Here, you’ll be a creative force, shaping products that empower people across the globe.

Join us on this journey to redefine resource management—and change lives along the way.

The Role

As a Site Reliability Engineer (SRE) at Air Apps, you will be responsible for ensuring the reliability, availability, and scalability of our systems. You will work at the intersection of software development and operations, implementing automation, monitoring, and performance optimization strategies to minimize downtime and improve system resilience.

Responsibilities

Design and implement scalable, reliable, and fault-tolerant systems across cloud environments.

Develop and maintain observability tools , including monitoring, logging, and alerting (e.g., Prometheus, Grafana, Datadog, ELK).

Automate infrastructure provisioning, deployment, and incident response using Infrastructure as Code (IaC) tools like Terraform or CloudFormation.

Optimize system performance, scalability, and incident response workflows to improve uptime.

Work closely with development and DevOps teams to improve system design for reliability.

Conduct root cause analysis (RCA) and implement preventative measures to minimize failures.

Ensure high availability by designing and maintaining load balancing, failover, and disaster recovery strategies.

Improve CI / CD pipelines to enhance deployment speed while maintaining stability.

Optimize cloud cost and resource utilization for AWS, Azure, or Google Cloud Platform (GCP) .

Participate in on-call rotations to quickly address system failures and minimize downtime.

Requirements

Around 4+ years of experience in Site Reliability Engineering (SRE), DevOps, or System Engineering .

Strong knowledge of cloud platforms (AWS, Azure, or GCP) and cloud-native architectures.

Experience with observability and monitoring tools (Prometheus, Grafana, ELK, Datadog, New Relic).

Proficiency in Infrastructure as Code (IaC) tools such as Terraform, CloudFormation, or Pulumi .

Hands-on experience with containerization and orchestration (Docker, Kubernetes, Helm).

Strong Linux system administration and networking fundamentals.

Experience with incident management, debugging, and root cause analysis .

Proficiency in scripting (Bash, Python, or Go) for automation and system monitoring .

Knowledge of load balancing, failover strategies, and distributed systems .

Understanding of security best practices, access control, and compliance requirements .

Strong communication skills and the ability to collaborate with cross-functional teams.

What benefits are we offering?

Apple hardware ecosystem for work.

Annual Bonus .

Medical Insurance (including vision & dental).

Disability insurance - short and long-term.

401k up to 4% contribution.

Air Stipend of $3,120 / year , paid over 12 monthly installments (for home office, learning, wellness, etc.).

Air Conference 2025 in Las Vegas – an opportunity to meet the team, collaborate, and grow together.

Diversity & Inclusion

At Air Apps, we are committed to fostering a diverse, inclusive, and equitable workplace. We enthusiastically welcome applicants from all backgrounds, experiences, and perspectives. We celebrate diversity in all its forms and believe that varied voices and experiences make us stronger.

Application Disclaimer

At Air Apps, we value transparency and integrity in our hiring process. Applicants must submit their own work without any AI-generated assistance. Any use of AI in application materials, assessments, or interviews will result in disqualification.

#J-18808-Ljbffr

serp_jobs.job_alerts.create_a_job

Site Reliability Engineer Sre • San Francisco, CA, United States

Job_description.internal_linking.related_jobs
  • serp_jobs.job_card.promoted
Site Reliability Engineer

Site Reliability Engineer

ConductorOneSan Francisco, CA, United States
serp_jobs.job_card.full_time
Shape the future of identity with the highest-caliber team.If you’re amazing at what you do and want to solve big challenges in identity and security, come on board. Identity is how companies are be...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Site Reliability Engineer - SRE at Descope Los Altos, CA

Site Reliability Engineer - SRE at Descope Los Altos, CA

Itlearn360Los Altos, CA, United States
serp_jobs.job_card.full_time
Site Reliability Engineer - SRE job at Descope.Descope R&D group is a skilled team of developers with a unique DNA of creativity,flexibility,anopen mindset. We are looking for a passionate SRE to jo...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Site Reliability Engineer

Site Reliability Engineer

xAIPalo Alto, CA, US
serp_jobs.job_card.full_time
AI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering exc...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Senior Site Reliability Engineer, Scalability

Senior Site Reliability Engineer, Scalability

Meraki, LLCSan Francisco, CA, United States
serp_jobs.job_card.full_time
Application window is open until further notice.The Infrastructure SRE team is responsible for the compute, storage and security underpinning Meraki's cloud in 10 data centers worldwide.Meraki's hi...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
Site Reliability Engineer

Site Reliability Engineer

DTEX SystemsFremont, CA, US
serp_jobs.job_card.full_time
serp_jobs.filters_job_card.quick_apply
We are excited that you’ve taken the time to explore our business and potentially join us on this incredible journey.We are already the leader in the Insider Risk Management, but our story do...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Rollbar, Inc.San Francisco, CA, United States
serp_jobs.job_card.full_time
Wikimedia Foundation is hiring a Senior Site Reliability Engineer (SRE) to join our Service Operations SRE team, where we take care of the infrastructure that runs wikipedia.The SRE team at Wikimed...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Site Reliability Engineer

Site Reliability Engineer

ZapierSan Francisco, CA, United States
serp_jobs.job_card.full_time
We're humans who simply think computers should do more work.At Zapier, we’re not just making software—we’re building a platform to help millions of businesses globally scale with automation and AI....serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Principal Site Reliability Engineer

Principal Site Reliability Engineer

Harrison ClarkeSan Francisco, CA, United States
serp_jobs.job_card.full_time
Principal Site Reliability Engineer (SRE).The ideal candidate should have extensive experience in designing highly scalable infrastructure, building systems, and performing testing, monitoring, and...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Senior / Staff Site Reliability Engineer, Storage

Senior / Staff Site Reliability Engineer, Storage

FluidstackSan Francisco, CA, United States
serp_jobs.job_card.full_time
Fluidstack is building GPU supercomputers for top AI labs, governments, and enterprises.Our customers include Mistral, Poolside, Black Forest Labs, Meta, and more. Our team is small, highly motivate...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Site Reliability Engineer - Technical Lead

Site Reliability Engineer - Technical Lead

ZipRecruiterSan Francisco, CA, United States
serp_jobs.job_card.full_time
Veryon is a leading software and technology company that enables aviation teams around the world to improve efficiency and safety. Our products maximize uptime for aircraft maintenance teams through...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
Site Reliability Engineer

Site Reliability Engineer

Foxconn Industrial Internet - FIISan Jose, CA, US
serp_jobs.job_card.full_time +1
serp_jobs.filters_job_card.quick_apply
Site Reliability Engineer Foxconn Industrial Internet (Fii), is a world leading professional design and manufacturing service provider of communication network equipment, cloud service equipment, p...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Site Reliability Engineer (SRE) - grok.com & API

Site Reliability Engineer (SRE) - grok.com & API

Pantera CapitalPalo Alto, CA, United States
serp_jobs.job_card.full_time
AI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excelle...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Site Reliability Engineer

Site Reliability Engineer

CanonicalSan Francisco, CA, United States
serp_jobs.job_card.full_time
Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiat...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Site Reliability Engineer

Site Reliability Engineer

WritemedSan Francisco, CA, United States
serp_jobs.job_card.full_time
Would you like to join one of the fastest-growing organizations with a goal of using the latest AI, GenAI, LLM, Cloud, and Digital Technologies to advance drug development and improve patient care ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Senior Software Engineer, Site Reliability Engineer (SRE)

Senior Software Engineer, Site Reliability Engineer (SRE)

harvey.aiSan Francisco, CA, United States
serp_jobs.job_card.full_time
At Harvey, we’re transforming how legal and professional services operate — not incrementally, but end-to-end.By combining frontier agentic AI, an enterprise-grade platform, and deep domain experti...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
Site Reliability Engineer

Site Reliability Engineer

LTD GlobalBerkeley, CA, US
serp_jobs.job_card.full_time
serp_jobs.filters_job_card.quick_apply
We are seeking a Site Reliability Engineer to join our Operations Group.This role plays a key part in advancing scientific discovery by supporting high-performance computing (HPC) and data analysis...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Associate Site Reliability Engineer

Associate Site Reliability Engineer

Salesforce, Inc.San Francisco, CA, United States
serp_jobs.job_card.full_time
To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts. Job CategorySoftware EngineeringJob Details • • • •Abo...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

BasetenSan Francisco, CA, United States
serp_jobs.job_card.full_time
Site Reliability Engineer (SRE).Baseten powers inference for the world's most dynamic AI companies, like OpenEvidence, Clay, Mirage, Gamma, Sourcegraph, Writer, Abridge, Bland, and Zed.By uniting a...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30