Search jobs > Washington, DC > Staff site reliability

Staff Site Reliability Engineer - Incident Response

Zscaler
Washington, District of Columbia, US
Full-time

Our Engineering team built the world's largest cloud security platform from the ground up, and we keep building. With more than 100 patents and big plans for enhancing services and increasing our global footprint, the team has made us and our multitenant architecture today's cloud security leader, with more than 15 million users in 185 countries.

Bring your vision and passion to our team of cloud architects, software engineers, security experts, and more who are enabling organizations worldwide to harness speed and agility with a cloud-first strategy.

Is this the next step in your career Find out if you are the right candidate by reading through the complete overview below.

NOTE : U.S. citizenship is required for this position due to the nature of the customers assigned to this role

We're looking for an experienced Staff Site Reliability Engineer-Incident Response to join our Shared Platform Engineer team.

Reporting to the Director Cloud Operations and Incident Management, you'll be responsible for :

  • Lead and advocate for the transformation to a world-leading SRE organization, promoting SRE principles within the Engineering Department.
  • Provide expert leadership during critical outages, coordinating multiple teams to ensure streamlined decision-making and quick resolution.
  • Promote a customer-focused approach by addressing and mitigating global customer environment issues, and fostering a culture of continuous learning and technical excellence within the SRE team.
  • Develop and implement scalable process frameworks and observability strategies to ensure rapid problem diagnosis, response, and service reliability.
  • Collaborate with product teams to thoroughly analyze failures and integrate insights to improve service reliability, scalability, and operational efficiency.

What We're Looking for (Minimum Qualifications)

  • 5+ years of experience as a Site Reliability Engineer, with relevant experience in an Operations or Engineering environment.
  • Hands-on experience troubleshooting Linux-based systems
  • Networking knowledge and able to troubleshoot TCP / IP, SSL / TLS, DNSSEC, IPsec, and BGP issues.
  • Coding experience (preferably Python) building tools, scripting, or automation
  • Bachelor's degree in Computer Science, a related technical field involving computer systems engineering, or equivalent practical experience.

What Will Make You Stand Out (Preferred Qualifications)

  • Experience supporting High / Moderate FedRAMP environments
  • Understanding of Observability practices and Tools - Grafana, DataDog, Splunk, etc
  • Experience Leading Major Incidents in large scale, high uptime environments

LI-YC2

LI-Remote

This role offers remote work option

J-18808-Ljbffr

11 days ago
Related jobs
Promoted
DaVita Inc.
Washington, District of Columbia

We're looking for an experienced Staff Site Reliability Engineer (Federal) to join our ZPA team, reporting to the Senior Manager SRE. Site Reliability Engineer, with relevant experience in an Operations or Engineering environment. Our Engineering team built the world's largest cloud security platfor...

Promoted
VirtualVocations
Washington, District of Columbia

A company is looking for a Senior Associate Site Reliability Engineer responsible for designing, building, and maintaining infrastructure for highly available solutions. ...

Promoted
Palantir
Washington, District of Columbia

Site Reliability Engineers combine engineering experience and an innate drive to improve existing systems and processes, with the creativity to develop novel solutions to evolving challenges. We’re looking for Site Reliability Engineers who can help us build, operate, and maintain high-performance, ...

Promoted
VirtualVocations
Washington, District of Columbia

A company is looking for a Cyber Security Engineer III (Remote) to lead incident response efforts and enhance security operations. ...

Promoted
Palantir
Washington, District of Columbia

As a Site Reliability Engineer on our Security Infrastructure team, you will be hands-on and have wide-ranging impact for the security of Palantir and its customers. The skills and background of successful candidates may vary, but curiosity, tenacity, and a drive to be an extraordinary security engi...

Promoted
VirtualVocations
Washington, District of Columbia

A company is looking for a Site Reliability Engineer II to join their SRE team. ...

Promoted
Accenture
Washington, District of Columbia

Accenture Federal Services is seeking a Senior Site Reliability Engineer (SRE) who is passionate about leveraging scripting and Infrastructure as Code (IaC) to enhance operational efficiency and reliability within an Azure environment. Platform Reliability and Performance Enhancement: Identify and i...

Promoted
Kansas Action for Children, Inc
Washington, District of Columbia

Our team is searching for our next Principal Site Reliability Engineer to play a crucial role improving system reliability and resilience, facilitating faster and more efficient software development and deployment. Improve system reliability and resilience by implementing advanced site reliability e...

Promoted
https:/www.energyjobline.com/sitemap.xml
Washington, District of Columbia

We are seeking an experienced Senior Site Reliability Engineer / DevOps Engineer with a minimum of 8 years of expertise to join our dynamic engineering team. Senior Site Reliability Engineer / DevOps Engineer (8+ years). Lead effective incident response management, considering system and enterprise ...

High Value Talent INC
Washington, District of Columbia

As a Site Reliability Engineer (SRE), you will play a vital role in continuously driving improvements in observability, performance, and reliability, aiming to make a substantial impact across the federal government. Minimum of 8 years of experience as a Site Reliability Engineer, demonstrating a st...