Search jobs > San Francisco, CA > Permanent > Senior site reliability

Senior Site Reliability Engineer, FedRAMP

DaVita Inc.
San Francisco, California, US
Full-time

Who We Are

Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network - even the ones they don't own.

Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues - before they impact end- user experiences.

All candidates should make sure to read the following job description and information carefully before applying.

ThousandEyes is deeply integrated across the entire Cisco technology portfolio and beyond, helping customers deploy at scale while also delivering AI-powered assurance insights within Cisco's leading Networking, Security, Collaboration, and Observability portfolios.

About The Role

The FedRAMP SRE team is focused on our Federal region's platform. The team is responsible for all aspects of the Federal region's infrastructure and operations, such as availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning, with a strong focus on security.

The job is to handle the Federal region's core infrastructure services, maintaining a constantly growing infrastructure capable of handling a very high volume of incoming data per day.

We believe in operations / infrastructure / everything as code which makes our distributed team efficient, functional and very effective.

We're looking for talented engineers with a software or operations background, experienced in designing and operating large-scale highly available distributed systems in the cloud.

You must be willing to work closely with our application development teams to ensure the reliability, performance and security of our infrastructure.

What You'll Do

  • Join forces with the software engineers to ensure that the ThousandEyes platform's Federal region infrastructure and services are designed and optimized for availability, latency, and performance.
  • Design, implementation, and management of FedRAMP-compliant infrastructure and systems.
  • Establish and maintain processes for continuous monitoring, logging, and auditing of systems to ensure compliance with FedRAMP controls.
  • Collaborate and partner with security teams to identify and remediate vulnerabilities, conduct security assessments, and implement necessary security controls.
  • Design and implement dynamic infrastructure. Solutions to run our platform's infrastructure as we grow and continue scaling (think multi-region scale).
  • Drive and build automation enabling our infrastructure and platforms to scale effortlessly, with a special focus on FedRAMP systems.
  • Know the latest industry best practices, evolving security threats, and updates to FedRAMP guidelines, and apply this knowledge to improve the security posture of our systems.
  • Design, deploy, and maintain cloud-native services in AWS that are elastic and resilient to failure.
  • Participate in and contribute to improving our 24x7 incident response and on-call rotation.
  • Capacity planning for the infrastructure and platform and help teams prepare for growth.

Qualifications

  • 5+ years of experience.
  • Experience building and / or operating FedRAMP environments.
  • Experience identifying and analyzing cyber security risks.
  • Solid understanding of the FedRAMP framework, its controls, and compliance requirements.
  • Familiarity with security standard processes, vulnerability management, and incident response processes.
  • Ability to write high-quality code in Python, Go, or equivalent languages.
  • Ability to build and implement scalable and well-tested solutions.
  • Good understanding of Unix / Linux systems, the kernel, system libraries, file systems, and client-server protocols.
  • Knowledge of cloud providers, ideally AWS.
  • Infrastructure as Code skills, ideally with Terraform, Puppet, and Kubernetes.
  • Good Communication and documentation skills.
  • Solid sense of ownership, drive, and enthusiastic attention to detail.

The successful applicant will be performing work in FedRAMP environments, and therefore, must be a U.S. Person (i.e. U.S.

citizen, U.S. national, lawful permanent resident, asylee, or refugee). This position may also perform work that the U.S.

government has specified can only be performed by a U.S. citizen on U.S. soil.

J-18808-Ljbffr

2 days ago
Related jobs
Promoted
Amino Health
San Francisco, California

Our engineering team is small but mighty, and we are searching for a Senior / Staff Platform Engineer to act as a technical lead for the DevOps and Site Reliability disciplines. Most immediately, you’ll have an opportunity to work directly with the CTO as well as senior Security and Product leads to...

Promoted
Cisco Systems, Inc.
San Francisco, California

As a Principal Site Reliability you will focus on innovating and providing strong technical vision as well as work with the team to build reliable, scalable and highly available datastores on a constantly growing multi-region scale platform. We're looking for a reliability-focused engineering l...

Promoted
Genmo
San Francisco, California

As a Site Reliability Engineer (SRE) at Genmo, you will be responsible for designing, implementing, and maintaining the infrastructure that powers our large generative AI models. Ensure the reliability, availability, and performance of our systems through proactive monitoring and incident response. ...

Promoted
Withorb
San Francisco, California

As a Site Reliability Engineer at Orb, you will play a critical role in maintaining and scaling our robust infrastructure, ensuring stability, scalability, and performance. You will be at the heart of tackling some of the most significant engineering challenges, from scaling our data ingestion pipel...

Federal Reserve System
San Francisco, California

As a Senior Cloud Reliability Engineer in the SRE chapter, you will be accountable for implementing reliability practices using software as means for the cloud foundational product line in the Federal Reserve. Works part of cloud foundational platform squads to demonstrate and champion site reliabil...

Disney Entertainment & ESPN Technology
San Francisco, California

The Senior Site Reliability Engineer is a key member of our Performance and Reliability embedded teams. Our Performance and Reliability teams are leading the improvements, optimization, and availability of applications across the Disney organization and business units, taking a consultative approach...

Infused Solutions
San Francisco, California

Our client is looking for a skilled Senior Site Reliability Engineer with an Microsoft Azure background and a good level of software engineering experience. Senior Site Reliability Engineer. Infused Solutions have partnered with a market leader in the San Francisco area, they are looking for a Senio...

Abbott
Alameda, California

The Senior Reliability Engineer’s job duties consist of three primary aspects: performing failure analysis investigation on returned ADC products, supporting small to medium-sized projects, and performing sustaining activities. Reliability Engineer position works out of our Alameda, CA location in t...

GEICO
San Mateo, California

Our Senior Manager is an engineering leader who works with the engineering staff to innovate and build new engineering solutions, improveand enhance existing solutions as well as leverage engineering solutions to solve critical operational problems. Senior Manager, Site Reliability Engineering - Net...

Global Technical Talent
Oakland, California

Site Reliability Engineer (Hybrid). Reliability and Capital Planning: Participate in annual and long-term reliability planning, ensuring alignment with operational objectives. The Principal Solutions Architect will report to the Senior Manager of Data Solutions Architecture in the Data Analytics & I...