Search jobs > San Francisco, CA > Permanent > Senior site reliability

Senior Site Reliability Engineer, FedRAMP.

Cisco
San Francisco, California, US
Full-time

Who We Are

Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network even the ones they don’t own.

Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues before they impact end- user experiences.

ThousandEyes is deeply integrated across the entire Cisco technology portfolio and beyond, helping customers deploy at scale while also delivering AI-powered assurance insights within Cisco’s leading Networking, Security, Collaboration, and Observability portfolios.

About The Role

The FedRAMP SRE team is focused on our Federal region’s platform. The team is responsible for all aspects of the Federal region’s infrastructure and operations, such as availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning, with a strong focus on security.

The job is to handle the Federal region’s core infrastructure services, maintaining a constantly growing infrastructure capable of handling a very high volume of incoming data per day.

We believe in operations / infrastructure / everything as code which makes our distributed team efficient, functional and very effective.

We’re looking for talented engineers with a software or operations background, experienced in designing and operating large-scale highly available distributed systems in the cloud.

You must be willing to work closely with our application development teams to ensure the reliability, performance and security of our infrastructure.

What You’ll Do

  • Join forces with the software engineers to ensure that the ThousandEyes platform’s Federal region infrastructure and services are designed and optimized for availability, latency, and performance.
  • Design, implementation, and management of FedRAMP-compliant infrastructure and systems.
  • Establish and maintain processes for continuous monitoring, logging, and auditing of systems to ensure compliance with FedRAMP controls.
  • Collaborate and partner with security teams to identify and remediate vulnerabilities, conduct security assessments, and implement necessary security controls.
  • Design and implement dynamic infrastructure. Solutions to run our platform’s infrastructure as we grow and continue scaling (think multi-region scale).
  • Drive and build automation enabling our infrastructure and platforms to scale effortlessly, with a special focus on FedRAMP systems.
  • Know the latest industry best practices, evolving security threats, and updates to FedRAMP guidelines, and apply this knowledge to improve the security posture of our systems.
  • Design, deploy, and maintain cloud-native services in AWS that are elastic and resilient to failure.
  • Participate in and contribute to improving our 24x7 incident response and on-call rotation.
  • Capacity planning for the infrastructure and platform and help teams prepare for growth.

Qualifications

  • 5+ years of experience.
  • Experience building and / or operating FedRAMP environments.
  • Experience identifying and analyzing cyber security risks.
  • Solid understanding of the FedRAMP framework, its controls, and compliance requirements.
  • Familiarity with security standard processes, vulnerability management, and incident response processes.
  • Ability to write high-quality code in Python, Go, or equivalent languages.
  • Ability to build and implement scalable and well-tested solutions.
  • Good understanding of Unix / Linux systems, the kernel, system libraries, file systems, and client-server protocols.
  • Knowledge of cloud providers, ideally AWS.
  • Infrastructure as Code skills, ideally with Terraform, Puppet, and Kubernetes.
  • Good Communication and documentation skills.
  • Solid sense of ownership, drive, and enthusiastic attention to detail.

The successful applicant will be performing work in FedRAMP environments, and therefore, must be a U.S. Person (i.e. U.S.

citizen, U.S. national, lawful permanent resident, asylee, or refugee). This position may also perform work that the U.S.

government has specified can only be performed by a U.S. citizen on U.S. soil.

30+ days ago
Related jobs
Promoted
Google Inc.
San Francisco, California

Senior Software Engineer, Site Reliability Engineering, Google Cloud. Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. We're proud to be our engineers' engineers and love voiding warranties by ta...

Promoted
Google Cloud - Minnesota
San Francisco, California

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. We're proud to be our engineers' engineers and love voiding warranties by taking things apart so we can rebuild them. Master's degree in Computer ...

Promoted
Astranis Space Technologies
CA, United States

The component reliability engineer will work closely with our electrical, thermal, and mechanical teams to select EEE components, assess their reliability, design solutions to our reliability challenges, and to plan and execute additional qualification or screening processes as needed. Senior EEE Co...

Promoted
Tbwa Chiat/Day Inc
San Francisco, California

As a Senior Reliability Test Engineer, you will be tasked with the unique challenge of working at the intersection of all engineering disciplines to ensure our hardware meets the highest level of quality and reliability. Senior Reliability Test Engineer. The ideal candidate has extensive engineering...

Promoted
Outdefine
San Francisco, California

Read the overview of this opportunity to understand what skills, including and relevant soft skills and software package proficiencies, are required.Long term contract: 1099 – US Based candidates / No C2C or C2H.Demonstrated ability to deliver solutions that are easily maintainable, understandable, ...

Promoted
PicnicHealth
San Francisco, California

As a Senior Site Reliability Engineer at PicnicHealth, you will be responsible for the reliability, efficiency, and architecture of our cloud, developer, and security operations. Full Time] Site Reliability Engineer at PicnicHealth (United States). PicnicHealth’s engineering team is highly engaged a...

1000 Kyndryl, Inc.
San Francisco, California

In your role as a Site Reliability Engineer, you’ll use your skills to help instrument our systems so they can be easily built, observed, monitored, tested, and deployed at scale, and ensure Skytap’s services perform well for enterprise customers. In order to be effective in this role as a Site Reli...

Bayer
Berkeley, California

Senior Associate Site IT Infrastructure Engineer for Berkeley, CA to provide technical support for LAN, WAN, servers and on-site IT infrastructure storage; participate in gathering & evaluating customer requests and project infrastructure needs against existing IT infrastructure; consult with networ...

GEICO
San Francisco, California

Our Senior Engineer is a key member of the engineering staff working across the organization to collaboratively design creative solutions to complex problems using automation. As a Senior Engineer, you will: . GEICO is seeking an experienced Engineer with a passion for building high-performance, low...

GEICO
San Mateo, California

Our Senior Manager is an engineering leader who works with the engineering staff to innovate and build new engineering solutions, improveand enhance existing solutions as well as leverage engineering solutions to solve critical operational problems. Senior Manager, Site Reliability Engineering – Dat...