Search jobs > San Francisco, CA > Permanent > Senior site reliability

Senior Site Reliability Engineer II, FedRamp - ThousandEyes.

Cisco
San Francisco, California, US
Full-time

Who We Are

The name ThousandEyes was born from two big ideas : the power to see things not ordinarily possible and the ability to collect insights from a multitude of vantage points.

As organizations rely more on cloud services and the Internet, the network has become a black box they can't understand.

Our Internet and cloud intelligence platform delivers the only collectively powered view of the Internet, cloud and SaaS platforms, helping enterprises and service providers work together to identify problems before it impacts revenue, damages brand reputation, or halts employee productivity.

In August 2020, Cisco Systems completed the acquisition of ThousandEyes, which now forms the ThousandEyes Business Unit within Cisco’s Network Services Business Group and is a foundational component of Cisco’s growing Observability business.

About The Role

The FedRAMP SRE team is focused on our Federal region’s platform. The team is responsible for all aspects of the Federal region’s infrastructure and operations, such as availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning, with a strong focus on security.

The job is to handle the Federal region’s core infrastructure services, maintaining a constantly growing infrastructure capable of handling a very high volume of incoming data per day.

We believe in operations / infrastructure / everything as code which makes our distributed team efficient, functional and very effective.

We’re looking for talented engineers with a software or operations background, experienced in designing and operating large-scale highly available distributed systems in the cloud.

You must be willing to work closely with our application development teams to ensure the reliability, performance and security of our infrastructure.

What You’ll Do

  • Join forces with the software engineers to ensure that the ThousandEyes platform’s Federal region infrastructure and services are designed and optimized for availability, latency, and performance.
  • Design, implementation, and management of FedRAMP-compliant infrastructure and systems.
  • Establish and maintain processes for continuous monitoring, logging, and auditing of systems to ensure compliance with FedRAMP controls.
  • Collaborate and partner with security teams to identify and remediate vulnerabilities, conduct security assessments, and implement necessary security controls.
  • Design and implement dynamic infrastructure. Solutions to run our platform’s infrastructure as we grow and continue scaling (think multi-region scale).
  • Drive and build automation enabling our infrastructure and platforms to scale effortlessly, with a special focus on FedRAMP systems.
  • Know the latest industry best practices, evolving security threats, and updates to FedRAMP guidelines, and apply this knowledge to improve the security posture of our systems.
  • Design, deploy, and maintain cloud-native services in AWS that are elastic and resilient to failure.
  • Participate in and contribute to improving our 24x7 incident response and on-call rotation.
  • Capacity planning for the infrastructure and platform and help teams prepare for growth.

Qualifications

  • 5+ years of experience.
  • Experience building and / or operating FedRAMP environments.
  • Experience identifying and analyzing cyber security risks.
  • Solid understanding of the FedRAMP framework, its controls, and compliance requirements.
  • Familiarity with security standard processes, vulnerability management, and incident response processes.
  • Ability to write high-quality code in Python, Go, or equivalent languages.
  • Ability to build and implement scalable and well-tested solutions.
  • Good understanding of Unix / Linux systems, the kernel, system libraries, file systems, and client-server protocols.
  • Knowledge of cloud providers, ideally AWS.
  • Infrastructure as Code skills, ideally with Terraform, Puppet, and Kubernetes.
  • Good Communication and documentation skills.
  • Solid sense of ownership, drive, and enthusiastic attention to detail.

The successful applicant will be performing work in FedRAMP environments, and therefore, must be a U.S. Person (i.e. U.S.

citizen, U.S. national, lawful permanent resident, asylee, or refugee). This position may also perform work that the U.S.

government has specified can only be performed by a U.S. citizen on U.S. soil.

30+ days ago
Related jobs
Cisco
San Francisco, California

In August 2020, Cisco Systems completed the acquisition of ThousandEyes, which now forms the ThousandEyes Business Unit within Cisco’s Network Services Business Group and is a foundational component of Cisco’s growing Observability business. Join forces with the software engineers to ensure that the...

Promoted
Cisco Systems, Inc.
San Francisco, California

As a Senior Software Engineer on this team, you will be helping to address our two main challenges: keeping up with the ever-increasing amount of data gathered by our agents, and making this information more actionable for our customers. ...

GlossGenius
San Francisco, California
Remote

In this role, you'll have the opportunity to join GlossGenius as one of the first Senior Site Reliability Engineer as part of the Infrastructure Engineering team. Production Engineer, Cloud Engineer, Site Reliability Engineer, or DevOps equivalent roles. As a Site Reliability Engineer, you will play...

Cisco Meraki
San Francisco, California
Remote

Building an automatic service lifecycle platform to coordinate the full lifecycle of all infrastructure (server, storage, network and site). Deploying comprehensive monitoring tools to provide insight into the performance and reliability of our infrastructure. ...

Cisco
San Francisco, California

As a Site Reliability Engineer on the team, you will focus on all aspects of reliability for the ThousandEyes global monitoring infrastructure. Collaborate with software engineers across engineering to quickly and accurately identify any software bugs and provide pointers on performance or architect...

Jesica.ai
San Francisco, California

Jesica is a recruiting agency that utilizes AI to source, screen and match candidates to right career opportunities.We work on behalf on our clients who are actively seeking qualified candidates to these roles.If you are a candidate either actively looking or just browsing, we highly encourage you t...

Cisco
San Francisco, California

This role is the Site Reliability Engineering Manager for the FedRAMP SRE team at ThousandEyes. In August 2020, Cisco Systems completed the acquisition of ThousandEyes, which now forms the ThousandEyes Business Unit within the Cisco Networking Business Group and is the Network Assurance solution for...

Astranis
San Francisco, California

As a Senior Reliability Test Engineer, you will be tasked with the unique challenge of working at the intersection of all engineering disciplines to ensure our hardware meets the highest level of quality and reliability. Senior Reliability Test Engineer. The ideal candidate has extensive engineering...

Flexton Inc.
San Francisco, California

Participating as a member of the Site Operations team responsible for management and operation of various platforms in San Jose. Building and scaling sites which provides 99. ...

Hims
San Francisco, California
Remote

We are seeking a Site Reliability Engineer to help build a reliable web experience for our users. Manage incidents and emergency response, track outages, ensure data integrity and engineer releases to promote safe, efficient and rapid deployments. ...