Search jobs > San Francisco, CA > Site reliability engineer

Site Reliability Engineer, Cloud Agent Operations

ThousandEyes
San Francisco, CA
$137.9K-$203.1K a year
Full-time

Who We Are

The name ThousandEyes was born from two big ideas : the power to see what’s not ordinarily possible, and the ability to collect intelligence from vantage points as diverse and global as the Internet.

As organizations depend on cloud services, the Internet has become their defacto network connecting cloud applications to users.

Our Internet and cloud intelligence platform is like a Google maps of the Internet’, providing the only collectively powered view of digital experiences end-to-end.

We enable our customers made up of the world’s largest and fastest-growing brands, to identify problems before they impact revenue, brand reputation, or employee productivity.

In August 2020, Cisco Systems completed the acquisition of ThousandEyes, which now forms the ThousandEyes Business Unit within Cisco’s Network Services Business Group, and is a foundational component of Cisco’s growing Observability business.

About You

We are looking for dedicated and creative engineers with a mix of software and operations background, good knowledge of applicable technologies such as networking protocols, Kubernetes, IaC, distributed computing, and software development.

We don't expect everyone will have everything listed here. Don't hesitate to apply if the role looks interesting, you have a good foundation, and are enthusiastic about advancing your understanding of these and other technologies.

We are searching for engineers who...

  • Have an ability to design and implement scalable and well tested solutions with focus on Kubernetes and terraform deployments.
  • Enjoy writing high quality code in Python, Go, or similar programming languages.
  • Understand how to use Infrastructure as Code (IaC) using technologies such as Terraform, Ansible, Puppet, and Kubernetes to build complicated systems while working to limit complexity.
  • Have experience with the various cloud provider managed services, especially, but not limited to, AWS.
  • Possess a solid understanding of Unix / Linux systems, the kernel, system libraries, file systems, and general Linux administrative knowledge.
  • Have knowledge of standard network protocols such as IPv4, IPv6, TCP, UDP, DNS, HTTP, TLS.
  • Are able to communicate and document complicated topics in concise and consumable language for readers of diverse levels of understanding.
  • Have a strong sense of ownership, drive and a dedicated attention to detail.
  • Have a dedication to excellence on both operations and development.

What You'll Do

  • Collaborate with software engineers across engineering to quickly and accurately identify any software bugs and provide pointers on performance or architecture improvements.
  • Design, build, deploy, and maintain a custom infrastructure deployment model built from the ground up that includes bare metal servers and virtualized infrastructure from all major and mid-level cloud providers.
  • Drive and build automation wherever possible, enabling the fleet to scale itself as needed.
  • Participate in on-call rotation and collaborate with the team to improve our 24x7 incident response.
  • Be passionate about giving our customers the best possible experience.
  • Analyze, debug, and solve issues across our infrastructure and platform services.

Cisco is an Affirmative Action and Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis.

Cisco will consider for employment, on a case by case basis, qualified applicants with arrest and conviction records.

US COMPENSATION RANGE

137,900 USD - 203,100 USD

30+ days ago
Related jobs
Promoted
ADT Inc.
Brisbane, California

ADT's Site Reliability Engineering (SRE) team is looking for a passionate software engineer who wants their lines of code to have a significant and measurable positive impact on our customers, the bottom line, and the industry. Demonstrate a strong focus on tactical operations, as well as medium to ...

Okta, Inc.
San Francisco, California
Remote

Working closely with the product engineers, quality engineers, platform engineers and architecture teams, your primary focus will be on ensuring production systems remain operational at all times, while continually setting and achieving long-term performance, reliability and scalability goals in a p...

Wells Fargo
San Leandro, California

Production support/Site Reliability Engineering teams with continued focus on improving Platform health. Act as a key participant in developing standards and companywide best practices for engineering complex and large scale technology solutions for technology engineering disciplines. Represent Plat...

BHO Tech
San Francisco, California

Senior Site Reliability Engineer, San Mateo, CA. As a Site Reliability Engineer, you’ll play a critical role in helping us scale our software stack and hardware infrastructure at a time of incredible growth for our business. Up-to-speed on all things Cloud: you have working experience with public cl...

BaseTen Labs, Inc.
San Francisco, California

As a Site Reliability Engineer, you'll envision and build robust systems and processes that ensure our infrastructure is scalable, reliable, and efficient. AS A SITE RELIABILITY ENGINEER, YOU:. ML teams at enterprises and category-defining AI-native companies like Descript, Bland, and Patreon use Ba...

QTC Management, inc
CA null, US

You will collaborate with a range of stakeholders to align cloud operations with business objectives and ensure that our cloud-based systems meet the highest standards of reliability and security. Work closely with other IT teams, including developers, DevOps engineers, and network engineers, to ens...

Forhyre
San Francisco, California

The Cloud Service Reliability Engineer will be responsible for effective design, execution, and maintenance of systems implemented on premise or in the cloud, primarily focused on identity and access management, cloud computing. The Cloud Service Reliability Engineer will be responsible for effectiv...

Zetachain
San Francisco, California
Remote

Site Reliability Engineer to join our team and run critical infrastructure for our blockchain and web applications. Ensure all processes meet our security, performance, and reliability requirements. Engineers with a strong background in infrastructure and a keen interest in blockchain technology. So...

GlossGenius
San Francisco, California
Remote

Production Engineer, Cloud Engineer, Site Reliability Engineer, or DevOps equivalent roles. In this role, you'll have the opportunity to join GlossGenius as one of the first Senior Site Reliability Engineer as part of the Infrastructure Engineering team. As a Site Reliability Engineer, you will play...

BHO Tech
San Francisco, California

We are looking for a Site Reliability / DevOps Engineer to be a pivotal member of our engineering team. We are a team of ten engineers, artists, and entrepreneurs inventing the future from our travel-labs in Denver and Los Angeles, fueled by wonderlust and San Pellegrino. You have AWS and/or Azure c...