Staff Site Reliability Engineer Cloud Platform

Zilliz
California, Missouri, US
Full-time

What you will do :

  • Work at the intersection of development and site reliability. Creating SRE tools and systems, as well as supporting existing infrastructure and platforms.
  • Ensure the reliability, availability, and performance of Zilliz’s distributed database systems.
  • Develop and implement strategies for monitoring, incident management, and disaster recovery.
  • Automate system operations and maintenance tasks to improve efficiency and reduce manual intervention.
  • Design and build tools to manage and monitor infrastructure, ensuring scalability and robustness.
  • Collaborate with software engineers to enhance system reliability, scalability, and performance.
  • Maintain and improve the CI / CD pipeline to ensure smooth and rapid deployment of changes.
  • Actively contribute to the Milvus open-source community, focusing on improving reliability and operational efficiency.

What we are looking for :

  • 4+ years of experience in site reliability engineering or similar roles with a focus on cloud-native systems.
  • Proficiency in scripting languages such as Python, Go, or Java.
  • Strong knowledge of container orchestration technologies like Kubernetes and Docker.
  • Expertise with cloud platforms such as AWS, GCP, or Azure, and their respective monitoring and management tools.
  • Experience with infrastructure as code tools such as Terraform or Ansible.
  • Familiarity with CI / CD tools such as Jenkins, GitLab CI, or Argo.
  • Proven ability to troubleshoot complex distributed systems and resolve issues promptly.
  • Bachelor’s degree or above in computer science, software engineering, or other relevant disciplines.
  • Ability to thrive in a fast-paced, startup environment and handle multiple projects simultaneously.

J-18808-Ljbffr

11 days ago
Related jobs
Promoted
Wikimedia Foundation
California, Missouri

Our project sites are some of the most highly visited on the internet, with more page views per engineer than any other site. As a Senior Site Reliability Engineer at the Wikimedia Foundation, you will be part of a small, focused team of skilled, experienced engineers. Senior Site Reliability Engine...

Promoted
Zscaler
California, Missouri

We're looking for an experienced Site Reliability Engineer to join our Site Reliability Engineering team. Our Engineering team built the world's largest cloud security platform from the ground up, and we keep building. Bring your vision and passion to our team of cloud architects, software engineers...

Promoted
Twist Bioscience
California, Missouri

As a Platform Staff Engineer, you will play a crucial role in designing and building scalable, high-performance platforms that support our cutting-edge products and services. We are seeking an experienced and highly skilled Platform Staff Engineer to join our dynamic team. Experience: Minimum of 8 y...

Promoted
Splunk Inc.
California, Missouri
Remote

Infrastructure Software Engineers at Splunk are cloud-native systems engineers who use infrastructure-as-code, microservices, automation, and efficient design to build, operate, and scale our products. You will help us run one of the largest and most sophisticated cloud-scale, big data, and microser...

Promoted
DR Power LLP
California, Missouri

This engineer will be responsible for setting reliability goals, reviewing Design for Reliability (DFA), and monitoring and executing reliability demonstration tests. Senior Staff Engineer, Product Reliability - Clean Energy. Bachelor of Science Degree in Reliability Engineering, Electrical Engineer...

Promoted
Coinbase Developer Platform
California, Missouri

Troubleshoot and debug complex fullstack issues, ensuring platform stability, reliability, and security. Integrate with a variety of existing Coinbase surfaces, including Smart Wallet, Commerce, and Platform APIs. Drive the adoption of engineering best practices, ensuring operational excellence and ...

Promoted
Bitwarden Inc.
California, Missouri

As a Senior Site Reliability Engineer at Bitwarden, you will directly contribute to the operational success and growth of the Bitwarden team and product through cloud technology. Implement site reliability tools, monitoring, early warning and alert systems, and observability across Bitwarden cloud e...

Promoted
Zscaler
California, Missouri

We're looking for an experienced Site Reliability Engineer to join our Site Reliability Engineering team. Our Engineering team built the world's largest cloud security platform from the ground up, and we keep building. Bring your vision and passion to our team of cloud architects, software engineers...

Promoted
Shpe Sv
California, Missouri

The Wikimedia Foundation is looking for a Senior Site Reliability Engineer to join our team, reporting to the Search Platform Engineering Manager. As the Senior Site Reliability Engineer, you will be responsible for supporting, maintaining and extending Wikidata Query Service (WDQS), the essential i...

Promoted
Shpe Sv
California, Missouri

We are looking for a Senior Site Reliability Engineer to support and develop the platform serving the world’s favorite encyclopedia to millions of people around the globe. If you find what we do interesting, if you are up to the challenge of improving the reliability and delivery of one of the Inter...