What you'll do :
- Establish service level indicators and data-driven objectives, and develop SRE standards and processes to uphold and improve uptime, latency, and system health.
- Define and execute initiatives to continuously improve our deployed cloud footprint in areas such as observability / monitoring, risk detection and mitigation, disaster recovery, cost optimization, and related areas.
- Collaborate across engineering and other stakeholders to ensure that key stability and maintainability requirements are understood and maintained.
- Create automation in areas such as monitoring, alerting, deployment, and others to enable scale and efficiency.
- Be part of the SRE on-call rotation, including responsibility for incident response.
- Implement best practices around incident management and root cause analysis while being part of on-call rotations.
- Provide mentorship to junior site reliability engineers on best practices.
To be successful in this role, you'll need :
- Bachelor's degree in Computer Science, Management Information Systems, or equivalent practical experience.
- 4+ years of experience in site reliability engineering focused on maintaining production-grade cloud infrastructure.
- Familiarity with a wide range of cloud-based infrastructure technologies, such as those used in container orchestration, data orchestration, business middleware, security, and governance.
This includes AWS (S3, EC2, RDS, more), Kubernetes, Docker, Kafka, Jenkins, and Grafana.
- Demonstrated track record in effectively analyzing and troubleshooting large-scale distributed systems.
- Systematic problem-solving approach, coupled with effective communication skills and a sense of drive.
Pay Transparency Statement
This is a hybrid position based out of our offices : San Francisco, CA , Plano, TX , or Lehi, UT . Hybrid employees are expected to be in the office three days per week (Plano, TX ) or two days per week (all other locations). # LI -hybrid
The actual pay rate offered within the range will depend on factors including geographic location, qualifications, experience, and internal equity.
In addition to the salary, you will be eligible for stock options and benefits like health insurance, 401k, and paid time off.
30+ days ago