A company is looking for a Staff Site Reliability Engineer.
Key Responsibilities
Define and drive the strategic direction for SRE practices and reliability engineering
Architect and implement complex systems and solutions for scalability and reliability
Lead major incident response efforts and postmortem analyses to improve system resilience
Required Qualifications
8+ years of experience as an SRE in AWS environments within medium to large-scale organizations
8+ years of hands-on experience with observability tools like Prometheus and Grafana
Exceptional proficiency in programming, particularly in Python, Go, and Bash
5+ years of experience in designing and building infrastructure deployment pipelines using tools like Terraform and Git
Advanced expertise in managing production environments in AWS and deep knowledge of Linux systems
Site Reliability Engineer • Fort Lauderdale, Florida, United States