Talent.com
Site Reliability Engineer

Site Reliability Engineer

The Giant BullseyeSaint Louis, MO, US
job_description.job_card.30_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Job Description

Job Description

About the Role :

We are seeking a skilled and proactive Site Reliability Engineer (SRE) to join our infrastructure team. As an SRE, you will bridge the gap between development and operations by applying software engineering principles to infrastructure and operations problems. Your mission will be to build scalable and highly reliable systems, ensuring uptime, performance, and automation at every layer.

Responsibilities :

Design, implement, and maintain scalable, resilient infrastructure.

Develop and maintain tools and automation to support infrastructure and deployment.

Monitor systems for performance, reliability, and availability using observability tools.

Participate in incident response, root cause analysis, and post-mortems.

Implement and advocate for SLOs / SLIs to ensure service quality and reliability.

Work closely with development teams to improve CI / CD pipelines and deployment strategies.

Enhance security, compliance, and infrastructure governance.

Requirements :

Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience).

10+ years of experience in DevOps, SRE, or related roles.

Proficient in at least one programming / scripting language (Python, Go, Bash, etc.).

Experience with cloud platforms (AWS, GCP, or Azure).

Hands-on experience with containerization (Docker, Kubernetes).

Strong knowledge of monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, ELK, Datadog).

Familiar with Infrastructure as Code (Terraform, CloudFormation, or similar).

Excellent problem-solving skills and a collaborative mindset.

serp_jobs.job_alerts.create_a_job

Site Reliability Engineer • Saint Louis, MO, US