Talent.com
Senior Site Reliability Engineer

Senior Site Reliability Engineer

VirtualVocationsDes Moines, Iowa, United States
job_description.job_card.30_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

A company is looking for a Senior Cluster Site Reliability Engineer.

Key Responsibilities

Respond to and resolve urgent cluster outages or issues

Ensure high cluster uptime and track SLAs for reliability

Diagnose recurring problems and collaborate on engineering solutions

Required Qualifications

5+ years of experience in SRE or DevOps roles

Knowledge of HPC / batch compute frameworks and machine learning training systems

Ability to develop scripts in a common scripting language

Familiarity with infrastructure-as-code and cloud infrastructure

Bachelor's degree in computer science or equivalent experience

serp_jobs.job_alerts.create_a_job

Senior Site Reliability Engineer • Des Moines, Iowa, United States