Talent.com
Senior Cloud Site Reliability Engineer

Senior Cloud Site Reliability Engineer

Radicle HealthAustin, TX, US
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Job Description

Job Description

Radicle Health is a collection of human services software products designed to foster collaboration and innovation, helping organizations better serve their communities. We believe technology plays a crucial role in the success of the human services sector, but no single system can meet the diverse needs of every agency. That's why we've built Radicle Health as a home for mission-driven products that support organizations in delivering essential services. Under one roof, our teams learn from each other, test ideas faster, and think holistically about the individuals and communities we serve.

About the Job :

Join the Radicle Health shared SRE team to help build and evolve a unified platform supporting the SaraWorks, Link2Feed, and Foothold Care Management applications. Our environment spans AWS, Azure, and GCP; this role will focus primarily on AWS (commercial and some GovCloud) and GCP initially, while contributing patterns and tooling usable across all clouds. You'll partner closely with individual product pods while shaping shared standards, automation, and platform capabilities.

Who you are :

  • 5+ years in SRE / DeOps / Platform / Infrastructure Engineering.
  • Eligibility for (or prior possession of) a PIV credential (Tier 1 background investigation).
  • Strong AWS foundations (networking, IAM, compute, storage, managed databases, VPC design); exposure to multi-cloud concepts.
  • Linux systems administration and production troubleshooting proficiency.

Production container experience (Docker plus ECS, Fargate, EKS, or Kubernetes).

  • Hands-on IaC (Terraform, Pulumi, or CloudFormation) with willingness to adopt Pulumi.
  • Scripting or programming in at least one of : Python, Bash, TypeScript, Go, Ruby, or similar.
  • CI / CD pipeline design and maintenance (GitLab CI or equivalent).
  • Practical observability (metrics, logs, tracing, alert strategy design we're invested in DataDog here).
  • Incident response / on‑call participation with follow‑through on remediation.
  • Clear written and verbal communication; able to tailor depth to audience.
  • Availability during core US Eastern collaboration hours.
  • Preferred Experience :

  • Pulumi (via Cloud or Self-Hosted).
  • AWS GovCloud experience; familiarity with compliance frameworks (HIPAA, FedRAMP, SOC 2).
  • GCP services (GKE, Cloud SQL, IAM, networking) and foundational Azure awareness.
  • Advanced container orchestration (autoscaling strategies, service mesh, workload isolation).
  • Performance tuning & optimization for PostgreSQL or other relational databases.
  • Application ecosystem familiarity : Ruby and / or .NET.
  • Disaster recovery strategy, resilience / chaos engineering practice.
  • AI-assisted DevOps / AIOps tooling : e.g., GitHub Copilot, incident automation, AI-driven runbook generation, etc.
  • Experience applying LLMs or automation to infra workflows (e.g., generating IaC modules, intelligent alert tuning, predictive scaling).
  • Familiarity with AI transformation initiatives : governance, data sensitivity considerations, and secure integration of AI into engineering workflows.
  • What you'll be responsible for :

    1. Infrastructure as Code & Cloud Engineering

  • Design, build, and evolve AWS (and GCP initial scope) infrastructure using IaC (Pulumi preferred; Terraform / CloudFormation experience transferable).
  • 2. Container & Runtime Platform

  • Advance containerization (ECS / Fargate, EKS / Kubernetes, or equivalent) and establish secure, observable runtime patterns.
  • 3. CI / CD & Release Engineering

  • Enhance pipelines (GitLab CI or similar) for reliable builds, automated testing, artifact / version management, and progressive delivery.
  • 4. Collaboration & Enablement

  • Partner with engineering pods on hands-on implementation, architecture, incident response readiness, and post‑incident improvement.
  • 5. Observability & Operational Excellence

  • Implement actionable metrics, tracing, structured logging, and intelligent alerting; refine SLOs and reduce MTTR.
  • 6. Reliability & Performance

  • Lead capacity planning, resilience reviews, failover / DR exercises, and performance tuning aligned to SLIs / SLOs.
  • 7. Security & Compliance

  • Embed least‑privilege IAM, secrets management, hardened configurations, and support compliance needs (e.g., GovCloud, healthcare).
  • 8. Automation & Tooling

  • Eliminate toil via scripting, reusable service templates, policy-as-code, and self‑service operational workflows.
  • 9. Documentation & Runbooks

  • Maintain clear architecture diagrams, decision records, playbooks, and onboarding guides.
  • 10. Incident & On‑Call

  • Participate in a humane rotation; drive blameless retros and ensure remediation actions are implemented.
  • What we offer :

  • Unlimited PTO policy
  • Competitive medical, dental, and vision healthcare coverage
  • 401k matching
  • Paid holidays
  • Volunteer time off
  • Paid parental leave
  • Remote work stipend
  • Compensation : $110,000 - $140,000
  • Location : Remote
  • Salary ranges are dependent on a variety of factors, including qualifications, experience and geographic location. More information about the salary range specific to your working location and other factors will be shared during the hiring process.

    Radicle Health is an Equal Employment Opportunity employer that proudly pursues and hires a diverse workforce. Radicle Health does not make hiring or employment decisions on the basis of race, color, religion or religious belief, ethnic or national origin, nationality, sex, gender, gender-identity, sexual orientation, disability, age, military or veteran status, or any other basis protected by applicable local, state, or federal laws or prohibited by Company policy.

    Radicle Health is an Equal Employment Opportunity employer that proudly pursues and hires a diverse workforce. Radicle Health does not make hiring or employment decisions on the basis of race, color, religion or religious belief, ethnic or national origin, nationality, sex, gender, gender-identity, sexual orientation, disability, age, military or veteran status, or any other basis protected by applicable local, state, or federal laws or prohibited by Company policy.

    serp_jobs.job_alerts.create_a_job

    Senior Site Reliability Engineer • Austin, TX, US

    Job_description.internal_linking.related_jobs
    Senior Site Reliability / DevOps Engineer

    Senior Site Reliability / DevOps Engineer

    AutoRABIT Holding Inc.Austin, TX, US
    serp_jobs.job_card.permanent
    serp_jobs.filters_job_card.quick_apply
    AutoRABIT is a hyper-growth SaaS software company and the leading provider of Salesforce DevSecOps platform for regulated industries such financial institutions, insurance, and healthcare.AutoRABIT...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    Cloud Infrastructure Engineer_only on W2

    Cloud Infrastructure Engineer_only on W2

    Chelsoft Solutions CoAustin, TX, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Cloud Infrastructure Engineer_only on W2 Austin, Texas REQURIED SKILLS : Minimum of ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    • serp_jobs.job_card.new
    Senior Cloud Reliability Analyst

    Senior Cloud Reliability Analyst

    EpicorAustin, TX, United States
    serp_jobs.job_card.permanent
    As Senior Cloud Reliability Analyst at Epicor.You'll collaborate across teams to automate deployments, monitor live environments, and drive continuous improvement in our cloud operations.Analyze ap...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_less
    • serp_jobs.job_card.promoted
    Reliability Engineer II

    Reliability Engineer II

    I-Con TechnologyAustin, TX, United States
    serp_jobs.job_card.full_time
    ICON is looking for a Reliability Engineer II to assist in the development of ICON's latest print systems on the Phoenix Team. This team is responsible for delivering the machine known as Phoenix to...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Lead Systems Engineer, Cloud Observability

    Lead Systems Engineer, Cloud Observability

    VisaAustin, TX, United States
    serp_jobs.job_card.full_time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    VisaAustin, TX, United States
    serp_jobs.job_card.full_time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Site Reliability Engineer - Sr. Consultant(Java Applications)

    Site Reliability Engineer - Sr. Consultant(Java Applications)

    VisaAustin, TX, United States
    serp_jobs.job_card.full_time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Reliability Engineer II

    Reliability Engineer II

    ICON TechnologyAustin, TX, United States
    serp_jobs.job_card.full_time
    ICON is looking for a Reliability Engineer II to assist in the development of ICON's latest print systems on the Phoenix Team. This team is responsible for delivering the machine known as Phoenix to...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    Site Reliability Engineer

    Site Reliability Engineer

    Paradromics, Inc.Austin, TX, US
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Site Reliability Engineer About Paradromics Brain-related illness is one of the last great frontiers in medicine, not because the brain is unknowable, but because it has been inaccessible.Paradromi...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    Cloud & DevOps Engineer (EntryLevel)

    Cloud & DevOps Engineer (EntryLevel)

    Hudson ManpowerAustin, TX, US
    serp_jobs.job_card.full_time
    We are seeking a motivated and detail-oriented.The ideal candidate will support our cloud operations, automation pipelines, and deployment processes, while working closely with senior engineers to ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Travel CT Tech - $2984.62 / Week

    Travel CT Tech - $2984.62 / Week

    Pulse Healthcare ServicesSan Marcos, TX, US
    serp_jobs.job_card.full_time
    Pulse Healthcare Services is seeking an experienced CT Tech for an exciting Travel Allied job in San Marcos, TX.Shift : 5x8 hr days Start Date : ASAP Duration : 13 weeks Pay : $2984.About Pulse Healthc...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_1_day
    • serp_jobs.job_card.promoted
    Staff Site Reliability Engineer - Big data Platform and Cloud Engineering

    Staff Site Reliability Engineer - Big data Platform and Cloud Engineering

    VisaAustin, TX, United States
    serp_jobs.job_card.full_time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Staff Site Reliability Engineer, (Python Development)

    Staff Site Reliability Engineer, (Python Development)

    VisaAustin, TX, United States
    serp_jobs.job_card.full_time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Senior Customer Reliability Engineer (CRE)

    Senior Customer Reliability Engineer (CRE)

    Arista NetworksAustin, TX, US
    serp_jobs.job_card.full_time
    Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. What sets us apart is our relentless pursuit of innovation.We...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
    • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    VisaAustin, TX, United States
    serp_jobs.job_card.full_time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Eagle Eye Networks IncAustin, TX, US
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    About Us Eagle Eye Networks is the global leader in cloud video surveillance, delivering cyber-secure, cloud-based video with artificial intelligence (AI) and analytics to make businesses more effi...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Site Reliability Engineer - Sr. Consultant

    Site Reliability Engineer - Sr. Consultant

    VisaAustin, TX, United States
    serp_jobs.job_card.full_time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    Cloud and Storage Engineer

    Cloud and Storage Engineer

    CGSAustin, Texas, United States, 78701
    serp_jobs.job_card.full_time
    Employment Type : Full-Time, Experienced.Department : Information technology.CGS is seeking a Cloud and Storage Engineer to develop and implement full-scale Storage Area Network (SAN) architecture fo...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30