Job title : Site Reliability Engineer
Job type : Full time
Rate : Competitive, based on experience
Role Location : On-Site, Palo Alto
About the Role :
We are a technology startup advancing healthcare with a safety-focused AI platform that assists medical professionals by managing patient communications, including check-ins, reminders, and follow-up care. We are seeking a highly skilled Senior Site Reliability Engineer to join our team. In this role responsibilities will include designing and implementing infrastructure automation, continuous integration and delivery pipelines, and monitoring and scaling the infrastructure that powers our healthcare AI platform. You will work closely with software engineers, research scientists, and other cross-functional teams to develop and maintain reliable and scalable infrastructure that enables rapid iteration and deployment of our products.
Key Responsibilities :
- Design and implement infrastructure automation and deployment pipelines using tools such as Terraform
- Implement and maintain monitoring and logging systems to ensure the reliability and performance of our healthcare AI platform
- Work closely with software engineers to design and deploy scalable, fault-tolerant, and secure production systems on cloud platforms such as AWS, GCP, or Azure
- Develop and maintain security and compliance policies and procedures for our healthcare AI platform
- Collaborate with cross-functional teams to troubleshoot and resolve complex issues related to infrastructure, deployment, and operations
- Implement and maintain disaster recovery and business continuity plans
- Develop and maintain documentation related to infrastructure, deployment, and operations
- Mentor and provide technical guidance to junior engineers
Qualifications :
Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related fieldAt least 5 years of professional experience as SREStrong skills in building cloud infra orchestration systems (Operators) using Python with some expertise in Go ideallyExpertise in infrastructure automation and deployment tools such as Terraform , or GitLab CI / CDExperience with cloud platforms such as A WS, GCP, or AzureStrong knowledge of containerization technologies such as Docker and KubernetesExperience with monitoring and logging tools such as ELK, Grafana, or DatadogFamiliarity with security and compliance best practices and tools such as HashiCorp Vault, AWS KMS, or Azure Key VaultStrong problem-solving skills and ability to work independently and collaboratively in a team environmentExcellent communication and interpersonal skillsPreferred :
Experience implementing HIPAA and SOC2 compliance in a plusExperience working in an HPC Environment is a plusAccessibility Statement :
Read and apply for this role in the way that works for you by using our Recite Me assistive technology tool. Click the circle at the bottom right side of the screen and select your preferences. We make an active choice to be inclusive towards everyone every day. Please let us know if you require any accessibility adjustments through the application or interview process. Our Commitment to Diversity, Equity, and Inclusion : Signify’s mission is to empower every person, regardless of their background or circumstances, with an equitable chance to achieve the careers they deserve. Building a diverse future, one placement at a time.
Check out our DE&I page here