Staff Site Reliability Engineer (SRE)

Augmedix

San Francisco, CA, United States

$170K-$200K a year

Full-time

We are sorry. The job offer you are looking for is no longer available.

About Augmedix :

Augmedix (Nasdaq : AUGX) delivers industry-leading, ambient medical documentation and data solutions to healthcare systems, physician practices, hospitals, and telemedicine practitioners.

Augmedix is on a mission to help clinicians and patients form a human connection by seamlessly integrating our technology at the point of care.

Augmedix’s proprietary platform digitizes natural clinician-patient conversations, which are converted into comprehensive medical notes and structured data in real time.

The company’s platform uses automatic speech recognition, and natural language processing, including large language models, to generate accurate and timely medical notes that are transferred into the EHR.

Augmedix’s products relieve clinicians of administrative burden, in turn, reducing burnout, increasing clinician efficiency and improving patient access.

Through Augmedix’s proprietary platform and bi-directional communication channel, Augmedix is ideally suited to serve as the vehicle for change at the point of care.

Augmedix is headquartered in San Francisco, CA, with offices around the world. To learn more, visit www.augmedix.com .

About the Role :

We are seeking a highly skilled and experienced Staff SRE to join our growing team. You will play a critical role in ensuring the reliability, scalability, and performance of our critical infrastructure and applications.

Beyond core SRE responsibilities, you will also serve as a key liaison across various teams, fostering collaboration and ensuring seamless operations.

Responsibilities :

Proactively identify and mitigate potential issues impacting infrastructure and applications
Partner with development teams to implement best practices for building reliable and scalable systems
Stay up-to-date on the latest SRE trends and technologies

Monitoring and Observability :

Design, implement, and maintain robust monitoring solutions using tools like Prometheus and Grafana
Develop and configure alerts within tools like PagerDuty to ensure timely notification of potential issues
Analyze and troubleshoot issues using collected application and infrastructure metrics

Incident Management :

Lead incident response, ensuring timely resolution and minimizing downtime
Document and communicate incident details effectively to stakeholders
Conduct post-incident reviews to identify root causes and implement preventative measures

Service Level Agreements (SLAs) :

Collaborate with product and engineering teams to define clear and measurable SLAs for our SaaS offerings
Establish Service Level Objectives (SLOs) for key metrics based on SLA requirements
Define Service Level Indicators (SLIs) to track progress towards achieving SLOs
Monitor SLO compliance and proactively identify potential SLA breaches

Automation :

Identify opportunities for automation to improve efficiency and reliability
Develop and implement automation scripts using tools like Python or Bash
Automate routine tasks and incident response workflows

Cross-Team Collaboration :

Act as a liaison between SRE, Product, Security, Application Engineering, and Customer Operations teams
Facilitate communication and information sharing across teams to ensure smooth operations
Work collaboratively to define and implement solutions that meet the needs of all stakeholders

Mentorship and Knowledge Sharing :

Mentor and collaborate with junior SRE engineers
Share knowledge and best practices within the team
Contribute to the development and documentation of internal SRE processes

Requirements :

5-8 years of experience as a Site Reliability Engineer (SRE) or related role
Proven experience with monitoring tools like Prometheus and Grafana
Strong understanding of incident management best practices
Experience with alerting tools like PagerDuty
Experience with scripting languages like Python or Bash for automation
Excellent communication and collaboration skills
Ability to work independently and as part of a team
Strong problem-solving and analytical skills
Passion for building reliable and scalable systems

Bonus :

Experience with cloud platforms like AWS, GCP, or Azure
Experience with container orchestration platforms like Kubernetes
Experience with chaos engineering principles
Experience with configuration management tools like Ansible or Chef

$170,000 - $200,000 a year

Salary range is listed above. There are several factors that determine final pay for a position including location and experience.

Total compensation will typically include salary + performance bonus + equity.

Augmedix is an equal opportunity employer. We are committed to providing equal employment opportunities regardless of sex, gender identity, race, religious creed, color, ancestry, age, disability, marital status, sexual orientation including being transgender and / or any other protected bases.

J-18808-Ljbffr

9 days ago

Related jobs

Staff Site Reliability Engineer (SRE)

Augmedix

San Francisco, California

Site Reliability Engineer (SRE) or related role. We are seeking a highly skilled and experienced Staff SRE to join our growing team. Act as a liaison between SRE, Product, Security, Application Engineering, and Customer Operations teams. Mentor and collaborate with junior SRE engineers. ...

Promoted

Senior Site Reliability Engineer

Storm2

CA, United States

Senior Site Reliability Engineer. They are on the lookout for a highly skilled Senior Site Reliability Engineer to help enhance their secure and seamless financial solutions. Senior Site Reliability Engineer/similar role, ideally using tech stacks like C#, Java, Scala, Go, etc. Establish and impleme...

Sr. Staff Site Reliability Engineer - CDN & Networking

WP Engine

Remote, California

Remote

The evolution of our platform is required for our scale, and we are searching for an experienced Site Reliability Engineer with expertise in CDN and Networking to join our rapidly growing engineering team. CDN engineering, or SRE role with a heavy emphasis on those functions, delivering high quality...

Staff Site Reliability Engineer

Varo

San Francisco, California

As a Staff Site Reliability Engineer (SRE), you will be playing a pivotal role in ensuring the reliability, scalability, and performance of our cloud-based services. Minimum 12 years experience as a Site Reliability, DevOps, or Software Engineer with proficiency in one or more high-level languages (...

Principal Site Reliability Engineer, Datastores (ThousandEyes)

Cisco Systems, Inc.

San Francisco, California

Principal Site Reliability Engineer, Datastores (ThousandEyes). As a Principal Site Reliability you will focus on innovating and providing strong technical vision as well as work with the team to build reliable, scalable and highly available datastores on a constantly growing multi-region scale plat...

Site Reliability Engineer San Francisco, CA

Retool Inc.

San Francisco, California

As our first Site Reliability Engineer, you will be instrumental in defining and shaping the processes and practices for a pivotal new business offering. This role requires a blend of deep technical expertise in site reliability engineering and a keen product sense to create solutions that not only ...

Senior Site Reliability Engineer II, FedRamp - ThousandEyes.

Cisco

San Francisco, California

The FedRAMP SRE team is focused on our Federal region’s platform. We’re looking for talented engineers with a software or operations background, experienced in designing and operating large-scale highly available distributed systems in the cloud. You must be willing to work closely with our applicat...

Site Reliability Engineer

Retool

San Francisco, California

Site Reliability Engineer (*)REMOTE

WEX Inc

San Francisco Bay Area, California

Remote

The WEX Site Reliability Engineering (SRE) team is looking for individuals passionate about developing software and solutions focused on observability, incident response, reliability and performance, operational excellence, and compliance. Site Reliability Engineer or equivalent role. As part of the...

Senior Site Reliability Engineer, Security (Remote)

Cisco Meraki

San Francisco, California

Remote

In Meraki SRE we build the highly scalable cloud infrastructure that supports millions of Meraki devices worldwide. The Infrastructure SRE team is responsible for the compute, storage and security underpinning Meraki's cloud in 10 data centers worldwide. Building an automatic service lifecycle platf...