Site Reliability Engineering (SRE) Lead

IDEXX
US, MI, Virtual
Full-time

Are you interested in working on a fast-paced Agile team, building modern & global LIMS platform? Do you want to work on a product that makes a difference in the day-to-day life of lab operations, veterinarians, and pet owners?

Are you a self-starter individual? We are looking for a motivated engineer to join the Site Reliability Engineering Team to help drive performance, stability and customer satisfaction with the product and the team.

The right lab information managementsystem (LIMS) is critical to operations, clinical outcomes, client relationships, and more.

Applying system thinking to these challenges supports our desire for change while also creating pull-through innovations that are led by our commercial needs, and the needs of our customers.

IDEXX is looking for a SRETechnical Lead to work on a complex distributed platform that scales globally across the entire Reference Lab operationsecosystem.

You will participate in everything from high-level design, execute product strategies, oversee the management and optimization of our vendor software products and solutions.

You will be working on a platform that is composed of numerous technologies and leverages PAAS infrastructure in both AWS and GCP.

This opportunity offers the chance to tackle complex challenges and build a new solution from the ground up.

As the SRE Lead , you will be responsible for ensuring the reliability, scalability, and performance of our services and solutions.

The ideal candidate should have a strong technical background, excellent problem-solving skills, and a passion for building resilient systems.

In this role...

You will provide Leadership :

  • Mentor team of Site Reliability Engineers.
  • Foster a culture of collaboration, innovation, and continuous improvement within the team.
  • Identifies business needs, assesses available technologies, and develops and presents solutions.

You will be performing Reliability Engineering responsibilities :

  • Design, implement, and maintain systems and processes to ensure high availability and performance of services.
  • Develop and manage SLAs, SLOs, and SLIs to monitor and improve service reliability.
  • Proactively identify and address potential reliability issues and risks.
  • Provides high level of customer service, partners with end users in the resolution of problems or in deployment of new applications.

You will drive Automation, Tooling and Development :

  • Drive automation efforts to reduce manual work and improve operational efficiency.
  • Develop and maintain infrastructure as code and configuration management tools.
  • Implement monitoring, logging, and alerting systems to ensure timely detection and resolution of issues.
  • Design, code, test, debug, and document software applications according to technical specifications developed by analysts and project teams.

Create modules that adhere to these specifications, ensuring efficient and reliable system operation.

You will lead Incident Management :

  • Lead incident response efforts, including root cause analysis and post-mortem reviews.
  • Implement processes to prevent recurrence and improve incident response times.
  • Collaborate with cross-functional teams to ensure effective communication and coordination during incidents.

You will be responsible for Continuous Improvement :

  • Identify opportunities for process improvements and implement best practices for SRE.
  • Stay up-to-date with industry trends and emerging technologies to drive innovation.
  • Foster a culture of learning and knowledge sharing within the team and across the organization.

You will collaborate :

  • Work closely with development, operations, and product teams to ensure seamless integration and delivery of services.
  • Provide technical guidance and support to other teams as needed.
  • Participate in architecture and design discussions to ensure reliability and scalability considerations are addressed.

What You Will Need to Succeed :

  • You will have a solid track record of leading technical initiatives to meet timelines and meet the expectations of various stakeholders.
  • You have a strong understanding of cloud infrastructure (AWS, GCP, Azure) and containerization technologies (Docker, Kubernetes).
  • You have 7+ years of experience in site reliability engineering, software development, or a related role.
  • You have proficiency with languages including Java / Kotlin.
  • Bachelor’s degree in computer science, Engineering, or a related field. Master’s degree preferred.
  • Proven experience leading and mentoring technical teams.
  • Proficiency in scripting and automation using languages such as Python, Go, or similar.
  • Experience with monitoring and logging tools (Splunk, ELK, CloudWatch, etc.)
  • Experience working with RESTful APIs, front end JS frameworks and Jenkins
  • Excellent problem-solving and analytical skills.
  • Experience with relational & NoSQL databases
  • Strong communication and interpersonal skills.
  • Ability to work effectively in a fast-paced, dynamic environment

Location : This position is remote and you can be based anywhere in US, with preference in the CST and EST time zones.

Why IDEXX?

We’re proud of the work we do, because our work matters. An innovation leader in every industry we serve, we follow our Purpose and Guiding Principles to help pet owners worldwide keep their companion animals healthy and happy, to ensure safe drinking water for billions, and to help farmers protect livestock and poultry from diseases.

We have customers in over 175 countries and a global workforce of over 10,000 talented people.

So, what does that mean for you? We enrich the livelihoods of our employees with a positive and respectful work culture that embraces challenges and encourages learning and discovery.

At IDEXX, you will be supported by competitive compensation, incentives, and benefits while enjoying purposeful work that drives improvement.

Let’s pursue what matters together.

LI-REMOTE

16 days ago
Related jobs
Promoted
VirtualVocations
Warren, Michigan

A company is looking for a Site Reliability Engineering (SRE) Lead to deliver mission-critical services that empower end users. ...

IDEXX
US, MI, Virtual

Are you interested in working on a fast-paced Agile team, building modern & global LIMS platform? Do you want to work on a product that makes a difference in the day-to-day life of lab operations, veterinarians, and pet owners? Are you a self-starter individual? We are looking for a motivated engine...

Promoted
VirtualVocations
Warren, Michigan

A company is looking for a Director of Site Reliability Engineering. ...

Promoted
Interesting Engineering, Inc.
Detroit, Michigan

A leader in fully digital banking and financial services is seeking junior Site Reliability Engineers to sustain and deploy critical applications and automate deployment and data gathering processes for critical apps. ...

Promoted
VirtualVocations
Warren, Michigan

A company is looking for a Site Reliability Technical Lead. ...

GEICO
Ann Arbor, Michigan

Our Senior Manager is an engineering leader who works with the engineering staff to innovate and build new engineering solutions, improveand enhance existing solutions as well as leverage engineering solutions to solve critical operational problems. Be a strong thought leader in Site Reliability eng...

WK Kellogg Co
Battle Creek, Michigan

Are you the type of person who enjoys developing and implementing continuous improvement strategies? If so, come join us at our Battle Creek Plant as a Manufacturing Reliability Site Lead. In this role you will promote and support lean activities and develop a lean culture by leading, defining, and ...

GEICO
Ann Arbor, Michigan

Our Senior Manager is an engineering leader who works with the engineering staff to innovate and build new engineering solutions, improveand enhance existing solutions as well as leverage engineering solutions to solve critical operational problems. Be a strong thought leader in Site Reliability eng...

Promoted
KLA
Ann Arbor, Michigan

Network Linux Engineers are core to KLA's technology, while we do not currently have an opening, we are always building our Network Linux Engineering talent community, we are interested in learning about your background. Our expert teams of physicists, engineers, data scientists and problem-solvers ...

Promoted
VirtualVocations
Warren, Michigan

A company is looking for a Lead DevOps Engineer to build and maintain AWS cloud infrastructure. ...