Site Reliability Engineering (SRE) Lead

IDEXX
Virtual Ohio
Full-time

Are you interested in working on a fast-paced Agile team, building modern & global LIMS platform? Do you want to work on a product that makes a difference in the day-to-day life of lab operations, veterinarians, and pet owners?

Are you a self-starter individual? We are looking for a motivated engineer to join the Site Reliability Engineering Team to help drive performance, stability and customer satisfaction with the product and the team.

The right lab information managementsystem (LIMS) is critical to operations, clinical outcomes, client relationships, and more.

Applying system thinking to these challenges supports our desire for change while also creating pull-through innovations that are led by our commercial needs, and the needs of our customers.

IDEXX is looking for a SRETechnical Lead to work on a complex distributed platform that scales globally across the entire Reference Lab operationsecosystem.

You will participate in everything from high-level design, execute product strategies, oversee the management and optimization of our vendor software products and solutions.

You will be working on a platform that is composed of numerous technologies and leverages PAAS infrastructure in both AWS and GCP.

This opportunity offers the chance to tackle complex challenges and build a new solution from the ground up.

As the SRE Lead , you will be responsible for ensuring the reliability, scalability, and performance of our services and solutions.

The ideal candidate should have a strong technical background, excellent problem-solving skills, and a passion for building resilient systems.

In this role...

You will provide Leadership :

  • Mentor team of Site Reliability Engineers.
  • Foster a culture of collaboration, innovation, and continuous improvement within the team.
  • Identifies business needs, assesses available technologies, and develops and presents solutions.

You will be performing Reliability Engineering responsibilities :

  • Design, implement, and maintain systems and processes to ensure high availability and performance of services.
  • Develop and manage SLAs, SLOs, and SLIs to monitor and improve service reliability.
  • Proactively identify and address potential reliability issues and risks.
  • Provides high level of customer service, partners with end users in the resolution of problems or in deployment of new applications.

You will drive Automation, Tooling and Development :

  • Drive automation efforts to reduce manual work and improve operational efficiency.
  • Develop and maintain infrastructure as code and configuration management tools.
  • Implement monitoring, logging, and alerting systems to ensure timely detection and resolution of issues.
  • Design, code, test, debug, and document software applications according to technical specifications developed by analysts and project teams.

Create modules that adhere to these specifications, ensuring efficient and reliable system operation.

You will lead Incident Management :

  • Lead incident response efforts, including root cause analysis and post-mortem reviews.
  • Implement processes to prevent recurrence and improve incident response times.
  • Collaborate with cross-functional teams to ensure effective communication and coordination during incidents.

You will be responsible for Continuous Improvement :

  • Identify opportunities for process improvements and implement best practices for SRE.
  • Stay up-to-date with industry trends and emerging technologies to drive innovation.
  • Foster a culture of learning and knowledge sharing within the team and across the organization.

You will collaborate :

  • Work closely with development, operations, and product teams to ensure seamless integration and delivery of services.
  • Provide technical guidance and support to other teams as needed.
  • Participate in architecture and design discussions to ensure reliability and scalability considerations are addressed.

What You Will Need to Succeed :

  • You will have a solid track record of leading technical initiatives to meet timelines and meet the expectations of various stakeholders.
  • You have a strong understanding of cloud infrastructure (AWS, GCP, Azure) and containerization technologies (Docker, Kubernetes).
  • You have 7+ years of experience in site reliability engineering, software development, or a related role.
  • You have proficiency with languages including Java / Kotlin.
  • Bachelor’s degree in computer science, Engineering, or a related field. Master’s degree preferred.
  • Proven experience leading and mentoring technical teams.
  • Proficiency in scripting and automation using languages such as Python, Go, or similar.
  • Experience with monitoring and logging tools (Splunk, ELK, CloudWatch, etc.)
  • Experience working with RESTful APIs, front end JS frameworks and Jenkins
  • Excellent problem-solving and analytical skills.
  • Experience with relational & NoSQL databases
  • Strong communication and interpersonal skills.
  • Ability to work effectively in a fast-paced, dynamic environment

Location : This position is remote and you can be based anywhere in US, with preference in the CST and EST time zones.

Why IDEXX?

We’re proud of the work we do, because our work matters. An innovation leader in every industry we serve, we follow our Purpose and Guiding Principles to help pet owners worldwide keep their companion animals healthy and happy, to ensure safe drinking water for billions, and to help farmers protect livestock and poultry from diseases.

We have customers in over 175 countries and a global workforce of over 10,000 talented people.

So, what does that mean for you? We enrich the livelihoods of our employees with a positive and respectful work culture that embraces challenges and encourages learning and discovery.

At IDEXX, you will be supported by competitive compensation, incentives, and benefits while enjoying purposeful work that drives improvement.

Let’s pursue what matters together.

LI-REMOTE

15 days ago
Related jobs
IDEXX
Virtual Ohio

Are you interested in working on a fast-paced Agile team, building modern & global LIMS platform? Do you want to work on a product that makes a difference in the day-to-day life of lab operations, veterinarians, and pet owners? Are you a self-starter individual? We are looking for a motivated engine...

JPMorgan Chase & Co.
Delaware, Ohio

As a Lead Site Reliability Engineer at JPMorgan Chase within the Corporate Sector, you hold a leadership role in your team, demonstrate strong knowledge across multiple technical domains, and advise others on the technical and business issues facing them. Deep proficiency in reliability, scalability...

Hasura Inc.
Marion, Ohio
Remote

Site Reliability Engineers (SREs) are responsible for keeping Hasura Cloud systems running smoothly and making sure updates can be rolled out reliably without any downtime. ...

iSeatz
Hudson, Ohio

The Site Reliability Engineering (SRE) Manager reports to the Manager of Platform Services and leads full-time and contractor team members. Lead, manage and mentor a team of Site Reliability Engineers to ensure the reliability, scalability, and performance of our services. In this role, you will ens...

PNC Bank NA
Strongsville, Ohio

Oversee the talent management of engineering talent, including recruiting, mentoring, and enabling a culture of learning with a focus on engineering craftsmanship. Proven leadership experience with a moderate to large scope of responsibility is required. ...

JPMorgan Chase Bank, N.A.
Columbus, Ohio

Participate in the engineering design/review of nonfunctional requirements for Quantum Key Distribution, SD-WAN and SASE architectures * Participate in the engineering design to support network segmentation practices * Collaborates with technical experts, key stakeholders, and te...

Forhyre
Ohio Township, Ohio

You will provide technical leadership to cross-functional engineering, infrastructure, and product teams, and evangelize cloud best practices while building a culture of reliability and observability. Serve as subject matter expert in an SRE mindset, best practices, and cloud-native principles. Scal...

Promoted
Owens Corning
Toledo, Ohio

Digital DevOps Developer Lead / DevOps Engineer. This role will work closely with Digital Developer Leads, Digital Architects, DevSecOps practitioners and Product Owners to define and implement our digital roadmap and the necessary infrastructure and cloud environments for lasting solutions. Reports...

Promoted
Insight Global
OH, United States

Lead project teams to accomplish task assignments related to highway projects,. Develop opportunity leads, project teams and coordination with the Infrastructure Client Development team to capture strategic projects. Develop project scopes and budgets, interfacing with and maintenance of clients, me...

Enumerate
Cincinnati, Ohio
Remote

Software Onboarding Project Manager. Someone who loves client-focused implementations over internal IT projects. Someone who thrives on running multiple projects, managing changing priorities. Three (3) years of experience in project management. ...