Site Reliability Engineering (SRE) Lead

IDEXX
US, TN, Virtual
Full-time

Are you interested in working on a fast-paced Agile team, building modern & global LIMS platform? Do you want to work on a product that makes a difference in the day-to-day life of lab operations, veterinarians, and pet owners?

Are you a self-starter individual? We are looking for a motivated engineer to join the Site Reliability Engineering Team to help drive performance, stability and customer satisfaction with the product and the team.

The right lab information managementsystem (LIMS) is critical to operations, clinical outcomes, client relationships, and more.

Applying system thinking to these challenges supports our desire for change while also creating pull-through innovations that are led by our commercial needs, and the needs of our customers.

IDEXX is looking for a SRETechnical Lead to work on a complex distributed platform that scales globally across the entire Reference Lab operationsecosystem.

You will participate in everything from high-level design, execute product strategies, oversee the management and optimization of our vendor software products and solutions.

You will be working on a platform that is composed of numerous technologies and leverages PAAS infrastructure in both AWS and GCP.

This opportunity offers the chance to tackle complex challenges and build a new solution from the ground up.

As the SRE Lead , you will be responsible for ensuring the reliability, scalability, and performance of our services and solutions.

The ideal candidate should have a strong technical background, excellent problem-solving skills, and a passion for building resilient systems.

In this role...

You will provide Leadership :

  • Mentor team of Site Reliability Engineers.
  • Foster a culture of collaboration, innovation, and continuous improvement within the team.
  • Identifies business needs, assesses available technologies, and develops and presents solutions.

You will be performing Reliability Engineering responsibilities :

  • Design, implement, and maintain systems and processes to ensure high availability and performance of services.
  • Develop and manage SLAs, SLOs, and SLIs to monitor and improve service reliability.
  • Proactively identify and address potential reliability issues and risks.
  • Provides high level of customer service, partners with end users in the resolution of problems or in deployment of new applications.

You will drive Automation, Tooling and Development :

  • Drive automation efforts to reduce manual work and improve operational efficiency.
  • Develop and maintain infrastructure as code and configuration management tools.
  • Implement monitoring, logging, and alerting systems to ensure timely detection and resolution of issues.
  • Design, code, test, debug, and document software applications according to technical specifications developed by analysts and project teams.

Create modules that adhere to these specifications, ensuring efficient and reliable system operation.

You will lead Incident Management :

  • Lead incident response efforts, including root cause analysis and post-mortem reviews.
  • Implement processes to prevent recurrence and improve incident response times.
  • Collaborate with cross-functional teams to ensure effective communication and coordination during incidents.

You will be responsible for Continuous Improvement :

  • Identify opportunities for process improvements and implement best practices for SRE.
  • Stay up-to-date with industry trends and emerging technologies to drive innovation.
  • Foster a culture of learning and knowledge sharing within the team and across the organization.

You will collaborate :

  • Work closely with development, operations, and product teams to ensure seamless integration and delivery of services.
  • Provide technical guidance and support to other teams as needed.
  • Participate in architecture and design discussions to ensure reliability and scalability considerations are addressed.

What You Will Need to Succeed :

  • You will have a solid track record of leading technical initiatives to meet timelines and meet the expectations of various stakeholders.
  • You have a strong understanding of cloud infrastructure (AWS, GCP, Azure) and containerization technologies (Docker, Kubernetes).
  • You have 7+ years of experience in site reliability engineering, software development, or a related role.
  • You have proficiency with languages including Java / Kotlin.
  • Bachelor’s degree in computer science, Engineering, or a related field. Master’s degree preferred.
  • Proven experience leading and mentoring technical teams.
  • Proficiency in scripting and automation using languages such as Python, Go, or similar.
  • Experience with monitoring and logging tools (Splunk, ELK, CloudWatch, etc.)
  • Experience working with RESTful APIs, front end JS frameworks and Jenkins
  • Excellent problem-solving and analytical skills.
  • Experience with relational & NoSQL databases
  • Strong communication and interpersonal skills.
  • Ability to work effectively in a fast-paced, dynamic environment

Location : This position is remote and you can be based anywhere in US, with preference in the CST and EST time zones.

Why IDEXX?

We’re proud of the work we do, because our work matters. An innovation leader in every industry we serve, we follow our Purpose and Guiding Principles to help pet owners worldwide keep their companion animals healthy and happy, to ensure safe drinking water for billions, and to help farmers protect livestock and poultry from diseases.

We have customers in over 175 countries and a global workforce of over 10,000 talented people.

So, what does that mean for you? We enrich the livelihoods of our employees with a positive and respectful work culture that embraces challenges and encourages learning and discovery.

At IDEXX, you will be supported by competitive compensation, incentives, and benefits while enjoying purposeful work that drives improvement.

Let’s pursue what matters together.

LI-REMOTE

15 days ago
Related jobs
IDEXX
US, TN, Virtual

Are you interested in working on a fast-paced Agile team, building modern & global LIMS platform? Do you want to work on a product that makes a difference in the day-to-day life of lab operations, veterinarians, and pet owners? Are you a self-starter individual? We are looking for a motivated engine...

ENERCON
Soddy-Daisy, Tennessee

The Nuclear Service Group is seeking an Engineering Site Lead to join the team based in our Soddy-Daisy, TN location(s). Lead weekly client interface/project status meetings and providing other on-site support, as needed. Prepare and/or review design change packages and other engineering deliverable...

TheIncLab
Nashville, Tennessee

As a member of the engineering team, the DevOps Engineer is responsible for building and maintaining scalable processes to build, package, and deliver software products to our customers. The DevOps engineer will build and maintain CI/CD pipelines that will take source code from the software developm...

IEA Infrastructure and Energy Alternatives
Middleton, Tennessee

MasTec’s Clean Energy and Infrastructure Group (CE&I) is a $4 billion annual revenue business unit that provides construction services for industrial facilities; building products manufacturers, power generation facilities, manufacturing plants; solar, wind, and thermal energy plants; buildings, and...

Ardent Health Services
Brentwood, Tennessee

We have an exciting opportunity for a Systems Engineer with an Infrastructure focus to join our Core Infrastructure Services team. POSITION SUMMARY The Systems Engineer, Infrastructure will be part of an experienced team of Systems Engineers who are responsible for securing, maintaining, and monitor...

Kyler Professional Search
Nashville, Tennessee

Preferred 3-4 years of Senior/Supervisory Project Manager experience in healthcare projects. Then consider the below opportunity for a Assistant Project Manager with one of the most prestigious GCs in the US! . Management, Civil Engineering or similar degree)• Must have experience running projects f...

Capgemini
Nashville, Tennessee

As a Senior DevOps engineer on a team, this role is pivotal in crafting the definition, vision, design, road map, and development of product features from beginning to end. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its m...

Environmental Solutions Group
Chattanooga, Tennessee

Eye is looking for an experienced DevOps Engineer that can build culture and automation to empower our software delivery teams to deploy continuously. You will help manage our AWS infrastructure, as well as engineer and maintain a fully automated CI/CD pipeline. In addition, you will develop interna...

IonQ
Bristol, Tennessee

Be the first site reliability engineer at IonQ! You'll create, support, and manage infrastructure, instrumentation, and tooling for both our product and the engineering teams. We're looking for our first site reliability engineer to help increase performance, decrease latency, and ensure that the wo...

Gpac
Nashville, Tennessee

Position: Traveling Project Manager. We are seeking a talented and motivated individuals to join our team as a Traveling Project Manager. Oversee and manage construction projects from start to finish, ensuring they are completed within budget and on time. Coordinate with various parties involved in ...