Talent.com
Lead Site Reliability Engineer (M365)
Lead Site Reliability Engineer (M365)Jobgether • US
Lead Site Reliability Engineer (M365)

Lead Site Reliability Engineer (M365)

Jobgether • US
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
  • serp_jobs.filters.remote
  • serp_jobs.filters_job_card.quick_apply
job_description.job_card.job_description

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Lead Site Reliability Engineer (M365) in the United States .

This role provides the opportunity to lead and enhance the reliability, performance, and scalability of a large Microsoft 365 environment supporting a national organization. You will design and implement monitoring and observability dashboards, automate processes with PowerShell and Graph APIs, and optimize workflows with Power Apps / Automate. The position requires hands-on technical expertise combined with leadership skills to guide teams, manage incidents, and drive continuous improvement. You will work with multiple stakeholders, ensuring systems remain secure, performant, and highly available while mentoring team members and shaping best practices for cloud and on-premises M365 services.

Accountabilities

  • Lead the development and creation of monitoring and observability dashboards in Splunk, Dynatrace , and other platforms.
  • Drive incident management, post-incident reviews, and root cause analysis for improved system reliability.
  • Develop and maintain automation scripts using PowerShell and integrations with Graph APIs .
  • Optimize and maintain workflows and applications using Power Apps / Automate .
  • Guide teams in deploying new services, performing system validations, and managing service performance.
  • Coach and mentor team members, design key performance indicators, and implement best practices.
  • Ensure compliance with organizational policies and drive continuous improvements in system reliability and SDLC processes.

Requirements

  • Bachelor’s degree in a quantitative or technical field (e.g., Computer Science, Engineering, Statistics) or equivalent experience.
  • 5–7 years of site reliability engineering experience, focused on Microsoft 365 environments.
  • Advanced proficiency in PowerShell scripting and Graph APIs ; intermediate skills in Power Apps / Automate .
  • Strong experience with monitoring and observability tools such as Splunk and Dynatrace .
  • Solid understanding of incident management processes and cloud / enterprise system administration.
  • Demonstrated analytical skills, project management abilities, and technical aptitude.
  • Excellent judgment, decision-making, and communication skills to effectively guide teams and influence upper management.
  • Benefits

  • Competitive salary : $100,900 – $186,800 per year.
  • Comprehensive medical, dental, and vision insurance.
  • 401(k) retirement plan and stock purchase opportunities.
  • Tuition reimbursement and professional development programs.
  • Paid time off plus holidays, with flexible remote, hybrid, or office work schedules.
  • Additional performance incentives may be available.
  • Supportive and inclusive workplace culture, emphasizing diversity and professional growth.
  • Jobgether is a Talent Matching Platform that partners with companies worldwide to efficiently connect top talent with the right opportunities through AI-driven job matching .

    When you apply, your profile goes through our AI-powered screening process , designed to identify top talent efficiently and fairly.

    🔍 Our AI evaluates your CV and LinkedIn profile thoroughly, analyzing your skills, experience, and achievements.

    📊 It compares your profile to the job’s core requirements and past success factors to determine your match score.

    🎯 Based on this analysis, we automatically shortlist the 3 candidates with the highest match to the role.

    🧠 When necessary, our human team may perform an additional manual review to ensure no strong profile is missed.

    This process is transparent, skills-based, and free of bias , focusing solely on your fit for the role. Once the shortlist is completed, we share it directly with the company that owns the job opening. The final decision and next steps (such as interviews or additional assessments) are then made by their internal hiring team.

    Thank you for your interest!

    #LI-CL1

    serp_jobs.job_alerts.create_a_job

    Site Reliability Engineer • US