Search jobs > Seattle, WA > Remote > Site reliability engineering

Senior Manager, Site Reliability Engineering (SRE) for Hybrid Cloud - Infrastructure as a Service (IaaS) (REMOTE)

GEICO
Seattle, WA
$115K-$261.5K a year
Remote
Full-time

GEICO is seeking an experienced and visionary SRE Senior Manager to join the organization and aid the establishment and growth of the Site Reliability Engineering (SRE) practice for Hybrid Cloud - Infrastructure as a Service (IaaS).

As an SRE Leader, you will be responsible for leading and driving data center and network engineering initiatives to enhance the reliability, availability, performance, and security of Geico’s private and public cloud infrastructure.

You will collaborate closely with cross-functional teams, including Data Center and Network Engineering, Security, and PaaS / Application Software Development, to ensure robust data center and network architecture and software solutions and seamless operations.

This role combines technical expertise, leadership, and strategic thinking to drive continuous improvement in reliability and scalability.

Key Responsibilities :

Technical Leadership :

Provide strategic direction and technical leadership in the design, development, and deployment of a robust, reliable, and scalable digital infrastructure.

Drive the architecture, design, and optimization of highly available, scalable, and fault-tolerant systems and services supporting the digital engineering team.

Team Management :

Build and nurture a high-performing SRE team, providing mentorship, coaching, and guidance to foster a culture of continuous learning and improvement.

Work closely with all GEICO Tech products and platforms to manage, innovate and create new programs, software and analytics that improve the availability, scalability, latency and effectiveness of GEICO products and services.

Collaborate with cross-functional leaders including product area leads to guide product engineering to build reliable and durable production systems and contribute to the strategic direction of the Tech organization.

Collaboration and Communication :

Present a reliability vision and strategic recommendations with clarity and concision to stakeholders having varying degrees of SRE fluency.

Develop and own relationships with technology and business partners.

Foster effective collaboration and communication across cross-functional teams to align priorities, share best practices, and ensure smooth coordination for incident response, system maintenance, and upgrades.

Manage department budgets, resource allocation, and vendor relationships to optimize costs and maintain high-quality outcomes.

Qualifications :

Bachelor's degree in Computer Science, Information Technology, or a related field (Master's degree preferred).

Proven experience in a leadership role focused on software defined and software driven data center and network engineering within a complex, large-scale production environments.

Deep knowledge of SRE practices, methodologies, and principles, along with a solid understanding of on prem and public cloud based network, compute and storage technologies.

In-depth knowledge of hybrid cloud architecture, IaaS technologies, container orchestration platforms (e.g., Kubernetes), cloud efficiency and observability etc.

Strong background in incident management, performance tuning, and capacity planning. including creating incident response playbooks, incident triaging strategies, and post-incident analysis to drive continuous improvement in system reliability and availability.

Experience with open source management and monitoring tools (e.g. Cacti, Zabbix, Splunk, Prometheus, Grafana)

Experience with infrastructure automation, tooling, and configuration management frameworks (e.g., Puppet, Chef, Ansible, Terraform, etc.).

Familiarity with cloud security best practices and compliance standards.

Excellent leadership and team management skills with a passion for mentoring and fostering professional growth.

Strong problem-solving and analytical abilities, with a keen eye for detail and a passion for driving operational efficiency.

Experience in budget management, resource allocation, and vendor collaboration.

Certifications such as AWS Certified DevOps Engineer, Google Professional DevOps Engineer, or relevant cloud provider certifications are a plus.

LI-RP2

DICE

Annual Salary

$115,000.00 - $261,500.00

The above annual salary range is a general guideline. Multiple factors are taken into consideration to arrive at the final hourly rate / annual salary to be offered to the selected candidate.

Factors include, but are not limited to, the scope and responsibilities of the role, the selected candidate’s work experience, education and training, the work location as well as market and business considerations.

At this time, GEICO will not sponsor a new applicant for employment authorization for this position.

Benefits :

As an Associate, you’ll enjoy our

  • to help secure your financial future and preserve your health and well-being, including :
  • Premier Medical, Dental and Vision Insurance with no waiting period
  • Paid Vacation, Sick and Parental Leave
  • 401(k) Plan
  • Tuition Reimbursement
  • Paid Training and Licensures
  • Benefits may be different by location. Benefit eligibility requirements vary and may include length of service.

Coverage begins on the date of hire. Must enroll in New Hire Benefits within 30 days of the date of hire for coverage to take effect.

30+ days ago
Related jobs
Promoted
Elit IT Inc.
Seattle, Washington

SRE ( Site Reliability Engineer) - Data DevOps/ DataOps/ No- SQL, Kafka , Databricks, Kubernetes, Kafka , Terraform. Based microservices, responsible for deployment, scripting language is python. NO-SQL Database - Cassandra, Mongo, PostGres- must have this experience. Terraform- skill level expert i...

Promoted
TikTok
Seattle, Washington

Our Recommendation Infrastructure Team at US Tech Services department is responsible for building up and optimizing the architecture for our recommendation system to provide the most stable and best experience for our TikTok users. We create to inspire - for you, for us, and for more than 1 billion ...

Promoted
Husky Senior Care
Kirkland, Washington

Are you looking for an opportunity to use your skills in caregiving, management, and customer service? We are looking for an experienced, dedicated, and compassionate Resident Manager to join our team at Husky Senior Care's sister company Longhouse Adult Family Homes in Northgate, WA. Our mission is...

Promoted
The Allen Institute for Artificial Intelligence
Seattle, Washington

This infrastructure directly supports the institute's frontier AI efforts, such as: online reinforcement learning, distributed pre-training on large clusters, and PB-scale dataset curation and synthesis. We're also responsible for Ai2's on-premise GPU servers from the bare-metal up, operating a high...

Promoted
MongoDB
Seattle, Washington

The Cloud Team is responsible for several services including MongoDB Atlas - our database as a service offering and fastest growing product, MongoDB Realm- our serverless platform offering that allows developers to build apps on MongoDB without managing any infrastructure, and our newest offering, A...

Promoted
Amazon.com Services LLC
Seattle, Washington

The Inspire team is looking for an innovative and self-motivated technical program manager to own the delivery of flagship goals in support of Inspire's mission to help customers discover ideas and inspiration to kickstart new shopping journeys on Amazon. A successful program manager in this role wi...

Promoted
Amazon
Seattle, Washington

The role requires a seasoned individual who has excellent experience as a Technical Program Manager for distributed SOA software systems and can guide high-level technical design as well as think about potential, future areas of fraud that our platform might encounter. Amazon is looking for a dynami...

Amazon.com Services LLC
Bellevue, Washington

We are looking for a Technical Infrastructure Program Manager with expertise in electrical system design, power systems studies, inspection, testing, and maintenance to join the Base-Building Electrical Engineering team. Develop and implement procedures for monitoring, diagnosing, and maintaining pe...

ByteDance
Seattle, Washington

With a suite of more than a dozen products, including TikTok, Helo, and Resso, as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content. Solid hands-on experience and und...

Amazon.com Services LLC
Bellevue, Washington

Our software systems include services that handle thousands of requests per second, make business decisions impacting billions of dollars a year, integrate with a network of small and large carriers worldwide, maintain business rules for millions of unique products, and improve experience for online...