Search jobs > Atlanta, GA > Site reliability engineer

Lead Site Reliability Engineer

Bose
GA - Atlanta, US
Full-time

You know the moment. It’s the first notes of that song you love, the intro to your favorite movie, or simply the sound of someone you love saying hello.

It’s in these moments that sound matters most.

At Bose, we believe sound is the most powerful force on earth. We’ve dedicated ourselves to improving it for nearly 60 years.

And we’re passionate down to our bones about making whatever you’re listening to a little more magical.

The Information Technology team at Bose exists to deliver valuable and reliable business and technology solutions with an innovative, engaged, and collaborative team focused on contributing to our corporate vision.

Job Description

Lead, mentor, and manage a team of Site Reliability Engineers, providing guidance, support, and performance evaluations.

Foster a culture of collaboration, continuous improvement, and innovation within the team.

Define and communicate clear goals and objectives for the SRE team, aligning with overall business objectives.

Develop and execute strategies to improve system reliability, availability, and performance.

Drive the adoption of best practices and standards for SRE across the organization.

Participate in and lead strategic planning for capacity management, disaster recovery, and infrastructure investments.

Lead post-incident reviews to identify root causes and implement preventive measures.

Develop and enforce incident response procedures and runbooks.

Collaborate with engineering and architecture teams to design scalable and resilient system architectures.

Optimize system performance and reliability through proactive monitoring, tuning, and enhancements.

Evaluate and implement new technologies and tools to improve system capabilities and efficiency.

Drive the automation of operational processes to improve efficiency and reduce manual intervention.

Oversee the development and maintenance of tools for deployment, monitoring, and configuration management.

Promote the use of Infrastructure-as-Code (IaC) and Continuous Integration / Continuous Deployment (CI / CD) practices.

Lead efforts in capacity planning to ensure infrastructure can support current and future business needs.

Design and implement scaling strategies to handle variations in demand and growth.

Monitor and optimize resource utilization to balance performance and cost-effectiveness.

Work closely with cross-functional teams, including development, operations, and product management, to ensure alignment on reliability and performance goals.

Communicate effectively about system status, performance metrics, and ongoing improvements to stakeholders.

Provide technical guidance and support to other teams as needed.

Ensure thorough documentation of systems, processes, and procedures.

Create and maintain operational runbooks, knowledge base articles, and training materials.

Share knowledge and best practices with the team and organization through training sessions and workshops.

Required Competencies :

Advanced proficiency in scripting and programming languages such as Python, Go, Bash, or Java.

Extensive experience with monitoring and observability tools (e.g., Prometheus, Grafana, Datadog).

In-depth knowledge of containerization and orchestration technologies (e.g., Docker, Kubernetes).

Strong familiarity with cloud platforms (e.g., AWS, Azure, Google Cloud).

Expertise in configuration management and Infrastructure-as-Code tools (e.g., Terraform, Ansible).

Strong understanding of networking, distributed systems, and databases.

Proven ability to lead and manage technical teams effectively.

Excellent problem-solving, analytical, and communication skills.

Experience Requirements :

Education / Certification Requirements :

Education : Bachelor’s degree in Computer Science, Engineering, or a related field. Advanced degree or relevant certifications (e.

g., AWS Certified DevOps Engineer, Google Professional DevOps Engineer) preferred.

Bose is an equal opportunity employer that is committed to inclusion and diversity. We evaluate qualified applicants without regard to race, color, religion, sex, sexual orientation, gender identity, genetic information, national origin, age, disability, veteran status, or any other legally protected characteristics.

30+ days ago
Related jobs
Promoted
Capital One
Atlanta, Georgia
Remote

Locations: US Remote, United States of AmericaSr Lead Site Reliability Engineer - Back End, Shopping (Remote-Eligible)Interested in joining a dynamic remote-first engineering team in a fast-paced environment full of greenfield problem-solving? Then Capital One Shopping might be the place for you. Wh...

Promoted
Capital One
Hapeville, Georgia
Remote

Locations: US Remote, United States of AmericaSr Lead Site Reliability Engineer - Back End, Shopping (Remote-Eligible)Interested in joining a dynamic remote-first engineering team in a fast-paced environment full of greenfield problem-solving? Then Capital One Shopping might be the place for you. Wh...

Promoted
VirtualVocations
Atlanta, Georgia

A company is looking for a Site Reliability Engineering (SRE) Solution Architect. ...

Promoted
Tata Consultancy Services
Atlanta, Georgia

Site Reliability Engineering: Knowledge of the theories and methodologies of reliability engineering; ability to design, develop and support various tools, services and applications to maintain a reliable site environment. Using Chaos Engineering to test what you build under real-world conditions. P...

Promoted
Cox
Avondale Estates, Georgia

This role is for an opening for a Senior Site Reliability Engineer (SRE) on the Manheim Logistics SRE team. As a Senior Site Reliability Engineer at Cox Automotive you will:. Engage with engineering teams to ensure best practices are implemented. Improve predictability and reliability of software re...

Brilliance Cyber Systems INC
Atlanta, Georgia

Job Title: Lead Site Reliability Engineer<br /><br />Location: Atlanta, GA (Onsite)<br /><br />Duration: Long Term Contract<br /><br /> <br /><br /> <br /><br />Job Description:<br /><br />Desired Skillset:<br /><b...

Bank of America
Atlanta, Georgia

We are seeking a Platform Engineer in support of Network Automation with at least 5-7 years of professional experience to join a team that sustains and enhances platforms, infrastructure, and microservices for network automation. Lead and propose solution design activities including data modeling, d...

Cox Automotive
Atlanta, Georgia

Lead and mentor a team of Site Reliability Engineers, providing technical guidance, coaching, and fostering a culture of continuous improvement. Stay abreast of emerging technologies, industry trends, and evolving best practices in DevOps and Site Reliability Engineering and propose innovative solut...

Bank of America
Atlanta, Georgia

We are seeking a talented and experienced Key Management Service (KMS) Service Reliability Engineer (SRE) to join our team. In this role, you will be responsible for ensuring reliability, stability, and security of a robust enterprise key management infrastructure. Work closely with our CIOs , engin...

Cox Automotive
Atlanta, Georgia

Evolve problem statements into actionable items that enable the team to deliver measurable value by staying updated with industry trends, emerging technologies, and best practices in DevOps and Site Reliability Engineering domains in order to shape actionable items for the data services engineering ...