Job Description
We are seeking a highly skilled and experienced Staff DevOps Engineer to join our team. This role will focus on Site Reliability Engineering (SRE), enhancing our developer platform, and ensuring robust security practices.
The ideal candidate will have a strong background in SRE principles, platform engineering, and security, with a proven ability to drive improvements in system reliability, performance, and security.
Key Responsibilities :
Site Reliability Engineering (SRE) : Implement and manage SRE practices to ensure high availability, reliability, and performance of our systems and services.
Develop and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs).
Monitor and analyze system performance, identifying and addressing reliability issues proactively.
Automate operational tasks to reduce manual intervention and improve system efficiency.
Developer Platform : Enhance and maintain the developer platform to support efficient and scalable software development.
Collaborate with engineering teams to improve CI / CD pipelines, streamline development workflows, and optimize deployment processes.
Ensure that development tools and environments are up-to-date, reliable, and scalable.
Security : Implement and manage security practices to protect our systems and data from threats.
Conduct regular security assessments and vulnerability scans to identify and mitigate risks.
Collaborate with security teams to enforce security policies and ensure compliance by default.
Collaboration and Leadership : Lead with a product mindset, building tools that developers find intuitive and easy to use.
Work closely with cross-functional teams to support and improve system reliability, performance, and security.
Mentor and provide technical guidance to junior team members.
Stay updated with industry trends and best practices, applying them to improve our systems and processes.