Search jobs > Cupertino, CA > Site reliability engineer
Kindly Note : This would be a W2 Opportunity
Primary Focus : This role involves close collaboration with the customer&
tech leads and teams from development, infrastructure, cloud deployment,
and DevOps. The SRE Tech Lead presents findings, works through
challenges, and provides solutions for reliability improvements.
Required Skills :
o Communication & Collaboration : Strong ability to work directly with
customer teams, understand their design challenges, and
communicate complex technical concepts clearly.
o DevOps & Cloud : Hands-on experience in DevOps practices and
cloud infrastructure (AWS, Kubernetes, Rancher, etc.), with the ability
to recommend reliability solutions.
o Performance Tuning & Reliability : Strong knowledge of
performance bottlenecks, capacity planning, and reliability
engineering best practices.
o Problem Solving : Analytical skills to work through customer
challenges and limitations, and to co-develop short-term and long-term
solutions.
o Dev Engineering & Infra Design : Background in software
development engineering and infrastructure design, enabling
effective discussions with customer teams on the technical aspects.
o CI / CD Tools : Experience with Jenkins, ArgoCD, and other CI / CD
tools to discuss automation and deployment pipelines with customers.
Educational Qualification :
o Bachelor’s or Master’s Degree in Computer Science, Information
Systems, or related fields.
Years of Experience :
o 8-10 years of experience in Site Reliability Engineering, DevOps, or
related fields, with a significant focus on cloud and infrastructure
performance.
o Proven experience working in a customer-facing role where direct
engagement and collaborative problem-solving are key.
Senior Site Reliability Engineer - Observability and Telemetry Platform
Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. Senior Site Reliability Engineer - Observability and Telem...
Site Reliability Engineer - USDS
Site Reliability Engineering at TikTok combines software and systems engineering to build and run large-scale, massively distributed, and fault-tolerant systems. Site Reliability Engineering (SRE). TikTok is the leading destination for short-form mobile video. Scale systems sustainably through mecha...
Site Reliability Engineer
Site Reliability Engineer, Production Engineer, Platform Engineer). Collaborate, partner, advise, review and mentor engineering teams on Reliability topics like high reliability architecture, observability, safe change management. Experience leading and driving company-wide reliability efforts and e...
Site Reliability Engineer - Infra and DevOps
As Secure Development Factory (SDF) Site Reliability Engineer - DevOps, you will be at the heart of Western Digital’s engineering process, delivering the software development tools and infrastructure that empowers engineering teams to develop and deliver high quality products quickly. You will lead ...
Site Reliability Engineer - Data Infrastructure (San Jose)
Our data infrastructure Site Reliability Engineering (SRE) team is a pioneer in innovation. Establish sustainable mechanisms for scaling systems, such as automation, to drive enhancements in reliability, efficiency, and velocity. ...
Sr. Site Reliability Engineer
CENTRL’s clients include leading companies around the world including several Fortune 500 firms. CENTRL is led by a highly experienced management team with a proven track record and is backed by some of the leading investors such as Providence Strategy Growth and Susquehanna Growth Equity. In this l...
Site Reliability Engineer - Product Resilience
You will also produce designs and lead more junior team members through their implementation and deployment to production. You will be the tech lead for production resilience and disaster recovery, and you will define the roadmap for improvements in this area. Be familiar with chaos engineering and ...
Site Reliability Engineer, Infrastructure and Assurance Services - USDS
TikTok is the leading destination for short-form mobile video. The teams within USDS that deliver on this commitment daily span across Trust & Safety, Security & Privacy, Engineering, User & Product Ops, Corporate Functions and more. Initiate and lead scripting/tooling/automation to streamline proce...
Site Reliability Engineer
Design, implement, and maintain complex data systems supporting millions of customers with Cloud Native principles and best practices to ensure highly available, secure, performant and scalable database systems.Build and maintain CI/CD pipelines in Jenkins.Build and deploy services in Kubernetes clu...
Staff Cloud DevOps/Site Reliability Engineer (SRE) - USA
DevOps, Infrastructure, Operations, or Site Reliability Engineer (or as a software engineer with relevant experience). We are looking for a Staff Cloud DevOps/Site Reliability Engineer to join our team. Our Technical Operations team manages the infrastructure, DevOps, and Site Reliability of our pla...