Site Reliability Engineer/Lead

Diamondpick

Cupertino, CA, United States

Full-time

Kindly Note : This would be a W2 Opportunity

Primary Focus : This role involves close collaboration with the customer&

tech leads and teams from development, infrastructure, cloud deployment,

and DevOps. The SRE Tech Lead presents findings, works through

challenges, and provides solutions for reliability improvements.

Required Skills :

o Communication & Collaboration : Strong ability to work directly with

customer teams, understand their design challenges, and

communicate complex technical concepts clearly.

o DevOps & Cloud : Hands-on experience in DevOps practices and

cloud infrastructure (AWS, Kubernetes, Rancher, etc.), with the ability

to recommend reliability solutions.

o Performance Tuning & Reliability : Strong knowledge of

performance bottlenecks, capacity planning, and reliability

engineering best practices.

o Problem Solving : Analytical skills to work through customer

challenges and limitations, and to co-develop short-term and long-term

solutions.

o Dev Engineering & Infra Design : Background in software

development engineering and infrastructure design, enabling

effective discussions with customer teams on the technical aspects.

o CI / CD Tools : Experience with Jenkins, ArgoCD, and other CI / CD

tools to discuss automation and deployment pipelines with customers.

Educational Qualification :

o Bachelor’s or Master’s Degree in Computer Science, Information

Systems, or related fields.

Years of Experience :

o 8-10 years of experience in Site Reliability Engineering, DevOps, or

related fields, with a significant focus on cloud and infrastructure

performance.

o Proven experience working in a customer-facing role where direct

engagement and collaborative problem-solving are key.

3 days ago

Related jobs

Senior Site Reliability Engineer - Observability and Telemetry Platform

Nvidia Corporation

Santa Clara, California

Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. Senior Site Reliability Engineer - Observability and Telem...

Site Reliability Engineer - USDS

TikTok

Mountain View, California

Site Reliability Engineering at TikTok combines software and systems engineering to build and run large-scale, massively distributed, and fault-tolerant systems. Site Reliability Engineering (SRE). TikTok is the leading destination for short-form mobile video. Scale systems sustainably through mecha...

Site Reliability Engineer

Silver Valley Metals Corporation, site: Bunker Hill Mine

Palo Alto, California

Site Reliability Engineer, Production Engineer, Platform Engineer). Collaborate, partner, advise, review and mentor engineering teams on Reliability topics like high reliability architecture, observability, safe change management. Experience leading and driving company-wide reliability efforts and e...

Site Reliability Engineer - Infra and DevOps

Western Digital

Milpitas, California

As Secure Development Factory (SDF) Site Reliability Engineer - DevOps, you will be at the heart of Western Digital’s engineering process, delivering the software development tools and infrastructure that empowers engineering teams to develop and deliver high quality products quickly. You will lead ...

Site Reliability Engineer - Data Infrastructure (San Jose)

ByteDance

San Jose, California

Our data infrastructure Site Reliability Engineering (SRE) team is a pioneer in innovation. Establish sustainable mechanisms for scaling systems, such as automation, to drive enhancements in reliability, efficiency, and velocity. ...

Sr. Site Reliability Engineer

CENTRL Inc.

Mountain View, California

CENTRL’s clients include leading companies around the world including several Fortune 500 firms. CENTRL is led by a highly experienced management team with a proven track record and is backed by some of the leading investors such as Providence Strategy Growth and Susquehanna Growth Equity. In this l...

Site Reliability Engineer - Product Resilience

Zoom

San Jose, California

You will also produce designs and lead more junior team members through their implementation and deployment to production. You will be the tech lead for production resilience and disaster recovery, and you will define the roadmap for improvements in this area. Be familiar with chaos engineering and ...

Site Reliability Engineer, Infrastructure and Assurance Services - USDS

TikTok

Mountain View, California

TikTok is the leading destination for short-form mobile video. The teams within USDS that deliver on this commitment daily span across Trust & Safety, Security & Privacy, Engineering, User & Product Ops, Corporate Functions and more. Initiate and lead scripting/tooling/automation to streamline proce...

Site Reliability Engineer

Altimetrik

Mountain View, California

Design, implement, and maintain complex data systems supporting millions of customers with Cloud Native principles and best practices to ensure highly available, secure, performant and scalable database systems.Build and maintain CI/CD pipelines in Jenkins.Build and deploy services in Kubernetes clu...

Staff Cloud DevOps/Site Reliability Engineer (SRE) - USA

Inworld AI

Mountain View, California

DevOps, Infrastructure, Operations, or Site Reliability Engineer (or as a software engineer with relevant experience). We are looking for a Staff Cloud DevOps/Site Reliability Engineer to join our team. Our Technical Operations team manages the infrastructure, DevOps, and Site Reliability of our pla...

Site Reliability Engineer/Lead

Senior Site Reliability Engineer - Observability and Telemetry Platform

Site Reliability Engineer - USDS

Site Reliability Engineer

Site Reliability Engineer - Infra and DevOps

Site Reliability Engineer - Data Infrastructure (San Jose)

Sr. Site Reliability Engineer

Site Reliability Engineer - Product Resilience

Site Reliability Engineer, Infrastructure and Assurance Services - USDS

Site Reliability Engineer

Staff Cloud DevOps/Site Reliability Engineer (SRE) - USA

Popular searches