Search jobs > Santa Clara, CA > Site reliability engineer

Site Reliability Engineer

https:/www.energyjobline.com/sitemap.xml
Santa Clara, California, US
Full-time

Job Description :

The full job description covers all associated skills, previous experience, and any qualifications that applicants are expected to have.

Partner with teams to ensure security and compliance requirements are met.

Work with development teams to ensure that applications have scalability and reliability built-in from day one - Agile is second nature to you and you're excited to work in scrum teams and represent the SRE perspective.

Design and enhance software architecture to improve scalability, service reliability, cost, and performance - You've helped create services that are critical to their customers' success.

Deploy automation for provisioning and operating infrastructure at large scale - You are experienced in Infrastructure as Code concepts and have put them into production.

Partner with teams to improve CI / CD processes and technology - Helping teams in delivering value early is what you strive for.

Drive the adoption of observability practices and a data-driven mindset - You love metrics, graphs, and gaining a deep understanding of why things happen in a system, helping others gain visibility into the things they build.

Participate in the occasional on-call rotation supporting the infrastructure owned by the SRE team - Finding ways to reduce the time to resolution and improve the reliability of services is key to running a trusted platform.

Skills :

  • 5+ years of total experience with Unix / Linux experience (shell / tools / kernel / networking / storage)
  • 2+ years of working with microservice architectures running on Kubernetes and containers
  • CICD pipelines using GitLab and ArgoCD
  • Terraform
  • Ansible
  • Artifactory or equivalent experience
  • Vault or equivalent experience
  • Demonstrated experience in building tools and automation
  • Experience with public cloud (GCP highly) at medium to large scale
  • Vulnerability Management for containers and VMs
  • Go or Python
  • GitLab with some GCP

J-18808-Ljbffr

2 days ago
Related jobs
Promoted
TikTok
Mountain View, California

BS degree in Computer Science, Computer Engineering, Electrical Engineering or relevant majors with 2+ years of working experience. The teams within USDS that deliver on this commitment daily span across Trust & Safety, Security & Privacy, Engineering, User & Product Ops, Corporate Funct...

Samsung Electronics America
Mountain View, California

As a Site Reliability Engineer specializing in DevOps Infrastructure at Samsung Ads, you will play a crucial role in ensuring the reliability, scalability, and performance of our advertising technology platform. The Site Reliability team at Samsung ads operates a multi-million dollar ad tech ecosyst...

Promoted
TikTok
Mountain View, California

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed infrastructures. Master's degree (or Bachelor's degree with 3+) years of experience in Computer Engineering, Electrical Engineering, Computer Science or related major. Par...

Illumio
Sunnyvale, California

This role will be onsite in Sunnyvale, CA HQ five days a week. As an SRE/DevOps Engineer, you will be responsible for designing, implementing, and managing our cloud infrastructure on Azure, as well as supporting our multi-cloud environment with AWS and/or GCP. You will collaborate closely with deve...

Cloud Cover LLC
Mountain View, California

DevOps, Infrastructure, Operations, or Site Reliability Engineer (or as a software engineer with relevant experience). We are looking for a Staff Cloud DevOps/Site Reliability Engineer to join our team. Our Technical Operations team manages the infrastructure, DevOps, and Site Reliability of our pla...

NVIDIA
Santa Clara, California

Join our team at NVIDIA as a Senior Site Reliability Engineer focused on HPC storage and play a crucial role in designing, implementing, and optimizing on-prem High-Performance Computing (HPC) storage solutions while harnessing the power of cloud computing. You will collaborate closely with engineer...

TikTok
Mountain View, California

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed services and infrastructures. As a site reliability engineer in the Ads data platform area, you will have the opportunity to manage the services and infrastructures in one...

Nvidia Corporation
Santa Clara, California

Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. Senior Site Reliability Engineer - Observability and Telem...

TikTok
Mountain View, California

Site Reliability Engineering at TikTok combines software and systems engineering to build and run large-scale, massively distributed, and fault-tolerant systems. Site Reliability Engineering (SRE). Scale systems sustainably through mechanisms such as automation; evolve systems reliability, efficienc...

TikTok
Mountain View, California

About the role:This is a Site Reliability Engineer role, focusing on the data pipeline reliability for the Video Platform team in USDS. ...