Site Reliability Engineer

Agile Datapro
CA, United States
Full-time

About the Job

  • Role : SRE
  • Type of Engagement : Hybrid - 2 days from Mountain View office
  • Location : Mountain View
  • Employment Type : W2 Or Full Time

Job Description / Requirement :

Design, implement, and maintain complex data systems supporting millions of customers with Cloud Native principles and best practices to ensure highly available, secure, performant, and scalable database systems

  • Build and maintain CI / CD pipelines in Jenkins
  • Build and deploy services in Kubernetes cluster using helm, kustomize, etc
  • Contribute to infrastructure changes to AWS with deep understanding of AWS services
  • Engage in on-call for pre-production and production systems supporting multi-million users
  • Write / Review RCA docs to prevent recurrence of Incidents in future and share the learnings
  • Contribute to major system upgrades, deployment automation, monitoring enhancements and Production changes
  • Create operational playbooks, contribute to how-to articles, and gain domain knowledge to drive changes in the team
  • Participate and contribute in FMEA / Chaos testing, Security remediations, etc
  • Share best practices and patterns for operational excellence and cost optimization
  • Reduce or eliminate manual steps by automating as much as possible
  • Continuously look for opportunities to increase developer velocity and productivity

Qualifications :

  • Bachelor’s or master’s degree in computer science or a related technical field. Equivalent experience will be considered
  • 4+ years of hands-on development & operational experience with building and maintaining infrastructure in AWS
  • Extensive performance monitoring, troubleshooting & tuning experience
  • Experience with AWS services and hands-on knowledge of hosting on Cloud
  • Experience with scripting languages for DevOps automation
  • Experience with any one of the programming languages : Java / Python / Ruby
  • Knowledge of Docker & Kubernetes, ArgoCD,
  • Experience with monitoring and observability using Splunk, Wavefront, AppDynamics, Prometheus, Tracing, etc

Education :

Bachelor’s degree in computer science, Software Engineering, or a related field.

If you are interested to pursue the opportunity, please send your updated resume to [email protected] along with your rate / salary information

2 hours ago
Related jobs
Promoted
VirtualVocations
Moreno Valley, California

A company is looking for a Lead Site Reliability Engineer. ...

Promoted
SpaceX
Hawthorne, California

Bachelor's degree in computer science, information systems/IT, or an engineering discipline; OR 2+ years of professional experience in software, DevOps, or site reliability engineering in lieu of a degree. As a Site Reliability Engineer, you will design, develop, and test key aspects of an in-house ...

Promoted
VirtualVocations
Moreno Valley, California

A company is looking for an Associate Site Reliability Engineer responsible for maintaining infrastructure and ensuring system reliability. ...

Promoted
Unreal Gigs
San Francisco, California

Site Reliability Engineer (SRE). As a Site Reliability Engineer at. You’ll collaborate to implement reliability engineering practices such as service level indicators (SLIs) and service level objectives (SLOs) and enforce best practices for system reliability. Equivalent experience in site rel...

Promoted
Canonical - Jobs
San Jose, California

As a Senior Site Reliability / Gitops Engineer you will. As an Senior SRE & Gitops engineer you'll be in a unique position to drive operations automation to the next level, both in our own private clouds as well as in the public clouds. Provide assistance and work with globally distributed e...

Promoted
Infused Solutions
San Francisco, California

Senior Site Reliability Engineer (Azure SRE). We're proud to announce that we've partnered with an ambitious fintech company looking for an experienced Senior Site Reliability Engineer (SRE) to join their infrastructure team. Site Reliability Engineering or a related field, ideally within fi...

Promoted
Canonical - Jobs
Fresno, California

As a Site Reliability / Gitops Engineer engineer you will. As an SRE & Gitops engineer you'll be in a unique position to drive operations automation to the next level, both in our own private clouds as well as in the public clouds. Provide assistance and work with globally distributed engine...

Promoted
Google Inc.
San Bruno, California

Senior Software Engineer, Site Reliability Engineering, Google Cloud. Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. We're proud to be our engineers' engineers and love voiding warranties by ta...

Promoted
NVIDIA
Santa Clara, California

Join our team at NVIDIA as a Senior Site Reliability Engineer focused on HPC storage and play a crucial role in designing, implementing, and optimizing on-prem High-Performance Computing (HPC) storage solutions while harnessing the power of cloud computing. You will collaborate closely with engineer...

TP-Link
Irvine, California

Senior Site Reliability Engineer . Our team of passionate engineers are constantly innovating, engineering solutions that transform the end user experience with simpler, smarter, and more reliable connectivity. Reliability, scalability, and operational excellence. Performing Load Tests and Chaos Tes...