Talent.com
Site Reliability Engineer
Site Reliability EngineerAmicis Global • Alpharetta, GA, United States
Site Reliability Engineer

Site Reliability Engineer

Amicis Global • Alpharetta, GA, United States
job_description.job_card.1_day_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
  • serp_jobs.filters_job_card.quick_apply
job_description.job_card.job_description

Title : Senior Site Reliability Engineer

Location : Alpharetta, GA

Duration : 6-12+ Months

About the Role

We're seeking an experienced Senior Site Reliability Engineer to join our team and play a critical role in ensuring the reliability, scalability, and performance of our cloud infrastructure. You'll be a technical leader who combines deep operational expertise with strong automation skills to build and maintain highly available systems. As a Kubernetes expert, you'll drive our container orchestration strategy and serve as a technical authority for our platform teams.

Key Responsibilities :

Infrastructure & Automation

Design, deploy, and manage cloud infrastructure across AWS and Azure using Terraform and infrastructure-as-code principles

Architect, deploy, and maintain production-grade Kubernetes clusters with a focus on reliability, security, and performance

Serve as the subject matter expert on Kubernetes, providing guidance and best practices to engineering teams

Build and maintain automated provisioning pipelines to ensure consistent, repeatable deployments

Implement and maintain HashiCorp Vault on AWS for secrets management and security, including Vault integration with Kubernetes

Design and implement automated High Availability and Disaster Recovery (HA / DR) capabilities through CI / CD pipelines

Optimize cloud resources and Kubernetes workloads for performance, cost efficiency, and reliability.

Observability & Monitoring

Architect and implement comprehensive observability solutions using Datadog for cloud-native applications and Kubernetes infrastructure

Build monitoring, logging, and alerting frameworks for containerized workloads that provide actionable insights into system health

Implement Kubernetes-native monitoring patterns and troubleshoot complex container orchestration issues

Integrate Datadog with PagerDuty and other incident management platforms

Define and track SLIs, SLOs, and error budgets to drive reliability improvements

Create custom dashboards and monitors to track infrastructure, application, and Kubernetes cluster performance

CI / CD & Pipeline Management

Design, build, and maintain robust CI / CD pipelines that enable rapid, safe deployments to Kubernetes

Implement GitOps workflows and automated deployment strategies for containerized applications

Implement automated testing, security scanning, and quality gates within pipelines

Drive solutions through test, QA, and production environments with appropriate controls and safeguards

Automate deployment strategies including blue-green, canary, and rolling deployments in Kubernetes

Security & Vulnerability Management

Identify, assess, and remediate security vulnerabilities in infrastructure, applications, and Kubernetes clusters

Implement Kubernetes security best practices including RBAC, pod security policies / standards, and network policies

Collaborate with security teams to implement and maintain security best practices

Manage and maintain HashiCorp Vault infrastructure for secure secrets management

Ensure compliance with security policies and industry standards across all environments

Incident Management & Response

Participate in 24 / 7 on-call rotation to respond to critical production incidents

Serve as Incident Commander, coordinating cross-functional response teams during major outages

Lead post-incident reviews and drive thorough root cause analysis across engineering teams

Troubleshoot complex Kubernetes and distributed systems issues under pressure

Develop and refine incident response procedures and runbooks

Collaboration & Leadership

Partner with engineering teams to improve system reliability and performance

Mentor junior SREs and promote SRE best practices across the organization

Lead Kubernetes adoption efforts and educate teams on container orchestration best practices

Drive initiatives to reduce toil through automation and process improvement

Contribute to architectural decisions with a reliability and operability lens

Required Qualifications :

5+ years of experience in Site Reliability Engineering, DevOps, or similar roles

Expert-level knowledge of Kubernetes

, including architecture, operations, and troubleshooting in production environments

Proven track record as a go-to Kubernetes resource and technical authority

Deep understanding of container technologies (Docker, containerd) and orchestration patterns

Strong hands-on experience with

AWS and Azure

cloud platforms

Proficiency in

Terraform

for infrastructure automation and management

Expert-level knowledge of

Datadog

for monitoring, logging, and observability

Experience with

HashiCorp Vault

, including deployment and management on AWS and Kubernetes integration

Deep understanding of

CI / CD pipelines

, including design, implementation, and optimization for containerized workloads

Proven ability to implement automated HA / DR solutions through CI / CD workflows

Strong programming skills in

Python

for automation, tooling, and analysis

Proven experience building observability solutions for distributed cloud applications

Experience configuring monitoring and alerting systems and integrating with paging platforms like PagerDuty

Demonstrated experience identifying and remediating security vulnerabilities

Experience driving deployments through multiple environments (test / QA / production) with proper gates and controls

Demonstrated experience participating in on-call rotations and responding to production incidents

Experience serving as Incident Commander or leading incident response efforts

Track record of conducting root cause analysis and driving systemic improvements

Strong understanding of networking, security, and cloud architecture principles

Excellent communication skills with ability to work across multiple teams and explain complex Kubernetes concepts

Preferred Qualifications :

Experience with

Google Cloud Platform (GCP)

and GKE

Certified Kubernetes Administrator (CKA) or Certified Kubernetes Security Specialist (CKS)

Experience with service mesh technologies (Istio, Linkerd, Consul)

Knowledge of Helm, Kustomize, and other Kubernetes tooling

Experience with GitOps tools (ArgoCD, Flux)

Familiarity with additional CI / CD tools (Jenkins, GitLab CI, GitHub Actions, CircleCI)

Experience with configuration management tools (Ansible, Chef, Puppet)

Background in software engineering or systems programming

Understanding of chaos engineering and reliability testing methodologies

Experience with cost optimization strategies in cloud and Kubernetes environments

Security certifications (AWS Security Specialty, CISSP, CKS, etc.)

Experience with compliance frameworks (SOC 2, ISO 27001, etc.)

Contributions to open-source Kubernetes projects or active participation in the Kubernetes community

What We Offer

Competitive salary and equity compensation

Comprehensive health, dental, and vision insurance

Flexible work arrangements

Professional development opportunities and certification support

Collaborative and inclusive team culture

Our Commitment

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

serp_jobs.job_alerts.create_a_job

Site Reliability Engineer • Alpharetta, GA, United States

Job_description.internal_linking.related_jobs
Senior Site Reliability Engineer

Senior Site Reliability Engineer

TEKsystems • Atlanta, GA, United States
serp_jobs.job_card.full_time
Duration : 3 month w2 contract to hire.Location : 4 days onsite & 1 day remote- Charlotte, NC or Atlanta, GA.The hiring manager is more focused on SRE Practice (being able to bring the knowledge of p...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Site Development Project Engineer (Hiring Immediately)

Site Development Project Engineer (Hiring Immediately)

Jobot • Norcross, GA, US
serp_jobs.job_card.full_time
Growing Engineering Firm | Great Compensation Package | Upwards Career Growth!.This Jobot Job is hosted by : Lauren Lehman. Are you a fit? Easy Apply now by clicking the Apply button and sending us y...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted
Structures Analysis Engineer - AMMM - Level 5

Structures Analysis Engineer - AMMM - Level 5

Lockheed Martin • Marietta, GA, US
serp_jobs.job_card.full_time
Lockheed Martin Aeronautics in seeking a full-time Stress Analysis Engineer for the Air Mobility and Maritime Missions (AMMM) program in Marietta, Georgia. A successful candidate will use their tech...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Plant Maintenance Supervisor

Plant Maintenance Supervisor

Smithfield Foods • Cumming, GA, United States
serp_jobs.job_card.part_time
A great job-and a great future-awaits you at Smithfield Foods.We are an American food company with a leading position in packaged meats and fresh pork products. We're looking for motivated people wh...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Plant Engineer

Plant Engineer

Smithfield Foods • Cumming, GA, United States
serp_jobs.job_card.full_time
If you are currently employed at Smithfield, please log into Workday and submit your application through the Jobs Hub.A great job-and a great future-awaits you at Smithfield Foods.We are an America...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Plumbing Engineering Designer (Hiring Immediately)

Plumbing Engineering Designer (Hiring Immediately)

Jobot • Austell, GA, US
serp_jobs.job_card.part_time
Growing Engineering Firm | Full Benefits Package | Upwards Career Growth!.This Jobot Job is hosted by : Lauren Lehman.Are you a fit? Easy Apply now by clicking the Apply button and sending us your r...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Manager Site Reliability Engineering

Manager Site Reliability Engineering

RELX • Alpharetta, GA, US
serp_jobs.job_card.full_time
Are you an experienced site reliability engineering leader ready to shape strategy, inspire teams, and drive innovation at scale? Are you looking to lead a high-impact sre team where your leadershi...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Site Lead

Site Lead

Vertiv Holdings • Atlanta, GA, US
serp_jobs.job_card.full_time
At Vertiv, we design, manufacture, and service mission-critical infrastructure technologies for vital applications in data centers, communication networks, and commercial and industrial environment...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior Civil Site Design Engineer with 15-20 yrs Experience - Earn Up To $170k Annually - Tucker, GA

Senior Civil Site Design Engineer with 15-20 yrs Experience - Earn Up To $170k Annually - Tucker, GA

Graham & Associates • Stockbridge, GA, US
serp_jobs.job_card.full_time
Graham & Associates is seeking a highly skilled Civil Site Design Engineer to join our team and lead the design efforts for projects at Atlanta Hartsfield-Jackson Airport.At least Minimum of 15-20 ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Systems Engineer

Systems Engineer

Viasat • Duluth, GA, United States
serp_jobs.job_card.full_time
At Viasat, we're on a mission to deliver connections with the capacity to change the world.For more than 35 years, Viasat has helped shape how consumers, businesses, governments and militaries arou...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Senior Civil Site Design Engineer - Earn Up To $170k Annually - Tucker, GA

Senior Civil Site Design Engineer - Earn Up To $170k Annually - Tucker, GA

Graham & Associates • Decatur, GA, US
serp_jobs.job_card.full_time
Graham & Associates is seeking a highly skilled Civil Site Design Engineer to join our team and lead the design efforts for projects at Atlanta Hartsfield-Jackson Airport.At least Minimum of 15-20 ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Lead Application Security Engineer - 19562

Lead Application Security Engineer - 19562

Cox Automotive • Redan, GA, US
serp_jobs.job_card.full_time
The Lead Application Security Engineer will partner with Security Engineering Enablement and Security Architecture to design and ship secure software : secure code reviews and help define requiremen...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted
Lead Engineer

Lead Engineer

Chesapeake Utilities Corporation • Norcross, GA, United States
serp_jobs.job_card.full_time
Remote Within Service Territory -.DE, PA, OH, GA, NC, VA, MD or FL).The Lead Engineer plays a pivotal role in training and process improvement, developing and leading training programs for the Engi...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Distribution Center Site Leader / General Manager

Distribution Center Site Leader / General Manager

Zoetis, Inc • Marietta, GA, United States
serp_jobs.job_card.full_time
What is it like to work for Zoetis, the world leader in animal health? Zoetis means something a little different to every colleague, but at our core, our purpose 'to nurture the world and humankind...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Consultant Engineer

Consultant Engineer

FM • AUSTELL, Georgia, United States
serp_jobs.job_card.full_time
FM is one of the world’s largest risk management and industrial property insurance organizations.With 76 office locations in over 60 countries worldwide, FM provides specialized property protection...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Sales Engineer I

Sales Engineer I

BTD Manufacturing • Dawsonville, GA, United States
serp_jobs.job_card.full_time
Safety is #1 at BTD : Our expectation is that every employee : 1) Strictly follows safety policies, rules and safe work methods. Promptly corrects or reports safety hazards or unsafe conditions.Prompt...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Sr Site Reliability Engineer, Kubernetes, Datadog, Vault, Prod Support 12+ Mths Cont Alpharetta, GA

Sr Site Reliability Engineer, Kubernetes, Datadog, Vault, Prod Support 12+ Mths Cont Alpharetta, GA

ZnA Inc • Alpharetta, GA, United States
serp_jobs.job_card.full_time
serp_jobs.filters_job_card.quick_apply
Sr Site Reliability Engineer (SRE), Kubernetes, Datadog, Vault, Prod Support 12+ Mths Cont Alpharetta, GA JPC- # 3488< / ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days
Security Site Supervisor - Unarmed

Security Site Supervisor - Unarmed

Allied Universal • Lithonia, GA, United States
serp_jobs.job_card.full_time
Security Site Supervisor - Unarmed.Allied Universal, North America's leading security and facility services company, offers rewarding careers that provide you a sense of purpose.While working in a ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted