Search jobs > Sandy Springs, GA > Site reliability engineer

Azure Site Reliability Engineer II / Sandy Springs, GA / Hybrid

Motion Recruitment
Sandy Springs, Georgia, United States
Full-time

Exciting opportunity in Sandy Springs, GA! This company sells software for an e-commerce website focused in the retail industry.

They are seeking an experienced Azure Site Reliability Engineer to join their team. This is an On-Site position and is a full-time role.

In this role, you'll work with cutting-edge technologies such as Azure Services and Datadog!

Our client is looking for hard-working individuals who work well on a team. Here, you will have the chance to grow your skills, work on meaningful projects, and enjoy a supportive work-life balance.

If you are ready to grow your skills, then this is the place for you!

Required Skills & Experience

  • Proficiency with Azure services
  • Strong experience with Datadog
  • 5+ YOE with Site Reliability
  • Proficient in scripting languages like Python, PowerShell, or Bash
  • Strong skills in diagnosing, troubleshooting, and optimizing system performance issues across large-scale environments.

Desired Skills & Experience

  • Knowledge of Datadog integrations for Azure services, Kubernetes, and CI / CD pipeline monitoring.
  • Familiarity with managing and optimizing databases such as Azure SQL, Cosmos DB, or MySQL.
  • Knowledge of SRE principles such as error budgets, automation, and incident postmortems.
  • Familiarity with IaC (Terraform and Ansible)
  • Understanding of compliance standards (ISO, SOC 2, GDPR) and security practices specific to cloud environments.

What You Will Be Doing

Tech Breakdown :

  • Core services : Azure Kubernetes Service (AKS), Azure Functions, Azure App Services.
  • Set up Datadog to monitor Azure resources, including Virtual Machines, AKS clusters, and storage accounts.
  • Use Datadog’s dashboards and anomaly detection features to proactively detect and resolve system issues before they impact users.
  • Monitor deployments through Datadog to detect any application errors or performance issues introduced during updates.
  • Develop and optimize CI / CD pipelines for efficient, reliable application deployment.

Daily Responsibilities

  • Automate resource provisioning and deployment with IaC tools like Terraform or ARM templates.
  • Continuously monitor Azure infrastructure and applications using Datadog for performance, uptime, and resource utilization.
  • Use Infrastructure as Code (IaC) tools like Terraform or Ansible to provision, update, and manage cloud infrastructure.
  • Develop, maintain, and improve CI / CD pipelines to automate Docker image builds and Kubernetes deployments.
  • Respond to system alerts, production issues, and incidents. Work to resolve outages quickly and perform root cause analysis to prevent future incidents.

The Offer

Bonus OR Commission eligible

You will receive the following benefits :

  • Medical, Dental, and Vision Insurance
  • Vacation Time
  • 401(k) with a company match, commuter benefits, paid holidays, PTO, quarterly bonuses, and more
  • Health Insurance

Applicants must be currently authorized to work in the US on a full-time basis now and in the future.

13 hours ago
Related jobs
Promoted
VirtualVocations
Decatur, Georgia

A company is looking for an Associate Site Reliability Engineer responsible for maintaining infrastructure and ensuring system reliability. ...

Promoted
P.L. Marketing
Atlanta, Georgia

Comply with the guidelines established for KOMPASS employees, especially those regarding timeliness, productivity, teamwork, communication and clocking in/out guidelines. This may include, but is not limited to, requiring app permissions such as enabling location services, camera, and photo gallery ...

Promoted
VirtualVocations
Alpharetta, Georgia

A company is looking for a Site Reliability Engineering (SRE) Solution Architect. ...

Motion Recruitment
Atlanta, Georgia

As a new team member, you will be owning their infrastructure and delegating DevOps mentalities to help the organization grow. A growing organization located in the heart of Northern Atlanta is looking for their next team member. This company specializes in creating a customer facing investment plat...

NCR Corporation
Atlanta, Georgia

We leverage our expertise, R&D capabilities and unique platform to help navigate, simplify and run our customers’ technology systems. You will work closely with cross-functional teams to implement new features, optimize existing code, and ensure the reliability and responsiveness of our mobile appli...

GEICO
Atlanta, Georgia
Remote

GEICO is seeking an experienced and visionary SRE Senior Manager to join the organization and aid the establishment and growth of the Site Reliability Engineering (SRE) practice for Hybrid Cloud - Infrastructure as a Service (IaaS). As an SRE Leader, you will be responsible for leading and driving d...

Ciena Corporation
Atlanta, Georgia

Bridging the gap between customer needs, software development, and hardware capabilities. Strong verbal and written communications skills, to effectively communicate/collaborate in multi-site environment. Batchelor’s degree in Electrical Engineering required, Master’s degree preferred. ...

Starbucks
Georgia, United States

Legal documentation establishing your identity and eligibility to be legally employed in the country in which you apply. This list is subject to change depending on collective bargaining in locations where partners have a certified bargaining representative. Strong organizational, interpersonal and ...

Epam
Georgia

Senior Site Reliability Engineer. Leadership development, career advising, soft skills and well-being programs Certifications, including GCP, Azure and AWS Unlimited access to LinkedIn Learning, Get Abstract, O'Reilly, Cloud Guru Free English classes with certified teachers. Participation in the Emp...

vTech Solution
Georgia, United States

Bachelors in Engineering/Engineering Technology. Underground feeder conversions, reliability improvements. Significant Electric Utility Project Management Experience including organized approach, . Gantt Charts), lead pre-construction meetings. ...