Site Reliability Engineer

Amtex Systems Inc.
Plano, TX, United States
Full-time
We are sorry. The job offer you are looking for is no longer available.

Title : Site Reliability Engineer

Location : Plano, TX

Duration : 6+ months

Locals ONLY

Experience Level : 10 + years

  • Should be strong SRE, experience with java, AWS / DevOps / deployment strategy and monitoring tools. Candidates should be with more hands-on experience with Dynatrace / Splunk / CICD / Grafana etc.
  • Looking for resource with very good application trouble shooting experience. More on core SRE metrics before going to Prod.

uptime vs availability, monitoring vs Observability, and incident and outage etc.

  • Should be familiar with SLO, SLA, SLI or other SRE keywords or terms.
  • Experience with deploying using CICD pipeline and debugging / troubleshooting issues and coordinate with the application team such as Java, Spring Boot, Python, .Net, etc.
  • Ability to perform API performance testing using tools such as JMeter / Blazemeter.
  • Experience on identifying RCA for any production issues on AWS environment with multiple microservices.
  • Expertise in Terraform to manage infrastructure as code would be highly desirable.

Job responsibilities :

  • Demonstrates and champions site reliability culture and practices and exerts technical influence throughout your team.
  • Leads initiatives to improve the reliability and stability of your team’s applications and platforms using data-driven analytics to improve service levels.
  • Collaborates with team members to identify comprehensive service level indicators and stakeholders to establish reasonable service level objectives and error budgets with customers.
  • Demonstrates a high level of technical expertise within one or more technical domains and proactively identifies and solves technology-related bottlenecks in your areas of expertise.
  • Acts as the main point of contact during major incidents for your application and demonstrates the skills to identify and solve issues quickly to avoid financial losses.
  • Documents and shares knowledge within your organization via internal forums and communities of practice Required qualifications, capabilities, and skills.
  • Formal training or certification on Software engineering concepts and 5+ years of applied experience.

Required Qualifications, Capabilities, and Skills :

  • Deep proficiency in reliability, scalability, performance, security, enterprise system architecture, toil reduction, and other site reliability best practices with the ability to implement these practices within an application or platform.
  • Fluency in JAVA programming.
  • Proficiency and experience in observability such as white and black box monitoring, SLO alerting, and telemetry collection using tools such as Splunk, Grafana, Dynatrace, Prometheus, Datadog.
  • Proficiency in continuous integration and continuous delivery tools (e.g., Jenkins, GitLab, Terraform, etc.)
  • Experience with container and container orchestration (e.g., ECS, Kubernetes, Docker) Preferred qualifications, capabilities, and skills.
  • Experience with infrastructure as code tools such as Terraform. also experience managing / supporting Cloud based applications, AWS preferred.
  • Excellent communications desired.
  • 9 days ago
Related jobs
Promoted
VirtualVocations
Carrollton, Texas

A company is looking for a Senior Site Reliability Engineer to improve the reliability and stability of its customer-facing production infrastructure. ...

Promoted
Capital One
Plano, Texas

Lead Platform Engineer, Site Reliability Engineering (SRE). Site Reliability Engineering experience. If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation, please contact Capital One Recruiting at 1-800-304-9...

Promoted
VirtualVocations
Carrollton, Texas

...

Splunk Inc
Texas, United States

Learn more aboutSplunkcareers and how you can become a part of our journey!Role:Splunk is looking for a TechOps Engineer with the ability to provide day-to-day technical expertise for our Splunk Cloud Azure TechOps team and the Splunk organization. As a TechOps Engineer, you will be interfacing with...

Promoted
VirtualVocations
Carrollton, Texas

A company is looking for a Staff Software Engineer, Site Reliability. ...

JPMorgan Chase & Co.
Plano, Texas

Demonstrates site reliability principles and practices every day and champions the adoption of site reliability throughout your team. Advanced knowledge in site reliability culture and principles with demonstrated ability to implement site reliability within an application or platform. Elevate your ...

iHeartMedia
Virtual, TX

The Senior Site Reliability Engineer will be responsible for leading a talented team of SREs/DevOps Engineers across a wide variety of Cloud Services. Run Reliability Incident management processes along with Root Cause Analysis, developing Runbooks . ...

Bank of America
Plano, Texas

We are seeking a talented and experienced Key Management Service (KMS) Service Reliability Engineer (SRE) to join our team. In this role, you will be responsible for ensuring reliability, stability, and security of a robust enterprise key management infrastructure. Work closely with our CIOs , engin...

PTR Global
Irving, Texas

Bachelors degree in Computer Engineering, Computer Science, Electrical Engineering or related field, and 5 years of experience. Masters degree in Computer Engineering, Computer Science, Electrical Engineering or related field, and 3 years of experience. ...

Yum! Brands
Plano, Texas

Site Reliability Engineers are just as adept at software engineering as they are able to be at their best in the crunch of a production outage. If so, you might be just the person we are looking for to fill our Site Reliability Engineering role at Pizza Hut. Are you equally passionate about the joys...