job summary :
Location- Merrimack, NH, or Westlake, TX.
Required Skills : EKS : experience in managing Kubernetes cluster administration and expected to have good experience troubleshooting Kubernetes.
Python and developing API's. Docker, Shell, AWS, Jenkins, Ansible.
Reliability Engineering (SRE) is an engineering field that combines software engineering and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems.
responsibilities :
The Expertise and Skills we're Looking For
- 3+ years of experience in systems and platform operations and technology management
- EKS - experience in managing Kubernetes cluster administration and expected to have good experience troubleshooting Kubernetes.
- strong with Linux and shell scripting
- strong python programming
- strong understanding of application networking
- Hands on experience with log aggregation, monitor and data visualization tools - dashboarding, monitoring and alerting (datadog, Splunk, ELK, Prometheus and Grafana)
- Proven experience working with API's & API frameworks.
- Proven experience implementing AWS products and services.
- Hands on experience working with infrastructure as code (CFT, Terraform etc.).
- Understanding of container technologies (Docker)
- Hands-on experience implementing CI / CD pipelines with (Ansible, JenkinsCore, ArgoCD).
- Understanding of AWS cloud security and account management
- Passionate about working in a DevOps culture.
- A passion for learning new technologies, practices, and ways of working.
- Comfortable in an agile environment
- strong communication skills written and oral.
- Experience working in globally distributed teams.
qualifications :
- Experience level : Experienced
- Minimum 3 years of experience
- Education : Bachelors
skills :
- Professional Engineer
30+ days ago