Senior DevOps Engineer with SRE
eTek IT Services, Inc.
Los Angeles, CA, US
Full-time
Job Description
Job Description
Overview :
The Senior DevOps Engineer with SRE plays a critical role in our organization, responsible for designing and implementing solutions that improve our software development and operations processes.
This position is vital to ensuring the reliability, scalability, and performance of our systems.
Key Responsibilities :
- Collaborate with development teams to create and maintain CI / CD pipelines
- Design, implement, and maintain infrastructure as code using tools like Terraform
- Monitor and improve system stability, performance, and reliability
- Automate manual processes to improve efficiency and reduce human error
- Implement and maintain security measures for the infrastructure
- Troubleshoot and resolve issues in development, testing, and production environments
- Collaborate with cross-functional teams to ensure smooth deployment and operation of systems
- Participate in on-call rotation and respond to incidents as needed
- Implement and maintain best practices in areas such as logging, monitoring, and alerting
- Train and mentor junior team members
- Contribute to the continuous improvement of DevOps processes and procedures
- Evaluate new tools and technologies to improve DevOps processes
- Manage and optimize cloud resources
- Develop and maintain disaster recovery and business continuity plans
- Participate in capacity planning and scalability assessments
Required Qualifications :
- Bachelor's degree in Computer Science, Engineering, or related field
- 5+ years of experience in a DevOps or SRE role
- Strong understanding of cloud platforms such as AWS, Azure, or GCP
- Proficiency in at least one scripting language such as Python, Ruby, or Bash
- Experience with containerization and orchestration tools like Docker and Kubernetes
- Deep knowledge of version control systems such as Git
- Expertise in automation and configuration management tools like Ansible, Chef, or Puppet
- Ability to troubleshoot and optimize software and infrastructure performance
- Experience with monitoring and logging tools like Prometheus, ELK stack, or similar
- Strong understanding of networking and security principles
- Excellent communication and collaboration skills
- Ability to work effectively in a fast-paced, dynamic environment
- Relevant certifications such as AWS Certified DevOps Engineer or Certified Kubernetes Administrator are a plus
- Experience with agile and Scrum methodologies
- Proven track record of implementing and managing scalable, secure, and highly available systems
6 days ago