SITE RELIABILITY ENGINEER
DESCRIPTION : Duties : Build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems.
Help JPMC Advanced Data Enterprise (JADE) team on production support in public cloud. Work with AI / ML and cloud engineers to build the platform, pipeline, and monitoring systems to ensure the application landscape is designed to take most advantage of JPMC's global cloud solution.
Implement Site Reliability Engineer (SRE) frameworks to support globally multi-cloud environments, and ensure the highest level of SLA through operational excellence.
Provide failure analysis / root cause analysis when required. Develop and improve the quality of technical engineering documentation.
Drive the maturity of the software development lifecycle. Provide quality control of engineering deliverables. Provide technical consultation to product management.
Perform deployment, administration, management, configuration, testing, and integration tasks related to the big data platforms in cloud environment.
Develop new cloud engineering strategies and implementations for the firm. Champion a DevOps model so that services are automated and elastic across all platforms.
Write operation documentation and knowledge base of known issues with solutions. Participate in 24x7 SRE on-call rotations and escalation workflows.
QUALIFICATIONS : Minimum education and experience required : Master's degree in Computer Science, Computer Engineering, Information Technology, or related field of study plus 3 years of experience in the job offered or as Site Reliability Engineer, Software Systems Release Engineer, or related occupation.
- Skills Required : Requires experience in the following : Linux; Unix; Windows; Agile SDLC; Application Architecture Disciplines, such as container orchestration, Docker, ECS or Kubernetes;
- Observability and Monitoring tools, including Grafana and Prometheus; Infrastructure Architecture Disciplines, including Code IaaC and Networking;
- Continuous Integration / Continuous Development tools, including Jenkins and Git; Python; Shell Scripting; SQL; AWS Cloud Services;
Automated Testing; System Integration Testing; Unit Testing; and User Acceptance Testing.Job Location : 1111 Polaris Pkwy, Columbus, OH 43240.
Telecommuting permitted up to 40% of the week.