Seeking a Leader to expand the AWS Region Reliability team in Denver!
We are looking for an Engineering and Operations Leader to help grow the Region Reliability organization. Region Reliability is the cornerstone of continuous engineering improvement by driving efficiencies across AWS through innovation, automation, and operational excellence.
This new organization was created to drive consistency in operational practices across AWS, reducing duplicative efforts and refining and accelerating operational excellence and best practices for AWS Amazon Dedicated Cloud Regions to provide a simpler experience for our customers and AWS Builders.
Do you like helping U.S. Intelligence Community and Defense agencies implement innovative cloud computing solutions and solve technical problems?
Would you like to do this using the latest cloud computing technologies? Are you the type of person that works with all teams to make operations a better place through simplification of process, creation of automation or building scrappy tools?
Would you like to drive the systems operations of the world’s largest scale cloud compute platform? Then this is the job for you.
Joining Region Reliability empowers you to drive operational improvements across AWS to delight customers.
Our services operate at large scale with workloads critical to national security. These mission-critical cloud computing solutions require a relentless focus on operational excellence.
Given the national security implications of our services, a deep passion for delivering reliable, secure, and high-performing infrastructure is essential.
You will share big ideas and execute to deliver the next big innovations at rapid pace. You are a believer in the Dev-Ops model of service engineering, and you are excited to run operations in an environment where developers don’t toss problems over the wall but solve them.
And you are happiest when you are working with empowered, world-class engineers to meet world-class challenges. Finally, with your strong ownership bias, you have an infectious desire to continually improve how things are done.
At Amazon, we hire hands-on managers at all levels. This leader must be able to dive deep into the details on business, operations, and engineering and identify how to deliver outcomes where the solution isn’t understood yet.
As a Region Reliability Engineering Manager, you will be be familiar with the technical implementation of creating, securing, deploying, maintaining, simplifying, and safely deprecating software through development and production environments.
With this perspective, you drive identification of operational excellence improvements and drive adoption across the organization.
You will excel at hiring and developing other Amazon Dedicated Cloud Engineers technical skill and review and improve their work.
You will grow leaders in your organization to identify manual actions and relentlessly drive improvements through elimination of process, creation of automation, or escalation through owning teams to drive a better experience for our Builders.
We need a technical, detail-oriented operations leader focused on operational excellence to drive best operational practices through the business.
You will excel at hiring and developing systems level engineers. You will grow leaders in your organization and lead a charter that grows as your organization grows.
This position requires that the candidate selected must currently possess and maintain an active TS / SCI security clearance.
The position further requires that, after start, the selected candidate obtain and maintain an active TS / SCI security clearance with polygraph or commensurate clearance for each government agency for which they perform AWS work.
For inquiries, please reach out to Josh Sacks at [email protected]
10012
Key job responsibilities
In this role you will have the opportunity to :
- Own the continual improvement of operations at AWS.
- Identify Builder operational pain points from disparate data and drive actions to resolve them. Examples includes creation, improvement, simplification, and elimination of processes.
Knowing when to create a solution vs. improving existing tools.
- Hire and Develop the Best leaders at AWS by leveraging your technical ability to identify outcomes, design project plans, and drive the right actions to build the technical acumen on the team.
- Collaborate and learn from world-class leaders to meet world-class challenges, every day.
- Work across Region Reliability to root cause operational challenges and
- Hire, train, and grow new region reliability engineers
- Drive continual improvement in systems operations through tool building and automation.
- Drive root cause analyses, in collaboration with software development teams, as well as influencing local development to improve operational performance.
- Report on the health of these services at an executive level.
- Meet with internal and external customers to develop relationships and clarify requirements and schedules.
- Collaborate with and learn from world-class leaders to meet world-class challenges, every day
- Work in an environment where operational excellence is truly the first priority, and where the degree of automation is above bar.
A day in the life
RRE Line managers manage a team of Region Reliability Engineers. A typical day includes working with the team in the SCIF assigning and unblocking workflows, working with partner teams to drive manual task reduction through automation, drive root cause analysis for complex issues, and ensure team SLAs are consistently being met.
You will mentor and grow your team, and drive strategic initiatives to generate efficiencies across operational practices.
We are open to hiring candidates to work out of one of the following locations :
Denver, CO, USA
BASIC QUALIFICATIONS
- Associate's degree, or Cloud+ or GICSP (Global Industrial Cyber Security Professional) or GSEC (GIAC Security Essentials) or SSCP (Systems Security Certified Practitioner)
- 7+ years of relevant hands-on systems engineering and administrative experience in networking, storage systems, operating systems
- 3+ years of experience as the systems engineering and operations leader for an Internet service or leading edge IT organization operating in a 24x7 environment
- Current, active US Government Security Clearance of TS / SCI
PREFERRED QUALIFICATIONS
- Demonstrated success building and leading teams
- Strong systems engineering fundamentals (networking, storage, operating systems)
- Development experience with a high level language like python, ruby, or java
- Leading development life cycle processes and best practices, esp. in the areas of deployment automation and monitoring
- Agile engineering practices (Kanban, continuous delivery, etc.)
- Mentoring / training systems engineers and systems development engineers
- Experience with distributed systems at scale
- AWS service usage on commercial or government cloud