Sr. Availability Engineer

Amazon Data Services, Inc.
Herndon, Virginia, USA
$128.6K a year
Full-time

Availability Engineers are responsible for consultative and peer review of the design of all design disciplines within Amazon DCs world-wide.

Beyond design focus, we work directly with operations, security teams, field engineering, construction management, and operations to implement processes and procedures, that are functional, practical and innovative with a primary focus on improvement of system availability.

As an Availability Engineer, you will be evaluating the impact of data center products and features to meet ever-evolving customer needs as we continue expanding our fleet to hyper-scale. As an ideal candidate you :

  • Possess Strong Engineering Judgement and are able to provide recommendations despite uncertainty
  • Are detail- and data- oriented
  • Have experience managing engineering projects and consultants.
  • Build trust and relationships with different stakeholders (e.g., Operations, Commissioning, Construction and Design)
  • Be inclined to get into the field to see things up close.

Each day you will interact with different teams responsible for all aspects of the data centers. You will prioritize your activities to support data center availability focusing on the actions that are most impactful.

You will have the responsibility to think globally about all initiatives.

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running.

We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on.

We work on the most challenging problems, with thousands of variables impacting the supply chain and we’re looking for talented people who want to help.

You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles.

You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers.

And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.

Key job responsibilities

Availability Engineers provide the following deliverables :

  • Auditing and peer reviewing DC Infrastructure Engineering designs with focus on Availability
  • Perform an engineering-oriented analysis of past availability events
  • Providing technical oversight for global COE action items
  • Developing availability forecast models
  • Oversight & review availability performance metrics
  • Providing multi root cause categorical analysis for availability events
  • Oversight & review Failure Mode and Effect Analysis studies
  • Conceive, initiate, and lead availability projects with widespread impact on infrastructure design, innovation, and implementation.
  • Set the standard for customer serving coordination and repeatable processes as they relate to engineering, test, construction, commissioning, operations, and best practices.

Drives process improvements across the organizations to drive increase in availability to meet customer expectation.

  • Providing technical oversight of the Regional Electrical and Mechanical Basis of Design (BOD), Construction Documentation, SoWs, procurement initiatives, supplier management, commissioning scripts, operation and maintenance manuals and procedures, and all relevant products and processes that drive and impact availability.
  • Serving as a technical advisor for AWS data center electrical, mechanical, structural, site, civil, security, network, fire detection and suppression and all design disciplines as they relate to and drive increased availability.
  • Working with internal teams to understand customer availability requirements
  • Providing technical oversight and review for LSE / CSE COE

About the team

Why AWS

Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.

Diverse Experiences

Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply.

If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.

Work / Life Balance

We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture.

When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.

Inclusive Team Culture

Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences.

Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.

Mentorship and Career Growth

We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.

BASIC QUALIFICATIONS

  • Bachelor’s Degree in Electrical or Mechanical Engineering or equivalent experience.
  • Cumulative 3+ years partnering with cross-functional teams working in critical facilities
  • Cumulative 10+ years of experience with mission critical facilities to include the following :

1. Understanding of uninterruptible power sources, diesel generators, electrical switchgear, power distribution units, and automatic / static transfer switches.

2. Understanding of chillers, cooling towers, direct and indirect evaporative cooing, and variable speed drives, and fan systems.

3. Knowledge of building codes and regulations including Life Safety, IBC, NFPA, NEC, NESC and OSHA.

4. Direct experience with the design, construction, operation, or maintenance of data centers.

5. Ability to research new designs, technologies, construction methods, and innovative operations procedures of data center equipment and facilities.

6. Ability to critically audit and provide customer-representative feedback on design concepts through exploration, development, deployment / construction, and through operations.

7. Ability and willingness to think outside of the box to find creative and innovative solutions to improve availability through improved quality, reliability, and maintainability.

8. Ability to perform complex business case analysis to justify technical decisions and present the justification to management in a high level review.

9. Possess excellent communication skills, attention to detail, maintain high quality standards.

PREFERRED QUALIFICATIONS

  • Organized and have the ability to set priorities and meet deadlines and budget
  • Experience using a variety of web-based and other software tools for data analysis and visualization.
  • Direct experience with the design, construction, operation, and maintenance of mission critical facilities, especially data centers.
  • Experience as resident engineer or hands-on (in the field) design consultant.
  • Knowledge of building codes and regulations including Life Safety, IBC, NFPA, NEC, NESC and OSHA.
  • Experience reading, interpreting, and creating construction drawings, specifications, and submittal documents.
  • Ability to carry design concepts through exploration, development, and into deployment / mass production
  • Ability to research new designs, technologies, construction methods, and innovative operations procedures of data center equipment and facilities.
  • Ability to critically audit and provide customer-representative feedback on design concepts through exploration, development, deployment / construction, and through operations.
  • Ability and willingness to think outside of the box to find creative and innovative solutions to improve availability through improved quality, reliability, and maintainability.
  • Ability to perform complex business case analysis to justify technical decisions and present the justification to management in a high level review.
  • Possess excellent communication and writing skills, attention to detail, maintain high quality standards
  • Detailed understanding of both mechanical and electrical equipment / design related to data centers (Including but not limited to : uninterruptable power supplies , diesel generators, electrical switchgear, power distribution units, variable frequency drives, automatic / static transfer switches, chillers air-cooled and water-cooled , pumps, cooling towers, heat exchangers, air handlers, economizers, etc...)
  • EPMS / SCADA / BMS Controls system experience (software and / or hardware)
  • Registered Professional Engineer
  • Advanced degree in engineering, business, or related field
  • Experience with large scale technical operations or large-scale compute facilities
  • 30+ days ago
Related jobs
Amazon Data Services, Inc.
Herndon, Virginia

Availability Engineers are responsible for consultative and peer review of the design of all design disciplines within Amazon DCs world-wide. Beyond design focus, we work directly with operations, security teams, field engineering, construction management, and operations to implement processes and p...

Promoted
The Aerospace Corporation
Chantilly, Virginia

Engineering Specialist - Data Engineering) to join a team of motivated engineers passionate about researching, prototyping, understanding, and building data platforms and cloud native applications for the space enterprise. This includes data model designs and prototypes, implementation and managemen...

Promoted
Another Source
Ashburn, Virginia

Sabey Data Centers, the largest privately-owned multi-tenant data center operator in the United States, is seeking two dedicated individuals to join their team in Ashburn, VA, as a Data Center Facilities Technician and a Data Center Facilities Engineer 1. Data Center Facilities Engineer 1. Data Cent...

Promoted
Saxon Global
Vienna, Virginia

Job Title: Pega CDH Data Engineer. Experience in understanding existing data model and create new data model. Hands on experience using dataflows and datasets. Bachelor’s degree in Information Systems/Technology, Computer Science, Engineering, or related field, or the equivalent combination of educa...

Promoted
Capital One
Vienna, Virginia
Remote

Center 1 (19052), United States of America, McLean, VirginiaData Engineer (Remote-Eligible)Do you love building and pioneering in the technology space? Do you enjoy solving complex business problems in a fast-paced, collaborative, inclusive, and iterative delivery environment? At Capital One, you''l...

Promoted
Abile Group, Inc.
Springfield, Virginia

Abile Group has an exciting and challenging opportunity for a Data Center Engineer on a 10 year contract providing User Facing and Data Center Services supporting an Intelligence Community customer. All the personnel on the team will work together to support innovative design, engineering, procureme...

Promoted
Shuvel Digital
Fairfax, Virginia
Remote

Google Cloud Platform: Experience with GCP services, particularly Cloud Functions, Cloud SQL, BigQuery, and database management. ETL Pipelines: Comfortable with data cleaning, transformation, and integration workflows. Remote working/work at home options are available for this role. ...

Promoted
Shuvel
Vienna, Virginia
Remote

Significant hands-on experience with Azure services such as Azure Data Factory, Azure Databricks, Azure Data Lake Storage (ADLS Gen2), Azure SQL, and other data sources. You will be working with all levels of technology from backend data processing technologies (Databricks/Apache Spark) to other Clo...

Task Force Talent
Tysons, Virginia

This role combines data engineering skills as well as technical cyber analysis capabilities. They work on very interesting, usually highly technical roles in cybersecurity, software development, data science, and related areas for well-known companies and government organizations. Hands-on experienc...

Amazon Data Services, Inc.
Herndon, Virginia

Interface with internal data center operations team, data center design engineering team, server hardware team, environmental health and safety team to promote standards that maintain consistency and reliability in services delivered. As an Amazon Field Engineer, you will provide full life-cycle sup...