Search jobs > New York, NY > Senior software development

Senior Software Development Engineer, Stores Incident Monitoring

Amazon.com Services LLC
New York, New York, USA
$151.3K a year
Full-time

We’re hiring a Senior Software Development Engineer to help shape and drive Incident Monitoring tooling and engineering efforts as part of the incident response program for the worldwide Amazon retail websites.

We are re-imagining incident management & response for Amazon’s retail operations. Amazon is evolving faster than our incident management / response programs can keep up. It’s time to change that.

As an L6 Software Development Engineer on the Monitoring team, you will play a pivotal role in the design and implementation of a strategic monitoring platform for the central incident response team.

When Amazon is under duress, every single minute matters, and your technical contributions will have a direct impact on the decisions made by Amazon executives and the teams that rely on our centralized control centers and outage management capabilities.

You will be required to dive deep into the intricacies of post-incident analysis, uncovering what went wrong, identifying opportunities for improvement, and ensuring that blind spots are addressed in the future.

Amazon incidents are inherently complex, fast-paced, and highly nuanced, presenting a unique and challenging environment for technical problem-solving.

Key job responsibilities

As an L6 SDE on the Monitoring team, you will play a crucial role in defining, building, and integrating key performance indicators for various website experiences into our product.

This will require you to navigate the complex architectural landscape of Amazon and work collaboratively with experience owners across the organization.

Your technical expertise and insightful architectural design instincts will be instrumental in developing simple, elegant, and scalable solutions that can support the monitoring of thousands of unique retail website experiences.

You will be expected to take initiative and thrive in a relatively unstructured environment, leveraging your problem-solving skills to deliver innovative technical solutions.

A deep passion for understanding the retail business and providing real-time visibility into Amazon's operational health will be a key requirement for this role.

You will need to enjoy working within the Amazon ecosystem, collaborating with sister teams and retail experience owners, and building foundational solutions that will empower the central response team.

Mentoring and supporting junior engineers will be a crucial aspect of your role, as you work to foster a culture of continuous learning and improvement within the team.

Maintaining a deep understanding of the broader incident management ecosystem and its interdependencies will be essential.

A day in the life

The challenges you will face will not be easy. The sheer scale of Amazon's operations and the semi-connected nature of its systems will present unique technical problems that require creative problem-solving and persistence.

However, these are the types of big challenges that will have a substantial impact on the Central Reliability and Response organization, contributing to its ongoing efforts to improve operational resilience and responsiveness.

By embracing these challenges and leveraging your technical expertise, you will play a vital role in enhancing the monitoring capabilities that are crucial for safeguarding the seamless operation of Amazon's retail experiences.

About the team

The Incident Command Systems team at Amazon is responsible for envisioning and building programs, which consistently improve remediation times for outages.

This group consists of multiple 2-pizza teams (teams of 6-10 engineers) that each own software components for monitoring, anomaly detection of website degrading issues as well as incident management software used during these outages.

BASIC QUALIFICATIONS

  • 5+ years of non-internship professional software development experience
  • 5+ years of programming with at least one software programming language experience
  • 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • Experience as a mentor, tech lead or leading an engineering team

PREFERRED QUALIFICATIONS

  • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Bachelor's degree in computer science or equivalent
  • Experience contributing to the architecture and design (architecture, design patterns, reliability and scaling) of new and current systems
  • 30+ days ago
Related jobs
Promoted
JPMorgan Chase & Co
New York, New York

As a Senior Lead Software Engineer at JPMorgan Chase within the TCIO Corporate technology team, you will be supporting a trading desk and an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology products in a secure, stable, and scalable way. Acti...

Promoted
Sumitomo Mitsui Banking Corporation (SMBC)
New York, New York

Senior Software Engineer is responsible for assisting in the technical vision and strategic planning of all aspects of software solutions for the Capital Market business. The individual usually is on the team of mission critical projects and provides the technical expertise for development environme...

Promoted
Scale AI, Inc.
New York, New York

As a Senior Software Engineer on the team, you'll focus on building web based interfaces that allow large scale data collections for cutting edge models. At Scale AI, our mission is to accelerate the development of AI applications. Influence the culture, values, and processes of a growing engineerin...

Promoted
Capital One
New York, New York

Senior Software Engineer, Back End (Python, AWS). New York City (Hybrid On-Site): $165,100 - $188,500 for Senior Software Engineer. As a Capital One Software Engineer, you’ll have the opportunity to be on the forefront of driving a major transformation within Capital One. At least 4 years of profess...

Promoted
MongoDB
New York, New York

We're looking for a Senior Software Engineer to join our Cloud Insights & Telemetry (InTel) Team. Perform code reviews with peers and make recommendations on how to improve our code and software development processes. MongoDB's mission is to empower innovators to create, transform, and disrupt indus...

Peerbound
New York, New York

Peerbound is a venture-backed SaaS startup, led by a second-time founder and a small, tight-knit team of seasoned engineers, data scientists, and operators from top-tier software companies. We are moving fast against an ambitious, customer-validated roadmap, and we need a senior founding engineer to...

Promoted
MongoDB
New York, New York

Curiosity, willingness, and ability to quickly learn new things in the domains of computer science and software engineering. In twelve months, you're leading the design and development of major new features and are helping to mentor new engineers on the team. MongoDB's mission is to empower innovato...

CLEAR
New York, US

Were looking for an experienced Senior Software Engineer to help us build the next generation of products which will go beyond just ID & enable our members to leverage the power of a networked digital identity. As a Senior Software Engineer at CLEAR, you will participate in the design, implementatio...

Promoted
MongoDB
New York, New York

MongoDB is growing rapidly and seeking a Senior Software Engineer for the Machine Learning Platform team to be a key contributor to the critical data science and machine learning initiatives at MongoDB. As a Software Engineer, you will design and build a scalable platform to effectively develop, man...

Amazon Development Center U.S., Inc.
New York, New York

We are looking for a Senior Security Engineer who has a strong passion for security-at-scale. Minimum of 5 years of experience with any combination of the following: mobile security, threat modeling experience, secure coding, identity management and authentication, software development, cryptography...