Sr. Platform & Reliability Engineer (Remote)

Designer Shoe Warehouse

Columbus, OH, US

Remote

Full-time

As a Sr. Platform & Reliability Engineer you will live, eat, and breathe the principles of availability, performance, reliability, and automation.

You will be constantly presented with new challenges of sizable scope and variety. You will maintain a close partnership with development teams;

helping them architect and implement their applications and environments via new and ground-breaking methods that break the traditional infrastructure model.

This position, under the direction of the Sr. Manager, Platform & Reliability Engineering, will be responsible for delivering knowledge and experience of the DevOps and SRE domains, including production support and cloud service delivery as well as experience of CI / CD.

Successful candidates will be humble, yet passionate and self-motivated. They will be strong leaders who can prioritize well, communicate clearly, and have a consistent track record of identifying opportunities and creating efficiencies.

We welcome those who see things differently, aren’t afraid to experiment, practice the fail fast / fail forward philosophy, believe that if you have to do it more than once-you automate, and are comfortable having healthy discussions / debates with teammates and peers to drive the aforementioned principles.

Reports To : Sr. Manager, Platform & Reliability Engineering

Essential Duties and Responsibilities :

Remain curious! Meaning you research and present new technology trends, influencing peers and leadership toward adoption, while always questioning the industry standards or status quo.
Collaborate closely with other Solution Centers to understand workload / technical requirements and guide them to the best leverage of infrastructure cloud services, optimizing for performance, cost and architectural flexibility
You are never satisfied with the performance you are seeing and always know you can get a little bit more if you pull this lever.

You consistently improve developer experience, availability, performance, and reliability via automation, observability, and related efficient tooling.

Design, implement and roll out solutions that leverage integration of home-grown, open source and 3rd party solutions to provide a high-performing continuous delivery pipeline that fits with the development teams’ needs as well as Designer Brands’ long-term strategy
Define reusable components, frameworks, common schemas, standards, and tools, influencing their usage across teams
Assist in building world-class, multi-cloud capable, state-of-the-art products by : Automating build and deployment processes Automating verification, rollback, and scaling bi-directionally Including A / B, Canary, Blue / Green deployment patterns Building highly resilient cloud eco-systems capable of high availability and scale Using Docker containers, Kubernetes as an orchestrator, Small Function Sets, or as full VMs with base images Mastering Layer-7 Traffic Management Technologies as code for Efficient Delivery Implementing observability as code (Metrics, Logging, Tracing, Alerting)
Influence, Implement, and continuously refine operational processes, ensuring a balance between speed, agility, and adherence to policy
Utilize the combination of above-mentioned items to create a Next-Generation Platform for DBI Application Delivery
Evolve infrastructure, server, deployment strategies and testing to support our goal of 100% up time and quick turnaround of deployments for the application development organization
Mentor and provide technical oversight and guidance to team members and cross-functional partners, improving their skills, knowledge of our systems, and their ability to get things done!
Possess the ability to troubleshoot technology you know, and technology you don’t know. Sometimes you will have to lead issues where you may not be versed on all the technology under the covers.

You will need to get with your team to bring resources together to fill the gaps.

Participate in industry groups to gain visibility to trends and influence future direction

Required Skills :

Subject matter expertise in a wide range of infrastructure related domains, with a track record of large production grade service deployment and IT operations in a 24 / 7 setting
Ability to take technical and / or business requirements and translate them into detailed infrastructure solution designs
Expert knowledge of container solutions and their management (Kubernetes, Docker, OpenShift)
Expert knowledge of Infrastructure as Code frameworks such as Puppet, Chef, Ansible, and Terraform, ArgoCD, Flux
Knowledge of one or more Layer 7 Traffic Management Application such as F5, Pulse Secure vATM, AVI, Envoy, or Nginx(Plus)
Demonstrated Programming / Scripting skills or the ability to read and modify : Bash, Python, Ruby, C, or Golang.
Excellent communication, presentation and leadership skills

Competencies :

SETTING GOALS Creates and follows effective plans. Anticipates risks, creates contingency plans. Aligns plans with goals.

Allocates adequate resources. Accepts and supports change. Willing to take risks and suggests new ideas, approaches. Takes initiative.

Seeks out learning activities.

WORKING WITH OTHERS Clearly articulates own, other’s goals. Promotes a team atmosphere by demonstrating humility and respect.

Builds effective relationships, relates well to others. Delivers and responds to feedback in a constructive manner. Considers multiple perspectives.

Handles conflict, pressure, uncertainty and adapts independently. Meets commitments. Dedicated to working with business partners on their expectations.

GETTING RESULTS Personally accountable for work performance targets and achieving results. Prioritizes well. Anticipates and handles obstacles effectively.

Makes good, timely decisions. Can simplify and process complex problems. Understands underlying issues and addresses root causes.

Meets deadlines, works until finished.

Qualifications : Experience :

Experience :

5-7+ years’ experience as part of large-scale engineering teams or commerce environments where downtime is not acceptable
3+ years’ experience supporting container runtimes and orchestration such as Docker, Docker-swarm, Kubernetes / K8S, Mesos, and Nomad IN PRODUCTION
In-depth understanding of cloud native design patterns (Infrastructure as Code, Microservices)
Experience with Content Delivery Networks and Related Offerings (Akamai, Cloudflare, Fastly)
Strong aptitude for learning new technologies and understanding how, when, and where to best utilize them
Experience with offerings for cloud (Azure, AWS, GCP) and on-prem (VMWare, OpenShift, solutions
Experience utilizing best of breed processes to improve day to day operations
Experience with modern development tools such as Git, Jenkins, Azure DevOps, Jira, etc.
Admin-level experience supporting and developing Linux / Unix based environments
Admin-level experience in infrastructure and network (DNS, DHCP, IPAM, NTP, LB,

Preferred Qualifications :

Experience in Retail preferred, but not required

Education :

Bachelor’s degree in relevant field or equivalent work experience.

LI-Remote

30+ days ago

Related jobs

Sr. Platform & Reliability Engineer (Remote)

Designer Shoe Warehouse

Columbus, Ohio

Remote

Manager, Platform & Reliability Engineering, will be responsible for delivering knowledge and experience of the DevOps and SRE domains, including production support and cloud service delivery as well as experience of CI/CD. Platform & Reliability Engineer. Manager, Platform & Reliability Engineering...

Promoted

Sr. Lead Application Security Engineer - Generative AI (Remote)

S&P Global

Columbus, Ohio

Remote

We are looking for a Senior security engineer responsible for development and implementation of security architecture and engineering best practices across S&P Ratings technology platforms. This role will provide Security engineering and Security Architecture consultation to build and enhance securi...

Principal Engineer - Sr. Site Reliability Engineer

WELLS FARGO BANK

Columbus, Ohio

Site Reliability Engineers leverage their experience as software and systems engineers to ensure applications onboarded to SRE are available, have full stack observability, introduce continuous improvement through code and automation, provide operational insight through analytics, continuously test,...

Sr. Network Security Engineer (Remote)

Designer Brands (DSW, Camuto Group)

Columbus, Ohio

Remote

Senior Network Security Engineer. Minimum of 5 years of experience in network security engineering. ...

Lead Cloud Engineer (Google Cloud Platform, Azure, Tool Onboarding) - Remote

CVS Health

Work from hom, OH, US

Remote

The Lead Cloud Engineer will be a Technical Subject Matter Expert / Individual Contributor accountable for leading the organizational transformation in onboarding various data consumption and business intelligence tools into Google Cloud. The Lead Cloud Engineer will: . Create and propose cloud engi...

Sr Protection And Control Engineer PE (Remote)

Actalent

New Albany, Ohio

Remote

Sr Protection and Control Engineer PE. The engineer will be working with other engineers to complete protection and control projects that support our customer with minimum direction. They will work with in-house and client engineers and designers. Utilize MOB to review equipment and materials establ...

Sr Appl/Sys Sales Engineer - Remote -East Coast Region

Honeywell

Columbus, Ohio

Remote

Honeywell is looking for a Solutions Architect/Pre-Sales Engineer who will provide primary technical pre- sales support to one or more assigned account representatives throughout the sales cycle with the objective of achieving monthly, quarterly and annual quota assignments. Help coordinate engineer...

Sr Structural Substation Engineer (Remote)

Actalent

New Albany, Ohio

Remote

Sr Structural Substation Engineer. The Principal Engineer will work to complete projects that support our utility customer. They will work with in-house, client engineers, and designers. They will provide, coordinate, and conduct training to lower-level engineers. ...