Search jobs > Columbus, OH > Remote > Reliability engineer

Sr. Platform & Reliability Engineer (Remote)

Designer Shoe Warehouse
Columbus, OH, US
Remote
Full-time

As a Sr. Platform & Reliability Engineer you will live, eat, and breathe the principles of availability, performance, reliability, and automation.

You will be constantly presented with new challenges of sizable scope and variety. You will maintain a close partnership with development teams;

helping them architect and implement their applications and environments via new and ground-breaking methods that break the traditional infrastructure model.

This position, under the direction of the Sr. Manager, Platform & Reliability Engineering, will be responsible for delivering knowledge and experience of the DevOps and SRE domains, including production support and cloud service delivery as well as experience of CI / CD.

Successful candidates will be humble, yet passionate and self-motivated. They will be strong leaders who can prioritize well, communicate clearly, and have a consistent track record of identifying opportunities and creating efficiencies.

We welcome those who see things differently, aren’t afraid to experiment, practice the fail fast / fail forward philosophy, believe that if you have to do it more than once-you automate, and are comfortable having healthy discussions / debates with teammates and peers to drive the aforementioned principles.

Reports To : Sr. Manager, Platform & Reliability Engineering

Essential Duties and Responsibilities :

  • Remain curious! Meaning you research and present new technology trends, influencing peers and leadership toward adoption, while always questioning the industry standards or status quo.
  • Collaborate closely with other Solution Centers to understand workload / technical requirements and guide them to the best leverage of infrastructure cloud services, optimizing for performance, cost and architectural flexibility
  • You are never satisfied with the performance you are seeing and always know you can get a little bit more if you pull this lever.

You consistently improve developer experience, availability, performance, and reliability via automation, observability, and related efficient tooling.

  • Design, implement and roll out solutions that leverage integration of home-grown, open source and 3rd party solutions to provide a high-performing continuous delivery pipeline that fits with the development teams’ needs as well as Designer Brands’ long-term strategy
  • Define reusable components, frameworks, common schemas, standards, and tools, influencing their usage across teams
  • Assist in building world-class, multi-cloud capable, state-of-the-art products by : Automating build and deployment processes Automating verification, rollback, and scaling bi-directionally Including A / B, Canary, Blue / Green deployment patterns Building highly resilient cloud eco-systems capable of high availability and scale Using Docker containers, Kubernetes as an orchestrator, Small Function Sets, or as full VMs with base images Mastering Layer-7 Traffic Management Technologies as code for Efficient Delivery Implementing observability as code (Metrics, Logging, Tracing, Alerting)
  • Influence, Implement, and continuously refine operational processes, ensuring a balance between speed, agility, and adherence to policy
  • Utilize the combination of above-mentioned items to create a Next-Generation Platform for DBI Application Delivery
  • Evolve infrastructure, server, deployment strategies and testing to support our goal of 100% up time and quick turnaround of deployments for the application development organization
  • Mentor and provide technical oversight and guidance to team members and cross-functional partners, improving their skills, knowledge of our systems, and their ability to get things done!
  • Possess the ability to troubleshoot technology you know, and technology you don’t know. Sometimes you will have to lead issues where you may not be versed on all the technology under the covers.

You will need to get with your team to bring resources together to fill the gaps.

Participate in industry groups to gain visibility to trends and influence future direction

Required Skills :

  • Subject matter expertise in a wide range of infrastructure related domains, with a track record of large production grade service deployment and IT operations in a 24 / 7 setting
  • Ability to take technical and / or business requirements and translate them into detailed infrastructure solution designs
  • Expert knowledge of container solutions and their management (Kubernetes, Docker, OpenShift)
  • Expert knowledge of Infrastructure as Code frameworks such as Puppet, Chef, Ansible, and Terraform, ArgoCD, Flux
  • Knowledge of one or more Layer 7 Traffic Management Application such as F5, Pulse Secure vATM, AVI, Envoy, or Nginx(Plus)
  • Demonstrated Programming / Scripting skills or the ability to read and modify : Bash, Python, Ruby, C, or Golang.
  • Excellent communication, presentation and leadership skills

Competencies :

SETTING GOALS Creates and follows effective plans. Anticipates risks, creates contingency plans. Aligns plans with goals.

Allocates adequate resources. Accepts and supports change. Willing to take risks and suggests new ideas, approaches. Takes initiative.

Seeks out learning activities.

WORKING WITH OTHERS Clearly articulates own, other’s goals. Promotes a team atmosphere by demonstrating humility and respect.

Builds effective relationships, relates well to others. Delivers and responds to feedback in a constructive manner. Considers multiple perspectives.

Handles conflict, pressure, uncertainty and adapts independently. Meets commitments. Dedicated to working with business partners on their expectations.

GETTING RESULTS Personally accountable for work performance targets and achieving results. Prioritizes well. Anticipates and handles obstacles effectively.

Makes good, timely decisions. Can simplify and process complex problems. Understands underlying issues and addresses root causes.

Meets deadlines, works until finished.

Qualifications : Experience :

Experience :

  • 5-7+ years’ experience as part of large-scale engineering teams or commerce environments where downtime is not acceptable
  • 3+ years’ experience supporting container runtimes and orchestration such as Docker, Docker-swarm, Kubernetes / K8S, Mesos, and Nomad IN PRODUCTION
  • In-depth understanding of cloud native design patterns (Infrastructure as Code, Microservices)
  • Experience with Content Delivery Networks and Related Offerings (Akamai, Cloudflare, Fastly)
  • Strong aptitude for learning new technologies and understanding how, when, and where to best utilize them
  • Experience with offerings for cloud (Azure, AWS, GCP) and on-prem (VMWare, OpenShift, solutions
  • Experience utilizing best of breed processes to improve day to day operations
  • Experience with modern development tools such as Git, Jenkins, Azure DevOps, Jira, etc.
  • Admin-level experience supporting and developing Linux / Unix based environments
  • Admin-level experience in infrastructure and network (DNS, DHCP, IPAM, NTP, LB,

Preferred Qualifications :

Experience in Retail preferred, but not required

Education :

Bachelor’s degree in relevant field or equivalent work experience.

LI-Remote

30+ days ago
Related jobs
Designer Shoe Warehouse
Columbus, Ohio
Remote

Manager, Platform & Reliability Engineering, will be responsible for delivering knowledge and experience of the DevOps and SRE domains, including production support and cloud service delivery as well as experience of CI/CD. Platform & Reliability Engineer. Manager, Platform & Reliability Engineering...

Promoted
S&P Global
Columbus, Ohio
Remote

We are looking for a Senior security engineer responsible for development and implementation of security architecture and engineering best practices across S&P Ratings technology platforms. This role will provide Security engineering and Security Architecture consultation to build and enhance securi...

WELLS FARGO BANK
Columbus, Ohio

Site Reliability Engineers leverage their experience as software and systems engineers to ensure applications onboarded to SRE are available, have full stack observability, introduce continuous improvement through code and automation, provide operational insight through analytics, continuously test,...

Designer Brands (DSW, Camuto Group)
Columbus, Ohio
Remote

Senior Network Security Engineer. Minimum of 5 years of experience in network security engineering. ...

CVS Health
Work from hom, OH, US
Remote

The Lead Cloud Engineer will be a Technical Subject Matter Expert / Individual Contributor accountable for leading the organizational transformation in onboarding various data consumption and business intelligence tools into Google Cloud. The Lead Cloud Engineer will: . Create and propose cloud engi...

Actalent
New Albany, Ohio
Remote

Sr Protection and Control Engineer PE. The engineer will be working with other engineers to complete protection and control projects that support our customer with minimum direction. They will work with in-house and client engineers and designers. Utilize MOB to review equipment and materials establ...

Honeywell
Columbus, Ohio
Remote

Honeywell is looking for a Solutions Architect/Pre-Sales Engineer who will provide primary technical pre- sales support to one or more assigned account representatives throughout the sales cycle with the objective of achieving monthly, quarterly and annual quota assignments. Help coordinate engineer...

Actalent
New Albany, Ohio
Remote

Sr Structural Substation Engineer. The Principal Engineer will work to complete projects that support our utility customer. They will work with in-house, client engineers, and designers. They will provide, coordinate, and conduct training to lower-level engineers. ...

Honeywell
Columbus, Ohio
Remote

Honeywell is looking for a Solutions Architect/Pre-Sales Engineer who will provide primary technical pre- sales support to one or more assigned account representatives throughout the sales cycle with the objective of achieving monthly, quarterly and annual quota assignments. Help coordinate engineer...

Broadridge
Ohio,
Remote

This role is remote with opportunities to travel when needed to meet team members and clients in person from time to time. ...