Lead Site Reliability Engineer - Cloud Operations

Tendencys Innovatios
Littleton, Colorado, US
Full-time

Lead Site Reliability Engineer - Cloud Operations

CyberSource, a Visa company, is a global leader in eCommerce payment management. CyberSource was one of the world's first payment gateways, connecting online merchants to payment networks, including Visa.

Today, CyberSource offers a full-service payment management platform for eCommerce merchants, combining global payment processing, fraud management and payment security systems.

CyberSource is looking for a bright, passionate and dedicated employee to join our Operations 2nd Level Applications team.

In this role you will be responsible for designing, optimizing, implementing, and maintaining highly available and scalable infrastructure on AWS, GCP and our on-premise platform.

These systems are truly the backbone of our business and process millions of transactions daily for some of the most prestigious companies in the US.

Essential Functions :

  • Lead the design, implementation, and management of highly available and scalable infrastructure on AWS and GCP.
  • Collaborate with internal stakeholders to understand business requirements and translate them into effective cloud-based operational architectures.
  • Design, implement, and optimize infrastructure components and services on GCP / AWS, including compute, storage, networking, and security.
  • Develop and implement best practices and standards for deploying and managing applications, databases, and other resources on GCP / AWS.
  • Perform system and application performance tuning and optimization to maximize efficiency and scalability.
  • Implement application performance monitoring (e.g., memory, logging, latency) and proactively identify monitoring gaps on an ongoing basis.
  • Perform weekly code deployment to the Production / Customer Certification environment and automate CI / CD pipelines to build images and deployments.
  • Evaluate all infrastructure changes / maintenance and determine potential for platform impact and identify mitigating steps.
  • Act as a single point of contact, training and champion for the Enterprise platform initiatives that impact the service line.
  • Support Customer Support on merchant escalated issues specific to applications (e.g., increase latency).
  • Provide technical guidance and mentorship to team members, promoting knowledge sharing and professional development.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership / site), with a general guidepost of being in the office 50% or more of the time based on business needs.

Qualifications

Basic Qualifications :

A high number of candidates may make applications for this position, so make sure to send your CV and application through as soon as possible.

10+ years of relevant work experience with a Bachelor’s Degree or at least 7 years of work experience with an Advanced degree (e.

g., Masters, MBA, JD, MD) or 4 years of work experience with a PhD, OR 13+ years of relevant work experience.

Preferred Qualifications :

  • 12 or more years of work experience with a Bachelor’s Degree or 8-10 years of experience with an Advanced Degree (e.g., Masters, MBA, JD, MD) or 6+ years of work experience with a PhD.
  • Masters in Computer Science or related engineering field and 5+ years of experience in a highly-available Linux environment.
  • Extensive experience architecting and implementing operational infrastructure on Google Cloud Platform / Amazon Web Services.
  • Strong proficiency in infrastructure as code (IaC) concepts and tools, such as Terraform or CloudFormation, for automating infrastructure deployment.
  • Proven experience designing and optimizing scalable and highly available cloud-based solutions.
  • Strong knowledge of cloud services, including compute, storage, networking, databases, and security.
  • Extensive Shell / Ruby / ReactJS / Python scripting knowledge.
  • Proficient with Swarm and Kubernetes internal architecture, networking and container micro service architectural pattern.
  • Deep rooted understanding of Linux Systems, Databases and Network concepts.
  • Experience with Enterprise monitoring solutions.
  • Strong work ethic, self-starter, ability to work in fast-paced, team-oriented environment.

Additional Information

Work Hours : Varies upon the needs of the department.

Travel Requirements : This position requires travel 5-10% of the time.

Mental / Physical Requirements : This position will be performed in an office setting. The position will require the incumbent to sit and stand at a desk, communicate in person and by telephone, frequently operate standard office equipment, such as telephones and computers.

Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status.

J-18808-Ljbffr

4 days ago
Related jobs
Promoted
Visa
Highlands Ranch, Colorado

Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs. Visa is a world leader in digital payments, facilitating more than 215 billion payments...

Promoted
VirtualVocations
Littleton, Colorado

A company is looking for a Senior IT Cloud Operations Engineering Specialist. Google Cloud Platform or AWS) and IP NetworkingAdvanced proficiency in Linux and Windows environments, and system automation toolsExpertise in Database Administration and experience with version control systems like GitHub...

Promoted
Visa
Littleton, Colorado

Site Reliability Engineer is responsible for the support of the Visa HP Non-Stop systems and associated payments applications in a multi-datacenter and multi-processing environment. Site Reliability Engineer will facilitate problem situations with the appropriate management, support groups, and serv...

Promoted
StubHub
Denver, Colorado

StubHub is looking for a Senior Site Reliability Engineer (SRE) to design and develop next-generation technologies and complex features. Extensive experience (typically 5+ years) in a site reliability engineering or a related role, demonstrating a strong command of incident management, mitigation, &...

Promoted
Salesforce
Denver, Colorado

Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. Our knowledge of cloud products and ability with infrastructure as code allow us to quickly scale to meet customer demand and maintain a high security posture. Deep e...

Promoted
Loft Orbital
Golden, Colorado

We operate satellites, fly customer payloads onboard and handle the entire mission from initial concept through in-orbit operations, significantly reducing the lead-time and risk of a traditional space mission. You will be closely working with the development, operations, IT teams to ensure integrat...

Spectrum
Greenwood Village, Colorado

As a Sr Site Reliability Engineer in Infrastructure as a Service for on-premises cloud, you will be responsible for ensuring the reliability, availability, and scalability of our, IaaS platforms, cloud infrastructure, automation, and tooling. Experience working in a DevOps or Site Reliability Engine...

Nasdaq
Denver, Colorado

Lead the technological vision for cutting-edge deployments. Special Qualifications: Continues to develop computer engineering qualifications relevant to the industry challenges. ...

Oracle
Denver, Colorado

We work with multiple service development teams, identifying cross-team issues which create risk for operations across the organization and resolving those issues with a mixture of engineering, automation, troubleshooting expertise, and general operational guidance. Deploy, operate and maintain larg...

Blue Origin
Denver, Colorado

You will deliver system engineering artifacts to meet engineering standards including development, validation, management, and allocation of requirements; manage and execute gated design reviews; define the New Shepard Ground Systems Concept of Operations; develop, review, and assess verification ar...