Search jobs > San Jose, CA > Site reliability engineer

Site Reliability Engineer - Infra and DevOps

Western Digital Capital
San Jose, California, US
Full-time

Site Reliability Engineer - Infra and DevOps

  • Full-time
  • Job Type (exemption status) : Exempt position
  • Salary Range : 117,300.00-166,200.00
  • Business Function : Software Development (Sys)

At Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible.

Check below to see if you have what is needed for this opportunity, and if so, make an application asap.

As a Secure Development Factory (SDF) Site Reliability Engineer - DevOps, you will be at the heart of Western Digital’s engineering process, delivering the software development tools and infrastructure that empowers engineering teams to develop and deliver high quality products quickly.

You will play a pivotal role in ensuring the reliability, scalability, and performance of our IT infrastructure and DevOps tools.

Key Responsibilities

  • Observability and Monitoring : Design, implement, and continuously improve monitoring and observability solutions to ensure effective and real-time visibility into system performance.
  • Best Practices : Advocate for and implement best practices in SRE, DevOps, and automation, with a focus on enhancing platform stability and performance.
  • Automation : Lead automation efforts to streamline processes, reduce manual tasks, and improve operational efficiency.
  • Architecting and Designing : Contribute to the architecture and design of systems and applications, aligning them with reliability and scalability goals.
  • Technical accountability : Provide technical ownership in the SRE team, fostering a collaborative and growth-oriented environment.
  • Ownership : Take ownership of system reliability, meet Service Level Objectives (SLOs), and ensure customer satisfaction.
  • Collaboration : Work closely with Engineering teams to understand customer requirements and collaborate on solutions.
  • Adaptability : Stay updated with emerging technologies and adapt quickly to evolving requirements and challenges.
  • Upskilling : Continuously upskill in newer technologies and share knowledge within the team.
  • Team Player : Collaborate effectively with team members and contribute to a positive team culture.
  • Professional Behaviour : Demonstrate professionalism, integrity, and a commitment to the highest ethical standards.
  • Documentation : Maintain thorough and well-organized documentation of systems and processes.

Required Skills and Qualifications

  • Candidates MUST POSSESS a B.S. C.S, I.T., E.E., or M.E., +6 to 10 years of hands-on experience in DevOps tools and SRE practices.
  • MUST POSSESS Administration experience on DevOps tools such as Artifactory, Jenkins, Git, Blackduck, SAST / DAST tools, etc.
  • MUST POSSESS A Very good understanding of Infrastructure at the Server, VMWare, Storage and Networking.
  • Exceptional analytical, problem solving, and troubleshooting skills to manage complex process and technology issues.
  • Extensive experience in Ansible automation (Research, Write, Maintain, and Optimize roles / playbooks / modules).
  • Expertise in shell scripting, Python, and other configuration management tools like Terraform.
  • Development and customization of CICD pipelines and onboarding applications with varying requirements.
  • Experience in monitoring enhancements and metrics dashboarding using tools such as Icinga, Splunk, Prometheus & Grafana.
  • Good to have experience in containerization technologies viz., Docker, Kubernetes.
  • Focus on embedding Security postures on the systems.
  • Working experience in ha-proxy, load balancers, ldap / sso integration, security endpoint configurations.
  • Knowledge of cloud computing platforms (e.g., AWS, Azure, GCP) is a plus.
  • Excellent communication and collaboration skills.

Western Digital thrives on the power and potential of diversity. We are committed to an inclusive environment where every individual can thrive through a sense of belonging, respect and contribution.

J-18808-Ljbffr

2 days ago
Related jobs
Promoted
VirtualVocations
Fremont, California

A company is looking for a Site Reliability Engineering (SRE) Solution Architect. ...

Promoted
Adobe Inc.
San Jose, California

You have a track record as a site reliability engineer or eager to build a career in large-scale SaaS businesses, and a strong desire to implement initiatives and systems to improve reliability, availability, security, and privacy. Adobe's Reliability Engineering team is looking for a Site Reliabili...

Promoted
Intershop Communications AG
Mountain View, California

Senior) Site Reliability Engineer (m/f/d). Responsibility for the reliability, availability and performance of our Intershop Progressive Web App as part of standard product development. Contribute to the design and implementation of architectural decisions that improve the reliability and scalabilit...

Promoted
Palo Alto Networks
Santa Clara, California

DevOps Engineer (or equal role) with a passion for technology and strong motivation and responsibility for high reliability and service level. Collaboration and teamwork are at the foundation of our culture and we need engineers who can communicate and work well with others towards achieving a commo...

Promoted
Apple Inc.
Cupertino, California

Apple Services Engineering team is looking for an innovative SRE with experience managing physical infrastructure and cloud solutions to design, build, and maintain our core infrastructure. Collaborate with cross-functional teams to understand requirements, design and implement resilient and scalabl...

Promoted
Pure Storage
Santa Clara, California

At the frontier of cloud technology, Site Reliability Engineering (SRE) works diligently to bolster the availability of our cloud infrastructure and services. Responsible for uptime and reliability of our core services and infrastructure, including proactive monitoring and incident response/ resolut...

Promoted
NVIDIA
Santa Clara, California

Join our team at NVIDIA as a Senior Site Reliability Engineer focused on HPC storage and play a crucial role in designing, implementing, and optimizing on-prem High-Performance Computing (HPC) storage solutions while harnessing the power of cloud computing. Develop tooling to automate deployment and...

Protingent
Sunnyvale, California

Site Reliability Engineer (SRE). Protingent Staffing has an exciting contract opportunity for Site Reliability Engineer (SRE) with our client located in Sunnyvale, CA. Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding. Create ...

DataFlex LLC, The Human Capital & Company Matchmaker Experts
Sunnyvale, California

The Chief Engineer is accountable for the operational management and effective daily oversite and administration of the site’s operational and maintenance tasks with the objectives of safely, efficiently, and effectively operating equipment and systems in a cost-effective manner. Oversees operation ...

Everest Consulting Group
San Jose, California
Remote

Role: Site Reliability Engineer Employment Type: Contract – Only VISA FREE Work location: San Jose, CA Work mode: Onsite- 2 days in a week / 3 days Remote About the Role · We seek a highly skilled and dynamic Site Reliabi...