Site Reliability Engineer

NVIDIA

Santa Clara, California, US

$160K-$247.3K a year

Full-time

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology and outstanding people.

Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world.

Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, encouraging environment where everyone is inspired to do their best work.

Come join the team and see how you can make a lasting impact on the world.

Do you have the right skills and experience for this role Read on to find out, and make your application.

We are looking for a Staff Site Reliability Engineer to join our team. You should have experience supporting and working with teams across the company to improve the usability, reliability, and performance for enterprise applications.

What You'll Be Doing

Design, develop, and evolve the Site Reliability Engineering practice.
Deploy and support tools from a system engineering perspective and be able to solve any issues in-depth.
Help the SRE teams define technology and business strategies that deliver iterative enhancements to the tools and processes that improve availability, observability, and scalability.
Recognize, validate, and publish emerging technologies and architectures that align with business objectives.
Lead and build the proven foundation for the Infrastructure and Application lifecycle on installation, monitoring, observability, and user experience.
Build tooling to lower the barrier of entrance for engineering teams to plug in and enjoy the benefits of Reliability.
Documenting institutional knowledge.
Building software to help operations and support teams.

What We Need To See

Bachelor’s and / or Masters in computer science or related field of study (or equivalent experience).
8+ demonstrable experience deploying and supporting applications in a Cloud environment.
Having Confluence, Jira, and Service Desk experience is a plus.
Excellent Windows and Linux system skills.
Good understanding of security components like SSL, load balancer, firewalls, etc.
Extensive experience supporting applications in high-availability environments.
Scripting skills to automate repetitive and basic tasks.
Experience in documenting processes and procedures.
Strong interpersonal skills with the ability to understand and explain technical issues to a non-technical audience.

Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package.

As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com.

The base salary range is 160,000 USD - 247,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits.

NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

J-18808-Ljbffr

5 days ago

Related jobs

Promoted

Staff Site Reliability Engineer

VirtualVocations

Fremont, California

A company is looking for a Staff Site Reliability Engineer - Incident Response. ...

Promoted

Site Reliability Engineer (SRE) iCloud Edge & Messaging

Apple

Cupertino, California

The Apple Service Engineering - Edge & Messaging SRE team is looking for Site Reliability Engineers to build and run the services that hundreds of millions of customers use every day. We're looking for a talented and passionate person who loves designing, engineering and running systems and infrastr...

Promoted

Site Reliability Engineer - Video Infrastructure

TikTok

San Jose, California

Scale up systems sustainably through mechanisms like automation, and initiate changes that improve system reliability and processing speed. Bachelor's degree in Computer Science or a related technical background involving software/system engineering, or equivalent working experience. ...

Promoted

Staff Site Reliability Engineer - Technical Duty Officer

Zscaler

San Jose, California

We're looking for an experienced Staff Site Reliability Engineer-Technical Duty Officer to join our Shared Platform Engineer team. Site Reliability Engineer, with relevant experience in an Operations or Engineering environment. Our Engineering team built the world's largest cloud security platform f...

Promoted

Software Engineer (site Reliability)

Spry Info Solutions, INC

Santa Clara, California

We are looking for a site reliability engineer with an expertise in Splunk configuration, setup and monitoring. Implement integration to external system to develop Splunk use cases and proliferate Splunk usage across the enterprise; provide engineering expertise and assistance to the Splunk user. Th...

Promoted

DevOps/Site Reliability Engineer

LeadStack Inc.

San Jose, California

Job Title: DevOps/Site Reliability Engineer. Location: Hybrid on site, San Jose, CA. ...

Infrastructure or Site Reliability Engineer

Siemens Industry Software Inc.

Fremont, California

The position involves performance based compensation and reports to anInfrastructure Engineering Manager who manages personnel at multiple sites. Are you ready to have your system support skills andexperience leveraged to improve the productivity of developers working onworld-class engineering softw...

Senior Site Reliability Engineer

Tarana Wireless

Milpitas, California

As a Senior Site Reliability Engineer, you will help us manage software that runs on the cloud and remotely manages millions of radio devices. Automate the monitoring and auto-scaling of the production environment, to support millions of connected devices Monitoring of all live systems Troubleshoot ...

Site Reliability Engineer (Samsung Ads)

SAMSUNG

Mountain View, California

We are looking for a passionate Embedded Site Reliability Engineer who will lead the technical strategy and vision for our underpinning infrastructure, alerting & monitoring, infrastructure provisioning, networking, and development tooling in collaboration with other engineering teams and leadership...

Site Reliability Engineer

Fractal

CA, United States

Must be willing to participate in on-call rotationWork cross-functionally with Services and Engineering teams. ...

Site Reliability Engineer

Staff Site Reliability Engineer

Site Reliability Engineer (SRE) iCloud Edge & Messaging

Site Reliability Engineer - Video Infrastructure

Staff Site Reliability Engineer - Technical Duty Officer

Software Engineer (site Reliability)

DevOps/Site Reliability Engineer

Infrastructure or Site Reliability Engineer

Senior Site Reliability Engineer

Site Reliability Engineer (Samsung Ads)

Site Reliability Engineer

Related searches