Search jobs > Santa Clara, CA > Technical support engineer

Technical Support Engineer, Linux and HPC Admin - DGX Cloud

NVIDIA
Santa Clara, California, US
$104K-$195.5K a year
Full-time

NVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation fueled by great technology and dynamic people.

Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world.

Doing what’s never been done before takes vision, innovation, and the world’s best talent. NVIDIANS immerse themselves in a diverse, supportive environment that encourages everyone to do their best work.

Join the team and see how you can make a lasting impact on the world.

The full job description covers all associated skills, previous experience, and any qualifications that applicants are expected to have.

NVIDIA Base Command Manager is used to power thousands of clusters worldwide, varying from a few to several thousands of nodes, and streamlining cluster provisioning, workload management, and infrastructure monitoring.

It provides all the tools you need to deploy and run an AI data center. We take great pride in providing excellent, comprehensive support to our customers! The Technical Support Engineer in this role will significantly impact and contribute to the overall success of our customers running their clusters with the NVIDIA solution.

What You’ll Be Doing

  • Provide support to our customers for our Linux-based cluster management software product, ensuring customers get the help they require to support their clusters.
  • Collaborate with the development team to get the right information and to raise support tickets to the appropriate development team.
  • Become and serve as a subject-matter expert in any one of a number of areas.
  • Research and development tasks for customers or for internal use by our development team.
  • Work with the latest hardware (e.g. GPUs, FPGAs, AI accelerators, high-speed interconnects such as InfiniBand, Omni Path, and Gig-E) and software technologies such as parallel filesystems (e.

g. Lustre, GPFS, BeeGFS, WekaIO), Jupyter, various ML frameworks and tools, Spark, Kubernetes, and Ceph.

What We Need To See

  • BS degree or equivalent experience in Electrical Engineering or related field.
  • 5 years of relevant, aligned experience, ideally in a customer facing role.
  • Proven research skills and interest in assisting customers to achieve their goals.
  • Experience in a technical customer-facing role.
  • Eagerness to learn and become an authority of our product.
  • Excellent written communication skills with the ability to easily convey complex technical information to consumable summaries.
  • In-depth knowledge of Linux.
  • Familiarity with typical Linux installations and their most common software elements.

Ways To Stand Out From The Crowd

Experience with high-performance computing and system administration would be an asset.

The base salary range is 104,000 USD - 195,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits.

NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

J-18808-Ljbffr

1 day ago
Related jobs
Promoted
Lockheed Martin
Sunnyvale, California

Patch and harden Linux RedHat OS and works with Security personnel to protect against Cyber Threats. You thrive in a collaborative, multidisciplinary engineering environment and are committed to delivering best-in-class products and solutions. Lockheed Martin considers factors such as (but not limit...

Promoted
Amentum
Sunnyvale, California

Amentum provides intelligence analysis and security, training and education, and intelligence support strategy and policy support, intelligence and operations support, program management, and international business development services to U. Our Senior Leaders, Subject Matter Experts, and Operationa...

Promoted
myGwork - LGBTQ+ professionals & allies
Santa Clara, California

The Platform ILOM team of Oracle Hardware Development (OHD) is looking for a self-motivated, talented Embedded Software developer to bring exceptional technical skills to join a growing, distributed, multifunctional team developing and maintaining OHD’s latest embedded Linux software for new and exc...

Promoted
Bayinfotech
San Jose, California

Provides second/third level technical support for Routing Protocols technologies to Cisco customers, partners, account teams, and other Technical Assistance Center engineers. Applies analytical skills and technical knowledge to solve product and network problems of moderate to high complexity. Candi...

Promoted
Fortinet, Inc.
Sunnyvale, California

You would act as the Technical Support Engineer for the Switching and Wireless Team. We offer a supportive work environment and a competitive Total Rewards package to support you with your overall health and financial well-being. We are currently seeking a dynamic Technical Support Engineer to contr...

Promoted
FlutterFlow
Mountain View, California

As a Technical Support Engineer, you’ll work hands-on with our users to solve complex technical issues on building in FlutterFlow. Identify emerging issues and provide feedback to the engineering team on common feature requests, bugs, and technical issues. Full Time] Technical Support Engineer at Fl...

Promoted
NVIDIA
Santa Clara, California

NVIDIA is hiring a Senior Distributed Systems Engineer to architect, lead, and develop scalable AI infrastructure and deep learning platforms! You will need to have strong programming skills, a deep understanding of distributed systems, distributed storage & compute systems, and distributed syst...

Advanced Micro Devices, Inc
Santa Clara, California

Gain expertise in AMD’s ROCm software and familiar ML frameworks such as TensorFlow and PyTorch, and use this knowledge to create outstanding documentation and training resources. Design and manage a comprehensive library of technical materials, including manuals, guidebooks, application notes, and ...

NVIDIA
Santa Clara, California
Remote

We expect you to have a strong programming background, a deep understanding of distributed systems, familiarity with software testing and deployment, and excellent communication and planning abilities. You and other engineers in this team will help advance NVIDIA's capacity to build and deploy leadi...

NVIDIA
Santa Clara, California

Strong technical skills and understanding of embedded systems, orchestration & automation systems, data centers and cloud architecture, as well as excellent communication and planning skills. The cloud hosts a heterogeneous mix of machines and devices with various operating systems (Windows/Linu...