AI System Failure Analyst Engineer

ZT Systems
Secaucus, NJ
Full-time

What We Do

ZT engineers hyperscale compute and storage solutions that are tailored to the unique workloads and business needs of our global data center customers.

With the proven ability to deliver these solutions, ZT Systems is well-positioned as the design, manufacturing, and logistics partner of choice for hyperscale computing and storage customers.

What You’ll Do

Evaluate, design, and implement product reliability test regimens to ensure products of the highest quality are delivered to our customers

Execute server hardware reliability system testing, reliability stresses, failure analysis, and statistical analysis through all phases of the product life cycle working with cross-functional teams, including hardware developers and system engineers

Hands-on Hardware reliability system testing, reliability stresses, failure analysis, and statistical analysis

Work with engineering and other cross-functional team management to define operation project requirements, solutions, and schedules

Develop innovative techniques / approaches to accelerate failure identification and mechanism understanding and support technology transfer to high-volume manufacturing

Conduct root cause analysis on issues, recommend / manage the implementation of appropriate solutions

Concisely and effectively communicate progress, status, and issues to management

Participate in product design and reliability reviews during new product development to ensure the robustness of product design and manufacturing processes

Define problems, collect data, establish facts and draw valid conclusions

Domestic and international travel may be required after 6-12 months on the job.

What You’ll Bring

Bachelor’s Degree in a STEM discipline (Electrical Engineering, Computer Engineering, Systems Engineering preferred) and 0+ years of experience

Hands-on computer / server hardware repair and troubleshooting experience preferred

Knowledge of servers and network technologies

Understanding of server component installation / uninstallation, connection, and basic networking preferred

Batch script, windows power shell, and Python knowledge are strongly preferred

Knowledge of test methodologies, writing test plans, creating test cases, and debugging

Experience analyzing statistical data

Good analytical hands-on skills

Must be a US Citizen or US Permanent Resident

LI-SL1, #LI-Onsite

About ZT Systems

At ZT Systems, you’ll get to do work you are proud of alongside smart, passionate people. Every day, there are opportunities to collaborate with the best in the industry to design, build, and deliver impactful solutions to world-class customers.

Along the way, you will gain hands-on experience in a face-paced environment that’s challenging, rewarding, and career-defining.

A culture built around our values we work hard and think fast. We view challenges as opportunities to do better, push harder, and be faster than we were the day before.

When we fail, we learn from it and move on together. And when we succeed, we use the momentum to go even further. We create value with everything we do, building the foundation of today and transforming the future of tomorrow.

Join ZT Systems and help us build technology infrastructure that connects the world.

What We Offer

When you join ZT, you’ll enjoy a range of world-class, inclusive employee benefits designed to grow with you and our company.

From competitive compensation to 401K matching to comprehensive health & wellness programs and tuition reimbursement, ZT Systems offers industry leading benefits packages for eligible employees designed to help you get the most out of life.

ZT Group Int’l. is an Equal Opportunity Employer and prohibits discrimination and harassment of any kind. ZT Systems provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local laws.

30+ days ago
Related jobs
ZT Systems
Secaucus, New Jersey

Execute server hardware reliability system testing, reliability stresses, failure analysis, and statistical analysis through all phases of the product life cycle working with cross-functional teams, including hardware developers and system engineers. Bachelor’s Degree in a STEM discipline (Electrica...

Capital One
Newark, New Jersey
Remote

We are looking for an experienced Senior Distinguished Engineer, AI Systems, to help us build the foundations of our enterprise AI Capabilities. Sr Distinguished Engineer, Generative AI Systems - (Remote- Eligible). Sr Distinguished Engineer, Generative AI Systems. Design and build fault-tolerant in...

Capital One
Newark, New Jersey
Remote

We are looking for an experienced Senior Distinguished Engineer, AI Systems, to help us build the foundations of our enterprise AI Capabilities. Distinguished Engineer, Generative AI Systems (Remote Eligible). Design and build fault-tolerant infrastructure to support long-running large-scale trainin...

Artech LLC
Jersey City, New Jersey

Network Engineering (planning, provision). Roles and Responsibilities: " Research, design and engineer network security products with focus in proxy-based Edge Secure Web Gateway(SWG) solution. Hardware Benchmarking (Agile, program managements, network management). Develop perimeter network security...

Optima Global Solutions
Teaneck, New Jersey

Under the direction of the Director of Systems, the Junior Systems Administrator deploys, maintains, and troubleshoots all data center systems. The Junior Systems Administrator monitors the daily health of the operating systems and responds to backup related issues. The Junior Systems Administrator ...

Pinnacle Group, Inc.
Woodbridge Township, New Jersey

Network Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, military experience, education. Bachelor’s degree in computer science / engineering and/or 3+ Years of Networking Experience. Demonstrated skill with creating and/or updating te...

Strategic Staffing Solutions
Woodbridge Township, New Jersey

Consult on or participate in moderately complex initiatives and deliverables within Network Engineering and contribute to large-scale planning related to Network Engineering deliverables. Review and analyze moderately complex Network Engineering challenges that require an in-depth evaluation of vari...

Codebase Inc
Jersey City, New Jersey

Role: Sr Business Data Analyst. Able to work with business / risk managers / functional teams to understand EUC’s and translate them to user stories / Epics - including capturing EUC meta data and other functional details. The role will be responsible for data analysis and mapping of EUC’s. Document...

People Integra LLC
Jersey City, New Jersey

Big Data/Hadoop Systems Administrator to support and administer the firms Big Data clusters. Aligning with systems administrator s groups to deploy new software environments required for Hadoop and to expand/upgrade existing environments. Cloudera BDR (Backup Data Replication) for HDFS and Kafka r...

VDart Inc
Jersey City, New Jersey

Monitoring application health with Layer 3, Layer 4, and Layer 7 monitors (including transparent, scripted, and external monitors Processing traffic with virtual servers including network, forwarding, and reject virtual servers) Configuring the Load balancers and Create, Modify virtual servers, pool...