AI / ML Systems Administrator

Empire AITown of Tonawanda, NY, US

job_description.job_card.1_day_ago

serp_jobs.job_preview.job_type

serp_jobs.job_card.full_time

job_description.job_card.job_description

Overview

Empire AI is establishing New York as the national leader in responsible artificial intelligence. Backed by a consortium of top academic and research institutions including Columbia, Cornell, NYU, CUNY, RPI, SUNY, Rochester Schools, Mount Sinai, Simons Foundation, and the Flatiron Institute. By leveraging the state's rich academic resources and research institutions, Empire AI is driving innovation in fields like medicine, education, energy, and climate change, while giving New York's researchers access to computing resources that are often prohibitively expensive and only available to big tech companies, fueling statewide innovation, driving economic growth, and preparing a future-ready AI workforce to tackle society's most complex challenges. The initiative is funded by $500+ million in public and private investments, State Capital Grant, Academic Institutions, Simons Foundation, Flatiron Institute, and Tom Secunda (Co-Founder of Bloomberg).

Base pay range is provided for context; actual pay will be based on skills and experience — talk with your recruiter to learn more.

Position Summary

The AI / ML Systems Administrator will help build and maintain the shared computing infrastructure that underpins New York State's most ambitious AI research initiative. These positions will support the operations of Empire AI's high-performance GPU clusters, multi-petabyte storage systems, and high-speed networks across multiple university partners. Reporting to the Manager, AI / ML Systems Administration, the AI / ML Systems Administrator will manage system health, software environments, and user support for AI / ML and data-intensive scientific research.

Duties and Responsibilities

Maintain and support Linux-based HPC and AI cluster infrastructure, including nodes, interconnects, and parallel file systems
Apply software patches, security updates, firmware upgrades, and system tuning
Implement monitoring, logging, and alerting systems to ensure uptime and performance
Deploy and maintain scientific software stacks including AI / ML frameworks (e.g., PyTorch, TensorFlow, JAX) and libraries for GPU acceleration
Assist users in debugging, optimizing, and containerizing AI workloads (e.g., Apptainer, Docker)
Support the integration of workflow tools (e.g., Slurm, Kubernetes, Nextflow, Snakemake)
Provide Tier II / III support for faculty, students, and research staff across Empire AI institutions
Troubleshoot performance issues, job failures, and environment configuration conflicts
Contribute to onboarding documentation and assist in user training activities
Implement and maintain access control, audit logging, encryption, and other safeguards aligned with HIPAA, NIST 800-171, and institutional security policies
Support secure enclaves and trusted execution environments for regulated or sensitive research
Administer large-scale parallel file systems (e.g., Lustre, GPFS) and distributed storage systems
Support automated data movement, archival, and replication between sites
Contribute to storage performance tuning and capacity planning
Develop scripts and tools to automate system tasks (e.g., provisioning, monitoring, reporting)
Maintain clear documentation of system configurations, procedures, and architecture diagrams
Participate in cross-institutional working groups to support system consistency and scalability
Support special projects or pilots in collaboration with research teams or state partners

Minimum Qualifications

Bachelor's degree in Computer Science, Engineering, or a related field

3+ years of experience in Linux systems administration in HPC, research computing, or enterprise environments

Experience managing Slurm or similar job schedulers, GPU resources, and distributed software environments

Familiarity with common research software stacks, container technologies, and scripting languages (e.g., Bash, Python)

Preferred Qualifications

Master's degree in Computer Science, Engineering, or a related technical field, or equivalent professional experience

Experience with HPC technologies including Slurm workload manager, Lustre parallel file system, and NVIDIA GPU driver installation and maintenance

Familiarity with academic or research computing environments, including support for faculty, student researchers, or large-scale research projects

Technical certifications such as NVIDIA DLI, Red Hat Certified Engineer (RHCE), or equivalent

Proficiency with container technologies (e.g., Apptainer / Singularity, Docker) and research software stacks (e.g., Python, R, MATLAB)

Experience supporting AI / ML research workflows and scientific computing applications

Working knowledge of infrastructure automation tools (e.g., Ansible, Terraform) and system monitoring frameworks (e.g., Prometheus, Grafana)

Compensation

Our compensation reflects the cost of labor across several US geographic markets. The base pay and target total cash for this position range from $50,000 to $150,000. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.

J-18808-Ljbffr

serp_jobs.job_alerts.create_a_job

System Administrator • Town of Tonawanda, NY, US

Job_description.internal_linking.related_jobs

serp_jobs.job_card.promoted

System Administrator (M365) - Bevertec

Bevertecst catharines, on, ca

serp_jobs.job_card.full_time

Reliability level government clearance.The contractor can perform the work remotely.If the contractor is local and there is a need for it, arrangements can be made to do some work at 22 Eddy, Gatin...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days

serp_jobs.job_card.promoted

Computer Systems Analyst

TradeJobsWorkForce14272 Buffalo, NY, US

serp_jobs.job_card.full_time

Training users on how to appropriately utilize their computer systems Writing instruction manuals for systems Consulting with managers to determine what role the systems play in the business Testin...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30

serp_jobs.job_card.promoted

Information Systems Technician

NavySilver Creek, NY, United States

serp_jobs.job_card.full_time

ABOUT Effective, secure communication in the cyber domain is essential to the everyday operations of military intelligence in America’s Navy. Information Professionals who oversee the seamless opera...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days

serp_jobs.job_card.promoted

: Lead Applied Mathematician – AI Algorithm Development for Defense Vision Systems - VisionWave Holdings

VisionWave Holdingsniagara falls, on, ca

serp_jobs.job_card.full_time

Help shape the mathematical core of next-generation intelligent systems — from strategic vision to edge execution.Remote / Hybrid / On-site (flexible based on candidate profile).Full-Time / Contrac...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days

serp_jobs.job_card.promoted

SDMS System Administrator

I.T. Solutions, Inc.niagara falls, on, ca

serp_jobs.job_card.full_time

NEAR SHORE - South America / Canada / Mexico.Able to provide Waters NuGenesis SDMS technical support in a GMP-regulated environment. General working knowledge of laboratory testing processes and automa...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days

serp_jobs.job_card.promoted

Forensic Engineer SME - Mitigateway

Mitigatewayniagara falls, on, ca

serp_jobs.job_card.full_time

We believe that by embedding expert forensic reasoning into scalable AI, we can transform the way risk is understood and adjudicated in property insurance losses. We build enterprise-grade generativ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days

serp_jobs.job_card.promoted

Systems Engineer

Heritage Christian ServicesBuffalo, NY, US

serp_jobs.job_card.full_time

Analyze, design, and optimize enterprise-wide systems and processes to improve efficiency, scalability, and integration.Engage in cross-functional collaboration to align system solutions with busin...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30

serp_jobs.job_card.promoted

Systems Engineer

Hospice BuffaloBuffalo, NY, US

serp_jobs.job_card.full_time

What's in it for you? Hospice offers a Robust Total Rewards Package.Employer 401k contribution regardless of employee participation, and match on employee contributions there after.Health Insur...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days

serp_jobs.job_card.promoted

System Engineer (Amherst)

Beyond TalentEdgeAmherst, NY, US

serp_jobs.job_card.part_time +1

Beyond TalentEdge has a direct hire opportunity for a Systems Engineer.Our client is a human services agency serving communities across New York state. The Systems Engineer role will analyze, design...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days

serp_jobs.job_card.promoted

Systems Designer

Design ElectronicsNiagara Falls, ON, Canada

serp_jobs.job_card.full_time

Remote Work Opportunity For over 34 years, Design Electronics has been the leader in the systems integration market serving both national and international markets. Design Electronics focuses on pro...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30

serp_jobs.job_card.promoted

System Administrator (M365)

Bevertecst catharines, on, ca

serp_jobs.job_card.full_time

serp_jobs.job_card.promoted

: Lead Applied Mathematician – AI Algorithm Development for Defense Vision Systems - st catharines

VisionWave Holdingsst catharines, on, ca

serp_jobs.job_card.full_time

serp_jobs.job_card.promoted

Systems Developer

Lactalis USABuffalo, NY, United States

serp_jobs.job_card.full_time

Ready for more than just a job?.At Lactalis in the USA, we believe in promoting from within and giving our employees meaningful opportunities to learn, grow, and thrive. Whether you're just starting...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30

serp_jobs.job_card.promoted

System Administrator (M365) - st catharines

Bevertecst catharines, on, ca

serp_jobs.job_card.full_time

serp_jobs.job_card.promoted

System Engineer

Beyond TalentEdgeAmherst, NY, US

serp_jobs.job_card.permanent

Systems Administrator & Technical Support Specialist

Techstra SolutionsBuffalo, NY, US

serp_jobs.job_card.full_time

serp_jobs.filters_job_card.quick_apply

We are seeking a skilled and motivated Systems Administrator and Technical Support Specialist to join our IT team.In this role, you will be responsible for managing and maintaining critical t...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days

serp_jobs.job_card.promoted

Systems Engineer

Seneca HoldingsWilliamsville, NY, United States

serp_jobs.job_card.full_time

Our team of talented individuals is what makes us successful.To support our team, we provide a balanced mix of benefits and programs. Your total rewards package includes competitive pay, benefits, a...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days

serp_jobs.job_card.promoted

: Lead Applied Mathematician – AI Algorithm Development for Defense Vision Systems

VisionWave Holdingsst catharines, on, ca

serp_jobs.job_card.full_time

serp_jobs.job_card.promoted

Junior Systems Administrator

DulibaninsuranceWelland, Niagara Region, Canada

serp_jobs.job_card.full_time

Welland, Canada | Posted on 09 / 19 / 2025.This is an ON-SITE position located in Welland, Ontario.Are you looking for an opportunity to grow your career in systems and network administration? We have ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days

serp_jobs.job_card.promoted

SDMS System Administrator - st catharines

I.T. Solutions, Inc.st catharines, on, ca

serp_jobs.job_card.full_time