Data Center Site Manager
We are seeking a talented Data Center Site Manager to join our Data Center Operations team and lead the management of our flagship AI infrastructure site housing hundreds of thousands of GPUs. You will play a critical role in overseeing the operations and maintenance of our data center infrastructure, ensuring maximum uptime and performance for our world-class GPU supercomputers.
Focus
Manage and prioritize tasks via internal tools to ensure efficient operations
Collaborate with internal teams to troubleshoot and perform Root Cause Analysis and Corrective Action for issues
Liaise with local colocation partners to fully understand site topology and articulate issues as needed
Lead a team of technicians and engineers to maintain 99.99% uptime for critical AI infrastructure
Oversee power, cooling, and networking systems supporting large-scale GPU deployments
About You
7+ years of experience managing large-scale data center operations, preferably with HPC or AI infrastructure
Strong technical background in data center mechanical and electrical systems (power distribution, cooling, fire suppression)
Proven track record of managing teams and multi-million dollar facility budgets
Experience with GPU clusters and understanding of AI workload requirements
Excellent communication skills to interface with technical teams, vendors, and executive stakeholders
Nice to haves
Background in managing facilities for AI / ML workloads
Experience with sustainability initiatives and PUE optimization
Knowledge of compliance frameworks (SOC2, ISO 270001)
Benefits
Competitive total compensation package (cash + equity).
Retirement or pension plan, in line with local norms.
Health, dental, and vision insurance.
Generous PTO policy, in line with local norms.
Data Center Manager • New York, NY, US