Senior Director of Engineering, AI Workload Orchestration
Oracle Senior Director of Engineering, AI Workload Orchestration
Bismarck, North Dakota
Scroll down the page to see all associated job requirements, and any responsibilities successful candidates can expect.
Here at OCI, we’re building the world’s largest AI clusters and we’re the fastest at bringing them to market. The AI Infrastructure organization at OCI is leading this effort.
As part of this focus on AI workloads and customers, we’re building platforms for AI job management services and AI workload management, from reinforcement learning to deep learning to tuning and model serving.
These platforms will give AI researchers simple, easy-to-use tools that take care of managing the GPU clusters they have across the full model lifecycle.
These platforms will eliminate DevOps efforts and costs in cluster management, scheduling, and observability, significantly lowering the bar for infrastructure management expertise for our AI customers.
It will make our AI capabilities easily accessible to more customers and will enable our largest customers to focus on improving and monetizing their AI models rather than managing the AI infrastructure.
In this role, you would lead the software development organization building out and operating these platforms and work with some of the largest players in the AI space, building systems that operate at unprecedented speed, scale, and reliability.
You should be a distributed systems generalist, able to architect broad systems interactions while being very hands-on, able to dive deep into any part of the stack and lower-level system interactions.
You should value simplicity and scale, work comfortably in a collaborative, agile environment, and be excited to learn.
Responsibilities
The candidate will be responsible for providing leadership, direction, and strategy, establishing and developing the organization to meet and execute on strategy.
The candidate will also work with geographically distributed teams and contribute to the success of theirs and of other related teams in delivering large-scale projects on time with high quality.
Required Qualifications
- MS or BS in Computer Science, or equivalent experience
- 5+ years of experience managing Software Engineering teams
- 12+ years of software engineering experience
- Strong communication skills, analytical skills, and project management skills
Preferred Qualifications
- 7 - 10+ years’ experience delivering and operating large scale, highly available distributed systems
- Strong knowledge of data structures, algorithms, operating systems, and distributed systems fundamentals
- Working familiarity with networking protocols (TCP / IP, HTTP) and standard network architectures
- Strong experience and detailed technical knowledge in distributed systems, high performance computing, and GPU systems
J-18808-Ljbffr