Search jobs > Seattle, WA > Principal machine learning

Principal Machine Learning Engineer (JoinOCI-AI Services)

Oracle
Seattle, WA, United States
$94.2K-$223.5K a year
Full-time

At Oracle Cloud Infrastructure (OCI), we build the future of the cloud for Enterprises as a diverse team of fellow creators and inventors.

We act with the speed and attitude of a start-up, with the scale and customer-focus of the leading enterprise software company in the world.

Values are OCI’s foundation and how we deliver excellence. We strive for equity, inclusion, and respect for all. We are committed to the greater good in our products and our actions.

We are constantly learning and taking opportunities to grow our careers and ourselves. We challenge each other to stretch beyond our past to build our future.

You are the builder here. You will be part of a team of really smart, motivated, and diverse people and given the autonomy and support to do your best work.

It is a dynamic and flexible workplace where you’ll belong and be encouraged.

This role is available on the OCI AI Services and Data org . We are addressing exciting challenges at the intersection of artificial intelligence and cutting-edge cloud infrastructure.

We are building state of the art data processing, model training and benchmarking platform. As Software Engineer on our team, we build services and tools to manage model lifecycle, model provenance, model catalog and model training.

We build shared GPU super cluster that enables customers to easily onboard run, monitor and managing AI models at scale with OCI.

You will have the opportunity to work on the LLM accelerators, setup ML training and benchmarking platforms, runtimes, libraries in the open-source projects that enable low friction performance optimized large scale training and inferencing of the world’s most advanced AI models.

Basic Qualifications

  • Bachelor’s degree in computer science, engineering, or an equivalent highly technical field
  • 6+ years of software engineering experience and a proven track record of successfully architecting and shipping high performance, low latency AI / ML enabled products & services
  • Strong technical understanding in building complex, scalable, low latency streaming / batch processing AI / ML cloud services
  • Proven track record on running operations for a cloud service
  • Deep knowledge of large-scale compute, network, and storage systems
  • Experience working with Distributed Systems

Preferred Qualifications

  • PhD or MS in Computer Science or related technical field (Statistics, Mathematics, AI / ML, Operations Research
  • Experience in scalable distributed backend services design with Cloud Native
  • Demonstrated knowledge and experience with machine learning platforms from major providers (AWS, Microsoft Azure and Google Cloud)
  • Experience in leading multiple geographically distributed teams
  • Handling and working with Compliance frameworks and Healthcare data

Additional Details

  • Required to have worked in at least one of the following areas of Healthcare data processing, Large scale GPU infrastructure for ML training and Inferencing, building pipelines from ML models etc
  • Familiarity with recent ML / AI-based approaches for building and packaging LLMs and large ML models would be highly preferred

Career Level - IC4

  • Build new built on OCI and OCI Native services from scratch.
  • Inspire a culture of 'Always On' Service Operations in the team
  • Present weekly to upper management - Service Escalations and Statistics along with Corrective Actions / Preventive Actions
  • Interface with Architects and technical leads to steer them to continuous Feature Improvements
  • Directly and indirectly manage Globally Distributed engineering team that is fast growing
  • Allocate resources, set priorities, and manage schedules for the team. Work across the platform organization to define and provide inputs to the technology strategy, infrastructure and architecture vision that supports the successful execution of the product roadmap and business strategy.
  • Feed the Service Operations requirements, challenges into service development teams for continuous improvements
  • Responsible to Hire outstanding SDEs, SREs in a competitive environment. Proven ability to motivate, align, and manage high performing, happy and empowered developers.

Disclaimer :

Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates.

Range and benefit information provided in this posting are specific to the stated locations only

US : Hiring Range : from $94,200 to $223,500 per annum. May be eligible for bonus and equity.

Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle’s differing products, industries and lines of business.

Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.

Oracle US offers a comprehensive benefits package which includes the following :

1. Medical, dental, and vision insurance, including expert medical opinion

2. Short term disability and long term disability

3. Life insurance and AD&D

4. Supplemental life insurance (Employee / Spouse / Child)

5. Health care and dependent care Flexible Spending Accounts

6. Pre-tax commuter and parking benefits

7. 401(k) Savings and Investment Plan with company match

8. Paid time off : Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position.

Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment.

Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.

9. 11 paid holidays

10. Paid sick leave : 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.

11. Paid parental leave

12. Adoption assistance

13. Employee Stock Purchase Plan

14. Financial planning and group legal

15. Voluntary benefits including auto, homeowner and pet insurance

The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.

30+ days ago
Related jobs
Promoted
GE Healthcare
Bellevue, Washington

Lead the design and implementation of advanced machine learning models, with a particular emphasis on Large Language Models (LLMs) and Computer Vision Machine Learning (CVML), to automate and enhance clinical tasks utilizing diverse data sets like medical images, electronic health records, patient w...

Promoted
Blue Origin
Seattle, Washington

Autonomous Vehicle AI Principal Engineer with experienced and demonstrated skills in advanced neural networks. Contribute to the continuous improvement of in-house machine learning pipelines and tools for data annotation, model training, and performance monitoring. Mentor junior AI engineers and act...

Promoted
Apple
Seattle, Washington

The OS Intelligence team within this group is a focused Applied Machine Learning team that imbues the low layers of the operating system with Machine Learning-based intelligence and ships key features and technologies in every year's OS releases. The team is looking for extraordinary candidates to d...

Promoted
Strativ Group
Seattle, Washington

Our client, an exciting and disruptive startup is looking for a Senior Machine Learning Engineer specializing in Generative AI to join their Seattle, WA team. Senior Staff Machine Learning Engineer. The responsibilities of a Senior Staff Machine Learning Engineer would involve…. The successful candi...

Promoted
Etekit
Seattle, Washington

Sr Machine Learning Engineer with AI. Sr Engineer (Artificial Intelligence/Machine Learning). Foundation in machine learning and delivering projects. Skills: machine learning, cloud, artificial intelligence, azure. ...

Promoted
TikTok
Seattle, Washington

Experience in one or more of the following areas: applied machine learning, machine learning infrastructure, large-scale recommendation system, market-facing machine learning product;. Our time off and leave plans are: 10 paid holidays per year plus 17 days of Paid Personal Time Off (PPTO) (prorated...

Apple
Seattle, Washington

We are looking for a tenured software engineer to help implement innovative (LLMs, Diffusion, GenAI, etc) machine learning models against our powerful machine learning hardware. At Apple, the AIML the on-device Machine Learning group is responsible for accelerating the adoption of machine learning t...

Promoted
TikTok
Seattle, Washington

Our time off and leave plans are: 10 paid holidays per year plus 17 days of Paid Personal Time Off (PPTO) (prorated upon hire and increased by tenure) and 10 paid sick days per year as well as 12 weeks of paid Parental leave and 8 weeks of paid Supplemental Disability. Our Trust and Safety engineeri...

Amazon Data Services, Inc.
Seattle, Washington

You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. Thorough understanding and use of principals, theories and concepts in mechanical engineering. AWS Infrastructure Services owns the design,...

Amazon.com Services LLC
Seattle, Washington

AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine learning accelerators and the Trn1 and Inf1 servers that use them. This role is for a Software Engineer in the Machine Learning Applications (ML Apps) team for AWS Neuron. The ML Apps team works side by...