Search jobs > Palo Alto, CA > Member of technical

Member of Technical Staff, Research Engineer (Inference)

Inflection AI
Palo Alto, CA, United States
$175K-$325K a year
Full-time

Member of Technical Staff, Research Engineer (Inference)

As Inflection embarks on a new stage of growth, we are focusing on collaborating with commercial partners to adapt and fine-tune our cutting-edge models for their unique business requirements.

Our accomplishments in developing, aligning, and deploying state-of-the-art models in our high EQ consumer-facing chatbot, Pi, have established a strong foundation for success.

Well-funded and equipped with ample H100 resources, we have built a robust infrastructure and efficient processes to support best-in-class finetuning.

By joining our team, you'll have the opportunity to contribute your expertise while being part of a dynamic organization that values innovation and collaboration.

About Inflection

Inflection is a small, interdisciplinary AI studio. We have trained several state-of-the-art language models, including Inflection 1 and Inflection 2.

5, and built a personal assistant named Pi. As a studio, we are currently focused on finetuning and deploying models for specific use cases for our commercial partners.

We believe that artificial intelligence represents the beginning of an era of exponential change. Our name Inflection embraces this moment of transformation, whilst our status as a public benefit corporation provides us with the legal mandate to prioritize the well-being and happiness of our partners, users, and wider stakeholders above all else.

About The Role

As part of Inflection’s commitment to deploying high-performance models for enterprise applications, our inference team ensures that these models run efficiently and effectively in real-world scenarios.

Research engineers in this role focus on optimizing model inference processes, reducing latency, and improving throughput without compromising model performance, ensuring robust deployment in enterprise environments.

This is a good role for you if you :

  • Have experience with deploying and optimizing LLMs for inference, both in cloud and on-prem environments.
  • Are adept at using tools and frameworks for model optimization and acceleration, such as ONNX, TensorRT, or TVM.
  • Enjoy troubleshooting and solving complex problems related to model performance and scaling.
  • Have a deep understanding of the trade-offs involved in model inference, including hardware constraints and real-time processing requirements.
  • Are proficient with PyTorch and familiar with infrastructure management tools like Docker and Kubernetes for deploying inference pipelines.

We do not require a specific university degree or years of experience. Instead we're excited to see what you've been building.

Please send us examples of your best work, including but not limited to links to open source contributions, personal projects or a cover letter describing past projects that you are proud of.

Employee Pay Disclosures

At Inflection AI, we aim to attract and retain the best employees and compensate them in a way that appropriately and fairly values their individual contributions to the company.

For this role, Inflection AI estimates a starting annual base salary will fall in the range of approximately $175,000 - $325,000 depending on experience.

This estimate can vary based on the factors described above, so the actual starting annual base salary may be above or below this range.

How We Work

We value excellence and ownership. Our organizational structure focuses on individual responsibilities rather than management hierarchies.

Everyone is expected to lead by doing. We are big believers in the unreasonable effectiveness of highly talented Individual Contributors who are given all the resources, space and ownership to move fast and deliver outstanding results.

Teamwork and generosity are at our core. Our culture celebrates positive challenges, asking questions, learning and actively supporting one another.

This mentality of shared respect and purposeful teamwork is key to our success. We equally value all technical and non-technical contributions.

Constructive disagreement is essential. We appreciate when team members challenge assumptions, put forward new ideas, or encourage us to move faster or slower.

Openness, honesty and kindness make us great.

Feedback is our ground truth. We have a tight feedback loop between the user experience and our AI creation process. Quantitative and qualitative data drives our priorities.

This goes for internal culture too. Everyone has ownership and visibility into key decisions and progress.

Writing creates accountability. Whether on internal communication tools or in team memos, we are strong communicators with a special focus on the written word.

We deeply value time to reset outside of work. We encourage one another to constantly take time to recharge and always focus on maintaining a healthy work-life balance.

Engineering at Inflection

We are a vertically integrated AI studio. This means that our entire technology stack from large foundational model pre-training to the user interface is built in-house, with each of the components co-optimized to deliver the best AI experiences.

We have built one of the most advanced large language models in the world, based on multiple novel and proprietary innovations.

We believe in scale as the engine of progress in AI, and we are building one of the largest supercomputers in the world to develop and deploy the new generation of AIs.

We wear multiple hats and don’t distinguish between engineering and research. We continuously explore and exploit, creating new and perfecting existing techniques and solutions.

User feedback is our North Star.

Our Benefits

We offer generous benefits to ensure a positive, safe, inclusive and inspiring work environment for all Inflectioneers.

  • Unlimited paid time off
  • Parental leave and flexibility for all parents and caregivers
  • Generous medical, dental and vision plans for US employees
  • Compliance with country-specific benefits for non-US employees
  • Visa sponsorship for new hires
  • Avenues for personal growth such as coaching, conference attendance, or specific trainings

Diversity & Inclusion

We are building personal AIs that we hope will serve everyone. We are deeply committed to representing the full extent of the human experience inside our AI Studio.

This means that everyone from any walk of life is welcome if you have the right skills

30+ days ago
Related jobs
Promoted
QuantumScape
San Jose, California

Metrology Engineer, Member of Technical Staff. As a member of the metrology team, you will help us develop and deploy metrology tools to collect high volume and high velocity data that is predictive of battery and process performance. Professional experience with applying statistical methods to engi...

Promoted
VirtualVocations
Santa Clara, California

Key Responsibilities:Solve complex distributed systems problems at large scalesDevelop scalable technology solutions for metrics platforms and internal toolsBuild highly scalable and reliable distributed systems using cloud computing servicesRequired Qualifications:Bachelor's degree in Computer Scie...

Promoted
Microchip Technology Inc
San Jose, California

Are you looking for a unique opportunity to be a part of something great? Want to join a 20,000-member team that works on the technology that powers the world around us? Looking for an atmosphere of trust, empowerment, respect, diversity, and communication? How about an opportunity to own a piece of...

Promoted
VirtualVocations
Santa Clara, California

A company is looking for a Principal Member of Technical Staff in the Security Products Group. ...

Oracle
Santa Clara, California

As a Senior Member of Technical Staff, you will work as part of a highly collaborative team to build new features/tools while operating and growing the current service offering. Our engineers have significant technical and business impact while delivering critical enterprise level features. You are ...

PayPal
San Jose, California

For the majority of employees, PayPal's balanced hybrid work model offers 3 days in the office for effective in-person collaboration and 2 days at your choice of either the PayPal office or your home workspace, ensuring that you equally have the benefits and conveniences of both locations. That’s wh...

Micron
San Jose, California

As a Distinguished Member of Technical Staff or Fellow in AI Memory Systems Architecture, you will lead and be responsible for architecture exploration and definition of novel, tightly-coupled memory systems with a strong focus on performance, reliability, and yield working at the intersection betwe...

Oracle
Santa Clara, California

As a member of the software engineering division, you will be able to help define and develop software for tasks associated with designing, developing, and debugging File Storage Service. As part of the OCI - File Storage Service team, we are seeking talented engineers who want to solve complex prob...

Ichor Systems, Inc.
Fremont, California

Ensure work is performed in accordance with approved practices of the company. May direct the work of others within the department. ...

Governor's Office of Planning and Research
Sacramento County, US

The Governor’s Office of Planning & Research (OPR) is recruiting to fill one (1) limited term, Staff Services Manager III position within the Office of Community Partnerships/Admistrative. Under the general direction of the OCPSC Chief Deputy Director, and with oversight from the Office of Planning ...