Talent.com
serp_jobs.error_messages.no_longer_accepting
Research Engineer, Agentic AI Evals

Research Engineer, Agentic AI Evals

HUDSan Francisco, CA, United States
job_description.job_card.30_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
  • serp_jobs.job_card.part_time
job_description.job_card.job_description

About HUD

HUD (YC W25) is developing agentic evals for Computer Use Agents (CUAs) that browse the web. Our CUA Evals framework is the first comprehensive evaluation tool for CUAs.

Our Mission : People don't actually know if AI agents are working. To make AI agents work in the real world, we need detailed evals for a huge range of tasks.

We're backed by Y Combinator, and work closely with frontier AI labs to provide agent evaluation infrastructure at scale.

About the role

We're looking for a research engineer to help build out task configs and environments for evaluation datasets on HUD's CUA evaluation framework .

Responsibilities

Build out environments for HUD's CUA evaluation datasets, including evals for safety redteaming, general business tasks, long-horizon agentic tasks etc.

Create custom CUA datasets / evaluation pipelines - likely later as we're focusing on existing evals for the short term.

Experience

Technical Skills

Proficiency in Python, Docker, and Linux environments

React experience for frontend development

Production-level software development experience preferred

Strong technical aptitude and demonstrated problem-solving ability

You may be a good fit if you :

Have hands-on experience with LLM evaluation frameworks and methodologies

Have contributed to evaluation harnesses (EleutherAI, Inspect, or similar)

Built custom evaluation pipelines or datasets

Worked with agentic or multimodal AI evaluation systems

We prioritise contributions that show quality and quantity , such as building out large, high-quality eval datasets.

Strong candidates may have :

Startup experience in early-stage technology companies with ability to work independently in fast-paced environments

Strong communication skills for remote collaboration across time zones

Familiarity with current AI tools and LLM capabilities

Understanding of safety and alignment considerations in AI systems

Evidence of rapid learning and adaptability in technical environments

We prioritize technical aptitude and learning potential over years of experience. Motivated candidates are encouraged to apply even if they don't meet all criteria.

Team & Company Details

Team Size : ~5-10 people currently, looking to hire 2-3 additional people (though we judge case-by-case - could be zero or a lot more depending on candidates).

Our team : Our team includes 4 international Olympiad medallists (IOI, ILO, IPhO), serial AI startup founders, and researchers with publications at ICLR, NeurIPS etc.

Logistics

Employment : Fulltime preferred, but we're willing to consider internship offers.

Location : Remote-friendly, but if you’re in the San Francisco Bay Area, we do have an office you can work together in. We do prefer applicants who can show up to meetings in Pacific Time (UTC-7 : 00 / 8 : 00) or China / Singapore Time (UTC +8 : 00).

Visa Sponsorship : We provide support for relocation and visas for strong full-time candidates. For part-time / contract / internship arrangements, we'll work fully remote (which makes things simpler anyway).

Timeline : Applications are rolling. The process should involve 1-2 interviews and take less than a week.

Due to high volume, we may not actively respond to every application, but feel free to contact us at recruiting@hud.so or elsewhere if we missed your application!

#J-18808-Ljbffr

serp_jobs.job_alerts.create_a_job

Engineer Agentic Ai • San Francisco, CA, United States

Job_description.internal_linking.related_jobs
  • serp_jobs.job_card.promoted
AI Robotics Research Engineer

AI Robotics Research Engineer

Nimble RoboticsSan Francisco, CA, United States
serp_jobs.job_card.full_time
Nimble is a robotics and AI company inventing and scaling autonomous logistics with intelligent robots to enable fast, efficient, and sustainable commerce. We’re developing generalized robot intelli...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Senior AI Research Engineer, Model Inference (Remote)

Senior AI Research Engineer, Model Inference (Remote)

Tether Operations LimitedSan Francisco, CA, United States
serp_jobs.filters.remote
serp_jobs.job_card.full_time
Join Tether and Shape the Future of Digital Finance.At Tether, we’re building solutions that empower businesses to integrate reserve-backed tokens across blockchains with transparency and trust in ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Remote AI Engineer & Machine Learning Researcher (Inference) – Speechify Inc. Speechify Inc #1 [...]

Remote AI Engineer & Machine Learning Researcher (Inference) – Speechify Inc. Speechify Inc #1 [...]

WorkinvirtualPalo Alto, CA, United States
serp_jobs.filters.remote
serp_jobs.job_card.full_time
Speechify is one of the world’s leading.AI-powered text-to-speech platforms.From PDFs, eBooks, Google Docs, and news articles to websites, Speechify makes reading faster, smarter, and more accessib...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
AI Research Engineer, Lead

AI Research Engineer, Lead

Menlo VenturesSan Francisco, CA, United States
serp_jobs.job_card.full_time
The Technical Lead will drive AI research in one or more of the following areas : structure prediction, protein design, and lead optimization. PhD in AI, Machine Learning, Bioinformatics, or related ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Adversarial AI Engineer

Adversarial AI Engineer

GoFundMeSan Francisco, CA, United States
serp_jobs.job_card.full_time
Want to help us help others? We’re hiring! GoFundMe is the world’s most powerful community for good, dedicated to helping people help each other. By uniting individuals and nonprofits in one place, ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Applied AI Engineer – Generative AI

Applied AI Engineer – Generative AI

KodiakSan Francisco, CA, United States
serp_jobs.job_card.full_time
The company has developed an artificial intelligence (AI) powered technology stack purpose-built for commercial trucking and the public sector. The company delivers freight daily for its customers a...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Principal Machine Learning Engineer - AI Research

Principal Machine Learning Engineer - AI Research

General MotorsMountain View, CA, United States
serp_jobs.job_card.full_time
Our AI Research team, reporting directly to the Chief AI Officer, is pioneering how cutting-edge machine learning can transform the way vehicles are designed, manufactured, and experienced.We are b...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Principal Product Manager - Agentic AI

Principal Product Manager - Agentic AI

Five9San Ramon, CA, US
serp_jobs.job_card.full_time
Join us in bringing joy to customer experience.Five9 is a leading provider of cloud contact center software, bringing the power of cloud innovation to customers worldwide.Living our values everyday...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
AI Agentic Engineer

AI Agentic Engineer

DocuSign, Inc.San Francisco, CA, United States
serp_jobs.job_card.full_time
Docusign brings agreements to life.Docusign solutions to accelerate the process of doing business and simplify people’s lives. With intelligent agreement management, Docusign unleashes business-crit...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
  • serp_jobs.job_card.new
$225 / 60min Paid Market Research Study for AI Engineer in the Healthcare Sector

$225 / 60min Paid Market Research Study for AI Engineer in the Healthcare Sector

Ivy ExecSan Jose, CA, United States
serp_jobs.job_card.full_time
Market research studies are paid engagements Ivy Exec conducts with clients to get feedback on certain topics.For this study, we are looking for IT decision-makers to talk about their experiences i...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
  • serp_jobs.job_card.promoted
AI Engineer

AI Engineer

Alldus International Consulting LtdSan Francisco, CA, United States
serp_jobs.job_card.full_time
Our client, an exciting HealthTech organization, is hiring an AI Engineer to join the team in New York.The successful candidate will lead the architecture, development and deployment of Agentic AI ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
Senior AI Research Engineer, Model Inference (100% Remote)

Senior AI Research Engineer, Model Inference (100% Remote)

Tether Operations LimitedSan Francisco, CA, US
serp_jobs.filters.remote
serp_jobs.job_card.full_time
Join Tether and Shape the Future of Digital Finance.At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from ex...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Founding Team Lead, AI Research Engineering

Founding Team Lead, AI Research Engineering

AdyenSan Francisco, CA, United States
serp_jobs.job_card.full_time
Founding Team Lead, AI Research Engineering.Adyen is the financial technology platform of choice for leading businesses, providing payments, data, and financial products in a single solution for gl...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Research Engineer, Focused Bets

Research Engineer, Focused Bets

OpenAISan Francisco, CA, United States
serp_jobs.job_card.full_time
The Strategic Deployment team makes frontier models more capable, reliable, and aligned to transform high-impact domains. On one hand, this involves deploying models in real-world, high-stakes setti...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Senior AI Engineer (Gen AI Platform Services, Agentic Systems)

Senior AI Engineer (Gen AI Platform Services, Agentic Systems)

Capital OneSan Francisco, CA, United States
serp_jobs.job_card.part_time
You love to build systems, take pride in the quality of your work, and also share our passion to do the right thing.You want to work on problems that will help change banking for good.Passion for s...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
AI Engineer

AI Engineer

Airwallex Pty Ltd.San Francisco, CA, United States
serp_jobs.job_card.full_time
Airwallex is the only unified payments and financial platform for global businesses.Powered by our unique combination of proprietary infrastructure and software, we empower over 150,000 businesses ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
AI Research Engineer

AI Research Engineer

Menlo VenturesSan Francisco, CA, United States
serp_jobs.job_card.full_time
Chai Discovery is unlocking new biology with artificial intelligence by building state-of-the-art foundation models.The company is operating in stealth mode by a highly experienced founding team, w...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Applied AI Inference Engineer

Applied AI Inference Engineer

BasetenSan Francisco, CA, United States
serp_jobs.job_card.full_time
Baseten provides the infrastructure, tooling, and expertise needed to bring great AI products to market - fast.Backed by top investors including IVP, Spark Capital, Greylock, and Conviction, we’re ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
  • serp_jobs.job_card.new
$225 / 60min Paid Market Research Study for AI Engineer in the Telecom Sector

$225 / 60min Paid Market Research Study for AI Engineer in the Telecom Sector

Ivy ExecSan Jose, CA, United States
serp_jobs.job_card.full_time
Market research studies are paid engagements Ivy Exec conducts with clients to get feedback on certain topics.For this study, we are looking for IT decision-makers to talk about their experiences i...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
  • serp_jobs.job_card.promoted
Machine Learning Engineer - AI Research

Machine Learning Engineer - AI Research

General MotorsMountain View, CA, United States
serp_jobs.job_card.full_time
This role is categorized as hybrid.This means the successful candidate is expected to report to the office three times per week or any other frequency dictated by the business.View, California - Mo...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days