Talent.com
Data Infrastructure Engineer

Data Infrastructure Engineer

MeshySunnyvale, CA, US
job_description.job_card.variable_hours_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Job Description

Job Description

About Meshy

Headquartered in the Silicon Valley, Meshy is the leading 3D generative AI company on a mission to Unleash 3D Creativity. Meshy makes it effortless for both professional artists and hobbyists to create unique 3D assets—turning text and images into stunning 3D models in just minutes. What once took weeks and $1,000 now takes 2 minutes and $1.

Our global team of top experts in computer graphics, AI, and art includes alumni from MIT, Stanford, Berkeley, as well as veterans from Nvidia and Microsoft. With 3 million users (and growing), Meshy is trusted by top developers and backed by premiere venture capital firms like Sequoia and GGV.

  • No. 1 popularity, among 3D AI tools, according to A16Z games,
  • No. 1 website traffic, among 3D AI tools, according to SimilarWeb (2M monthly visits),
  • Leading 3D foundation model, delighted texture & fine geometry,
  • $52M funding by Top VCs,
  • 2.5M users & 20M models generated!

Ethan Yuanming Hu serves as the founder and CEO. Ethan got his Ph.D. in graphics and AI from MIT, where he developed the Taichi GPU programming language (27K stars on GitHub, used by 300+ institutes). His Ph.D. thesis got a honorable mention of SIGGRAPH 2022 Outstanding Doctoral Dissertation Award and his research has been cited over 2700 times. his favorite animal is the llama.

About the Role:

We are seeking a Data Infrastructure Engineer to join our growing team. In this role, you will design, build, and operate distributed data systems that power large-scale ingestion, processing, and transformation of datasets used for AI model training. These datasets span traditional structured data as well as unstructured assets such as images and 3D models, which often require specialized preprocessing for pretraining and fine-tuning workflows.

This is a versatile role : you'll own end-to-end pipelines (from ingestion to transformation), ensure data quality and scalability, and collaborate closely with ML researchers to prepare diverse datasets for cutting-edge model training. You'll thrive in our fast-paced startup environment, where problem-solving, adaptability, and wearing multiple hats are the norm.

What You'll Do:

Core Data Pipelines

Design, implement, and maintain distributed ingestion pipelines for structured and unstructured data (images, 3D / 2D assets, binaries).

Build scalable ETL / ELT workflows to transform, validate, and enrich datasets for AI / ML model training and analytics.

Distributed Systems & Storage

Architect pipelines across cloud object storage (S3, GCS, Azure Blob), data lakes, and metadata catalogs.

Optimize large-scale processing with distributed frameworks (Spark, Dask, Ray, Flink, or equivalents).

Implement partitioning, sharding, caching strategies, and observability (monitoring, logging, alerting) for reliable pipelines.

Pretrain Data Processing

Support preprocessing of unstructured assets (e.g., images, 3D / 2D models, video) for training pipelines, including format conversion, normalization, augmentation, and metadata extraction.

Implement validation and quality checks to ensure datasets meet ML training requirements.

Collaborate with ML researchers to quickly adapt pipelines to evolving pretraining and evaluation needs.

Infrastructure & DevOps

Use infrastructure-as-code (Terraform, Kubernetes, etc.) to manage scalable and reproducible environments.

Integrate CI / CD best practices for data workflows.

Data Governance & Collaboration

Maintain data lineage, reproducibility, and governance for datasets used in AI / ML pipelines.

Work cross-functionally with ML researchers, graphics / vision engineers, and platform teams.

Embrace versatility : switch between infrastructure-level challenges and asset / data-level problem solving.

Contribute to a culture of fast iteration, pragmatic trade-offs, and collaborative ownership.

What We're Looking For:

Technical Background

5+ years of experience in data engineering, distributed systems, or similar.

Strong programming skills in Python (plus Scala / Java / C++ a plus).

Solid skills in SQL for analytics, transformations, and warehouse / lakehouse integration.

Proficiency with distributed frameworks (Spark, Dask, Ray, Flink).

Familiarity with cloud platforms (AWS / GCP / Azure) and storage systems (S3, Parquet, Delta Lake, etc.).

Experience with workflow orchestration tools (Airflow, Prefect, Dagster).

Domain Skills (Preferred)

Experience handling large-scale unstructured datasets (images, video, binaries, or 3D / 2D assets).

Familiarity with AI / ML training data pipelines, including dataset versioning, augmentation, and sharding.

Exposure to computer graphics or 3D / 2D data processing is strongly preferred.

Mindset

Comfortable in a startup environment : versatile, self-directed, pragmatic, and adaptive.

Strong problem solver who enjoys tackling ambiguous challenges.

Commitment to building robust, maintainable, and observable systems.

Nice to Have:

Kubernetes for distributed workloads and orchestration.

Data warehouses or lakehouse platforms (Snowflake, BigQuery, Databricks, Redshift).

Familiarty GPU-accelerated computing and HPC clusters

Experience with 3D / 2D asset processing (geometry transformations, rendering pipelines, texture handling).

Rendering engines (Blender, Unity, Unreal) for synthetic data generation.

Open-source contributions in ML infrastructure, distributed systems, or data platforms.

Familiarity with secure data handling and compliance

Our Values:

Brain : We value intelligence and the pursuit of knowledge. Our team is composed of some of the brightest minds in the industry.

Heart : We care deeply about our work, our users, and each other. Empathy and passion drive us forward.

Gut : We trust our instincts and are not afraid to take bold risks. Innovation requires courage.

Taste : We have a keen eye for quality and aesthetics. Our products are not just functional but also beautiful.

Why Join Meshy?

Competitive salary, equity, and benefits package.

Opportunity to work with a talented and passionate team at the forefront of AI and 3D technology.

Flexible work environment, with options for remote and on-site work.

Opportunities for fast professional growth and development.

An inclusive culture that values creativity, innovation, and collaboration.

Unlimited, flexible time off.

Benefits:

  • Competitive salary, benefits and stock options.
  • 401(k) plan for employees.
  • Comprehensive health, dental, and vision insurance.
  • The latest and best office equipment.
  • serp_jobs.job_alerts.create_a_job

    Infrastructure Engineer • Sunnyvale, CA, US

    Job_description.internal_linking.related_jobs
    • serp_jobs.job_card.promoted
    Software Engineer, Data Infrastructure - Research

    Software Engineer, Data Infrastructure - Research

    OpenAISan Francisco, CA, United States
    serp_jobs.job_card.full_time
    The Workload team is responsible for designing and running OpenAI’s LLM training and inference infrastructure that powers frontier models at massive scale. Our systems unify how researchers train an...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Backend Infrastructure Engineer

    Backend Infrastructure Engineer

    Strategic Employment Partners (SEP)San Francisco, CA, US
    serp_jobs.job_card.full_time
    Join a stealth-mode startup on a mission to redefine how people shop online.Our client is building a hyper-personalized, AI-powered shopping experience backed by some of the most successful names i...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Machine Learning Infrastructure and Data Engineer

    Machine Learning Infrastructure and Data Engineer

    Apple Inc.Sunnyvale, CA, United States
    serp_jobs.job_card.full_time
    Machine Learning Infrastructure and Data Engineer.Sunnyvale, California, United States.Want to ship amazing experiences in Apple products? Be part of the team in the Video Computer Vision (VCV) org...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Azure Data Engineer

    Azure Data Engineer

    VirtualVocationsSan Francisco, California, United States
    serp_jobs.job_card.full_time
    A company is looking for an Azure Data Engineer to implement data solutions in Azure and support Infrastructure as Code initiatives. Key Responsibilities : Design, develop, and implement integratio...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Senior Data Engineer

    Senior Data Engineer

    VirtualVocationsSan Jose, California, United States
    serp_jobs.job_card.full_time
    A company is looking for a Senior Data Engineer to lead the development of scalable data processing and enrichment pipelines. Key Responsibilities Design and implement scalable AI-powered data pro...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal Data Infrastructure Engineer

    Principal Data Infrastructure Engineer

    fabric IncSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    We’re a team of dedicated experts creating a new way to commerce for the age of AI Shopping.AI Commerce Operating System to orchestrate, optimize, and scale unified commerce for everyone.It’s a sys...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_1_day
    • serp_jobs.job_card.promoted
    Senior Infrastructure Engineer

    Senior Infrastructure Engineer

    ProvableSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    At Provable, our mission is to redefine trust and privacy in the digital world.By creating tools that simplify the complexities of zero-knowledge technology, we empower developers to build applicat...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Software Engineer, ML Data Infrastructure

    Software Engineer, ML Data Infrastructure

    WaymoMountain View, CA, United States
    serp_jobs.job_card.full_time
    Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Software Engineer, Data Infrastructure

    Software Engineer, Data Infrastructure

    OpenAISan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Software Engineer, Data Infrastructure — OpenAI.This role focuses on building and operating data infrastructure that supports massive compute fleets and storage systems, designed for high performan...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_1_day
    • serp_jobs.job_card.promoted
    Software Engineer (Data Infrastructure)

    Software Engineer (Data Infrastructure)

    NumeralSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Numeral is building the automation backbone for internet commerce — starting with the painful world of sales tax compliance. We’re one of the fastest-growing companies from Y Combinator’s W23 batch,...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Data Engineer

    Data Engineer

    VirtualVocationsHayward, California, United States
    serp_jobs.job_card.full_time
    A company is looking for a Data Engineer - Databricks.Key Responsibilities Lead technical direction for data engineering initiatives Architect scalable data pipelines and systems using PySpark, ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Infrastructure Engineers wanted

    Infrastructure Engineers wanted

    RustsyndiSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Infrastructure Engineers wanted at EdgeDB.Join EdgeDB, an open-source database built on top of Postgres, and help scale out our cloud infrastructure. As an SRE / Infrastructure Engineer at EdgeDB, you...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    AI Infrastructure Engineer, ML Data Platform

    AI Infrastructure Engineer, ML Data Platform

    Scale AI, Inc.San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Scale's AI Infrastructure team supports both R&D and applied Generative AI initiatives, driving breakthroughs in areas of post-training research such as AI safety, agents, and evaluating state-of-t...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Platform & Infrastructure Engineer

    Platform & Infrastructure Engineer

    MindsdbSan Francisco, CA, US
    serp_jobs.job_card.full_time
    Job description ABOUT USMindsDB is a fast-growing AI startup headquartered in San Francisco, California.MindsDB is an AI Analytics solution that connects to diverse data sources and applications th...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Data Infrastructure Engineer

    Principal Data Infrastructure Engineer

    fabricSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Principal Data Infrastructure Engineer.We’re a team of dedicated experts creating a new way to commerce for the age of AI Shopping. AI Commerce Operating System to orchestrate, optimize, and scale u...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    AI Infrastructure Engineer

    AI Infrastructure Engineer

    LanceDBSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    From hyper-scalable vector search to advanced retrieval for RAG, from streaming training data to interactive exploration of large-scale AI datasets, LanceDB is the best foundation for your AI appli...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    AI Infrastructure Engineer

    AI Infrastructure Engineer

    StackAISan Francisco, CA, United States
    serp_jobs.job_card.full_time
    As a Series A company, your work will be foundational, enabling safe, efficient, and reliable AI workflows from end to end. Design and implement scalable backend architectures for AI workloads (infe...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Software Engineer II, Data Engineering & Infrastructure

    Software Engineer II, Data Engineering & Infrastructure

    Australian Competition and Consumer CommissionSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Aurora’s mission is to deliver the benefits of self-driving technology safely, quickly, and broadly.The Aurora Driver will create a new era in mobility and logistics, one that will bring a safer, m...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days