Talent.com
Speech Data Project Manager

Speech Data Project Manager

42dotSan Francisco, CA, United States
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Overview

At 42dot, we're building performance evaluation systems for in-service speech recognition and developing comprehensive training and evaluation datasets for our LLM modules through meticulous data annotation. We strategically collect TTS voice data to ensure a diverse range of authentic, high-quality audio samples. Additionally, we are at the forefront of defining our philosophy on voice design in automotive environments, integrating robust acoustic and user experience principles tailored specifically for vehicle settings. This role will participate in dataset collection and validation to mitigate issues such as data bias and errors.

Responsibilities

Verification of Speech Data — Validate speech data related to STT, TTS and wake-up word detection to ensure accuracy and consistency.

TTS Data Collection Strategy & Execution — Design and implement data collection strategies that reflect North American linguistic and cultural characteristics. Secure high-quality English, Spanish, and French text and speech data from diverse sources (e.g., online media, audio archives, user interviews).

Data Quality Control — Review collected data for pronunciation, intonation, grammar, and vocabulary accuracy to ensure suitability for model training. Perform outlier detection and data cleaning tasks (e.g., noise removal, audio clipping, text normalization).

Process Automation & Optimization — Develop scripts and tools (using Python, R, etc.) to automate repetitive tasks in data collection and verification. Build and manage data pipelines and propose workflow improvements to optimize the process.

Outsourcing Management — Oversee and manage outsourcing agencies responsible for speech data labeling, ensuring adherence to quality standards and deadlines.

Collaboration & Communication — Work closely with development teams, speech engineers, and language experts to set data quality standards and project objectives. Provide regular reports on project progress, challenges, and improvement measures.

Market & User Analysis — Analyze language usage trends, dialects, and intonation patterns in North America to continuously refine data collection strategies. Incorporate user feedback and emerging research trends to update and improve the datasets.

Qualifications

Experience — Over 3 years (or equivalent experience) in voice signal-related roles, including speech data verification, labeling, and managing outsourcing agencies. Proven experience in collecting and validating speech data for various audio signal tasks.

Educational Background — A Master’s or Doctoral degree in Linguistics, Speech Signal Processing, Computer Science, Data Science, or a related field.

Language & Communication Skills — Native-level proficiency in at least two of the following languages—English, French, and Spanish (plus) good. Strong understanding of North American dialects and cultural nuances. Excellent documentation, presentation, and teamwork skills. Professional-level Korean language proficiency is an asset for research collaboration.

Project Management & Problem-Solving — Strong analytical, problem-solving, and project management skills, with the ability to handle multiple tasks and set priorities effectively.

Preferred Qualifications

Specialized Industry Experience — Quality control and management of audio data labeling projects; strong understanding of STT, TTS and wake-up word detection; project experience with deep learning frameworks such as TensorFlow and PyTorch.

Data Management Expertise — Experience in building and managing large-scale multi-modal (text + speech) datasets and optimizing data cleaning processes.

Sound Engineering & Narration Directing Expertise — Experience in sound engineering and narration directing, including acoustic environment design, audio processing, voice talent management, and use of industry-standard tools (e.g., Pro Tools, Adobe Audition).

Professional & Academic Engagement — Active participation in industry events, with contributions to patents, publications, or open-source projects.

Certifications — Certifications in cloud services, data engineering, or machine learning (e.g., AWS Solutions Architect, Google Cloud Professional Data Engineer) are a plus.

Interview Process

Application Review - 1st Interview - 2nd Interview - Offer

The process may vary by position and is subject to change.

Schedule and results will be communicated via the email provided in your application.

Review information before applying : How to work in 42dot, About 42dot Way. Base Salary : $75,000 - $257,236

#J-18808-Ljbffr

serp_jobs.job_alerts.create_a_job

Project Manager Data • San Francisco, CA, United States