Search jobs > Mountain View, CA > Site reliability engineer

Site Reliability Engineer, Data Engineering - USDS

TikTok
Mountain View
Full-time

About the team

TikTok video system is a world-leading video platform that provides multi-media storage, delivery, transcoding a part of US Tech Service department, we are responsible for building the next generation video processing platform which provides excellent experiences for billions of users around the world.

In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager / department.

We regularly review our hybrid work model, and the specific requirements may change at any time. About the role : This is a Site Reliability Engineer role, focusing on the data pipeline reliability for the Video Platform team in USDS.

Data SREs monitor data and keep production batch and realtime processing jobs up and running with the highest level of availability, ensuring our users have the freshest, complete and correct data possible.

Responsibilities : Manage day-to-day operations of data service, realtime / batch data pipelines, such as Service Level Agreement management, pipeline deployment, performance tuning and troubleshootingProactively monitor and troubleshoot data pipelines and systems for performance issues, errors, or anomalies Create tools, build alarms and dashboards, drive internal process improvements, and automation to monitor and improve data engineering operationsImprove systems reliability, efficiency, and velocity through scaling, optimization of both resources and data processing workflows, potentially refactoring code or implementing new solutionsDevelop and deploy new reliable and scalable data pipelines and infrastructure components as required by business needsWork closely with data engineering and various vertical teams within the Video Architecture platform

Minimum QualificationsBachelor's in Computer Science or a related technical background involving software / system engineering, or equivalent working experienceGood programming experience with SQL and at least one of the following languages : Java, Python, Go, or ScalaExperience in data engineering, with a focus on data systems reliability, scalability, and performance Preferred QualificationsSolid experience with big data technologies (.

Hadoop, Spark, Flink, YARN) and databases (SQL, NoSQL)Knowledge of data pipeline and workflow management tools (., Airflow, Luigi)Demonstrated independent thinking capabilities and troubleshooting skills in large scale distributed systemsGood communication and coordination skillsExperience in building data solutions with AWS, Azure and other cloud services is a plus Candidates for this position must be legally authorized to work in the United States.

This position is not eligible for visa sponsorship or support.

30+ days ago
Related jobs
TikTok
Mountain View, California

About the role:This is a Site Reliability Engineer role, focusing on the data pipeline reliability for the Video Platform team in USDS. Data SREs monitor data and keep production batch and realtime processing jobs up and running with the highest level of availability, ensuring our users have the fre...

Xero
San Mateo, California

At Xero, our Data Reliability Engineering team plays a critical role in ensuring the data reliability of Xero’s databases and trust in the platform. As a Engineer in Data Reliability Engineering, you will help build platform products that reduce cognitive load, abstract complexity, and create the ne...

NVIDIA
Santa Clara, California
Remote

Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized discipline which demand knowl...

Illumio
Sunnyvale, California

We are looking for an experienced Senior Site Reliability Engineer (SRE) with a strong background in Azure cloud platform to play a key role in ensuring the reliability, scalability, and performance of our cloud-based systems and applications. Site Reliability Engineer (SRE) or similar role, with a ...

LinkedIn
Mountain View, California

The ideal candidate should have experience in Big Data, Data Analytics, data lakes and strength in data-related infrastructure concepts. Engineering, Computer Science or related technical field, or equivalent practical experience• 5+ years professional experience in an engineering or technical team,...

Apple
Cupertino, California

As a core member of the Data Engineering team you will be responsible for designing and implementing features that rely on processing and serving very large datasets with an awareness of scalability. The team’s data-driven engineers focus relentlessly on the customer experience by running worldwide ...

Syntricate Technologies Inc
Santa Clara, California

Position: Site Reliability Engineering (SRE). Site Reliability Engineering (SRE). Location: Santa Clara, CA (Onsite). AWS application and CI/CD pipelines, Microsoft Server admin and workload support (Data Center and AWS). ...

ByteDance
San Jose, California

Our data infrastructure Site Reliability Engineering (SRE) team is a pioneer in innovation. We contribute significantly to the next chapter of data infrastructure. Develop and manage components of cloud-managed data infrastructure, encompassing technologies such as Kubernetes, Redis, MySQL, Flink, a...

Palo Alto Networks
Santa Clara, California

We are looking for an exceptional Principal Site Reliability Engineer to enhance our ATP Infra team. This role will work on producing mission-critical platforms, tools, and processes that will ensure the highest levels of availability and reliability of all our applications. Represent SRE in design ...

GEICO
San Jose, California

Senior Manager, Site Reliability Engineering – Datacenter Hardware and IaaS. Our Senior Manager is an engineering leader who works with the engineering staff to innovate and build new engineering solutions, improveand enhance existing solutions as well as leverage engineering solutions to solve crit...