Search jobs > Mountain View, CA > Site reliability engineer

Site Reliability Engineer, TikTok Ads- USDS

TikTok
Mountain View, CA
Full-time

Responsibilities

TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. U.S. Data Security ("USDS") is a subsidiary of TikTok in the U.

S. This new, security-first division was created to bring heightened focus and governance to our data protection policies and content assurance protocols to keep U.

S. users safe. Our focus is on providing oversight and protection of the TikTok platform and U.S. user data, so millions of Americans can continue turning to TikTok to learn something new, earn a living, express themselves creatively, or be entertained.

The teams within USDS that deliver on this commitment daily span across Trust & Safety, Security & Privacy, Engineering, User & Product Ops, Corporate Functions and more.

Why Join Us

Creation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible.

Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day.

To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always.

At TikTok, we create together and grow together. That's how we drive impact - for ourselves, our company, and the communities we serve.

Join us.

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed services and infrastructures.

As a site reliability engineer in the Ads data platform area, you will have the opportunity to manage the services and infrastructures in one of the largest data plaforms in the world that directly supports the TikTok Ads ecosystem.

You'll need to ensure the data, services and infrastructures are reliable, fault-tolerant, efficiently scalable and cost-effective.

In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager / department.

We regularly review our hybrid work model, and the specific requirements may change at any time.

Responsibilities :

  • Perform SRE duties and operations across advertising services in production, including but not limited to : on-call rotations, maintenance, change management, monitoring, incident response, capacity planning, disaster recovery.
  • Maximize system uptime, availability and stability, to ensure functional and performance SLAs.
  • Work alongside Software Engineering teams to co-develop through automation, tool-building and capacity planning
  • Develop a strong understanding of business to work with product teams in aligning service reliability with business metrics
  • Contribute to existing documentations and build effective documentations such as operational runbooks, SOPs, SLA / SLO.
  • Work cross functionally and regionally with SRE / Dev / QA / PM teams to handle incidents and improve processes.
  • Manage and prioritize tasks / projects for high productivity and precise deliveries.

Qualifications

Minimum Qualification :

  • Bachelor's degree in Computer Science, a related field, or equivalent practical experience.
  • 5+ years of demonstrated experience in software development with one or more programming languages.
  • 5+ years of experience in Linux, distributed architectures, networking, data concepts and service reliability,
  • Strong analytical ability, problem solving and critical thinking skills.
  • Strong communication skills and the willingness to understand business and infrastructure as a team-player

Preferred Qualification :

  • Master's degree in Computer Science, Engineering or a related field.
  • Proficient in any of the following languages : Python, GoLang, C++.
  • Experience with Ads systems either as a SWE / SRE or Product role

Candidates for this position must be legally authorized to work in the United States. This position is not eligible for visa sponsorship or support.

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives.

Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy.

To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach.

We are passionate about this and hope you are too.

TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws.

If you need assistance or a reasonable accommodation, please reach out to us at https : / / shorturl.at / ktJP6

This role requires the ability to work with and support systems designed to protect sensitive data and information. As such, this role will be subject to strict national security-related screening.

30+ days ago
Related jobs
Promoted
Apple
Cupertino, California

Site Reliability Engineering, DevOps, or Infrastructure focused role. We are seeking a Senior Engineer with expertise in reliability, scalability, resilience, security, optimization, and service performance of critical infrastructure services. The Apple Maps Infrastructure team is seeking an outstan...

TikTok
Mountain View, California

Own end-to-end reliability and performance of a critical, revenue generating E-commerce platform, as well as supporting release management and data compliance in a cloud native environment- Build and manage a team of software/reliability engineers, including mentoring junior team members and support...

NetApp
San Jose, California

Title: Site Reliability Engineer. Cloud, Linux, Software Engineer, Developer, Java, Technology, Engineering. As a Seasoned Software Engineer, you will be involved in both the SRE operations as well as monitoring using Dynatrace/Instana. ...

TikTok
San Jose, California

Work with Engineering to design and implement automation infrastructure for scalability, contributing to engineering efforts to solve complex testing problems by designing and developing scalable test automation frameworks. TikTok is the leading destination for short-form mobile video. TikTok has gl...

TikTok
San Jose, California

The App Ads and Gaming team empowers TikTok's global monetization (billion-dollar business) via efficiently delivering application ads on TikTok. As a Machine Learning Engineer on the App Ads & Gaming team, you will make efforts to develop novel machine learning solutions for ranking, build scalable...

Zoom
San Jose, California

You will also design and implement reliability best practices to accomplish a highly available service ( Additionally, you will identify and fix problems in Kubernetes operators, submitting code fixes to OSS if needed. ...

TikTok
Mountain View, California

Data Security (“USDS”) is a subsidiary of TikTok in the U. The teams within USDS that deliver on this commitment daily span across Trust & Safety, Security & Privacy, Engineering, User & Product Ops, Corporate Functions and more. TikTok is the leading destination for short-form mobile vi...

EarnIn
Palo Alto, California

As a Staff Site Reliability Engineer, you’ll be the subject matter expert with operating systems and networking. You can plan, lead, and execute strategic objectives for the team or all of engineering. SRE or Software Engineering role. You’ve tackled site-wide outages, lessons were learned, and you ...

NVIDIA
Santa Clara, California

Join our team at NVIDIA as a Senior Site reliability engineer focused on HPC storage and play a crucial role in designing, implementing, and optimizing on-prem High-Performance Computing (HPC) storage solutions while harnessing the power of cloud computing. You will collaborate closely with engineer...

LatamCent
San Jose, California

Job Title: Site ReliabilityEngineer 3. Work closely with seniorengineers for refining the implementation. You get a chance to work onchallenging problems along with amazing engineers. ...