Site Reliability Engineer - USDS

TikTok

Mountain View

Full-time

About TikTok . Data SecurityTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy.

Data Security ( USDS ) is a subsidiary of TikTok in the . This new, security-first division was created to bring heightened focus and governance to our data protection policies and content assurance protocols to keep .

users safe. Our focus is on providing oversight and protection of the TikTok platform and . user data, so millions of Americans can continue turning to TikTok to learn something new, earn a living, express themselves creatively, or be entertained.

The teams within USDS that deliver on this commitment daily span across Trust & Safety, Security & Privacy, Engineering, User & Product Ops, Corporate Functions and more.

Why Join UsCreation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible.

Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day. To us, every challenge, no matter how difficult, is an opportunity;

to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always. At TikTok, we create together and grow together.

That's how we drive impact - for ourselves, our company, and the communities we serve. Join us. Site Reliability Engineering(SRE) at TikTok combines software and systems engineering to build and run large-scale, massively distributed, and fault-tolerant systems.

In our team, you’ll have the opportunity to manage the complex challenges of scale, while using expertise in coding, algorithms, complexity analysis, and large-scale system design.

We embrace a culture of diversity, intellectual curiosity, openness, and problem-solving. We encourage close collaboration while promoting self-direction.

In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager / department.

We regularly review our hybrid work model, and the specific requirements may change at any time. Responsibilities- Develop and maintain automation procedures to maximize system efficiency and minimize human intervention.

Work closely with software engineering teams to design, deploy and operate elements to ensure that systems are functionally robust.
Ensure system scalability to handle growth in web traffic and data. - Implement monitoring tools and set up metrics to keep track of system health and performance.
Participate in on-call rotations, assist with incident management, and diagnose, resolve, and prevent production issues.
Conduct performance tests to find and address system bottlenecks. - Collaborate with teams across the organization to define Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs).
Practice sustainable user support, incident response, and blameless postmortems.
Bachelor's degree in Computer Science, Information Technology, or a related field with 3+ years of experience- Proven work experience as a Site Reliability Engineer, Systems Engineer, or similar software engineering role.
Proficient knowledge of high-level programming languages (. Python, Go, Java, and Shell script). - Experience in network architecture, database modeling, cloud systems and large-scale distributed systems.
Strong understanding of Linux operating systems and open-source technologies. - Preferred Experience in MySQL, Redis, Ngnix, Kubernetes, Docker, OpenStack, Hadoop, Spark, etc- Preferred Knowledge of monitoring tools and methodologies (such as Prometheus, Grafana).
Excellent problem-solving skills, strategic thinking, and a strong ability to debug complex systems.- Exceptional communication skills and the ability to effectively collaborate with cross-functional teams.

30+ days ago

Related jobs

Promoted

Site Reliability Engineer

Lawrence Harvey

Sunnyvale, California

Our North American Technology team is seeking a talented, creative, and passionate Site Reliability Engineer to help build an innovative payment system that addresses merchant and consumer needs. They are a team of engineers dedicated to building cutting-edge, highly reliable, scalable, and high-thr...

Promoted

Sr Principal Software Engineer, Site Reliability (Access Edge Platform)

Palo Alto Networks

Santa Clara, California

DevOps Engineer (or equal role) with a passion for technology and strong motivation and responsibility for high reliability and service level. We are seeking experienced senior level Software Engineers to develop and deliver next-generation technologies within our Prisma Access Edge Platform team. W...

Promoted

Cloud Site Reliability Engineer, Cloud and System

TikTok

San Jose, California

Master's degree (or Bachelor's degree with 3+) years of experience in Computer Engineering, Electrical Engineering, Computer Science, or related major. Our Infrastructure Engineering team supports the company's fast growth by building and operating hyper-scale datacenters, managing the life cycle of...

Site Reliability Engineer

Insight Global

Redwood City, California

Insight Global is looking for a skilled Site Reliability Engineer (SRE) to work remotely in Peru or Guatemala for a large AAA game employer on a 9-12 month contract. As a Site Reliability Engineer, your role covers the entire life cycle of a product, from helping developers with architecture and del...

Embedded Site Reliability Engineer (Samsung Ads)

SAMSUNG

Mountain View, California

We are looking for a passionate Embedded Site Reliability Engineer who will lead the technical strategy and vision for our underpinning infrastructure, alerting & monitoring, infrastructure provisioning, networking, and development tooling in collaboration with other engineering teams and leadership...

Principal Site Reliability Engineer (XDR Cloud)

Palo Alto Networks

Santa Clara, California

Expert level experience as a DevOps/SRE engineer with a passion for technology and strong motivation and responsibility for high reliability and service level. Work closely and in full coordination with the DevOps and the RND team to develop new features and maintain high reliability for our SAAS Pr...

Site Reliability Engineer (Kubernetes)

NetApp

San Jose, California

Title: Site Reliability Engineer (Kubernetes). As a Cloud Infrastructure/Site Reliability Engineer, you will be operating at the intersection of development and operations. Team Collaboration and Influence: Work in tandem with other Cloud Infrastructure Engineers and developers to ensure maximum per...

Staff Site Reliability Engineer

General Motors

Mountain View, California

Chaos engineering implementation and experience a big plus. BS/MS in Computer Science/Engineering preferred. This means the successful candidate is expected to report onsite three times per week at minimum. ...

Senior Site Reliability Engineer

Tarana Wireless

Milpitas, California

As a Senior Site Reliability Engineer, you will help us manage software that runs on the cloud and remotely manages millions of radio devices. Automate the monitoring and auto-scaling of the production environment, to support millions of connected devices Monitoring of all live systems Troubleshoot ...

Senior Site Reliability Engineer - Automation / Containers

Oracle

Redwood City, California

As a Site Reliability Engineer, you will solve interesting technical challenges by defining, designing, deploying, and solving key Oracle Cloud services, platforms, and infrastructure, always thinking about reliability, scalability, resilience, security, and performance. We are unencumbered and will...

Site Reliability Engineer - USDS

Site Reliability Engineer

Sr Principal Software Engineer, Site Reliability (Access Edge Platform)

Cloud Site Reliability Engineer, Cloud and System

Site Reliability Engineer

Embedded Site Reliability Engineer (Samsung Ads)

Principal Site Reliability Engineer (XDR Cloud)

Site Reliability Engineer (Kubernetes)

Staff Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer - Automation / Containers

Popular searches