Search jobs > San Jose, CA > Site reliability engineer

Site Reliability Engineer, Global E-Commerce

TikTok
San Jose, CA
Full-time

Responsibilities

TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.

Why Join Us

At TikTok, our people are humble, intelligent, compassionate and creative. We create to inspire - for you, for us, and for more than 1 billion users on our platform.

We lead with curiosity and aim for the highest, never shying away from taking calculated risks and embracing ambiguity as it comes.

Here, the opportunities are limitless for those who dare to pursue bold ideas that exist just beyond the boundary of possibility.

Join us and make impact happen with a career at TikTok.

The e-commerce industry has seen tremendous growth in recent years and has become a hotly contested space amongst leading Internet companies, and its future growth cannot be underestimated.

With millions of loyal users globally, we believe TikTok is an ideal platform to deliver a brand new and better e-commerce experience to our users.

Our product engineering team is responsible for building an e-commerce ecosystem that is innovative, secure and intuitive for our users.

We are looking for passionate and talented people to join us as we drive the future of e-commerce here at TikTok.

Responsibilities :

1. Be part of global SRE oncall rotation and be responsible for Tier-1 online incident response and devops support.

2. Be responsible for service levels of mission critical, revenue-generating E-commerce platform as well as all supporting infrastructure and services.

This role will focus on service reliability, highly-scalable design, and release management in a cloud-native environment.

3. Define service level indicators and data-driven objectives, and develop devops / SRE standards, processes and methodologies, to uphold and improve uptime, latency, and system health of a core global e-commerce production platform.

4. Collaborate cross-team with engineering and product to ensure that key stability and maintainability requirements, such as capacity planning and launch reviews, are performed to enable transparent service delivery to customers.

5. Design strategies for risk detection and mitigation, disaster recovery & simulation, release management, cost optimisation, engineering quality etc

6. Automation geared towards infrastructure-as-code, scalability and service resiliency

7. Implement best practices around incident management, post-mortems while being part of on-call rotations.

Qualifications

1. Bachelor's or higher degree in Computer Science, similar technical field of study, or equivalent practical experience.

2. 3+ years experience developing, provisioning or maintaining production-grade large scaled distributed systems

3. High level of proficiency in Linux OS internals, networking, microservices, databases, caches etc in cloud-native environments.

4. Demonstrable familiarity with programming or scripting languages (Go / Python / Bash / C++ etc)

5. Demonstrable experience in the development and implementation of devops and SRE methodologies

6. Experience in designing, analyzing, and troubleshooting large-scale distributed systems.

7. Systematic problem-solving approach, coupled with effective communication skills and a sense of drive.

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives.

Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy.

To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach.

We are passionate about this and hope you are too.

TikTok is committed to providing reasonable accommodations during our recruitment process. If you need assistance or accommodation, please reach out to us at redacted .

4 days ago
Related jobs
Promoted
Apple
Cupertino, California

We are seeking an experienced and dynamic Site Reliability Engineer (SRE) Operator to join our team in maintaining the reliability, availability, and performance of our systems. At least 3 years of prior demonstrated experience in a Site Reliability Engineering, DevOps, or an Infrastructure-focused ...

Promoted
Crystal Equation Corporation
Fremont, California

Site Reliability Engineer or on similar hybrid and Software Engineering roles. We are seeking a skilled Site Reliability Engineer (SRE) to join our team. The Enterprise Platforms Integration team (EPI) handles onboarding, deployment, ongoing support, automation and integrations of tools used by inte...

Promoted
TikTok
San Jose, California

Global e-commerce funding center is a global team responsible for developing the core systems in the global e-commerce funding domain, including billing, settlement, payout, statement, finance, tax and virtual account and other sub-domains. Global e-commerce business team hopes to provide users with...

CDK Global
San Jose, California
Remote

Software Engineer - (SRE - Site Reliability Engineer). Work with internal groups such as Product Engineering, Tools and QA to adopt SRE best practices. CDK Global is committed to fair and equitable compensation practices. ...

Promoted
TikTok
Mountain View, California

The USDS Video Platform team is seeking an experienced Site Reliability Engineer to help us continue improving TikTok's video system. The teams within USDS that deliver on this commitment daily span across Trust & Safety, Security & Privacy, Engineering, User & Product Ops, Corporate Fun...

TikTok
San Jose, California

Global e-commerce business team hopes to provide users with more tailored and efficient consumption experience, enabling merchants to receive reliable platform services in different scenarios such as live e-commerce, short video content e-commerce, so as to make more affordable and high-quality prod...

Promoted
TikTok
San Jose, California

Global e-commerce business team hopes to provide users with more tailored and efficient consumption experience, enabling merchants to receive stable and reliable platform services in different scenarios such as live e-commerce, short video content e-commerce, so as to make more affordable and high-q...

TikTok
Mountain View, California

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed services and infrastructures. As a site reliability engineer in the Ads data platform area, you will have the opportunity to manage the services and infrastructures in one...

NetApp
San Jose, California

The Site Reliability Engineering (SRE) Manager will lead a dynamic team responsible for ensuring our critical systems' reliability, performance, and efficiency. Title: Mgr, Site Reliability Engineer. This role involves a strategic blend of engineering and operations and requires a strong background ...

Hireio, Inc.
San Jose, California

Site Reliability Engineering(SRE) team. Scale systems sustainably through mechanisms such as automation; evolve systems reliability, efficiency, and velocity by pushing for changes. ...