Search jobs > San Jose, CA > Site reliability engineer

Site Reliability Engineer - Data Infrastructure

TikTok
San Jose, California, US
$136.8K-$205K a year
Full-time

Responsibilities

Scroll down to find the complete details of the job offer, including experience required and associated duties and tasks.

TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul, and Tokyo.

Our data infrastructure Site Reliability Engineering (SRE) team is a pioneer in innovation. We seamlessly merge software development and infrastructure operations to design, build, and manage large-scale, highly distributed systems.

Responsibilities :

  • Participate in and enhance the complete service lifecycle, from inception and design, through development, capacity planning, launch reviews, deployment, operation, and refinement.
  • Design and implement software platforms and monitoring frameworks to govern service-oriented architecture (SOA) efficiently, automatically, and intelligently.
  • Develop and manage components of cloud-managed data infrastructure, encompassing technologies such as Kubernetes, Redis, MySQL, Flink, and more.
  • Establish sustainable mechanisms for scaling systems, such as automation, to drive enhancements in reliability, efficiency, and velocity.
  • Provide sustainable user support, manage incident responses, and conduct blameless postmortems as part of our ongoing efforts to improve our systems.

Qualifications

  • Bachelor's degree in Computer Science or a related technical field with 2+ years of experience
  • Experience programming in one of the following Languages : C, C++, Java, Python, Go, and Rust
  • Familiar with Unix / Linux system internals, networking, and distributed systems
  • Preferred Experience in MySQL, Redis, Ngnix, Kubernetes, Docker, OpenStack, Hadoop, Spark, Flink, etc.
  • Preferred Experience in designing and analyzing large-scale distributed systems
  • Preferred Strong skills in problem solving and communication

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives.

Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy.

To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach.

TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs, or other reasons protected by applicable laws.

If you need assistance or a reasonable accommodation, please reach out to us at [email protected]

Job Information :

For Pay Transparency Compensation Description (annually)

The base salary range for this position in the selected city is $136800 - $205000 annually.

Compensation may vary outside of this range depending on a number of factors, including a candidate’s qualifications, skills, competencies and experience, and location.

Our company benefits are designed to convey company culture and values, to create an efficient and inspiring work environment, and to support our employees to give their best in both work and life.

We offer the following benefits to eligible employees :

We cover 100% premium coverage for employee medical insurance, approximately 75% premium coverage for dependents and offer a Health Savings Account(HSA) with a company match.

As well as Dental, Vision, Short / Long term Disability, Basic Life, Voluntary Life and AD&D insurance plans.

Our time off and leave plans are : 10 paid holidays per year plus 17 days of Paid Personal Time Off (PPTO) (prorated upon hire and increased by tenure) and 10 paid sick days per year as well as 12 weeks of paid Parental leave and 8 weeks of paid Supplemental Disability.

We also provide generous benefits like mental and emotional health benefits through our EAP and Lyra. A 401K company match, gym and cellphone service reimbursements.

The Company reserves the right to modify or change these benefits programs at any time, with or without notice.

J-18808-Ljbffr

2 days ago
Related jobs
Promoted
Apple
Cupertino, California

At least 5 years in a Site Reliability Engineering, DevOps or infrastructure focused role. The Apple Services Engineering (ASE) team is one of the most exciting examples of Apple's long-held passion for combining art and technology. These engineers build secure, end-to-end solutions. Thanks to Apple...

Promoted
Groq
Mountain View, California

Site Reliability Engineer, Distributed Systems. Infrastructure Development: Build and automate cloud infrastructure using terraform to support a wide variety of needs. Specifically engineered for the demands of large language models (LLMs), the Language Processing Unit outpaces the GPU in speed, pow...

Promoted
EarnIn
Palo Alto, California

Senior Software Engineer (Data Exchange). Senior Software Engineer - Finance Platform. Senior Software Engineer (Internal Tool). Software Quality Engineer (Mobile Automation, Contract). ...

Promoted
Ushur
Santa Clara, California

As the Director of Senior Reliability Engineering, you will be responsible for building and managing a team of talented reliability engineers. Site Reliability Engineering or related roles, with at least 3+ years in a leadership capacity. Lead a high-performing team of senior reliability engineers, ...

Promoted
NVIDIA
Santa Clara, California

Join our team at NVIDIA as a Senior Site Reliability Engineer focused on HPC storage and play a crucial role in designing, implementing, and optimizing on-prem High-Performance Computing (HPC) storage solutions while harnessing the power of cloud computing. You will collaborate closely with engineer...

Promoted
Zscaler
San Jose, California

We're looking for an experienced Senior Site Reliability Engineer to join our Site Reliability Engineering team. Reporting to the Manager, Site Reliability Engineering, you'll be responsible for:. Site Reliability Engineer or in a related role within a SaaS organization. Working with Software Engine...

Promoted
EarnIn
Palo Alto, California

As a Staff Site Reliability Engineer, you’ll be the subject matter expert with operating systems and networking. You’ll understand how our services are performing, we use DataDog (Logging+Metrics+APM), and Cloudwatch (by way of Datadog) to alert with Slack or PagerDuty. We are strong believers in In...

ByteDance
San Jose, California

Our data infrastructure Site Reliability Engineering (SRE) team is a pioneer in innovation. We contribute significantly to the next chapter of data infrastructure. Develop and manage components of cloud-managed data infrastructure, encompassing technologies such as Kubernetes, Redis, MySQL, Flink, a...

Splunk Inc
California, United States
Remote

Site Reliability Engineers in this role will be engaging with multiple service owners across the platform to teach and implement modern interpretations ofSRE,observability, Chaos Engineering andDevOps. Splunk's Cloud Services group is looking for a Site ReliabilityEngineer to help lead, design and b...

ByteDance
San Jose, California

Relying on the abundant data and computing resources of the platform, the team has continued to invest in relevant fields and has launched its own general large model, providing multi-modal capabilities. The Machine Learning (ML) System sub-team combines system engineering and the art of machine lea...