Search jobs > San Jose, CA > Site reliability engineer

Site Reliability Engineer - AML

ByteDance
San Jose
Full-time

ResponsibilitiesFounded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Helo, and Resso, as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content.

Why Join UsAt ByteDance, our people are humble, intelligent, compassionate and creative. We create to inspire - for you, for us, and for millions of users across all of our products.

We lead with curiosity and aim for the highest, never shying away from taking calculated risks and embracing ambiguity as it comes.

Here, the opportunities are limitless for those who dare to pursue bold ideas that exist just beyond the boundary of possibility.

Join us and make impact happen with a career at ByteDance. The mission of our AML team is to push next-generation recommendation-based algorithms and platform for the company.

We also drive substantial impact for core businesses of the company. Currently we are looking for Site Reliability Engineers to join our team to support and advance that mission What You'll Do Site Reliability Engineering (SRE) of AML (Applied Machine Learning) team combines system engineering and the art of machine learning to develop and run massively distributed AI / recommendation system around the world.

On the SRE team, you'll have the opportunity to sharpen your expertise in coding, performance analysis and large system operation, and get heavily involved in the process of hardware / capacity decision-making.

SRE ensures that the very centric machine learning services at ByteDance have the highest level of availability, as well as creating highly automated systems and pipelines.

Qualifications1. Expertise in analyzing and troubleshooting distributed systems.2. Bachelor / Master's degree in Computer Science, a related technical field involving software develop or systems engineering.

3. Experience programming in at least one of the following languages : Python, C / C++ or Go. 4. With solid background of algorithms and data structures.

Preferred qualifications : 1. Ability to design and maintain large-scale systems.2. Strong understanding of code optimizing and routine tasks automation.

3. SRE experience on large scale distributed system. ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives.

Our platform connects people from across the globe and so does our workplace. At ByteDance, our mission is to inspire creativity and enrich life.

To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach.

We are passionate about this and hope you are too. ByteDance Inc. is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws.

If you need assistance or a reasonable accommodation,

30+ days ago
Related jobs
Promoted
VirtualVocations
Santa Clara, California

Kubernetes)Experience with scripting languages and Infrastructure as CodeKnowledge of AWS and familiarity with other cloud platformsExperience with CI/CD tools and deployment strategies....

Promoted
TikTok
San Jose, California

TikTok is one of the fastest growing apps in the world, and we're seeking Site Reliability Engineers (SREs) to join our monetization technology team. Deliver tools/software to improve the reliability, scalability and operability of services. ...

Promoted
TalentBurst, Inc.
Sunnyvale, California

Actual Title of the role: Site Reliability Engineer(SRE). Onsite/Hybrid/Remote: Hybrid (2-3 days onsite). Balance feature development speed and reliability with well-defined service-level objectives. ...

Promoted
PayPal
San Jose, California

We engineer AI driven reliability platforms that measure, monitor, and protect the experience of PayPal merchants and customers. You will be part of a production engineering team within a new SRE organization focused entirely on merchant experience. ...

Promoted
Venmo
San Jose, California

The Production Reliability Engineering & Operations (PREO) org is part of the larger Site Reliability Engineering (SRE) organization. Extensive experience leading or managing a team of Site reliability engineers. It is focused on monitoring the PayPal production site and managing incidents impac...

Promoted
Nvidia Corporation
Santa Clara, California

Senior Site Reliability Engineer, Data Science and ML Platforms. Are you passionate about building and maintaining large-scale production systems that support advanced data science and machine learning applications? Do you want to join a team at the heart of NVIDIA's data-driven decision-making cult...

Promoted
Apple Inc.
Cupertino, California

Software Delivery - Senior Site Reliability Engineer. Infrastructure Ops, Site Reliability Engineering, or DevOps focused role. This infrastructure enables thousands of Apple software engineers to develop products that delight millions of Apple customers. As a Senior SRE you will help lead and mento...

ByteDance
San Jose, California

Therefore, we set up an engineer team with high talent density, mainly focusing on AI technology and Privacy&Security in CapCut. ...

Everest Consulting Group
San Jose, California
Remote

Role: Site Reliability Engineer Employment Type: Contract – Only VISA FREE Work location: Sanjose,CA Work mode: Onsite- 2 days in a week / 3 days Remote About the Role · We seek a highly skilled and dynamic Site Reliabili...

ByteDance
San Jose, California

Our data infrastructure Site Reliability Engineering (SRE) team is a pioneer in innovation. We seamlessly merge software development and infrastructure operations to design, build, and manage large-scale, highly distributed systems. Our professionals play a critical role as connectors, ensuring the ...