Search jobs > San Francisco, CA > Staff site reliability

Staff Site Reliability Engineer - Data Engineering, Platform

Ellation, Inc.
San Francisco, CA, United States
Full-time

Who We Are

We're a cast of characters working to shine a spotlight on anime. Crunchyroll is an international business focused on creating both online and offline experiences for fans through content (licensed, co-produced, originals, distribution), merchandise, events, gaming, news, and more.

Visit our About Us pages for more information about our collection of brands.

About the Team

The Site Reliability Engineering (SRE) team is dedicated to ensuring the reliability, scalability, and performance of our data infrastructure.

We focus on standardizing and implementing monitoring and alerting across all datastores to track key metrics like errors, latency, and throughput, and to ensure critical systems are covered.

Our team also leads efforts to keep databases up-to-date, implements Infrastructure as Code (IaC) for high availability and performance, and automates key processes to enhance operational efficiency.

We lead and evangelize the principle of 100% automation. Additionally, we define and document operational requirements, develop incident response processes, and automate monitoring and compliance checks to maintain a secure and reliable data environment.

By continuously improving load testing and optimizing data governance practices, we support the overall health and efficiency of our data systems.

About the Role

Crunchyroll is growing and changing, presenting unique challenges and opportunities to support millions of anime fans around the world.

The Data Engineering team provides seamless help to our internal stakeholders, ensuring an exceptional experience for all Crunchyroll fans.

As a Staff Site Reliability Engineer for the Data Engineering team, you will be responsible for maintaining and enhancing the reliability of our data infrastructure.

Your work will directly impact the availability and performance of our data services, enabling the organization to better decisions.

You will collaborate closely with data engineers, and software engineers to develop and drive 100% automation, best practices for deep monitoring and alerting.

This role will report to our Director of Data Engineering. While it is preferred for this role to sit in one of our offices, fully remote is also an option in the United States.

About You

  • Bachelor's degree in Computer Science, Information Technology, or a related field.
  • 12+ years of experience in site reliability engineering, database operations, or a related role with a focus on data platforms, data stores, data operations.
  • Extensive experience with AWS cloud platform and their data-related services.
  • Proficiency in monitoring tools (e.g., Datadog, CloudWatch, DevOps Guru, DB Performance Insights).
  • Proficiency in one or more programming languages (e.g. Python, Java)
  • Proficiency in automation frameworks (e.g., Terraform, Cloud Formation).
  • Strong understanding of various performance metrics both at a high level and at a low level like Disk / IO saturation.
  • Experience in identifying and eliminating the bottlenecks in the system.
  • Strong understanding of database internals like types of indexes, schemas, query plans.
  • Strong understanding of database systems (e.g., SQL, NoSQL) and experience in managing large-scale data infrastructures.
  • Strong understanding and hands-on implementation of CI / CD pipelines and DataOps practices.
  • Experience with data governance, compliance, and lifecycle management.
  • Ability to own and execute projects while effectively collaborating with the team to influence and shape the vision of the data engineering organization.

Why you will love working at Crunchyroll

Not only will you get to work with fun, passionate and inspired colleagues, you will also...

  • Receive a great compensation package including salary plus performance bonus earning potential, paid annually.
  • Enjoy flexible PTO and time off policies allowing you to take the time you need to be your whole self.
  • Appreciate the generous medical, dental, vision, STD, LTD, and life insurance options for you and your family.
  • Take advantage of our health saving account HSA program plus health care and dependent care FSA programs.
  • Love that we offer an employer match on our 401(k) plan.
  • Receive employer paid commuter benefit (for eligible employees)
  • Appreciate the generous support program for new parents
  • Obtain pet insurance and some of our offices are pet friendly!

LifeAtCrunchyroll #LI-Remote

J-18808-Ljbffr

1 day ago
Related jobs
Promoted
VirtualVocations
Oakland, California

A company is looking for a Staff Engineer, Reliability Insights & Excellence. ...

Discord
San Francisco, California
Remote

As a leader of the Data and Machine Learning Platform team, you will be responsible for designing, developing, and maintaining our data and AIML infrastructure and services. This role reports to our Director of Engineering, Data Platform. Lead the design, implementation, and optimization of our data...

Promoted
VirtualVocations
Oakland, California

A company is looking for a Director of Data Platform Engineering to lead the operational management and architecture of their data platforms. ...

The Linux Foundation
San Francisco, California

The platform engineering software engineer in the LF Education department is responsible for developing and maintaining the delivery platform(P3), which enables LF Education’s hands-on certifications and interactive training labs, and all supporting services. The P3 platform is built on cloud native...

Disney Entertainment & ESPN Technology
San Francisco, California

The Senior Site Reliability Engineer is a key member of our Performance and Reliability embedded teams. Our Performance and Reliability teams are leading the improvements, optimization, and availability of applications across the Disney organization and business units, taking a consultative approach...

Roblox
San Mateo, California

As a Principal Data Engineer, you will work to define the data ontology for all of Roblox, establish standard methodologies for data operations and lifecycle management, design and build analytics tooling and frameworks, and influence event instrumentation. The Data Engineering team at Roblox plays ...

Simple Solutions
Austin or Scottsdale, CA, us

Sr Data Engineer III - 3 days a week hybrid on site. This is not a Data Engineer role to work with the engineering team on iOS project . Experience with databases like Teradata a big plus. Proven work Data Engineer working with Enterprise Clients (Google, Facebook, Nvidia, Uber, Linkedin etc). ...

Panasonic Well
CA, United States

We are looking for an experienced Staff Fullstack Engineer, AI Platforms to join our Ai and Data Platform team. This role focuses on building critical middleware systems and front-end data visualization tools, enabling our AI and data platform initiatives. While this role is not focused on building ...

Crunchyroll
San Francisco, California

In the role of Staff Partner Engineer, you will report to the Senior Director, Partner Engineering We are considering applicants for the locations of Culver City, Dallas, and San Francisco. The Partner Engineering team provides seamless help to our partners and internal stakeholders, ensuring an exc...

Fractal
CA, United States

Influence and create new designs, architectures, standards, and methods for supporting the platform. Must be willing to participate in on-call rotationWork cross-functionally with Services and Engineering teams. Expertise in Linux Operating Systems, Networking, and Database concepts. ...