Search jobs > Palo Alto, CA > Senior site reliability

Senior Site Reliability Engineer

Grindr
Palo Alto, California, US
$148.1K-$174.2K a year
Full-time

This is ahybridrole based in our Chicago, Palo Alto or San Francisco office and will require you to be in office Tuesdays and Thursdays.

Is this your next job Read the full description below to find out, and do not hesitate to make an application.

What’s so interesting about this role?

As we enter our second year as a public company, Grindr is building on the success we’ve had over our 15-year history in connecting, supporting, and improving the lives of the LGBTQ+ community globally.

We are hiring a Site Reliability Engineer to join our newly established SRE team. You will work closely with our cloud engineering and software development teams to design, implement, and maintain systems that ensure the high availability, performance, and security of our platform.

This is a unique opportunity to shape the SRE culture and practices from the ground up, influencing the way we deliver and manage our services.

What’s the job?

Monitoring and Alerting : Set up and maintain monitoring systems to track the health and performance of applications and infrastructure.

Create and manage alerting mechanisms to detect and respond to issues quickly.

  • Incident Response : Handle incidents and outages, working to resolve them swiftly and minimize downtime. Performing root cause analysis to prevent future occurrences and improve system resilience.
  • Automation : Develop tools and scripts to automate repetitive tasks, such as deployments, monitoring, and scaling, to increase efficiency and reduce human error.
  • Performance Optimization : Analyze system performance and identify bottlenecks or areas for improvement. Work with development teams to optimize code and infrastructure for better performance and resource utilization.
  • Capacity Planning : Plan for future growth by analyzing current usage trends and forecasting resource needs. Additionally, you’ll ensure that systems can handle increased load without compromising performance or reliability.
  • Service Level Objectives (SLOs) and Service Level Agreements (SLAs) : Define and measure SLOs and SLAs to set expectations for system reliability and performance.

Track these metrics and work to maintain or exceed the defined standards.

Incident Management and Postmortems : After incidents, conduct post mortems to document what went wrong, what was done to fix it, and how to prevent similar incidents in the future.

This process helps in continuous improvement and learning from failures.

Collaboration with Development Teams : Work closely with software developers to integrate reliability and performance into the development process.

Provide guidance on best practices and assist with designing resilient systems.

  • Security and Compliance : Ensure that systems are secure and compliant with relevant regulations and standards. They implement security measures, monitor for vulnerabilities, and respond to security incidents.
  • Continuous Improvement : Continuously look for ways to improve system reliability, performance, and efficiency. Stay updated with industry trends and advancements to implement the best practices and technologies.
  • Participate in an on-call rotation

What we'll love about you :

  • Technical Expertise :
  • Proficient in at least one programming language (e.g., Python, Go, Java).
  • Strong knowledge of Linux / Unix systems.
  • Experience with cloud platforms (e.g., AWS, GCP, Azure).
  • Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
  • Understanding of networking concepts and protocols.
  • Reliability Engineering :
  • Experience with monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, ELK stack).
  • Ability to implement and manage CI / CD pipelines.
  • Knowledge of infrastructure as code (e.g., Terraform, Ansible).
  • Proficiency in automated testing and deployment practices.
  • Understanding of SRE principles and practices, including SLAs, SLOs, and SLIs.
  • Security :
  • Knowledge of security best practices and compliance standards.
  • Experience with vulnerability assessment and mitigation.
  • Operational Excellence :
  • Proven track record of maintaining high availability and performance in production environments.
  • Experience with incident management and post-mortem analysis.
  • Ability to optimize system performance and resource utilization.

Basic Qualifications :

  • 5+ years of experience in site reliability including incident response, incident management, automation and performance optimization
  • 5+ years of experience in cloud platforms (AWS preferred)
  • 4+ years of experience working with DevOps technologies such as Docker, Kubernetes, Helm, and Terraform
  • 4+ years developing and maintaining CI / CD pipelines
  • 4+ years experience using a scripting language like python or bash
  • Experience coding in Kotlin or another JVM language is a plus

What You'll Love About Us

Mission and Impact : Grindr is building the global gayborhood in your pocket. Your role will impact the lives of millions of LGBTQ+ people around the world.

Through our success, we are making a world where the lives of our community are free, equal, and just.

  • Family Insurance : Insurance premium coverage for health, dental, and vision for you and partial coverage for your dependents.
  • Retirement Savings : Generous 401K plan with 6% match and immediate vest in the U.S.
  • Compensation : Industry-competitive compensation and eligibility for company bonus and equity programs.
  • Queer-Inclusive Benefits : Industry-leading gender-affirming offerings with up to 90% cost coverage, access to Included Health, monthly stipends for HRT, and more.
  • Additional Benefits : Flexible vacation policy, monthly stipends for cell phone, internet, wellness, food, and commuting, breakfast / lunch provided onsite, and yearly travel & leisure stipend.

About Grindr

Grindr is building the global gayborhood in your pocket. With more than 13.5 million monthly active users, Grindr has become a fundamental part of the LGBTQ+ community and is charting a path to make the world more free, equal, and just.

Since 2015, Grindr for Equality has advanced safety, health, and human rights for millions of Grindr users and the global LGBTQ+ community in partnership with more than 100 community organizations in every region of the world.

Our next evolution is underway as a public company that continues to grow and build meaningful experiences for our users.

From social issues to product innovations, we're setting audacious goals for our community and the business, and leveraging the latest tech stacks and a culture of engineering excellence to make it happen.

At the heart of our work in this new chapter is a shared set of operating principles centered around cultivating curiosity, thinking big, setting and expediting our ambitious goals, and growing through iteration;

all while keeping our users #1.

Grindr is headquartered in West Hollywood, California, with offices in the Bay Area, Chicago, New York, and Washington, D.

C. With a track record of strong financial performance and plans for continued headcount growth, we’re building a team of talented, passionate, and open-minded people who want to disrupt the dating app space, innovate products, and advance LGBTQ+ culture.

Come be a part of this exciting journey with us.

Grindr is an equal-opportunity employer

To learn more about how we handle the personal data of applicants, visit ourEmployee and Candidate Privacy Policy.

Grindr is committed to fair and equitable compensation practices. This base pay range is for the U.S. and is not applicable to locations outside of the U.

S. The actual base pay is dependent upon many factors, such as training, transferable skills, work experience, business needs, location, and market demands.

The base pay range is subject to change and may be modified in the future. This role will also be eligible for equity, benefits, and a company bonus program.

Chicago Base Pay Range

$125,843 $148,028 USD

Bay Area Base Pay Range

$148,050 $174,150 USD

J-18808-Ljbffr

4 days ago
Related jobs
Promoted
Storm2
CA, United States

Senior Site Reliability Engineer. Work with engineering teams to establish and maintain reliability standards. Enhance system reliability through testing, fault tolerance, and disaster recovery planning. ...

Promoted
Infused Solutions
CA, United States

Our client is looking for a skilled Senior Site Reliability Engineer with an Microsoft Azure background and a good level of software engineering experience. Senior Site Reliability Engineer. Infused Solutions have partnered with a market leader in the San Francisco area, they are looking for a Senio...

Promoted
NVIDIA
Santa Clara, California

Join our team at NVIDIA as a Senior Site Reliability Engineer focused on HPC storage and play a crucial role in designing, implementing, and optimizing on-prem High-Performance Computing (HPC) storage solutions while harnessing the power of cloud computing. You will collaborate closely with engineer...

Promoted
Grindr
Palo Alto, California

We are hiring a Site Reliability Engineer to join our newly established SRE team. You will work closely with our cloud engineering and software development teams to design, implement, and maintain systems that ensure the high availability, performance, and security of our platform. Additionally, you...

Promoted
Apple
Cupertino, California

The Apple Service Engineering - SRE team is looking for Site Reliability Engineers with experience in developing processes, tools, and automation for managing distributed systems in production environments. This role is for engineers who enjoy deep technical engineering that spans large cross-organi...

Promoted
Fortinet
Sunnyvale, California

Develop best practices alongside engineering/operations teams to improve the scalability and reliability of internal processes. Our team is growing, and we are looking for engineers with passion for automation. ...

II-VI Incorporated
Fremont, California

Minimum of 5 years of experience in roles focused on product certification, test and manufacturing quality, and reliability engineering. Develop, implement, and enforce quality and reliability standards and procedures for Silicon Photonics PIC products to meet Coherent's stringent criteria. Act as a...

General Motors
Palo Alto, California

Chaos engineering implementation and experience a big plus. BS/MS in Computer Science/Engineering preferred. This means the successful candidate is expected to report onsite three times per week at minimum. ...

ByteDance
San Jose, California

Therefore, we set up an engineer team with high talent density, mainly focusing on AI technology and Privacy&Security in CapCut. ...

Inabia Software & Consulting Inc.
San Jose, California

Job Description : About the Role          •       We seek a highly skilled and dynamic Site Reliability Engineer – Consultant  In this role you will  •       Maintain and improve the reliability, performance, and avai...