Site Reliability EngineerFractal • San Francisco, CA, United States

Site Reliability Engineer

Fractal • San Francisco, CA, United States

job_description.job_card.30_days_ago

serp_jobs.job_preview.job_type

serp_jobs.job_card.full_time

job_description.job_card.job_description

This range is provided by Fractal. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.

Base pay range

$110,000.00 / yr - $160,000.00 / yr

Site Reliability Engineer

Fractal Analytics is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets. An ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite empowers imagination with intelligence. And that it will be such Fractalites that will continue to build the company for the next 100 years.

Please Note : This role is specifically located in the Bay Area of San Francisco. You will need to work onsite Monday - Friday. We offer paid relocation.

Role Overview

As a Site Reliability Engineer with Fractal, you will be dedicated to ensuring the highest system availability and performance levels. This role involves comprehensive monitoring, addressing complex technical issues, automating solutions to recurring problems, and contributing to developing resilient system architectures and deployment strategies. You will work closely with our Services and Engineering teams, playing a crucial role in optimizing our platforms and infrastructures.

Responsibilities

Ensure maximum uptime and system availability to meet or exceed functional and performance SLAs.

Implement thorough end-to-end monitoring and alerting on all critical components to ensure quick detection and response.

Tackle complex challenges affecting critical services, focusing on automating problem resolution to prevent future occurrences.

Drive the development of innovative designs, architectures, standards, and methodologies to support and enhance our platform.

Lead in scripting and automation efforts, aiming to refine system updates and upgrade processes.

Design and configure essential infrastructure, tools, and frameworks to enhance the deployment lifecycle.

Collaborate effectively with cross-functional teams within Services and Engineering.

Qualifications

Have interest and ability to become certified on the end client AI platform. (We will provide all the necessary training and support)

Bachelor’s or master’s degree in computer science, a related field, or equivalent professional experience.

Minimum of 10 years of relevant experience.

Proven experience in deploying, managing, and optimizing scalable, fault-tolerant Linux / Kubernetes / JVM infrastructure across various cloud platforms like AWS, GCP, and Azure.

Deep expertise in Linux Operating Systems, Networking principles, and Database management.

Practical experience with Cassandra or similar NoSQL technologies.

Proficiency with major cloud services providers, notably AWS, Azure, and GCP.

Familiarity with configuration management tools such as Ansible or Terraform.

Proficiency in programming languages like Ruby or Python, particularly for system automation and monitoring.

Strong problem-solving abilities, critical thinking skills, and effective communication capabilities.

Prior experience in a DevOps or system administration role, ideally supporting commercial SaaS solutions.

Pay :

The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions, including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. A reasonable estimate of the current range is : $110,000 - $160,000. In addition, you may be eligible for a discretionary bonus for the current performance period.

As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation.

Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.

Seniority level

Mid-Senior level

Employment type

Full-time

Job function

Information Technology, Consulting, and Engineering

Industries

Technology, Information and Media, IT Services and IT Consulting, and Business Consulting and Services

#J-18808-Ljbffr

serp_jobs.job_alerts.create_a_job

Site Reliability Engineer • San Francisco, CA, United States

Job_description.internal_linking.related_jobs

Site Reliability Engineer

ConductorOne • San Francisco, CA, United States

serp_jobs.job_card.full_time

Shape the future of identity with the highest-caliber team.If you’re amazing at what you do and want to solve big challenges in identity and security, come on board. Identity is how companies are be...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Principal Site Reliability Engineer

Fortinet • Santa Clara, CA, United States

serp_jobs.job_card.full_time

At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Site Reliability Engineer I

prosper.com • San Francisco, CA, United States

serp_jobs.job_card.full_time

As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Site Reliability Engineer

Bits to Atoms • San Francisco, CA, United States

serp_jobs.job_card.full_time

Site Reliability Engineer (SRE).You’ll work at the intersection of infrastructure, AI / ML systems, and mission-critical physical operations. You’ll collaborate directly with engineering, AI, and oper...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Site Reliability Engineer

Latent • San Francisco, CA, United States

serp_jobs.job_card.full_time

Latent is building the intelligence infrastructure for American healthcare.Our products are already helping hospitals and clinics dramatically increase workflow output, speed up patient access to m...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Site Reliability Engineer

PsiQuantum • Palo Alto, CA, United States

serp_jobs.job_card.full_time

Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Site Reliability Engineer

Redwood Materials, Inc. • San Francisco, CA, United States

serp_jobs.job_card.full_time

Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling—keeping critical minerals in circulation and driving the energy transition.Founded in 2...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Site Reliability Engineer

Sigmaways Inc • San Francisco, CA, United States

serp_jobs.job_card.full_time

As a Site reliability engineer, you will partner with development and IT teams to implement CI / CD pipelines, develop automation and monitoring solutions to ensure our platforms are secure, scalable...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted

Site Reliability Engineer

Fortinet • Sunnyvale, CA, United States

serp_jobs.job_card.full_time

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Site Reliability Engineer

WorkOS • San Francisco, CA, United States

serp_jobs.job_card.full_time

About WorkOS 🚀 WorkOS builds tools and services for developers to help them implement authentication, identity, authorization, and overall enterprise readiness. We’re a fully distributed team with ...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Site Reliability Engineer I

Prosper Marketplace • San Francisco, CA, United States

serp_jobs.job_card.full_time

serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted

Site Reliability Engineer

Alchemy • San Francisco, CA, United States

serp_jobs.job_card.full_time

Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers th...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Site Reliability Engineer

Together AI • San Francisco, CA, United States

serp_jobs.job_card.full_time

As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Site Reliability Engineer - Technical Lead

ZipRecruiter • San Francisco, CA, United States

serp_jobs.job_card.full_time

Veryon is a leading software and technology company that enables aviation teams around the world to improve efficiency and safety. Our products maximize uptime for aircraft maintenance teams through...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Site Reliability Engineer - Openstack

Fortinet • Sunnyvale, CA, United States

serp_jobs.job_card.full_time

Fortinet is recruiting a Site Reliability Engineer- OPENSTACK to join our FortiStack team.This team is responsible for the management, operation and continued development of our Openstack-based pri...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Site Reliability Engineer

Redwood Materials • San Francisco, CA, United States

serp_jobs.job_card.full_time

Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling — keeping critical minerals in circulation and driving the energy transition.Founded in...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Site Reliability Engineer

Primer • San Francisco, CA, United States

serp_jobs.job_card.full_time

Primer helps B2B products break out of the B2C-centric marketing box.Our platform turns consumer ad channels, data streams, and emerging AI workflows into measurable growth engines for go-to-market...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Site Reliability Engineer

Signify Technology • Palo Alto, CA, United States

serp_jobs.job_card.full_time

Competitive, based on experience.We are a technology startup advancing healthcare with a safety-focused AI platform that assists medical professionals by managing patient communications, including ...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted