Talent.com
Senior Site Reliability Engineer
Senior Site Reliability EngineerMango • Los Angeles, CA, United States
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Mango • Los Angeles, CA, United States
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

We are seeking a Senior Site Reliability Engineer to own and evolve the infrastructure that supports our on-premise instruments, data systems, and machine learning pipelines. This role combines systems-level engineering with software craftsmanship, requiring deep understanding of how compute, storage, and networking layers interact under real workloads.You will be the go-to expert for diagnosing performance issues in our on-prem system. This could be from kernel-level I / O bottlenecks to distributed service latency. In addition to building robust automation that keeps our systems consistent and observable.Key ResponsibilitiesInfrastructure Design & Reliability Design, deploy, and maintain our on-premise and hybrid infrastructure which includes Dell PowerEdge and PowerVault servers, prosumer NAS units, and high-throughput data processing clusters. Implement fault-tolerant systems with reproducible deployments and clear observability.Performance & Systems Analysis Investigate complex performance issues across hardware, OS, and software boundaries. You will be using Linux toolin addition to in-house application-level metrics to uncover root causes in filesystems, caching layers, or I / O scheduling.Automation & Tooling Build automation for system provisioning, configuration management, and software deployment using Python, Go, Ansible, or similar frameworks. Develop lightweight services and tools that make reliability visible and maintainable.Collaboration Work closely with our software and hardware teams to co-design systems that meet the needs of high-resolution imaging and ML inference workloads. Translate hardware realities into software reliability guarantees.Observability & Incident Response Develop and maintain monitoring, alerting, and logging systems to ensure early detection of issues. Lead incident response and post-mortem efforts with a focus on learning and prevention.Documentation & Communication Produce clear documentation and communicate findings effectively to the broader team from network topology diagrams to kernel tuning rationales.General QualificationsDeep understanding of Linux systems and performance (I / O schedulers, RAID, caching, NUMA, kernel parameters).Hands-on experience designing and managing on-premise servers, storage arrays, or HPC clusters.Comfort with automation and software development (Python, Go, Bash, or similar).Strong diagnostic and analytical skills : ability to decompose performance problems across multiple layers.Proven track record of improving system reliability, throughput, and maintainability in a fast-paced environment.Excellent written and verbal communication skills for cross-disciplinary collaboration.Self-driven, curious, and motivated by understanding systems deeply rather than just maintaining them.Bonus Qualities (Not Required)510 years of relevant industry experience in systems engineering, SRE, or infrastructure software roles.Experience tuning Linux filesystems (ext4, btrfs) and software RAID (mdadm).Familiarity with containerization and orchestration (Docker, Compose, Kubernetes).Knowledge of networking fundamentals (VLANs, bonding, LACP, 10 GbE / 40 GbE).Experience supporting data-heavy scientific or ML workloads.Demonstrated technical leadership mentoring others in debugging, reliability, or performance analysis.

recblid a27ykxdqpvdzrj81gllu1mnyf3d85k

serp_jobs.job_alerts.create_a_job

Senior Site Reliability Engineer • Los Angeles, CA, United States

Job_description.internal_linking.related_jobs
Site Reliability Engineer

Site Reliability Engineer

VirtualVocations • Pasadena, California, United States
serp_jobs.job_card.full_time
A company is looking for a Site Reliability Engineer (SRE).Key Responsibilities Design, build, and maintain scalable and reliable infrastructure using cloud platforms and automation tools Implem...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Principal Site Reliability Engineer

Principal Site Reliability Engineer

VirtualVocations • Pasadena, California, United States
serp_jobs.job_card.full_time
A company is looking for a Principal Site Reliability Engineer.Key Responsibilities Lead the technical direction of the team while contributing to the design and implementation of self-service to...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Senior Software Engineer, Site Reliability

Senior Software Engineer, Site Reliability

ZipRecruiter • Los Angeles, CA, US
serp_jobs.job_card.full_time
Senior Software Engineer, Site Reliability.We offer a hybrid work environment.Most US-based positions can also be performed remotely (any exceptions will be noted in the Minimum Qualifications belo...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior Site Reliability Engineer

Senior Site Reliability Engineer

VirtualVocations • Van Nuys, California, United States
serp_jobs.job_card.full_time
A company is looking for a Senior Site Reliability Engineer to help scale its platform and ensure system reliability.Key Responsibilities Act as a first responder for system incidents and outages...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Senior Site Reliability Engineer / Los Angeles, CA / Hybrid

Senior Site Reliability Engineer / Los Angeles, CA / Hybrid

Motion Recruitment • Los Angeles, CA, US
serp_jobs.job_card.full_time
Senior Site Reliability Engineer / Los Angeles, CA / Hybrid.Motion Recruitment is seeking a Senior Site Reliability Engineer in Los Angeles, CA, hybrid work arrangement. A large gaming company is lo...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior Reliability Engineer

Senior Reliability Engineer

SiLC Technologies, Inc • Monrovia, CA, US
serp_jobs.job_card.full_time
SiLC, you will own SiLC's reliability program for FMCW LiDAR modules and assemblies and lead hands-on failure analysis of prototypes and field returns. This senior role combines proactive reliabilit...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted
Lead Plumbing Engineer

Lead Plumbing Engineer

ACCO Engineered Systems • Costa Mesa, CA, United States
serp_jobs.job_card.full_time
This position is responsible for independently delivering engineering services, from conceptual design through construction completion. Essential Duties & Responsibilities.Complete project planning,...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Senior Plumbing Engineer

Senior Plumbing Engineer

ACCO Engineered Systems • Pasadena, CA, United States
serp_jobs.job_card.full_time
This position is responsible for independently delivering engineering services, from conceptual design through construction completion. Essential Duties & Responsibilities.Expert in project planning...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Cloud Site Reliability Engineer

Cloud Site Reliability Engineer

VirtualVocations • Glendale, California, United States
serp_jobs.job_card.full_time
A company is looking for a Cloud Site Reliability Engineer (AWS).Key Responsibilities Design, deploy, and maintain AWS cloud infrastructure for high availability and fault tolerance Administer M...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Engineer, Reliability

Engineer, Reliability

AES Corporation • Long Beach, CA, United States
serp_jobs.job_card.full_time
Are you ready to be part of a company that's not just talking about the future, but actively shaping it? Join The AES Corporation (NYSE : AES), a. AES is committed to shaping a future through innovat...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior Geotechnical Engineer

Senior Geotechnical Engineer

Jobot • Santa Clarita, CA, US
serp_jobs.job_card.full_time
This Jobot Job is hosted by : Brian Perkins.Are you a fit? Easy Apply now by clicking the "Apply Now" buttonand sending us your resume. Salary : $100,000 - $160,000 per year.We offer a wide ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Senior Manager, Site Reliability Engineering

Senior Manager, Site Reliability Engineering

Motion Recruitment • Los Angeles, CA, US
serp_jobs.job_card.full_time
Senior Manager, Site Reliability Engineering.Senior Manager, Site Reliability Engineering.An established company operating within the financial cloud space is looking for a Senior Manager, Site Rel...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior Vulnerability Management Engineer

Senior Vulnerability Management Engineer

VirtualVocations • Norwalk, California, United States
serp_jobs.job_card.full_time
A company is looking for a Senior Vulnerability Management Engineer to lead the identification, assessment, and remediation of security vulnerabilities across enterprise systems.Key Responsibilitie...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Plumbing Design Engineer - Healthcare - P2S

Plumbing Design Engineer - Healthcare - P2S

P2S Inc. • Long Beach, CA, United States
serp_jobs.job_card.full_time
Our specialties include electrical, mechanical, plumbing, fire protection, and technology integration.Our offered services range from engineering and commissioning to construction management.With o...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
DevOps Site Reliability Engineer

DevOps Site Reliability Engineer

VirtualVocations • Long Beach, California, United States
serp_jobs.job_card.full_time
A company is looking for a DevOps / Site Reliability Engineer (Remote).Key Responsibilities Configure, manage, and improve CI / CD pipelines for application deployments Monitor application perform...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted
Senior Site Reliability / Gitops Engineer

Senior Site Reliability / Gitops Engineer

Canonical • Los Angeles, CA, US
serp_jobs.job_card.full_time
Senior Site Reliability / Gitops Engineer.Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is wide...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Lead Quality Engineer

Lead Quality Engineer

Skylimit Systems • Simi Valley, CA, US
serp_jobs.job_card.full_time
We are seeking a highly skilled Lead Quality Engineer to oversee and drive quality assurance initiatives within our aerospace and defense manufacturing operations.The ideal candidate will...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Senior Production Engineer

Senior Production Engineer

VirtualVocations • North Hollywood, California, United States
serp_jobs.job_card.full_time
A company is looking for a Senior Production Engineer to join their infrastructure and reliability engineering team.Key Responsibilities Design, automate, scale, and support production systems in...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior Site Reliability Engineer

Senior Site Reliability Engineer

K2 Space Corporation • Los Angeles, CA, US
serp_jobs.job_card.full_time +1
Senior Site Reliability Engineer.Be among the first 25 applicants.Senior Site Reliability Engineer.K2 Space is building large, high-powered spacecraft for the next generation of space development.B...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior Systems Engineer

Senior Systems Engineer

VirtualVocations • Whittier, California, United States
serp_jobs.job_card.full_time
A company is looking for a Staff Advanced Concepts Systems Engineer.Key Responsibilities Lead the creation of mission concepts, reference architectures, and CONOPS for future space missions Inte...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted