Talent.com
Principal Site Reliability Engineer
Principal Site Reliability EngineerFortinet • Santa Clara, CA, United States
Principal Site Reliability Engineer

Principal Site Reliability Engineer

Fortinet • Santa Clara, CA, United States
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers.

Our team members enjoy solving complex problems, and obsess over getting the details right. We love what we do and are proud of our work to secure clouds and container environments for thousands of b2b customers worldwide.

Our team is growing, and we are looking for engineers with passion for automation. You will help support the FortiCNAPP platform and play a key role in building, operating, and improving the FortiCNAPP Cloud Security Platform, the world's best real-time cloud-native threat detection system.

Our team develops and supports the infrastructure layers spanning our cloud accounts, network / connectivity, workload management, observability, and storage services. We build tooling to perform automated operations in order to scale the FortiCNAPP infrastructure and service. To be successful you will design, define, develop, deploy and operate internal tooling, APIs, and frameworks which streamline our workflows and automate our infrastructure.

About this role : As a Principal Site Reliability Engineer at FortiCNAPP, you will lead the design, implementation, and optimization of our highly scalable, resilient, and efficient platform infrastructure. You will drive strategic initiatives to enhance operational excellence, mentor teams, and set the standard for reliability and automation across the organization. Your expertise will shape the future of FortiCNAPP's infrastructure, ensuring it meets the demands of our customers and supports rapid growth.

Responsibilities :

  • Architect and implement advanced automation strategies to maximize operational efficiency and minimize toil across the FortiCNAPP platform.
  • Lead the design, development, and enhancement of infrastructure systems to ensure world-class scalability, resiliency, and performance.
  • Proactively identify and resolve complex, systemic issues through innovative automation, tooling, and architectural solutions, preventing customer-facing incidents.
  • Drive the evolution of monitoring, instrumentation, and observability systems to anticipate and mitigate scalability and reliability risks before they impact customers.
  • Champion company-wide adoption of reliability best practices, establishing key metrics, SLAs, and milestones to embed scalability and resiliency into all engineering processes.
  • Collaborate with cross-functional teams to define and implement industry-leading practices for infrastructure, deployment, and operational workflows.
  • Provide technical leadership and mentorship to engineering and operations teams, fostering a culture of reliability, automation, and continuous improvement.
  • Lead incident response and post-mortem processes, driving root cause analysis and implementing preventive measures.
  • Participate in an on-call rotation, serving as an escalation point for complex issues and guiding the team through critical incidents.
  • Influence strategic technology decisions, evaluating and integrating cutting-edge tools, services, and methodologies to enhance platform reliability.

Minimum Qualifications :

  • 10+ years of DevOps / SRE experience, with at least 5 years in a senior or lead role managing production systems at scale.
  • Expert-level development and automation skills, with a proven track record of building sophisticated tools and workflows.
  • Deep expertise in Infrastructure as Code (e.g., Terraform) and supporting tools (e.g., Atlantis, ArgoCD, Flux).
  • Advanced experience with Kubernetes and its ecosystem (e.g., Helm, operators, Kustomize), including managing large-scale, production-grade clusters.
  • Extensive experience with multiple cloud providers and managed services (e.g., AWS : EKS, EC2, S3, RDS, Secrets Manager; GCP, Azure).
  • Proven ability to architect and operate highly reliable, fault-tolerant cloud infrastructure that supports rapid microservice deployment with robust monitoring and high availability.
  • Exceptional cross-team communication and leadership skills, with experience driving alignment across engineering, product, and operations teams.
  • Deep knowledge of large-scale system building blocks, including load balancing, distributed / cloud computing, container orchestration, and advanced monitoring / observability.
  • Expert understanding of cloud networking, including VPC configuration, cross-cloud connectivity, and hybrid cloud architectures.
  • Proficiency in one or more programming languages (e.g., Python, Go, Rust) for building tools and automation frameworks.
  • Preferred Qualifications :

  • Extensive experience designing and implementing advanced monitoring and observability systems (e.g., Prometheus, Grafana, New Relic, Datadog, OpenTelemetry).
  • Strong advocate for "everything as code" principles, with experience institutionalizing IaC and GitOps practices across teams.
  • Deep expertise in Java application servers, JVM tuning, and performance optimization for high-throughput systems.
  • Experience leading cross-functional initiatives to improve system reliability, such as chaos engineering, disaster recovery planning, or zero-downtime deployments.
  • Educational Requirements :

  • Bachelor or Masters degree in Computer Science, Computer Engineering or related fields.
  • The US base salary range for this full-time position is $202,000-$247,000. Fortinet offers employees a variety of benefits, including medical, dental, vision, life and disability insurance, 401(k), 11 paid holidays, vacation time, and sick time as well as a comprehensive leave program.

    Wage ranges are based on various factors including the labor market, job type, and job level. Exact salary offers will be determined by factors such as the candidate's subject knowledge, skill level, qualifications, experience, and geographic location.

    All roles are eligible to participate in the Fortinet equity program, Bonus eligibility is reviewed at time of hire and annually at the Company's discretion.

    Why Join Us :

    We encourage candidates from all backgrounds and identities to apply. We offer a supportive work environment and a competitive Total Rewards package to support you with your overall health and financial well-being.

    Embark on a challenging, enjoyable, and rewarding career journey with Fortinet. Join us in bringing solutions that make a meaningful and lasting impact to our 660,000+ customers around the globe.

    serp_jobs.job_alerts.create_a_job

    Site Reliability Engineer • Santa Clara, CA, United States

    Job_description.internal_linking.related_jobs
    Site Reliability Engineer

    Site Reliability Engineer

    ConductorOne • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Shape the future of identity with the highest-caliber team.If you’re amazing at what you do and want to solve big challenges in identity and security, come on board. Identity is how companies are be...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Fortinet • Sunnyvale, CA, United States
    serp_jobs.job_card.full_time
    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer - SRE at Descope Los Altos, CA

    Site Reliability Engineer - SRE at Descope Los Altos, CA

    Itlearn360 • Los Altos, CA, United States
    serp_jobs.job_card.full_time
    Site Reliability Engineer - SRE job at Descope.Descope R&D group is a skilled team of developers with a unique DNA of creativity,flexibility,anopen mindset. We are looking for a passionate SRE to jo...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer I

    Site Reliability Engineer I

    prosper.com • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Principal Site Reliability Engineer

    Principal Site Reliability Engineer

    Harrison Clarke • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Principal Site Reliability Engineer (SRE).The ideal candidate should have extensive experience in designing highly scalable infrastructure, building systems, and performing testing, monitoring, and...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Latent • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Latent is building the intelligence infrastructure for American healthcare.Our products are already helping hospitals and clinics dramatically increase workflow output, speed up patient access to m...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Reliability Engineer

    Reliability Engineer

    Periodic • Menlo Park, CA, United States
    serp_jobs.job_card.full_time
    We are an AI + physical sciences lab building state of the art models to make novel scientific discoveries.We are well funded and growing rapidly. Team members are owners who identify and solve prob...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Redwood Materials, Inc. • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling—keeping critical minerals in circulation and driving the energy transition.Founded in 2...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    WorkOS • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    About WorkOS 🚀 WorkOS builds tools and services for developers to help them implement authentication, identity, authorization, and overall enterprise readiness. We’re a fully distributed team with ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer I

    Site Reliability Engineer I

    Prosper Marketplace • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Together AI • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer - Technical Lead

    Site Reliability Engineer - Technical Lead

    ZipRecruiter • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Veryon is a leading software and technology company that enables aviation teams around the world to improve efficiency and safety. Our products maximize uptime for aircraft maintenance teams through...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Alchemy • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers th...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer - West Coast

    Site Reliability Engineer - West Coast

    Zapier • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    We're humans who simply think computers should do more work.At Zapier, we’re not just making software—we’re building a platform to help millions of businesses globally scale with automation and AI....serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Redwood Materials • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling — keeping critical minerals in circulation and driving the energy transition.Founded in...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Fractal • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    This range is provided by Fractal.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Fractal Analytics is a strategic AI partner to Fortune 500 com...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Primer • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Primer helps B2B products break out of the B2C-centric marketing box.Our platform turns consumer ad channels, data streams, and emerging AI workflows into measurable growth engines for go-to-market...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Zipline • South San Francisco, CA, US
    serp_jobs.job_card.full_time
    Do you want to change the world? Zipline is on a mission to transform the way goods move.Our aim is to solve the world's most urgent and complex access challenges by building, manufacturing and ope...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted