Talent.com
Staff Site Reliability Engineer

Staff Site Reliability Engineer

Altana AINew York, NY, United States
job_description.job_card.variable_hours_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

AI can be a powerful tool for good in the world – at Altana we apply AI to the world’s largest organized body of supply chain data to power a more resilient, more secure, and more sustainable model of global commerce. Our customers connect to the Altana network to build resilience for critical industries and infrastructure, automate and safeguard cross-border trade, transform insurance underwriting, protect national security, combat modern slave labor, disrupt fentanyl trafficking, and ensure that their products are sustainable.

Altana is backed by leading investors and used by the world’s most important organizations, including Lloyd’s, Maersk, multiple government agencies across the US, UK, EU, Singapore, and Australia, General Atomics, Boston Scientific, and more. We are building a global platform connecting the public and private sectors into an AI-powered network for building trusted supply chains. We operate in accordance with our values : we focus on value creation, not capture; we foster diversity and embrace difference; we embrace reality; we get things done; we amaze our clients. When you join Altana, you’ll be joining a vibrant, collaborative team working together to solve complex problems with the potential for global societal impact.

The Opportunity at Altana

At Altana, we believe that software that ships must be reliable and efficient. As a Staff Site Reliability Engineer, you will be instrumental in ensuring the availability, performance, and scalability of Altana’s critical production services, with a strong focus on our cloud-native environments and data pipelines. You will apply Google-style SRE principles, embedding reliability into our architecture and operations through automation, proactive monitoring, and a commitment to reducing toil.

You will work hands-on with engineering teams, influencing system design for operability and contributing to the development of robust, self-healing infrastructure. This role emphasizes a deep understanding of observability practices to gain comprehensive insights into system behavior, proactive incident prevention, and efficient incident response. Success will be measured by the resilience of our production systems, the effectiveness of our observability stack, and our continuous improvement in operational efficiency and reliability.

Your Responsibilities

  • Reliability Engineering : Champion and implement SRE principles, including establishing and monitoring Service Level Objectives (SLOs) and error budgets for critical services. Drive initiatives to improve system reliability, availability, performance, and efficiency.
  • Observability & Monitoring : Design, implement, and maintain advanced monitoring, logging, and tracing solutions for our cloud-native applications and infrastructure (e.g., Kubernetes, microservices). Develop dashboards, alerts, and runbooks that provide deep insights into system health and behavior.
  • Automation & Toil Reduction : Identify and automate repetitive operational tasks and manual processes across our production environment. Develop tools and scripts to enhance system operations, deployment pipelines, and incident response.
  • Incident Management & Postmortems : Actively participate in the incident response lifecycle, including detection, triage, mitigation, and resolution of production issues. Lead thorough blameless postmortems to identify root causes and implement preventative measures and lasting improvements.
  • System Design & Optimization : Collaborate closely with development teams to influence the design of new services, ensuring they are built for operability, reliability, and cost-efficiency. Proactively identify and address performance bottlenecks and architectural weaknesses.
  • On-Call Rotation : Participate in a periodic on-call rotation, responding to critical alerts and ensuring rapid resolution of production incidents.
  • Data Reliability : Implement and maintain reliability and observability for critical data pipelines and data infrastructure, ensuring data integrity, availability, and timely processing.

About You

  • 5+ years of hands-on experience in a Site Reliability Engineering (SRE), DevOps, or equivalent role focusing on production system reliability and operations.
  • Strong understanding and practical application of Site Reliability Engineering (SRE) principles, including SLOs, error budgets, toil reduction, and blameless culture.
  • Expertise in designing, implementing, and managing observability platforms for cloud-native environments (e.g., Prometheus, Grafana, Datadog, ELK stack, OpenTelemetry, Jaeger).
  • Proficiency in at least one programming / scripting language (e.g., Python, Go) for automation and tool development.
  • Extensive hands-on experience with cloud platforms (AWS, Azure, or GCP), including their compute, networking, and database services.
  • Demonstrated experience with containerization technologies (Docker) and container orchestration platforms (Kubernetes).
  • Experience with Infrastructure as Code (IaC) tools (e.g., Terraform, OpenTofu, CloudFormation) for managing cloud resources.
  • Proven experience participating in and improving incident management processes for critical systems.
  • Knowledge of modern software delivery paradigms, including microservices architectures and CI / CD pipelines.
  • Excellent problem-solving, analytical, and troubleshooting skills in complex distributed systems.
  • Strong communication and collaboration skills, with the ability to work effectively across engineering teams.
  • Experience with data engineering concepts, including building or operating reliable data pipelines, data streaming technologies, or managing large-scale data infrastructure.
  • This role can be based in New York City, Washington D.C., or the San Francisco Bay Area with an expectation of hybrid work or occasional travel as needed.

    US Salary Range And Benefits

    $170,000 - $220,000

    Benefits

    The salary range, to the extent specified for this role, is a good faith statement of the minimum and maximum levels of the annual based salary for the position. The base salary offered to a successful candidate will depend on a wide range of compensation factors, including, but not limited to, work experience, education and / or training, critical skills, and / or business considerations. Competitive equity grants are included in the majority of full time offers; and are considered part of Altana's total compensation package. Altana also offers either a discretionary bonus or a variable compensation plan depending on the role. Additionally, Altana offers top-tier benefits for full-time employees, including :

  • Flexible Time Off : Altana operates with a Flexible Time Off (FTO) policy that gives you agency over your own time off so you can maximize your work-life balance.
  • Parental Leave : We offer industry leading Paid Parental Leave (PPL), providing 14 weeks of leave for non-birthing, adoptive, and foster parents and up to 26 weeks of leave for birthing parents, all paid at 100% of your base salary.
  • Health Benefits : We have a full suite of medical, vision, and dental benefits with generous employer contributions, designed to give you flexibility and choice for your individual health situation. Our high deductible health plan is 100% employer paid for employees and supplemented with an employer contribution to your Health Savings Account (HSA). There is also a Flexible Spending Account (FSA) option.
  • Supplemental Benefits : Altana provides life, short- and long-term disability, and AD&D insurance coverage, all at no cost to you, so you know that you and your loved ones are covered in case of an emergency.
  • 401(k) Savings : Save for and invest in your future using our Guideline 401(k) retirement savings program.
  • Commuter Benefits : Save money on your commute by setting aside pre-tax funds for public transit or parking!
  • Wellness : Because we value mental and emotional health, every Altana employee has access to a free premium subscription to Calm, the #1 app for meditation, sleep, and mindfulness.
  • Pet Insurance : Pets are family too! Keep them healthy with Wishbone insurance and / or our Total Pet vet service and telehealth discount plan.
  • Employee Assistance Program : Free access to confidential personal support.
  • Dependent Care FSA : You will have access to a Dependent Care FSA, which allows you to set aside pre-tax funds for childcare expenses
  • The recruiter assigned to this role can share more information about the specific compensation and benefit details associated with this role during the hiring process.

    Equal Opportunity Statement

    At Altana, we believe that a diverse workforce enables greater creativity, performance, and adaptability. We’re proud to be an equal opportunity employer and welcome you to join us as you are. Our employment opportunities and decisions are based on business needs and individual qualifications, without regard to race, color, religious creed, national origin, ancestry, age, physical or mental disability, medical condition, marital status, sexual orientation, gender identity or expression, genetic information, family care or medical leave status, military or veteran status, or any other characteristic protected by the laws or regulations in the areas in which we operate. We prohibit discrimination and harassment of any type, in any situation.

    Offers related to employment at Altana will come from an Altana.ai email address. We will never ask for payment as part of the interview or onboarding process.

    Why it’s great to work at Altana

  • We love to collaborate, and we win as a team!
  • We are committed to engineering excellence
  • We value personal and professional development
  • We learn from diverse backgrounds and perspectives
  • We impact the world, from enabling developing countries to identifying drug traffickers
  • At Altana, we believe that a diverse workforce enables greater creativity, performance, and adaptability. We’re proud to be an equal opportunity employer and welcome you to join us as you are. Our employment opportunities and decisions are based on business needs and individual qualifications, without regard to race, color, religious creed, national origin, ancestry, age, physical or mental disability, medical condition, marital status, sexual orientation, gender identity or expression, genetic information, family care or medical leave status, military or veteran status, or any other characteristic protected by the laws or regulations in the areas in which we operate. We prohibit discrimination and harassment of any type, in any situation.

    Offers related to employment at Altana will come from an Altana.ai email address. We will never ask for payment as part of the interview or onboarding process.

    #J-18808-Ljbffr

    serp_jobs.job_alerts.create_a_job

    Site Reliability Engineer • New York, NY, United States

    Job_description.internal_linking.related_jobs
    • serp_jobs.job_card.promoted
    • serp_jobs.job_card.new
    Lead, Systems Engineer (Cost Engineer - TruePlanning)

    Lead, Systems Engineer (Cost Engineer - TruePlanning)

    L3Harris TechnologiesORADELL, New Jersey, United States
    serp_jobs.job_card.full_time
    L3Harris is dedicated to recruiting and developing high-performing talent who are passionate about what they do.Our employees are unified in a shared dedication to our customers’ mission and quest ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
    Site Reliability Engineer

    Site Reliability Engineer

    Triangle WorkforceNew York, New York, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Site Reliability Engineer, Commodities Technology.Ensure high availability and uptime of Commodities Technology services and applications. Automate and streamline manual processes.Contribute to root...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Sr. Controls Engineer

    Sr. Controls Engineer

    JobotPaterson, NJ, US
    serp_jobs.job_card.full_time
    This Jobot Job is hosted by : Nick Strebig.Are you a fit? Easy Apply now by clicking the "Apply Now" buttonand sending us your resume. Salary : $100,000 - $160,000 per year.Based in New Jers...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Platform Lead, Reliability & Maintenance

    Platform Lead, Reliability & Maintenance

    Zoetis, IncParsippany-Troy Hills, NJ, United States
    serp_jobs.job_card.full_time
    The Global Engineering organization is responsible for Real Estate and Facilities Management, Reliability & Maintenance, Capital Project & Portfolio Management, Process Automation, Energy & Utility...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Sr. Site Reliability Engineer

    Sr. Site Reliability Engineer

    VimeoNew York, NY, US
    serp_jobs.job_card.full_time
    Do you love working with cloud infrastructure at scale? Optimizing the last bit of performance and efficiency out of applications that get hundreds of thousands of requests per second? Digging deep...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    Senior Engineer, Site Reliability

    Senior Engineer, Site Reliability

    VirtualVocationsElizabeth, New Jersey, United States
    serp_jobs.job_card.full_time
    A company is looking for a Senior Engineer in Site Reliability Engineering for Digital Banking.Key Responsibilities Ensure the reliability, availability, and performance of applications in produc...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    Site Reliability Engineer Charlotte, NC / Chandler, AZ / , NJ

    Site Reliability Engineer Charlotte, NC / Chandler, AZ / , NJ

    Career Mentors, LLCJersey City, NJ, US
    serp_jobs.job_card.full_time
    Pay Rate : upto $75 pr hr on W2.Jersey City, NJ - Near by candidates.Previously functioned in an SRE role within a large production environment, with a focus on automation testing experience.Hands-o...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    • serp_jobs.job_card.new
    Lead, Systems Engineer (Cost Engineer - TruePlanning))

    Lead, Systems Engineer (Cost Engineer - TruePlanning))

    L3Harris TechnologiesPERTH AMBOY, New Jersey, United States
    serp_jobs.job_card.full_time
    L3Harris is dedicated to recruiting and developing high-performing talent who are passionate about what they do.Our employees are unified in a shared dedication to our customers’ mission and quest ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
    • serp_jobs.job_card.promoted
    Staff Engineer (Site / Civil)

    Staff Engineer (Site / Civil)

    SESI Consulting EngineersParsippany, NJ, US
    serp_jobs.job_card.temporary
    SESI Consulting Engineers has been selected as one of NJBIZ’s "FASTEST GROWING" Companies and Inc.America’s Fastest-Growing Private Companies". Join a company recognized f...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Air Interdiction Agent

    Air Interdiction Agent

    U.S. Customs and Border ProtectionMineola, NY, United States
    serp_jobs.job_card.full_time
    Pilot CBP Air Interdiction Agent.Air and Marine Operations (AMO), a component of U.Customs and Border Protection (CBP), offers skilled Pilots interested in law enforcement an opportunity to work wi...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    VirtualVocationsPaterson, New Jersey, United States
    serp_jobs.job_card.full_time
    A company is looking for a Staff Site Reliability Engineer.Key Responsibilities Define and drive the strategic direction for SRE practices and reliability engineering Architect and implement com...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Senior Site Reliability Engineer - Infrastructure

    Senior Site Reliability Engineer - Infrastructure

    The Trade DeskNew York, NY, United States
    serp_jobs.job_card.full_time
    The Trade Desk is changing the way global brands and their agencies advertise to audiences around the world.How? With a media buying platform that helps brands deliver a more insightful and relevan...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    Site Reliability Engineer

    Site Reliability Engineer

    VirtualVocationsElizabeth, New Jersey, United States
    serp_jobs.job_card.full_time
    A company is looking for a Site Reliability Engineer to join a Cloud Services team in a remote role.Key Responsibilities Serve as a cloud SME for clients, providing expertise in design, architect...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Senior Civil / Site Project Manager

    Senior Civil / Site Project Manager

    JobotLake Como, NJ, US
    serp_jobs.job_card.full_time
    Civil / Site Project Manager Needed for Growing Full-Service and Consulting Firm!!.This Jobot Job is hosted by : Bryce Koelsch. Are you a fit? Easy Apply now by clicking the "Apply Now" but...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    • serp_jobs.job_card.new
    Site Reliability Engineer

    Site Reliability Engineer

    Intrepid USAHolmdel, NJ, United States
    serp_jobs.job_card.full_time
    We are seeking a skilled Engineer, Site Reliability (SRE) to contribute to the reliability, scalability, and performance of our multi-cloud SaaS platform serving thousands of customers worldwide.Th...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
    Cloud Engineer - Site Reliability Engineer

    Cloud Engineer - Site Reliability Engineer

    VirtualVocationsPaterson, New Jersey, United States
    serp_jobs.job_card.full_time
    A company is looking for a Cloud Engineer-Site Reliability Engineer.Key Responsibilities Design and manage complex engineering and integration of application, security, and infrastructure archite...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Site Director

    Site Director

    AuroriumBerkeley Heights, NJ, United States
    serp_jobs.job_card.full_time
    Aurorium is the materials innovation partner that helps global manufacturers harness the power of possibility to make the world a better place. Their specialty ingredients and high-performance mater...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    VirtualVocationsBrooklyn, New York, United States
    serp_jobs.job_card.full_time
    A company is looking for a Senior Site Reliability Engineer to join their Platform Engineering team.Key Responsibilities Design and implement observability solutions and monitoring dashboards for...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30