Search jobs > Washington, DC > Site reliability engineer

Site Reliability Engineer - US Government

Palantir
Washington, DC
$125K-$185K a year
Full-time

A World-Changing Company

Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.

The Role

We’re looking for Site Reliability Engineers who can help us build, operate, and maintain high-performance, scalable, and reliable services for our production infrastructure, across both cloud & on-prem environments.

Site Reliability Engineers combine engineering experience and an innate drive to improve existing systems and processes, with the creativity to develop novel solutions to evolving challenges.

Our team strives to automate processes wherever possible, using whichever tools are best for the job. You’ll be the experts for the environments that you operate infrastructure in, helping partner teams build & configure their software to operate reliably within.

We strongly believe in engineering teams being responsible for the operations of their services in production. In this role, you’ll work closely with engineers to advocate and participate in sensible, scalable, systems design and share responsibility with them in diagnosing, resolving, and preventing production issues.

Core Responsibilities

  • Maintaining availability of cloud & physical Linux servers that power the Palantir platform in air-gapped production environments.
  • Design, deploy, and operate infrastructure to support customer & product requirements via modern orchestration & monitoring platforms.
  • Collaborate closely with product teams on requirements & SLOs for deploying software into air-gapped environments.
  • Identifying, troubleshooting, and solving network & systems issues.
  • Scripting to automate away routine operational tasks.

What We Value

  • Active US Security clearance, or eligibility and willingness to obtain a US Security clearance.
  • Confidence in troubleshooting complex systems issues independently using stack traces and observability & systems tools.
  • Comfort with managing large scale production systems and technologies with configuration management, load balancing, monitoring & alerting infrastructure, and container orchestration.
  • Demonstrated ability to continuously learn and work independently, making decisions with minimal supervision while working in secure facilities.
  • Experience with containers (Docker / Podman) and orchestration (OpenShift / Kubernetes) at scale is a plus.
  • Preferred Certifications : DOD 8570 IAT Level II or greater (CISSP, Sec+), Unix / Linux Computing Environment (e.g Linux+, RHCE).
  • Proficiency with scripting in Python or Go is a plus.

What We Require

  • 5+ years of experience with Linux system administration (RHEL or equivalent preferred).
  • Experience with cloud-based hosting platforms like AWS, Azure, or GCP and / or experience with hardware-based environments.
  • Familiarity with monitoring systems using tools like Prometheus and writing health checks.
  • Our benefits aim to promote health and wellbeing across all areas of Palantirians’ lives. We work to continuously improve our offerings and listen to our community as we design and update them.

The list below details our available benefits and some of the perks that can be enjoyed as an employee of Palantir Technologies.

Benefits

  • Medical, dental, and vision insurance
  • Life and disability coverage
  • Paid leave for new parents and emergency back-up care for all parents
  • Family planning support, including fertility, adoption, and surrogacy assistance
  • Stipend to help with expenses that come with a new child
  • Commuter benefits
  • Relocation assistance
  • Unlimited paid time off
  • 2 weeks paid time off built into the end of each year

Salary

The estimated salary range for this position is estimated to be $125,000 - $185,000 / year. Total compensation for this position may also include Restricted Stock units, sign-on bonus and other potential future incentives.

Further note that total compensation for this position will be determined by each individual’s relevant qualifications, work experience, skills, and other factors.

This estimate excludes the value of any potential sign-on bonus; the value of any benefits offered; and the potential future value of any long-term incentives.

30+ days ago
Related jobs
Promoted
VirtualVocations
Washington, District of Columbia

A company is looking for a Senior Site Reliability Engineer - Data & AI. ...

Capital One
Washington, District of Columbia
Remote

Site Reliability Engineer - Backend, Shopping (Remote-Eligible). Do you love building and pioneering in the technology space? Do you enjoy solving complex business problems in a fast-paced, collaborative, inclusive, and iterative delivery environment? At Capital One, you'll be part of a big group of...

Computer World Services (CWS)Corporation
Washington, District of Columbia

Development of custom dashboards with a focus on reliability and performance of services. The Senior Systems Engineer - Observability (SSE) will define and implement infrastructure and application observability, set up governance, optimization, monitoring, and control for a consolidated common opera...

Tetra Tech
Washington, District of Columbia

Projects Types include: New US Embassy Compound (NEC) New US Consulate Compound (NCC) Compound Projects may include some of the following: Office buildings Marine Security Guard Residences (MSGR) Chanceries Support Annexes (SPX) Shops and Vehicle Maintenance Bays Compound Access Control Facilities (...

UnitedHealth Group
Washington, District of Columbia
Remote

Must be possess an industry recognized Reliability Engineer Certification CRE. Site Reliability Engineer (SRE). You will be responsible for design review and control; prediction, estimation, and apportionment methodology; failure mode effects and analysis; the planning, operation and analysis of rel...

EssentiHire
Washington, District of Columbia

As aSite Reliability Engineer (SRE) you will play a vital role incontinuously driving improvements in observability performance andreliability aiming to make a substantial impact across the federalgovernment. SiteReliability Engineer TS/SCI Govt. Weare partnering with a leading government technology...

WEX, Inc.
Washington, District of Columbia
Remote

The WEX Site Reliability Engineering (SRE) team is looking for individuals passionate about developing software and solutions focused on observability, incident response, reliability and performance, operational excellence, and compliance. As part of the Platform Reliability organization you'll have...

Splunk Inc
Washington, District of Columbia

Join us as we pursue our disruptive vision to make machine data accessible, usable and valuable to everyone. Skilled in identifying performance bottlenecks, spotting anomalous system behavior, and determining the root cause of incidents. We are a company filled with people who are passionate about o...

Varada Consulting, LLC
Washington, District of Columbia

Apply Site Reliability Engineering (SRE) principles to design, build, and operate highly scalable and reliable systems that meet the needs of our customers. Varada Consulting, LLC is seeking a full-time highly skilled and experienced Site Reliability Engineer (SRE) to join our team. Monitor system p...

Capital One
Washington, District of Columbia

As a Capital One Lead Software Engineer, Site Reliability Engineer you’ll have the opportunity to be on the forefront of driving a major transformation within Capital One. Lead Software Engineer, Site Reliability (Bank Tech). Do you love building and pioneering in the technology space? Do you enjoy ...