Search jobs > Washington, DC > Site reliability engineer

Site Reliability Engineer - US Government

Palantir Technologies
Washington, DC, US
Full-time

A World-Changing Company

Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.

The Role

We’re looking for Site Reliability Engineers who can help us build, operate, and maintain high-performance, scalable, and reliable services for our production infrastructure, across both cloud & on-prem environments.

Site Reliability Engineers combine engineering experience and an innate drive to improve existing systems and processes, with the creativity to develop novel solutions to evolving challenges.

Our team strives to automate processes wherever possible, using whichever tools are best for the job. You’ll be the experts for the environments that you operate infrastructure in, helping partner teams build & configure their software to operate reliably within.

We strongly believe in engineering teams being responsible for the operations of their services in production. In this role, you’ll work closely with engineers to advocate and participate in sensible, scalable, systems design and share responsibility with them in diagnosing, resolving, and preventing production issues.

Core Responsibilities

  • Maintaining availability of cloud & physical Linux servers that power the Palantir platform in air-gapped production environments.
  • Design, deploy, and operate infrastructure to support customer & product requirements via modern orchestration & monitoring platforms.
  • Collaborate closely with product teams on requirements & SLOs for deploying software into air-gapped environments.
  • Identifying, troubleshooting, and solving network & systems issues.
  • Scripting to automate away routine operational tasks.

What We Value

  • Active US Security clearance, or eligibility and willingness to obtain a US Security clearance.
  • Confidence in troubleshooting complex systems issues independently using stack traces and observability & systems tools.
  • Comfort with managing large scale production systems and technologies with configuration management, load balancing, monitoring & alerting infrastructure, and container orchestration.
  • Demonstrated ability to continuously learn and work independently, making decisions with minimal supervision while working in secure facilities.
  • Experience with containers (Docker / Podman) and orchestration (OpenShift / Kubernetes) at scale is a plus.
  • Preferred Certifications : DOD 8570 IAT Level II or greater (CISSP, Sec+), Unix / Linux Computing Environment (e.g Linux+, RHCE).
  • Proficiency with scripting in Python or Go is a plus.

What We Require

  • 5+ years of experience with Linux system administration (RHEL or equivalent preferred).
  • Experience with cloud-based hosting platforms like AWS, Azure, or GCP and / or experience with hardware-based environments.
  • Familiarity with monitoring systems using tools like Prometheus and writing health checks.

Life at Palantir

We want every Palantirian to achieve their best outcomes, that’s why we celebrate individuals’ strengths, skills, and interests, from your first interview to your long-term growth, rather than rely on traditional career ladders.

Paying attention to the needs of our community enables us to optimize our opportunities to grow and helps ensure many pathways to success at Palantir.

Promoting health and well-being across all areas of Palantirians’ lives is just one of the ways we’re investing in our community.

In keeping consistent with Palantir’s values and culture, we believe employees are better together and in-person work affords the opportunity for more creative outcomes.

Therefore, we encourage employees to work from our offices to foster connectivity and innovation. Many teams do offer hybrid options (WFH a day or two a week), allowing our employees to strike the right trade-off for their personal productivity.

Based on business need, there are a few roles that allow for Remote work on an exceptional basis. If you are applying for one of these roles, you must work from the state in which you are employed.

If the posting is specified as Onsite, you are required to work from an office.

Palantir is committed to promoting a culture of diversity, equity, and inclusion and is proud to be an Equal Employment Opportunity and Affirmative Action employer.

We believe that all Palantirians share the responsibility of upholding our commitment to these values and encourage candidates from a wide range of backgrounds, perspectives, and lived experiences to join us in solving the world’s hardest problems.

Palantir does not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.

Palantir is committed to working with and providing reasonable accommodations to qualified individuals with physical and mental disabilities.

Palantir is committed to making the job application process accessible to everyone. If you are living with a disability (visible or not visible) and need to request a reasonable accommodation for any part of the application or hiring process, please reach out and let us know how we can help.

J-18808-Ljbffr

5 days ago
Related jobs
Promoted
Palantir Technologies
Washington, District of Columbia

We’re looking for Site Reliability Engineers who can help us build, operate, and maintain high-performance, scalable, and reliable services for our production infrastructure, across both cloud & on-prem environments. Site Reliability Engineers combine engineering experience and an innate drive t...

Promoted
Palantir Technologies
Washington, District of Columbia

We’re looking for Site Reliability Engineers who can help us build, operate, and maintain high-performance, scalable, and reliable services for our production infrastructure, across both cloud & on-prem environments. Site Reliability Engineers combine engineering experience and an innate drive t...

Promoted
Computer World Services (CWS)Corporation
Washington, District of Columbia

Development of custom dashboards with a focus on reliability and performance of services. The Senior Systems Engineer - Observability (SSE) will define and implement infrastructure and application observability, set up governance, optimization, monitoring, and control for a consolidated common opera...

Promoted
Tetra Tech, Inc.
Washington, District of Columbia

PRO-Telligent, A Tetra Tech Company, needs Electrical Engineers for Global Projects that have a US Govt. US Bachelor's degree in Electrical Engineering. ABET accredited international degree accepted in lieu of US Bachelor's or US PE. Security Clearance for US Embassy and Consulate construction proje...

Promoted
Splunk Inc.
Washington, District of Columbia

We are looking for a Site Reliability Engineer to join our Splunk Cloud's Traffic Engineering team to help scale and secure the global Cloud networking infrastructure. Use continuous delivery, testing, and security standard methodologies. Capable of brainstorming a product outage, skilled in identif...

Promoted
Palantir Technologies
Washington, District of Columbia

As one of our engineers, you will be responsible for network services in our US Government facilities, but also part of a broader team supporting services globally. As a member of the Network Engineering team, you will be deeply involved in our mission to continue the evolution of Palantir’s network...

Palantir Technologies
Washington, District of Columbia

Our mission is deploying software in support of our customers' most critical needs as quickly as possible while upholding the government's trust. Our team members are subject matter experts in both cybersecurity and US Government policy. Achieve ATOs for Palantir software across multiple government ...

High Value Talent INC
Washington, District of Columbia

As a Site Reliability Engineer (SRE), you will play a vital role in continuously driving improvements in observability, performance, and reliability, aiming to make a substantial impact across the federal government. Minimum of 8 years of experience as a Site Reliability Engineer, demonstrating a st...

CareFirst BlueCross BlueShield
Washington, District of Columbia
Remote

Must be able to effectively communicate and provide positive customer service to every internal and external customer, including customers who may be demanding or otherwise challenging. Must be able to meet established deadlines and handle multiple customer service demands from internal and external...

Anduril
Washington, District of Columbia

A JADC2 Site Reliability Engineer (SRE) installs, connects and maintains Anduril’s software to deliver mission-critical capabilities to our customers. By bringing the expertise, technology, and business model of the 21st century’s most innovative companies to the defense industry, Anduril is changin...