Search jobs > Washington, DC > Site reliability engineer

Site Reliability Engineer - TITAN Program

Palantir Technologies
Washington, DC, US
Full-time

A World-Changing Company

Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.

The Role

Palantir has been selected as the prime contractor for the development and delivery of the Tactical Intelligence Targeting Access Node (TITAN) ground station system, the Army’s next-generation deep-sensing capability enabled by artificial intelligence and machine learning (AI / ML).

The TITAN team will be focused on the development of 10 TITAN prototypes, including five Advanced and five Basic variants, as well as the integration of new critical technologies to modernize the sensor to shooter workflow in support of Army long range precision fires.

TITAN is a ground station that has access to Space, High Altitude, Aerial, and Terrestrial sensors to provide actionable targeting information for enhanced mission command and long range precision fires.

Palantir’s TITAN solution is designed to maximize usability for Soldiers, incorporating tangible feedback and insights from Soldier touch points at every step of the development and configuration process.

Building off Palantir’s prior work delivering AI capabilities for the warfighter, Palantir is deploying the Army’s first AI-defined vehicle and is looking for teammates to help deliver next generation capability to meet the needs of future conflict.

Our U.S. Government team is looking for a skilled Site Reliability Engineer with an unbending commitment to security to join us.

The ideal candidate will combine their engineering experience and drive to improve existing systems and processes, with the creativity to develop novel solutions to evolving challenges.

Core Responsibilities

  • Collaborate with cross-functional teams to ensure the reliability, scalability, and performance of our systems and applications.
  • Design, implement, and maintain containerized environments.
  • Contribute to automation infrastructure and tools to streamline operations, deployment, and monitoring processes.
  • Monitor and troubleshoot system and application performance, identifying and resolving issues to minimize downtime.
  • Implement and maintain robust security measures to protect systems and data, while prioritizing security.
  • Collaborate with development teams to optimize application performance and ensure seamless integration of new features.
  • Participate in on-call rotations and respond to incidents to ensure timely resolution.
  • Stay up-to-date with the latest industry trends and technologies, proactively finding opportunities for improvement.

What We Value

  • Strong proficiency in Linux or Windows system administration and troubleshooting.
  • In-depth knowledge and practical experience with containerization solutions such as OpenShift, Kubernetes, Rancher, or MicroK8s.
  • Proficiency with programming and scripting languages such as Go, Bash, Java, and Python.
  • Familiarity with automation and configuration management tools (e.g., Ansible, Chef, Puppet).
  • Understanding of systems and network security principles and best practices.
  • Excellent problem-solving skills and the ability to analyze complex issues.
  • Strong communication and collaboration skills to work effectively in cross-functional teams.
  • Ability to operate autonomously and without day-to-day direction.

What We Require

  • Willingness to travel up to 25% of the time.
  • Experience with cloud platforms (e.g., AWS, Azure, Google Cloud Platform) or on-premise hardware.
  • Knowledge of DevOps and DevSecOps principles and practices.
  • Experience with CI / CD pipelines and related tools (e.g., Jenkins, GitLab CI).
  • Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
  • Understanding of database systems and SQL.
  • An active U.S. security clearance.

Life at Palantir

We want every Palantirian to achieve their best outcomes, that’s why we celebrate individuals’ strengths, skills, and interests, from your first interview to your longterm growth, rather than rely on traditional career ladders.

Paying attention to the needs of our community enables us to optimize our opportunities to grow and helps ensure many pathways to success at Palantir.

Promoting health and well-being across all areas of Palantirians’ lives is just one of the ways we’re investing in our community.

Learn more at Life at Palantir and note that our offerings may vary by region.

In keeping consistent with Palantir’s values and culture, we believe employees are better together and in-person work affords the opportunity for more creative outcomes.

Therefore, we encourage employees to work from our offices to foster connectivity and innovation. Many teams do offer hybrid options (WFH a day or two a week), allowing our employees to strike the right trade-off for their personal productivity.

Based on business need, there are a few roles that allow for Remote work on an exceptional basis. If you are applying for one of these roles, you must work from the state in which you are employed.

If the posting is specified as Onsite, you are required to work from an office.

Palantir is committed to promoting a culture of diversity, equity, and inclusion and is proud to be an Equal Employment Opportunity and Affirmative Action employer.

We believe that all Palantirians share the responsibility of upholding our commitment to these values and encourage candidates from a wide range of backgrounds, perspectives, and lived experiences to join us in solving the world’s hardest problems.

Palantir does not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.

Palantir is committed to working with and providing reasonable accommodations to qualified individuals with physical and mental disabilities.

Please see the United States Department of Labor’s EEO posterEEO poster supplement and Pay Transparency Notice for additional information.

Palantir is committed to making the job application process accessible to everyone. If you are living with a disability (visible or not visible) and need to request a reasonable accommodation for any part of the application or hiring process, please reach out and let us know how we can help.

J-18808-Ljbffr

6 days ago
Related jobs
Promoted
Palantir Technologies
Washington, District of Columbia

Government team is looking for a skilled Site Reliability Engineer with an unbending commitment to security to join us. The TITAN team will be focused on the development of 10 TITAN prototypes, including five Advanced and five Basic variants, as well as the integration of new critical technologies t...

Promoted
Mission Box Solutions
Washington, District of Columbia

As a Site Reliability Engineer (SRE), you will play a vital role in continuously driving improvements in observability, performance, and reliability, aiming to make a substantial impact across the federal government. Minimum of 8 years of experience as a Site Reliability Engineer, demonstrating a st...

Promoted
Mount Indie
Washington, District of Columbia

Site Reliability Engineer (SRE). As a member of this team, you will work onsite at JBAB (Joint Base Anacostia-Bolling) 3 days per week and remotely 2 days. Ability to work in downtown Washington, DC on client site at least 3 days per week. ...

Promoted
Palantir Technologies
Washington, District of Columbia

Site Reliability Engineers combine engineering experience and an innate drive to improve existing systems and processes, with the creativity to develop novel solutions to evolving challenges. We’re looking for Site Reliability Engineers who can help us build, operate, and maintain high-performance, ...

Promoted
Computer World Services (CWS)Corporation
Washington, District of Columbia

The Senior Systems Engineer - Observability (SSE) will define and implement infrastructure and application observability, set up governance, optimization, monitoring, and control for a consolidated common operating picture for IT operations. The role will work with engineering, application, security...

Promoted
Celonis
Washington, District of Columbia

You will be part of a highly technical, collaborative and creative team, with a focus on SRE & Software Engineering. Responsible for the design, implementation, reliability and management of cloud-based FedRAMP-compliant applications and platforms. Computer Science, Software Engineering) or a co...

Promoted
System One
Washington, District of Columbia

As a Site Reliability Engineer (SRE), you’ll continuously drive improvements in observability, performance, and reliability, with the goal to make an impact across the federal government. Minimum of 8 years of experience as a Site Reliability Engineer with a strong understanding of SRE principles fo...

Anduril
Washington, District of Columbia

A JADC2 Site Reliability Engineer (SRE) installs, connects and maintains Anduril’s software to deliver mission-critical capabilities to our customers. To this end, you will work alongside a product development team where you will leverage your operations and engineering experience to shape and deplo...

Splunk Inc
Washington, District of Columbia
Remote

Site Reliability Engineers in this role will be engaging with multiple service owners across the platform to teach and implement modern interpretations ofSRE,observability, Chaos Engineering andDevOps. Splunk's Cloud Services group is looking for a Site ReliabilityEngineer to help lead, design and b...

WEX Inc
Washington, District of Columbia
Remote

The WEX Site Reliability Engineering (SRE) team is looking for individuals passionate about developing software and solutions focused on observability, incident response, reliability and performance, operational excellence, and compliance. Site Reliability Engineer or equivalent role. As part of the...