Talent.com
Principal Site Reliability Engineer

Principal Site Reliability Engineer

DMV IT ServiceWashington, DC, US
job_description.job_card.variable_hours_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
  • serp_jobs.filters_job_card.quick_apply
job_description.job_card.job_description

Job Title : Principal Site Reliability Engineer

Location : Washington, D.C.

Employment Type : Contract

About US :

DMV IT Service LLC, founded in 2020, is a trusted IT consulting firm specializing in IT infrastructure optimization, cybersecurity, networking, and staffing solutions. We partner with clients to achieve technology goals through expert guidance, workforce support, and innovative solutions. With a client-focused approach, we also provide online training and job placements, ensuring long-term IT success.

Job Purpose :

We are seeking a highly skilled Principal Site Reliability Engineer to lead and elevate the reliability, scalability, and security of critical infrastructure systems. This position requires a seasoned technical professional with deep expertise in infrastructure automation (IaC) , CI / CD architecture , and cloud security , combined with hands-on experience in Site Reliability Engineering (SRE) principles such as SLOs, error budgets, and incident management. The ideal candidate will provide technical leadership, mentor cross-functional teams, and ensure systems are built for performance, resilience, and efficiency.

Requirements

Key Responsibilities :

  • Reliability & Operations : Establish and manage Service Level Objectives (SLOs) and Service Level Indicators (SLIs) ; oversee incident response , root cause analysis , and continuous service improvement initiatives.
  • Infrastructure Automation : Architect and manage scalable and secure cloud infrastructures using Infrastructure-as-Code (IaC) tools such as Terraform , Ansible , and CloudFormation .
  • CI / CD Optimization : Build and optimize secure CI / CD pipelines (e.g., GitHub Actions , Jenkins ) with automated rollbacks, canary and blue-green deployments , and artifact validation processes.
  • Observability & Monitoring : Develop advanced observability systems by creating dashboards , configuring alerts , and implementing synthetic checks for complete system visibility.
  • Security Integration : Embed security testing and compliance tools (SAST, DAST, SBOM, secret scanning) into deployment workflows and enforce security policies-as-code .
  • Cost & Capacity Management : Track and optimize cloud costs , manage capacity planning , and ensure efficient infrastructure utilization and uptime.
  • Platform Enablement : Develop self-service tools and shared frameworks that enhance developer efficiency and maintain delivery consistency.
  • Leadership & Mentorship : Act as a technical leader, mentor engineering teams, and champion best practices in reliability, automation, and secure delivery.

Required Skills & Experience :

  • Bachelor’s degree in Computer Science , Engineering , or related field.
  • At least 5 years of experience in SRE, DevOps, or Platform Engineering , with leadership in reliability and automation.
  • Minimum 3 years managing production-grade cloud systems using modern security and observability tools.
  • Strong expertise in AWS , Azure , or GCP , especially in Compute, Networking, and IAM.
  • Hands-on proficiency with Terraform , CloudFormation , Kubernetes , and Docker .
  • Solid background in Linux systems , shell scripting , and programming in Python , Go , or Bash .
  • Proficient with observability tools such as Prometheus , Grafana , ELK , Datadog , or CloudWatch .
  • Proven experience designing and managing secure CI / CD pipelines and GitOps workflows .
  • Deep understanding of SRE practices , including chaos engineering , SLO / SLA management , and capacity modeling .
  • Strong documentation, communication, and leadership skills with a record of improving operational standards.
  • serp_jobs.job_alerts.create_a_job

    Site Reliability Engineer • Washington, DC, US

    Job_description.internal_linking.related_jobs
    • serp_jobs.job_card.promoted
    Reliability Engineer

    Reliability Engineer

    JobotFrederick, MD, US
    serp_jobs.job_card.full_time
    Manufacturing company hiring Reliability Engineer in Frederick County!.This Jobot Job is hosted by : Christine McNamara.Are you a fit? Easy Apply now by clicking the "Apply Now" buttonand ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Staff Site Reliability Engineer (Federal)

    Staff Site Reliability Engineer (Federal)

    OktaWashington, DC, United States
    serp_jobs.job_card.full_time
    Okta is The World's Identity Company.We free everyone to safely use any technology, anywhere, on any device or app.Our flexible and neutral products, Okta Platform and Auth0 Platform, provide secur...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Site Reliability Engineer (Pipeline)

    Site Reliability Engineer (Pipeline)

    Technica CorporationWashington, DC, United States
    serp_jobs.job_card.full_time
    At Technica Corporation, our goal is to provide exceptional professional services and innovative technology solutions that meet or exceed our customer’s expectations. We specialize in a wide range o...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_1_day
    • serp_jobs.job_card.new
    Principal Site Reliability Engineer

    Principal Site Reliability Engineer

    Black Rock GroupsWashington, DC, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    The Principal Site Reliability Engineer will be a critical technical leader responsible for driving the operational excellence, resilience, and security of our core systems for a key Randstad clien...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
    • serp_jobs.job_card.promoted
    Site Reliability Engineer - Redmond WA

    Site Reliability Engineer - Redmond WA

    Redis EnterpriseWashington, DC, United States
    serp_jobs.job_card.full_time
    We built the product that runs the fast apps our world runs on.If you checked the weather, used your credit card, or looked at your flight status online today, you’re welcome.At Redis, you’ll work ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Senior Software Engineer, Site Reliability

    Senior Software Engineer, Site Reliability

    Capital OneWashington, D.C., US
    serp_jobs.job_card.full_time +1
    Senior Software Engineer, Site Reliability Do you love building and pioneering in the technology space? Do you enjoy solving complex business problems in a fast-paced, collaborative, inclusive , an...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Site Reliability Engineer - Cloud (Remote)

    Principal Site Reliability Engineer - Cloud (Remote)

    Donnelley Financial, LLCRockville, MD, United States
    serp_jobs.filters.remote
    serp_jobs.job_card.full_time
    Join a dynamic team at the pulse of global markets, where we deliver innovative software and service solutions for essential financial reporting and capital markets transactions.At DFIN, we are a v...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    • serp_jobs.job_card.new
    Site Reliability / Gitops Engineer

    Site Reliability / Gitops Engineer

    CanonicalWashington, DC, United States
    serp_jobs.job_card.full_time
    Site Reliability / Gitops Engineer.Be among the first 25 applicants.Site Reliability / Gitops Engineer.Canonical is a leading provider of open source software and operating systems to the global en...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
    • serp_jobs.job_card.promoted
    • serp_jobs.job_card.new
    Principal Site Reliability Engineer (SRE) at Jobgether Washington DC

    Principal Site Reliability Engineer (SRE) at Jobgether Washington DC

    JobgetherMt Rainier, MD, United States
    serp_jobs.job_card.full_time
    Overview Principal Site Reliability Engineer (SRE) job at Jobgether.This position is posted by Jobgether on behalf of Claroty. We are currently looking for a Principal Site Reliability Engineer (SRE...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
    Site Reliability Engineer

    Site Reliability Engineer

    Tax AnalystsFalls Church, VA, US
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Tax Analysts is seeking a Site Reliability Engineer (SRE) to help establish and shape our reliability engineering practice from the ground up. This is a unique opportunity to join a mission-driven o...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Site Reliability Engineer - Developer, Connected Warfare

    Site Reliability Engineer - Developer, Connected Warfare

    Anduril IndustriesWashington, DC, United States
    serp_jobs.job_card.full_time
    Site Reliability Engineer, Connected Warfare.Washington, District of Columbia, United States.Anduril Industries is a defense technology company with a mission to transform U.By bringing the experti...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Software Reliability Engineer

    Software Reliability Engineer

    RaftMcLean, VA, United States
    serp_jobs.job_card.full_time
    All of the programs we support require.All work must be conducted within the continental U.Distributed Data Systems, Platforms at Scale, and Complex Application Development, with headquarters in Mc...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CSCI ConsultingQuantico, VA, United States
    serp_jobs.job_card.full_time
    CSCI Consulting is looking for a.Site Reliability Engineer (SRE).This role combines deep systems engineering knowledge with DevOps automation, proactive monitoring, and incident response practices....serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    • serp_jobs.job_card.new
    Cloud Site Reliability Engineer

    Cloud Site Reliability Engineer

    Ford Motor CompanyWashington, DC, United States
    serp_jobs.job_card.full_time
    Enterprise Technology is the engine driving the future of transportation.If you’re looking for the chance to leverage advanced technology to redefine the mobility landscape, enhance the customer ex...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
    • serp_jobs.job_card.promoted
    Site Reliability Engineer, Home

    Site Reliability Engineer, Home

    Google Inc.Washington, DC, United States
    serp_jobs.job_card.full_time
    Experience completing work as directed, and collaborating with teammates; developing knowledge of relevant concepts and processes. At Google, we have a vision of empowerment and equitable opportunit...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_1_day
    • serp_jobs.job_card.promoted
    Reliability Engineer

    Reliability Engineer

    Lockheed MartinBethesda, MD, United States
    serp_jobs.job_card.full_time
    Lockheed Martin is a global security and aerospace company that employs some of the greatest minds in the industry.They are passionate about purposeful innovation, dedicated to keeping people safe ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_1_day
    • serp_jobs.job_card.promoted
    • serp_jobs.job_card.new
    Sr Site Reliability Engineer - Remote US

    Sr Site Reliability Engineer - Remote US

    SitusAMCWashington, DC, United States
    serp_jobs.filters.remote
    serp_jobs.job_card.full_time
    SitusAMC is where the best and most passionate people come to transform our client’s businesses and their own careers.Whether you’re a real estate veteran, a passionate technologist, or looking to ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
    • serp_jobs.job_card.promoted
    Lead Software Engineer, Site Reliability

    Lead Software Engineer, Site Reliability

    Capital OneWashington, DC, United States
    serp_jobs.job_card.full_time +1
    Lead Software Engineer, Site Reliability.Do you love building and pioneering in the technology space? Do you enjoy solving complex business problems in a fast-paced, collaborative, inclusive.At Cap...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    • serp_jobs.job_card.new
    Data Center Facility Operations Reliability Engineer

    Data Center Facility Operations Reliability Engineer

    MetaWashington, DC, United States
    serp_jobs.job_card.full_time
    Meta was built to help people connect and share, and over the last decade, our tools have played a critical part in changing how people around the world communicate with one another.With over two b...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
    • serp_jobs.job_card.promoted
    Site Reliability Engineer, Connected Warfare

    Site Reliability Engineer, Connected Warfare

    Jobs via DiceWashington, DC, United States
    serp_jobs.job_card.full_time
    Site Reliability Engineer, Connected Warfare.Posted 60+ days ago | Updated 10 hours ago.Anduril Industries is a defense technology company with a mission to transform U. By bringing the expertise, t...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30