Search jobs > San Francisco, CA > Senior site reliability

Senior Site Reliability Engineer

Federal Reserve System
San Francisco, CA
$110.2K-$151.6K a year
Full-time
Part-time

Company

Federal Reserve Bank of RichmondThe Richmond Fed is the proud home of the Federal Reserve’s National IT organization a nationwide team delivering technology solutions and support across the Federal Reserve System.

Many National IT employees are located in Richmond, while others are based across the U.S. at other Federal locations.

When you join our team, you’ll become part of a culture that welcomes differences, cares about our communities, and empowers each other to lead from where we are to make things better.

Bring your passion and we’ll provide challenging and purposeful careers in a variety of fields, opportunities to grow and a wide range of benefits and perks that support your health and wealth.

It’s all part of what makes #MyRichmondFed a great place to work!

About the Opportunity

As a Senior Cloud Reliability Engineer in the SRE chapter, you will be accountable for implementing reliability practices using software as means for the cloud foundational product line in the Federal Reserve.

The SRE Chapter is part of the Cloud Solutions & Services department and has the overall responsibility for reliability of the numerous cloud foundational environments in the FRS.

What Will Be Expected of You

  • Works part of cloud foundational platform squads to demonstrate and champion site reliability culture and practices and exerts technical influence throughout your team
  • Solves reliability of cloud platforms with software engineering principles
  • Develops and maintains automations, scripts and code associated with automating manual work, improving reliability and stability of the cloud platform
  • Develops, integrates and maintains synthetics (canaries) code to establish health of the platform
  • Leads SLIs, SLOs, Error budgets efforts in collaboration with product team to instrument, visualize for proactively managing the stability of cloud platforms
  • Implement observability (logs, metrics, traces) and monitoring for cloud foundational platforms
  • Defines chaos experiments in collaboration with product owners and conducts experiments
  • Develops and Mentors Junior engineers in the team
  • Other duties assigned as necessary

Qualifications

  • 5-7 years of experience in end-to-end enterprise software development life cycle including maintenance and support
  • 3+ years of experience in Observability and SRE practices.
  • Bachelor’s degree in computer science, Information Systems, or equivalent background or equivalent experience.
  • The ideal candidate is someone who loves building and maintaining reliable and scalable systems, is passionate about continuous improvement.
  • Self-motivated individual with the ability to prioritize and manage changing priorities.
  • Strong analytic and problem-solving skills.
  • Strong customer focus and communication skills.
  • Independent critical thinking and decision-making abilities.
  • Excellent written and oral communication abilities.

Expertise you will bring :

  • Extensive knowledge and experience of working in AWS environments
  • Software development experience with one of the languages : Python, GoLang
  • Experience with observability and tools like Dynatrace, Prometheus, Grafana, AWS CloudWatch, AWS Canary, AWS event bridge
  • Expertise in automation and tooling.
  • Working experience in Agile and Scaled Agile environments
  • Experience supporting infrastructure for large multi-services applications.
  • Knowledge of secure coding standards and banking environment is a plus.

Discover the Reason Why So Many People Love It Here!

When you join the Richmond Fed, not only will you find a challenging and purposeful career, you’ll also have access to a wide range of benefits and perks that support your health and wealth, including :

  • Great medical benefits
  • Pension and 401(k) with employer match
  • Paid time off
  • Tuition reimbursement
  • Employee resource networks
  • Paid volunteer leave
  • Flexible work options
  • Onsite amenities that make working here fun!

Other Requirements and Considerations :

  • Candidates should review the to ensure compliance with conflict-of-interest rules and personal investment restrictions.
  • If you need assistance or an accommodation due to a disability, please notify .
  • Employees who work at and / or visit another Federal Reserve entity or outside business as part of their job duties are required to comply with any onsite safety and health protocols of those organizations (including, but not limited to, requirements to vaccinate or test, mask, social distance, etc.).
  • Sponsorship is not available for this role. The selected candidate will be subject to a government security investigation and must meet eligibility requirements for access to classified information.

Eligibility for this specific position requires U.S. Citizenship.

  • The hiring range for the Engineer Senior position is $110,200 $151,580 annually.
  • For candidates outside Richmond, VA, listed hiring and salary ranges may be adjusted based on your geographic location.
  • Salary offered will be based on the job responsibilities and the individual’s knowledge, skills, and experience as defined in the job qualifications.
  • Applications are reviewed on a rolling basis. Interested candidates are strongly encouraged to apply by Sep 13th, 2024.

Full Time / Part Time

Full time

Regular / Temporary

Regular

Job Exempt (Yes / No)

Job Category

Information Technology

Work Shift

First (United States of America)

Always verify and apply to jobs on Federal Reserve System Careers () or through verified Federal Reserve Bank social media channels.

30+ days ago
Related jobs
Promoted
iRhythm Technologies, Inc.
San Francisco, California

Position: Senior Site Reliability Engineer. Senior Site Reliability Engineer (San Francisco, CA) to write, build, and deploy services globally and at scale and how best to write tools to automate the entire development cycle. Site reliability and availability, including end-to-end performance, servi...

Promoted
VirtualVocations
Oakland, California

A company is looking for a Staff Site Reliability Engineer for their GovCloud team. ...

Promoted
PicnicHealth
San Francisco, California

As a Senior Site Reliability Engineer at PicnicHealth, you will be responsible for the reliability, efficiency, and architecture of our cloud, developer, and security operations. Full Time] Site Reliability Engineer at PicnicHealth (United States). PicnicHealth’s engineering team is highly engaged a...

Promoted
Arbitrum
San Francisco, California

Site Reliability Engineer, Production Engineer, Platform Engineer). Collaborate, partner, advice, review and mentor engineering teams on Reliability topics like high reliability architecture, observability, safe change management. As an engineer in the Infrastructure department at Alchemy, you will ...

Promoted
Cisco Systems
San Francisco, California

As a Principal Site Reliability Engineer, you will focus on innovating and providing strong technical vision as well as work with the team to build reliable, scalable and highly available datastores on a constantly growing multi-region scale platform. We’re looking for a reliability-focused engineer...

Sight Machine, Inc.
San Francisco, California

In this role you will join our Site Reliability and Infrastructure Team in deploying, managing, optimizing and upgrading the systems that run Sight Machine software. Success will take a blend of technical expertise, experience with deployment technology frameworks, customer-centric focus, and a team...

Cisco Meraki
San Francisco, California
Remote

Building an automatic service lifecycle platform to coordinate the full lifecycle of all infrastructure (server, storage, network and site). Deploying comprehensive monitoring tools to provide insight into the performance and reliability of our infrastructure. ...

E-Solutions
California, United States

Site Reliability Engineer (SRE). We are seeking a skilled Site Reliability Engineer (SRE) to join our dynamic team. You will be responsible for ensuring the availability and reliability of our SaaS products, which host customer data and require 24x7 uptime. Ensure the reliability, availability, and ...

Saildrone
Alameda, California

The RoleWe are seeking a talented Staff Site Reliability Engineer with a strong focus on observability and mentorship to join our dynamic team. Your expertise in observability tools and practices will play a crucial role in scaling up Saildrone's Site Reliability Engineering team, helping to ensure ...

GEICO
San Francisco, California

Our Senior Manager is an engineering leader who works with the engineering staff to innovate and build new engineering solutions, improveand enhance existing solutions as well as leverage engineering solutions to solve critical operational problems. Senior Manager, Site Reliability Engineering – Dat...