Sr Site Reliability Engineer - FedRAMP

Alation
Fontana, California, US
$191.5K-$220K a year
Full-time
We are sorry. The job offer you are looking for is no longer available.

Big Data isn’t a problem. It’s an opportunity.

If you are considering sending an application, make sure to hit the apply button below after reading through the entire description.

At Alation, we help people find, understand, and trust data. So they not only excel in their work they drive value for their enterprise, team, and role.

In the words of one customer, Alation makes me look like a rockstar.

We help companies you know and trust empower their people with the best data every day. Alation helps Discover Financial Services quickly generate value from their data to create the product and customer service innovations that help the iconic credit card company remain number one in customer satisfaction.

And real estate giant Keller Williams uses Alation to govern the more than 70 TB of data that empowers their global team of over 190,000 agents.

With $340M in funding valued at over $1.7 billion and 550+ customers, including 35% of Fortune 100 companies- Alation is poised to capitalize on data as an opportunity.

Headquartered in Silicon Valley, Alation was named to Inc. Magazine’s Best Workplaces list for the fourth time. Do you want to join a team that welcomes new ideas, supports your growth, and recognizes your unique value?

Join us!

Job Description

The SRE Team manages world-class Alation cloud infrastructure and ensures our state-of-the-art services' reliability, availability, and performance.

We seek a Senior Site Reliability Engineer (SRE) to manage and enhance the configuration, stability, performance, and network connectivity of our Commercial and FedRAMP cloud offering.

What You’ll Do

  • Manage and run backend systems like Kubernetes, RDBMS, AWS services, and everything in between
  • Work closely with internal partners and teams to ensure that we ship software that meets security, SLA, and performance requirements
  • Maintain the Alation platform by diagnosing, predicting and correcting scaling problems
  • Participate in on-call rotation, identify issues, drive them to resolution while conducting blameless RCA
  • Write deployment plans, execute upgrades, develop documentation, capacity plans, and troubleshoot production issues
  • Coach and mentor junior-level engineers and contractors
  • Ensure Operational and ITIL best practices are documented and followed

Must Have

  • Must be a US Citizen to be considered for this position and on US soil
  • BS / MS in Computer Science or equivalent and at least 5+ years of experience in Technical Operations production SaaS roles
  • Strong working knowledge of Kubernetes & Docker
  • Experience with any higher language like Python, Ruby, Go or Java
  • Experience with a SaaS Product at Scale
  • Strong understanding of DevOps / Agile Principles
  • Good note-keeping (Confluence) and ticket management skills (Confluence, JIRA)
  • Strong experience with Linux / Unix Systems and cloud providers like AWS / Azure / GCP
  • Experience working with Infrastructure as Code (IaC) at scale using tools like Terraform, Chef, Ansible.
  • Good understanding of networking and messaging between services
  • Curiosity, growth mindset, and always willing to learn new technologies
  • Good communication and strong interpersonal skills
  • Experience with monitoring and alerting tools (eg. Prometheus / Grafana & Datadog)
  • Experience with logging tools (e.g. Datadog, ELK, Loki)
  • Proven experience leading complex projects

Nice To Have

  • Experience on SCM tools like (Git, Github)
  • Experience with SQL and NoSQL databases
  • Ability to work across multiple teams and good knowledge on AWS cloud services (e.g. EKS,RDS,IAM,Lambda etc.)
  • Prior FedRAMP experience

Security And Privacy Responsibilities

This position carries special Security and Privacy Responsibilities for protecting the U.S. Federal Government’s interests :

  • Know, acknowledge, and follow system-specific security policies and procedures
  • Protect data and individual privacy per requirements and regulations
  • Perform ongoing activities in compliance with service and contractual obligations
  • Participate in role-based training, completing assignments on a timely basis
  • Report security issues promptly and aid investigation when needed
  • Support controlled changes and vulnerability remediation activities
  • Work collaboratively with Information Security in designing, implementing, assessing or enhancing system-specific security and privacy controls

Compensation Pay Range

$191,481.00 - $220,000.00

Salary Information

The base salary range is specific to the United States. The salary of the final candidate selected for this role will be set based on a variety of factors, including but not limited to internal equity, experience, education, work location, specialty and training.

If the final candidate has a different level of experience, the base salary target range may be lower or higher than what is published.

Alation, Inc. is an Equal Employment Opportunity employer. All qualified applicants will receive consideration for employment without regards to that individual’s race, color, religion or creed, national origin or ancestry, sex (including pregnancy), sexual orientation, gender identity, age, physical or mental disability, veteran status, genetic information, ethnicity, citizenship, or any other characteristic protected by law.

The Company will strive to provide reasonable accommodations to permit qualified applicants who have a need for an accommodation to participate in the hiring process (e.

g., accommodations for a job interview) if so requested.

This company participates in E-Verify. Click on any of the links below to view or print the full poster.

E-Verify and Right to Work.

J-18808-Ljbffr

13 days ago
Related jobs
Splunk Inc
California, United States
Remote

Site Reliability Engineers in this role will be engaging with multiple service owners across the platform to teach and implement modern interpretations ofSRE,observability, Chaos Engineering andDevOps. Splunk's Cloud Services group is looking for a Site ReliabilityEngineer to help lead, design and b...

PEAK Technical Staffing
Local Remote, CA
Remote

This SRE role will focus on providing direct, level one and two support to internal engineering teams. Engage directly with engineering customers on troubleshooting requests and guiding them on solutions. Hands on experience in working with distributed systems and availability, reliability, scalabil...

Fractal
California

As a Site Reliability Engineer with Fractal, you will be dedicated to ensuring the highest system availability and performance levels. You'll need to be onsite or have the ability to move. You will work closely with our Services and Engineering teams, playing a crucial role in optimizing our platfor...

Tencent
California, US

Are you passionate about gaming and skilled in managing distributed online systems? Uncapped Games is looking for a Site Reliability Engineer like you! Join us in our quest to revolutionize the Real-Time Strategy (RTS) genre with our groundbreaking new game. SRE, DevOps, or Infrastructure Operations...

Robert Half
CA, United States

Currently, I have a client that is in the entertainment industry is looking for a Site Reliability Engineer to join their engineering team. The Site Reliability Engineer position is fully remote. The Site Reliability Engineer should have experience in AWS, Terraform, Kubernetes, and Linux. The tasks...

Microsoft
Fontana, California

The Site Reliability Engineering (SRE) team provides leadership, direction and accountability for application architecture, system design, and end-to-end implementation. Do you have a passion for high scale services and working with some of Microsoft’s most critical customers? We’re looking for a Se...

E-Solutions
California, United States

Reliability Engineer with Medical Devices Manufacturers. As an Lead Service Integrator(Reliability Engineer), you will be a part of an Agile team to build healthcare applications and implement new features while adhering to the best coding development standards. ...

Fractal
CA, United States

Work cross-functionally with Services and Engineering teams. ...

E-Solutions
California, United States

Site Reliability Engineer (SRE). We are seeking a skilled Site Reliability Engineer (SRE) to join our dynamic team. You will be responsible for ensuring the availability and reliability of our SaaS products, which host customer data and require 24x7 uptime. Ensure the reliability, availability, and ...

CoStar Group
CA, Orange County

On-site fitness center and/or reimbursed fitness center membership costs (location dependent), with yoga studio, Pelotons, personal training, group exercise classes, as well as Segways and bikes available for use during the day. ...