Search jobs > Waltham, MA > Site reliability engineer

Site Reliability Engineer

SS&C Technologies Holdings
Waltham, MA
Full-time

Job Description

Site Reliability Engineer

Location(s) : Waltham, MA Hybrid

Get To Know The Team

The SS&C Intralinks team is currently searching for a Site Reliability Engineer to join their team.

What You Will Get To Do

  • Responds to and resolves escalated incidents for customer issues or monitoring alerts.
  • In-depth analysis of incident root cause;
  • Working with R&D and architecture teams on defects and runtime inefficiencies identified in the production environment;
  • Building diagnostic and analytical tools that improve the MTTA, MTTD, and MTTR.
  • Building systems / site monitoring tools for system health and APIs to ensure smooth operations of production systems
  • Configuring and integrating commercially available monitoring tools into the production systems
  • Validate and Verify software deliverables for production readiness.
  • Risk assessment and mitigation of changes to the production systems

What You Will Bring

  • Strong work experience in Unix / Linux
  • BS or MS in Computer Science or similar discipline
  • Strong knowledge Java Web-based enterprise applications.
  • Strong work experience and troubleshooting skills in Kubernetes systems
  • Strong work experience in Microservices using Kubernetes and container applications
  • Working experience of AWS with CloudWatch, EKS, EFS, S3, RedShift and other AWS services
  • Working experience in using AWS tools to troubleshoot applications (resource constraints, connectivity, alerting, and monitoring)
  • Sound knowledge in one of the major programming languages (such as Java) and performance tuning.
  • Ability to automate mundane tasks using shell scripts, python, etc.
  • Experience working with one or more of the following : Splunk, Dynatrace, Zabbix, Prometheus, etc.
  • Experience working with one or more of the following : Oracle, PostgreSQL, MongoDB
  • Experience working with messaging subsystems : RabbitMQ, Interconnect, AMQ.
  • Working experience working with Jenkins, GIT

Preferred

  • Minimum 7 years of experience in developing Software projects / applications.
  • 30+ days ago
Related jobs
Promoted
Canonical - Jobs
Boston, Massachusetts

As a Senior Site Reliability / Gitops Engineer you will. As an Senior SRE & Gitops engineer you'll be in a unique position to drive operations automation to the next level, both in our own private clouds as well as in the public clouds. Provide assistance and work with globally distributed e...

Wolters Kluwer
Waltham, Massachusetts

Senior Devops Site Reliability Engineer. Minimum 5 years of software related experience required (Site Reliability, DevOps, Release Eng). You will report to the Associate Director, Product Software Engineering. Partner with Engineering, Security, and IT to build, deploy, maintain and monitor a compl...

Datadog
Boston, Massachusetts

The Site Reliability teams at Datadog are responsible for ensuring that our high-volume, low-latency environments continue to perform around the clock. You have a track record as an engineer in the operations of a large site. Built by engineers, for engineers, Datadog is used by organizations of all...

Klaviyo
Boston, Massachusetts

As a Senior Site Reliability Engineer you will own multiple foundational Klaviyo services and make a big impact on the productivity of our product engineering teams. Internally, we call this role Senior Site Reliability Engineer on the Security SRE team. SecEng") will make it easy for engineers to m...

SS&C Technologies Holdings
Waltham, Massachusetts

The SS&C Intralinks team is currently searching for a Site Reliability Engineer to join their team. Building systems/site monitoring tools for system health and APIs to ensure smooth operations of production systems. ...

CIRCLE
Boston, Massachusetts

As a Senior Site Reliability Engineer at Circle, you will design, build, and maintain Circle’s infrastructure estate to meet the growing worldwide customer base on public cloud providers across multiple regions. Staff Site Reliability Engineer (IV). Senior Site Reliability Engineer (III). Senior Sit...

Klaviyo
Boston, Massachusetts

Lead Site Reliability Engineering (SRE) is what you get when you treat system operations as a software engineering problem. The mission of the Site Reliability Engineering team is to ensure uninterrupted service for Klaviyo customers and act as a force multiplier for Klaviyo product teams to deliver...

Global InfoTek, Inc.
Bedford, Massachusetts

The Site Reliability Engineer (SRE) must be able to build and maintain infrastructure as code on large scale multi-site deployments. Eight-plus (8+) years of experience working in Operations, DevOps, or Site Reliability Engineering. The engineer will troubleshoot issues until root causes are underst...

Klaviyo
Boston, Massachusetts

Site Reliability Engineering (SRE) is what you get when you treat system operations as a software engineering problem. The mission of the Site Reliability Engineering team is to provide services, tooling, and guidance to Klaviyo's product engineers to make them more productive and ensure their servi...

Salesforce
Burlington, Massachusetts

Site Reliability Engineering (SRE) team is a brand-new organization within Security Engineering, with an exciting mission to bootstrap adoption of the industry’s groundbreaking SRE principles and best practices at Salesforce! We are looking for experienced Software Engineers/DevOps Engineers to join...