Search jobs > Newton, MA > Site reliability engineer

Site Reliability Engineer

Cypress HCM
Newton, MA, United States
$38,73 an hour
Full-time

Site Reliability Engineer 2

Description :

Reason : Special Project Department : Stock US Eng 6 Months

Job Summary

We are seeking a skilled Site Reliability Engineer (SRE) Level 2 to join our dynamic team. The ideal candidate will have a strong technical background, excellent problem-solving skills, and a passion for enhancing system reliability and performance.

You will play a crucial role in monitoring, automating, and optimizing our infrastructure to ensure the seamless operation of our services.

Key Responsibilities :

  • System Monitoring and Incident Response : Monitor system health, performance metrics, and availability. Respond promptly to incidents and outages, ensuring minimal downtime.
  • Infrastructure Management : Manage and optimize both cloud and on-premise infrastructure using Infrastructure as Code (IaC) tools.
  • Automation : Develop and maintain automation scripts and tools to enhance operational efficiency and reduce manual tasks.
  • Collaboration : Work closely with development teams to implement CI / CD practices and improve deployment processes.
  • Capacity Planning : Analyze usage patterns and forecast capacity needs to ensure system scalability and reliability.
  • Documentation : Create and maintain comprehensive documentation for systems, processes, and incident response protocols.
  • Security Best Practices : Implement and enforce security measures to protect infrastructure and data.
  • Post-Incident Reviews : Conduct post-mortems on incidents to identify root causes and implement corrective actions.

Required Skills :

  • Strong knowledge of Linux / Unix systems and proficiency in scripting languages (e.g., Python, Bash).
  • Familiarity with cloud platforms (e.g., AWS) and their services.
  • Experience with container orchestration (e.g., Kubernetes, Docker).
  • Proficiency in using monitoring and alerting tools (e.g., Prometheus, Grafana, Nagios).
  • Experience with version control systems (e.g., Git).
  • Strong troubleshooting skills with the ability to diagnose complex system issues.
  • Excellent verbal and written communication skills for collaboration with cross-functional teams.
  • Understanding of Agile development practices and methodologies.

Experience :

1-4 years of experience in Site Reliability Engineering or a similar role.

Location : Newton, MA

Newton, MA

Schedule :

  • Start Date : 10 / 14 / 2024
  • Estimated End Date : 04 / 07 / 2025
  • Hours Per Week : 40.00
  • Hours Per Day : 8.00

Compensation :

Up to $38.73 / hr. (W2 / Non-Exempt)

33897117

10 days ago
Related jobs
Promoted
Veradigm
Boston, Massachusetts
Remote

As a Senior Site Reliability Engineer, you will bring at least 4-7 years of relevant industry experience, including a minimum of 3 years as a Site Reliability, DevOps Engineer or equivalent. Site Reliability Engineer, DevOps Engineer, or an equivalent position for at least 2-3 years. Stay updated wi...

Promoted
Klaviyo Inc.
Boston, Massachusetts

Site Reliability Engineering (SRE) is what you get when you treat system operations as a software engineering problem. The mission of the Site Reliability Engineering team is to provide services, tooling, and guidance to Klaviyo's product engineers to make them more productive and ensure their servi...

Promoted
Jobs via eFinancialCareers
Boston, Massachusetts

As a Senior Site Reliability Engineer at Circle, you will design, build, and maintain Circle's infrastructure estate to meet the growing worldwide customer base on public cloud providers across multiple regions. Senior Site Reliability Engineer (III). Staff Site Reliability Engineer (IV). All the re...

Promoted
AXON-Networks
Boston, Massachusetts
Remote

We are looking for a Site Reliability Engineer (SRE) to join our support team to respond to and resolve incidents reported by our customers in a timely manner. Site Reliability Engineers (SREs) are responsible for keeping all user-facing services and other AXON-Networks production systems running sm...

Promoted
Global InfoTek, Inc.
Bedford, Massachusetts
Remote

The Site Reliability Engineer (SRE) shall be able to build and maintain infrastructure as code on large scale multi-site deployments. The engineer shall be able to troubleshoot issues until root causes are understood on high traffic production systems, participate in design and code review processes...

Global InfoTek, Inc.
Boston, Massachusetts
Remote

The Site Reliability Engineer (SRE) must be able to build and maintain infrastructure as code on large scale multi-site deployments. Eight-plus (8+) years of experience working in Operations, DevOps, or Site Reliability Engineering. The engineer will troubleshoot issues until root causes are underst...

Klaviyo
Boston, Massachusetts

Lead Site Reliability Engineering (SRE) is what you get when you treat system operations as a software engineering problem. The mission of the Site Reliability Engineering team is to ensure uninterrupted service for Klaviyo customers and act as a force multiplier for Klaviyo product teams to deliver...

Intralinks
Waltham, Massachusetts

The SS&C Intralinks team is currently searching for a Site Reliability Engineer to join their team. Building systems/site monitoring tools for system health and APIs to ensure smooth operations of production systems. Thank you for your interest in SS&C! To further explore this opportunity, p...

Splunk Inc
Massachusetts, United States
Remote

Site Reliability Engineers in this role will be engaging with multiple service owners across the platform to teach and implement modern interpretations ofSRE,observability, Chaos Engineering andDevOps. Splunk's Cloud Services group is looking for a Site ReliabilityEngineer to help lead, design and b...

Global InfoTek, Inc.
Bedford, Massachusetts

The Site Reliability Engineer (SRE) must be able to build and maintain infrastructure as code on large scale multi-site deployments. Eight-plus (8+) years of experience working in Operations, DevOps, or Site Reliability Engineering. The engineer will troubleshoot issues until root causes are underst...