Site Reliability Engineer

Altius Technologies, Inc.

San Jose, CA, United States

Full-time

Creating and supporting automation scripts (shell / ansible / python) for infrastructure deployments, validations and monitoring to improve operational tasksScheduling monitoring scripts using cron and airlfowMonitoring using tools including Dynatrace, Apica, Grafana etcDatabase handling Build CICD pipelines Incident handling and problem management Mandatory Skills : Experience in Ansible / Python Monitoring Tools Dynatrace / Apica / Grafana Required Experience : 14 plus years of IT Infrastructure experience Extensive experience working with linux flavors like rhel / centos os, shells, filesystems and utilitiesExperience in programming languages like Python, ansibleKnowledge of distributed computing and experience working with container orchestration frameworks including on-prem and rancher kubernetes and good knowledge on kubernetes objectsExperience working with Storage, ONTAP is preferable : volume, aggregates, back ups, DR planningExperience scheduling monitoring scripts using cron and airlfowExperience with monitoring tools including Dynatrace, Apica, Grafana etcDatabase knowledge including sql and nosql dbsExperience building CICD pipelines (preferred)Cloud platform knowledge (specifically AWS) is required

8 days ago

Related jobs

Promoted

Senior Site Reliability Engineer - Storage Platform

NVIDIA

Santa Clara, California

Site Reliability Engineering (SRE) is an engineering discipline that involves designing, building, and maintaining large-scale production systems with high efficiency and availability. It encompasses various areas, including software and systems engineering practices, storage, data management, and s...

Promoted

Software Engineer III, Site Reliability Engineering, Google Cloud

Google

Sunnyvale, California

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. Master's degree in Computer Science or Engineering. SRE ensures that Google Cloud's services—both our internally critical and our externally-visib...

Promoted

Site Reliability Engineer

CV Library

Santa Clara, California

Work with development teams to ensure that applications have scalability and reliability built-in from day one - Agile is second nature to you and you're excited to work in scrum teams and represent the SRE perspective. Design and enhance software architecture to improve scalability, service reliabi...

Promoted

Site Reliability Engineer (SRE)

Redolent Infotech Pvt. Ltd.

Sunnyvale, California

Azure DevOps Engineer@ Sunnyvale CA. Create frameworks, processes and best practices to be used across Engineering. ...

Promoted

Site Reliability Engineer

Atlassian

Mountain View, California

As a Site Reliability Engineer (SRE) you will actively work to improve the performance and reliability of services as well as address root causes of incidents and reduce incident rates. Love staying ahead of the growth curve and experimenting with new software and environments? Get on board as an At...

Promoted

Senior Site Reliability Engineer - DGX Cloud

Nvidia Corporation

Santa Clara, California

Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. Senior Site Reliability Engineer - DGX Cloud. SRE at NVIDI...

Promoted

Site Reliability Engineer Graduate (Edge Platform) - 2024 Start (BS/MS)

Bytedance

San Jose, California

Participate in technical operations and rotations in response to performance and reliability issues. Graduate with Bachelor's or Master's degree in Software Development, Computer Science, Computer Engineering, or a related technical discipline. ...

Promoted

Staff Site Reliability Engineer - Federal (US Citizen)

Zscaler

San Jose, California

Position: Staff Site Reliability Engineer. Resolve escalations and help prevent reiteration of incidents with process, monitoring and reliability improvements. Relevant experience preferably in an Operations or Engineering environment. ...

Principal Site Reliability Engineer (SASE)

Palo Alto Networks

Santa Clara, California

Experience in Site Reliability Engineering, Production Engineering, or DevOps. As a Principal Site Reliability Engineer, you will be part of a team supporting the services running on this infrastructure. This includes automation, architecture, performance, observability, troubleshooting, security, a...

Site Reliability Engineer Graduate (Technical Infrastructure) - 2025 Start (BS/MS)

ByteDance

San Jose, California

Our data infrastructure Site Reliability Engineering (SRE) team is a pioneer in innovation. Establish sustainable mechanisms for scaling systems, such as automation, to drive enhancements in reliability, efficiency, and velocity. ...

Site Reliability Engineer

Senior Site Reliability Engineer - Storage Platform

Software Engineer III, Site Reliability Engineering, Google Cloud

Site Reliability Engineer

Site Reliability Engineer (SRE)

Site Reliability Engineer

Senior Site Reliability Engineer - DGX Cloud

Site Reliability Engineer Graduate (Edge Platform) - 2024 Start (BS/MS)

Staff Site Reliability Engineer - Federal (US Citizen)

Principal Site Reliability Engineer (SASE)

Site Reliability Engineer Graduate (Technical Infrastructure) - 2025 Start (BS/MS)

Related searches