Tech Ops-Site Reliability Engineer - 30264

Splunk Inc
Connecticut, United States
Full-time

Join us as we pursue our disruptive vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers.

AtSplunk, we’re committed to our work, customers, having fun and most significantly to each other’s success. Learn more aboutSplunkcareers and how you can become a part of our journey!Role : Splunk is looking for a TechOps Engineer with the ability to provide day-to-day technical expertise for our Splunk Cloud Azure TechOps team and the Splunk organization.

This position is responsible for making key technical decisions that help drive our operational infrastructure that deliver Splunk’s SaaS customer facing systems.

As a TechOps Engineer, you will be interfacing with other cross functional leaders on key, strategic initiatives. You will partner with senior engineers to solve difficult problems.

You will help grow and mentor the broader operational team as well as interact with senior leadership to propose solutions.

We're looking for someone to bring a fresh approach to problems of all shapes and sizes and help us build a top-notch Splunk Cloud TechOps team.

This position is a remote role available in USA, Plano, TX, with the ability to support FedRAMP Moderate / High environments.You will :

  • Own Splunk Cloud in Microsoft Azure environments and Amazon AWS FedRAMP
  • Work across the organization to deliver quality products that delight Splunk's passionate users.
  • Lead teams of tight-knit engineers who are building a state-of-the-art, cloud-based environment for massive-scale data processing.
  • Mentor and help new engineers to achieve more than they thought possible. You enjoy making other teams successful and are fulfilled through the success of others.
  • Must attain Splunk Cloud Certified Architect, within the first 12 months of hire date.

Qualifications :

  • You have experience or an interest in working with regulated computing environments such as FISMA and / or FedRAMP and are enthusiastic about doing it better.
  • Experience working within an Azure environment
  • Experience working in a fully remote position and team
  • You are passionate about building and running distributed systems at scale in production. You understand the challenges and trade-offs to be made when building and deploying systems to production.
  • You constantly consider "How can I automate this process?"
  • Knowledge of best practices related to security, performance, and disaster recovery.
  • Skilled in identifying performance bottlenecks, spotting anomalous system behavior, and determining the root cause of incidents.
  • Experience monitoring cloud environments using tools like Splunk, VictorOps and Nagios
  • You care about good documentation and appreciate how it allows a distributed team to function.
  • Ability to tackle complex problems, resolve operational issues, and interact with vendors to find solutions.
  • Comfortable working with critical, customer-facing issues and able to prioritize quickly when escalations happen.
  • Deep understanding of linux systems or equivalent certification, (network stack, file system, OS services) and networking (L2 vs.

L3, network architecture, VLANs, etc)

  • Must have AZ-900 Azure Fundamentals or preferred AZ-104 Azure Administrator Certification
  • You've demonstrated the skills to effectively work across teams and functions to influence design, operations and deployment of highly available software.
  • You are interested in working hard to make the users of Splunk's products happier every day.
  • Ability to to work nights, weekends and On-Call

Preferred skills :

  • Experience monitoring cloud environments with Splunk.
  • Experience with at least one programming language, preferably golang (go) or python. Knowledge of working with and automating linux systems tasks using this language is required, including working with configuration files and system services.

Knowledge of common data structures and algorithms, as well as their performance characteristics is required.

  • Experience with large scale distributed cloud service development, infrastructure, traffic management and architecture.
  • Experience with distributed architectures / systems with optimized and scalable software that operates on a large number of nodes.
  • Familiarity with Gitlab, Puppet, Jenkins, Clustering, Web Apps, and yaml
  • Ability to support FedRAMP Moderate environments.
  • 30+ days ago
Related jobs
Splunk Inc
Texas, United States

Learn more aboutSplunkcareers and how you can become a part of our journey!Role:Splunk is looking for a TechOps Engineer with the ability to provide day-to-day technical expertise for our Splunk Cloud Azure TechOps team and the Splunk organization. As a TechOps Engineer, you will be interfacing with...

Promoted
https:/www.energyjobline.com/sitemap.xml
Spring, Texas

Specializing in managed services across Google Cloud Platform (GCP) and Amazon Web Services (AWS), we seek a dedicated local Site Reliability Engineer (SRE) who is passionate about technology, excels in problem-solving, and is dedicated to providing unparalleled customer service. This role involves ...

Splunk Inc
Texas, United States

Learn more about Splunk careers and how you can become a part of our journey!Role:Splunk is looking for a TechOps Engineer with the ability to provide day-to-day technical expertise for our Splunk Cloud Azure TechOps team and the Splunk organization. As a TechOps Engineer, you will be interfacing wi...

Home Depot
TEXAS, US
Remote

Software Engineer, you will be part of a dynamic team with engineers of all experience levels who help each other build and grow technical and leadership skills while creating, deploying, and supporting production applications. Extensive experience with front end technology such as HTML, CSS, and Ja...

Splunk Inc
Texas, United States
Remote

Site Reliability Engineers in this role will be engaging with multiple service owners across the platform to teach and implement modern interpretations ofSRE,observability, Chaos Engineering andDevOps. Splunk's Cloud Services group is looking for a Site ReliabilityEngineer to help lead, design and b...

ITL USA
Texas, US

Job description Infosys is seeking a Lead Site Reliability Engineer (SRE) with expertise on RedHat OpenShift container Platform, Kubernetes, Docker, Azure cloud PAAS/IAAS technologies, Ansible, Puppet and Chef, bash scripting or python scripting or shell scripting for automation and building CI/CD p...

Tek Ninjas
TX, United States

Released as a DevOps Engineer, but more of an SRE. Experience in site reliability of Java applications on Linux. CI/CD, Azure DevOps nice to have--pipelines in place, but be able to trace back to a pipeline issue. ...

Promoted
Hispanic Technology Executive Council
Spring, Texas

Manages a portfolio of projects or a program and supervises other project managers on very large multi- function projects. Infrastructure Project ManagerThis role has been designed as Hybrid with an expectation that you will work on average 2-3 days per week from an HPE office. Project may be global...

Promoted
iSphere Innovation Partners, LLC
The Woodlands, Texas

Infrastructure Engineer Job Posting -. Infrastructure Engineer to join our team. As an Infrastructure Engineer, you will be responsible for designing, implementing, maintaining, and monitoring all computing, networking, and storage systems. ...

Promoted
American Bureau of Shipping
Spring, Texas

The Equipment and Systems Engineer, you will be actively engaged in driving industry-leading research and development associated with machinery and systems for marine and offshore applications. Lead research and assess the maturity of new technologies in the maritime industry including different mac...