Search jobs > San Diego, CA > Site reliability engineer

Atlassian Services Site Reliability Engineer

Apple
San Diego
Full-time

Summary :

The Atlassian Services Site Reliability Engineer (SRE) role resides within the Software Delivery organization, which is at the core of the Apple software release process.

This role is responsible for applying SRE practices in maintaining Atlassian services, which are used by software engineers and project managers to develop Apple software for delivery to customers around the world.

The Atlassian Services team drives reliability and performance engineering of data center applications, instruments observability of services, responds to incident alerts, and reports on SLI / SLO metrics for visibility across the organization.

This SRE role is essential in maintaining the production systems of Bitbucket, Confluence, and Jira that are used to deliver the state-of-the-art operating systems, applications, and firmware to Apple customers.

Key Qualifications :

Passion in building reliable, scalable, and performant distributed systems Understanding of distributed systems w.r.t. application, networking, and securitySRE or Dev / Ops experience in managing customer-facing systems in 24 / 7 environment Experience in managing and monitoring fleets of *nix systems or container platforms Excellent judgment and integrity with ability to make timely and sound decisionsAbility to anticipate the needs of others and adapt to changing conditions

Description :

As an Atlassian Services Site Reliability Engineer, responsibilities include : - Configuration and monitoring of on-prem and cloud-based dependencies-Automate continuous integration (CI) and continuous delivery (CD) pipelines- Maintain staging and production environments with goal of maximizing uptimes - Implement observability of systems for monitoring, alerting, and metrics reporting - Generate reports regarding service metrics on performance, availability, and reliability - Champion practices regarding change control management and incident response A successful Atlassian Services Site Reliability Engineer will be expected to : - Proactively communicate status of Atlassian services to stakeholders and follow through on time-sensitive tasks - Demonstrate willingness to ask for clarification and increase awareness of the larger context- Explore solutions to problems, evaluate risk vs reward, then execute best approach- Communicate asynchronously with a global team across multiple timezones- Document new processes or update existing documentation pages - Eager and curious to learn across multiple technology stacks

Additional Requirements :

Desired, but not required, skills and experiences : - Experience as SCM administrator (e.g. Github, or similar)- Experience with container platforms (e.

g. Docker, or similar)- Experience with monitoring and alerting (e.g. Prometheus, Grafana, or similar) - Experience with data analysis (e.g. Splunk, or similar)

30+ days ago
Related jobs
Promoted
Apple
San Diego, California

As an Atlassian Services Site Reliability Engineer, responsibilities include: - Configuration and monitoring of on-prem and cloud-based dependencies -Automate continuous integration (CI) and continuous delivery (CD) pipelines - Maintain staging and production environments with goal of maximizing upt...

Promoted
VirtualVocations
San Diego, California

A company is looking for a Senior Site Reliability Engineer to improve the reliability and stability of its customer-facing production infrastructure. ...

Addison Group
San Diego, California

A gaming and entertainment company is seeking a skilled Site Reliability Engineer for a one-year contract in San Diego, CA. BS in Computer Science, Software Engineering, or equivalent experience. ...

Promoted
VirtualVocations
San Diego, California

...

Ursus
San Diego, California

Site Reliability experience operating at scale in high pace environment. Collaborate with engineering and system teams to drive changes and ensure optimal application performance and resiliency. Review and influence design, architecture, standards, and methods for deploying, monitoring and operating...

Promoted
VirtualVocations
San Diego, California

...

Fractal
California

As a Site Reliability Engineer with Fractal, you will be dedicated to ensuring the highest system availability and performance levels. You will work closely with our Services and Engineering teams, playing a crucial role in optimizing our platforms and infrastructures. Collaborate effectively with c...

PEAK Technical Staffing
Local Remote, CA
Remote

This SRE role will focus on providing direct, level one and two support to internal engineering teams. Engage directly with engineering customers on troubleshooting requests and guiding them on solutions. Perform monthly service deployments for cloud platform services. Perform on-call duties for gen...

CoStar Group
CA, Orange County

On-site fitness center and/or reimbursed fitness center membership costs (location dependent), with yoga studio, Pelotons, personal training, group exercise classes, as well as Segways and bikes available for use during the day. ...

Robert Half
CA, United States

Currently, I have a client that is in the entertainment industry is looking for a Site Reliability Engineer to join their engineering team. The Site Reliability Engineer position is fully remote. The Site Reliability Engineer should have experience in AWS, Terraform, Kubernetes, and Linux. The tasks...