Overview
Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. What sets us apart is our relentless pursuit of innovation. We leverage the latest advancements in cloud computing, artificial intelligence, and software-defined networking to provide our clients with a competitive edge in an increasingly interconnected world. Our solutions are designed to not only meet the current demands of the digital landscape but to also anticipate and adapt to future challenges.
At Arista we value the of thought and perspectives that each employee brings to the table. We believe that fostering an inclusive environment, where individuals from various backgrounds and experiences feel welcome, is essential for driving creativity and innovation.
Our commitment to excellence has earned us several prestigious awards, such as Best Engineering Team, Best Company for , Compensation, and Work-Life Balance. At Arista, we take pride in our track record of success and strive to maintain the highest standards of quality and performance in everything we do.
The Opportunity
This is not a traditional operations role. You will inherit a set of critical, manual, and hands-on operational responsibilities essential to our customers' success. We need you to help lead the effort to systematically dismantle this operational burden through automation, tooling, and systems. You will have a collaborative team of excellent engineers and a counterpart to you to work with on both the manual toil and the systems we need to engineer.
The short-term needs are : manual deployments, reactive troubleshooting, and on-call escalations. But we need you to help us build a system where programmatic solutions have replaced human intervention. You must have the pragmatism to manage the current reality and the systematic impatience to build its replacement.
Success in this role requires a dual mindset. You must be a skilled incident leader who can stabilize a crisis and a deliberate systems architect who can prevent the next one. You will work closely with our internal tools, platform, and product engineering teams to channel your direct operational knowledge into durable, long-term solutions.
What You'll Do
Your work will follow a deliberate trajectory from reactive execution to proactive design.
Qualifications
You must have a strong background in Site Reliability Engineering or a closely related DevOps function. You also have a strong command of Linux systems administration and possess an understanding of networking fundamentals (TCP / IP, DNS, routing).
You must have experience working directly with external customers to solve difficult technical problems. Your communication must be clear, empathetic, and precise.
You need production experience with a major cloud provider, preferably AWS. You should be proficient in its core concepts and services (VPC, EC2, IAM, S3) and have experience building and managing infrastructure as code with tools like Terraform.
You will be responsible for both building and using our observability stack. This requires hands-on experience instrumenting applications and managing the telemetry pipelines for metrics, logs, and traces. A core part of the role is then applying this data to debug complex production incidents, understand system behavior, and define SLOs.
You must be proficient in writing code to automate operational tasks. Expertise in a high-level language like Python or Go is required, as are strong shell scripting skills (e.g., Bash).
Skills
We use this software extensively in the product in customer environments. Experience here is not required but it is a plus.
We use Nix / NixOS extensively so knowing them helps, but they will not play a large role in your initial responsibilities. We\'ll train you on the job if you\'ve never used Nix before.
We value functional programming-oriented principles (compositionality, immutability, etc). You are not required to know functional , but some exposure is a plus as is a willingness to learn but this is not a requirement.
This is a hybrid work environment where office presence maybe required 1-2 days a week.
Additional Information
Arista Networks is an equal opportunity employer. Arista makes all hiring and employment-related decisions in a non-discriminatory manner without regard to race, color, national origin, religion, sex, familial status, disability, age, or any other factor determined to be unlawful under applicable federal, state, or law law. All your information will be kept confidential according to EEO guidelines.
J-18808-Ljbffr
Senior Reliability Engineer • Austin, TX, US