Search jobs > Mountain View, CA > Staff software engineer

Staff Software Engineer - Observability

DataBricks
Mountain View, California, US
Full-time

P-186

Be one of the first applicants, read the complete overview of the role below, then send your application for consideration.

At Databricks, we are inspired by allowing data teams to solve the world’s toughest problems, from security threat detection to cancer drug development.

We do this by building and running the world’s best data and AI infrastructure platform, so our customers can focus on the high value challenges that are central to their own missions.

Our engineering teams build technical products that fulfill real, important needs in the world. We always push the boundaries of data and AI technology, while simultaneously operating with the security and scale that is important to making customers successful on our platform.

We develop and operate one of the largest scale software platforms. The fleet consists of millions of virtual machines, generating terabytes of logs and processing exabytes of data per day.

At our scale, we observe cloud hardware, network, and operating system faults, and our software must gracefully shield our customers from any of the above.

As a software engineer in the Runtime Observability team, you will develop observability solutions that provide insights into the health and performance of our products and infrastructure.

You will report directly to the Director of Engineering.

The Impact You Will Have :

  • You will collaborate with different teams to identify metrics that allow engineers to observe how well the system and different subcomponents are performing.
  • You will build tooling and infrastructure to allow components to emit, log, and aggregate metrics that can be displayed on dashboards and used for alerting.
  • You will scale the observability solutions to support millions of instances and billions of queries per day.
  • You will develop processes and training for developers and field engineers to debug performance and reliability issues affecting customers.

What We Look For :

  • BS (or higher degree) in Computer Science, or a related field.
  • 4+ years of production level experience in one of : Java, Scala, C++, or similar language.
  • Experience in software development, in large-scale distributed systems.
  • Familiarity with metrics collection, health monitoring, and observability tools.
  • Experience building relationships with developers and field engineers to facilitate assessment and mitigation of performance and reliability problems.
  • 6+ years of production level experience in one of : Java, Scala, C++, or similar language.
  • Experience driving large projects involving multiple teams. Provide appropriate guidance on developing large-scale systems that can handle billions of queries per day.

Benefits :

  • Comprehensive health coverage including medical, dental, and vision.
  • 401(k) Plan.
  • Equity awards.
  • Flexible time off.
  • Paid parental leave.
  • Family Planning.
  • Gym reimbursement.
  • Annual personal development fund.
  • Work headphones reimbursement.
  • Employee Assistance Program (EAP).
  • Business travel accident insurance.

J-18808-Ljbffr

2 days ago
Related jobs
Promoted
sybill.ai
Mountain View, California

Years of Professional Software Engineering Experience, including at least 3 years of experience with senior engineering roles, ideally within an early-stage startup setting. Collaborate closely with product, design, and the rest of the engineering team to drive product-led growth through engineering...

Promoted
CV Library
Santa Clara, California

Do you have the following skills, experience and drive to succeed in this role Find out below.Our Cloud Management Platform is a public cloud delivered management platform to manage all Palo Alto Networks Next network security solutions.It’s an easy-to-use, scalable and secure platform to operationa...

Promoted
Tenstorrent Inc.
Santa Clara, California

As a Staff Cloud Software Engineer, you will collaborate closely with cross-functional teams to ensure the scalability, security, and efficiency of our cloud platform. Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field. Proven experience as a software enginee...

Promoted
Ambient.ai
Palo Alto, California

Full Time] Staff Software Engineer - Cloud at Ambient. Staff Software Engineer - Cloud. India Technology Engineering division with exponential growth ahead. Working under the leadership of Kiran Palan and Vikesh Khanna, you will be in one of the senior-most engineering roles and will be a key player...

Promoted
BillionToOne
Menlo Park, California

We apply bioengineering and machine learning principles to diagnostics in order to build truly quantitative molecular tests. At least 6 years experience in professional software development. Proven track record of shipping software. Writing and coordinating: System design, setting technical mileston...

Promoted
BILL
San Jose, California

At least 12 years of software engineering experience with a Bachelor's degree, 16+ years of work experience may be considered in lieu of a degree. BILL is a leader in financial automation software for small and midsize businesses (SMBs). Payment Engineer will have an opportunity to re-architect the ...

Promoted
Spotnana
Palo Alto, California

As a Staff Software Engineer, you will work on a specific project critical to Spotnana’s needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. Our engineers are versatile, display leadership qualities and are enthusiastic to take on new problems ac...

Promoted
Walmart
Sunnyvale, California

Option 1: Bachelor's degree in computer science, computer engineering, computer information systems, software engineering, or related area and 4 years' experience in software engineering or related area. Master's degree in Computer Science, Computer Engineering, Computer Information Systems, Softwar...

Promoted
ServiceNow
Santa Clara, California

ServiceNow is seeking a highly motivated and experienced Staff Software Engineer in Platform Persistence - Data Management to solve complex problems in management of data lifecycles within the ServiceNow ecosystem. It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred...

Advanced Micro Devices, Inc
San Jose, California

KEY RESPONSIBILITIES: Develop & optimize software systems for deploying AI solutions on Ryzen AI PCs Optimize Ryzen AI system-level performance for multiple concurrent current AI workloads (Vision, Generative AI, Speech) and AMD devices (CPU, GPU, NPU) Optimize AI applications for Ryzen AI accelerat...