Distributed Data Systems - Software Engineer

DataBricks
Mountain View, California, US
$150K-$190K a year
Full-time

At Databricks, we are obsessed with enabling data teams to solve the world's toughest problems, from security threat detection to cancer drug development.

We do this by building and running the world's best data and AI infrastructure platform, so our customers can focus on the high value challenges that are central to their own missions.

Scroll down to find an indepth overview of this job, and what is expected of candidates Make an application by clicking on the Apply button.

Founded in 2013 by the original creators of Apache Spark, Databricks has grown from a tiny corner office in Berkeley, California to a global organization with over 1000 employees.

Thousands of organizations, from small to Fortune 100, trust Databricks with their mission-critical workloads, making us one of the fastest growing SaaS companies in the world.

Our engineering teams build highly technical products that fulfill real, important needs in the world. We constantly push the boundaries of data and AI technology, while simultaneously operating with the resilience, security and scale that is critical to making customers successful on our platform.

We develop and operate one of the largest scale software platforms. The fleet consists of millions of virtual machines, generating terabytes of logs and processing exabytes of data per day.

At our scale, we regularly observe cloud hardware, network, and operating system faults, and our software must gracefully shield our customers from any of the above.

Modern data analysis employs sophisticated methods such as machine learning that go well beyond the roll-up and drill-down capabilities of traditional SQL query engines.

As a software engineer on the Runtime team at Databricks, you will be building the next generation distributed data storage and processing systems that can outperform specialized SQL query engines in relational query performance, yet provide the expressiveness and programming abstractions to support diverse workloads ranging from ETL to data science.

Below Are Some Example Projects :

  • Apache Spark : Develop the de facto open source standard framework for big data.
  • Data Plane Storage : Deliver reliable and high performance services and client libraries for storing and accessing humongous amount of data on cloud storage backends, e.

g., AWS S3, Azure Blob Store.

Delta Lake : A storage management system that combines the scale and cost-efficiency of data lakes, the performance and reliability of a data warehouse, and the low latency of streaming.

Its higher level abstractions and guarantees, including ACID transactions and time travel, drastically simplify the complexity of real-world data engineering architecture.

Delta Pipelines : It's difficult to manage even a single data engineering pipeline. The goal of the Delta Pipelines project is to make it simple and possible to orchestrate and operate tens of thousands of data pipelines.

It provides a higher level abstraction for expressing data pipelines and enables customers to deploy, test & upgrade pipelines and eliminate operational burdens for managing and building high quality data pipelines.

Performance Engineering : Build the next generation query optimizer and execution engine that's fast, tuning free, scalable, and robust.

What We Look For :

  • BS in Computer Science, related technical field or equivalent practical experience.
  • Optional : MS or PhD in databases, distributed systems.
  • Comfortable working towards a multi-year vision with incremental deliverables.
  • Driven by delivering customer value and impact.
  • 2+ years of production level experience in either Java, Scala or C++.
  • Strong foundation in algorithms and data structures and their real-world use cases.
  • Experience with distributed systems, databases, and big data systems (Spark, Hadoop).

Benefits

  • Comprehensive health coverage including medical, dental, and vision
  • 401(k) Plan
  • Equity awards
  • Flexible time off
  • Paid parental leave
  • Family Planning
  • Gym reimbursement
  • Annual personal development fund
  • Work headphones reimbursement
  • Employee Assistance Program (EAP)
  • Business travel accident insurance
  • Mental wellness resources

Our Commitment to Diversity and Inclusion

At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards.

Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics.

Pay Range Transparency

Databricks is committed to fair and equitable compensation practices. The pay range(s) for this role is listed below and represents base salary range for non-commissionable roles or on-target earnings for commissionable roles.

Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to job-related skills, depth of experience, relevant certifications and training, and specific work location.

Based on the factors above, Databricks utilizes the full width of the range. The total compensation package for this position may also include eligibility for annual performance bonus, equity, and the benefits listed above.

Local Pay Range

$150,000 $190,000 USD

J-18808-Ljbffr

5 days ago
Related jobs
Promoted
pony.ai
Fremont, California

The systems you build will have a large impact on ADAS, from fleet data collection & processing, to Machine Learning workflows, to evaluation and validation of the ADAS software stack. Design and implement tools and pipeline to handle data from autonomous vehicles including data labeling, batch proc...

Promoted
Dell
Milpitas, California

As part of the project responsibilities, you may also be developing test automation and automation framework modules and work collaboratively across Dell divisions to coordinate feature integration and product deliveryWe can’t wait for you to discover this for yourself as a Senior/Principal Software...

Promoted
Apple
Cupertino, California

The Data Governance Solutions team, part of Apple Data Platform, is focussed on building cutting edge solutions to support Apple's Data Governance and Compliance requirements for all data ingested, processed and stored within Apple. The successful candidate will be responsible for ensuring the compl...

Promoted
Snowflake
San Mateo, California

The Data Clean Rooms Applications team builds services and applications that empower secure multi-party collaboration on sensitive data while preserving the privacy of the data. You will join a fast-paced collaborative team of engineers on the journey to provide customers with an integrated set of i...

Promoted
Apple, Inc.
Cupertino, California

As a Lead Engineer in Apple Data Governance, you will be instrumental in designing and implementing solutions that ensure the integrity, privacy, and security of Apple's vast and diverse data assets. Champion data quality and metadata management initiatives, creating frameworks to enhance data accur...

Promoted
Apple Inc.
Cupertino, California

Knowledge of distributed databases, distributed storage, or similar mass-scale Distributed Systems. We are building a distributed, ordered key-value database that handles millions of transactions per second to support critical infrastructural systems and frameworks. Apple systems have huge scale and...

Promoted
Robinhood
Menlo Park, California

As a Staff Software Engineer, you will lead the development of data ingestion pipelines that process petabytes of data and billions of events daily. This role is highly cross-functional, requiring you to collaborate closely with Data Science, Data Engineering, and Product teams to understand custome...

NVIDIA
Santa Clara, California

NVIDIA is searching for outstanding software engineers to join the CUDA driver team. This work includes design, development, verification, and maintenance of new software features that monitor and run the Compute product line-up on Windows and Linux Operating Systems. Computer Science, Computer Engi...

Cadence Design Systems, Inc.
San Jose, California

We are looking for an exceptional C++ software engineer to join the Protium Software Development Team to develop and enhance the Protium FPGA-Based Prototyping product which is used by leading CPU/GPU/HyperScaler companies for pre-Silicon software validation of their SOC’s. You will develop new algo...

Advanced Micro Devices, Inc
Santa Clara, California

AMD together we advance_ Data Center Systems Application Engineer THE ROLE: The Datacenter GPU System Engineering team is seeking a strong Data Center Systems Application Engineer. Data Center Systems Application Engineer THE ROLE: The Datacenter GPU System Engineering team is seeking a strong Data ...