Stay in Touch

TITLE

 

Software Engineer - Distributed Query Processing

COMPANY

 

Dremio

LOCATION

 

Santa Clara, CA, US

Description

Dremio is the Data Lake Engine company. Our mission is to reshape the world of analytics to deliver on the promise of data with a fundamentally new architecture, purpose-built for the exploding trend towards cloud data lake storage such as AWS S3 and Microsoft ADLS. We dramatically reduce and even eliminate the need for the complex and expensive workarounds that have been in use for decades, such as data warehouses (whether on-premise or cloud-native), structural data prep, ETL, cubes, and extracts. We do this by enabling lightning-fast queries directly against data lake storage, combined with full self-service for data users and full governance and control for IT. The results for enterprises are extremely compelling: 100X faster time to insight; 10X greater efficiency; zero data copies; and game-changing simplicity. And equally compelling is the market opportunity for Dremio, as we are well on our way to disrupting a $25BN+ market.
 
If you, like us, say “bring it on” to exciting challenges that really do change the world (no BS), we have endless opportunities where you can make your mark.
 
Founded in 2015, Dremio is headquartered in Santa Clara, CA. Connect with Dremio on GitHub, LinkedIn, Twitter, and Facebook and visit https://www.dremio.com/careers/ for more information on the opportunities at Dremio.
 
About the Role
Query Processing engineers at Dremio own the development of the distributed query processing engine that powers Dremio’s Data Lake Engine.

Responsibilities and Ownership

    • Own the  full cycle of development from inception, design, development, testing, and production.
    • Work on problems such as distributed query planning, parallel query execution, schedulers, resource management, low latency access to distributed storage, auto scaling, and self healing.
    • Care deeply about modular design patterns to deliver an architecture that’s rooted in simplicity, that’s easy to iterate on and constantly evolve.
    • Passionate about quality, zero downtime upgrades, availability, resiliency, and uptime of the platform.

Requirements

    • B.S. in Computer Science and/or Math. M.S. and Ph.D in a related technical field or equivalent practical experience
    • Fluency in Java  and/or C++ with 4+ years of experience developing production level software
    • Strong foundation in data structures, algorithms, multi-threaded and asynchronous patterns and their applications towards developing scalable systems
    • Strong relational database fundamentals and understanding of relational algebra and operators, cost based query optimization, query parallelization
    • Background in database internals, distributed query processing and experience working on parallel processing (MPP) databases or data platforms
    • Strong knowledge of SQL
    • Experience with NoSQL and related systems (e.g. Hadoop, Spark, MongoDB, Elasticsearch) a plus
    • Understanding of distributed data stores and file systems like HDFS or S3 or ADLS a plus
    • Excellent communication skills and affinity for collaboration and teamwork
    • Interested and motivated to be  part of a fast-moving startup with a fun and accomplished team
    • Startup experience a plus
Dremio doesn't accept unsolicited agency resumes and won't pay fees to any third-party agency or firm that doesn't have a signed agreement with Dremio.

Apply for the job

Subscribe to our blog.


 

Blog & Newsletter Signup