Stay in Touch

TITLE

 

Site Reliability Engineer

COMPANY

 

Dremio

LOCATION

 

Santa Clara, CA, US

Description

Dremio is an exciting, early-stage, Series B, well-funded startup in hyper-growth mode. We are backed by Lightspeed, Norwest, Redpoint and Cisco Ventures. 
Dremio’s data platform makes it ridiculously easy to discover and query huge datasets regardless of the data’s format and location. Dremio Distributed Querying Processor can query data across on-prem or cloud data sources ranging from Amazon S3, ADLS, RDBMS, NoSQL, HDFS and more. Dremio accelerates the retrieval of data via the use of its proprietary Data Reflections™ which results in retrieval times that are up to 1000x faster. 
 
At Dremio we are committed to the open source software model. We are the co-creators of Apache Arrow and many of us have been actively committing to projects for nearly a decade. We use a number of open source projects to build Dremio, including projects we embed in our platform.
 
We’re looking for people with a strong background or interest in building successful products or systems. You’re comfortable in dealing with lots of moving pieces, working in a fast paced environment and dealing with ambiguity.
 
Founded in 2015, Dremio is headquartered in Santa Clara, CA. Connect with Dremio on GitHubLinkedInTwitter, andFacebook and visit https://www.dremio.com/careers/ for more information on the opportunities at Dremio.  
 
About the position:
 
Dremio’s SREs ensure that our internal and externally visible services have reliability and uptime appropriate to users' needs and a fast rate of improvement. 
 
You will be joining a newly formed team that will spearhead our efforts to launch a cloud service. This is an opportunity to join a very fast growth startup and help build a cloud service from the ground up.

Responsibilities and Ownership

      • Ability to debug and optimize code and automate routine tasks.
      • Evangelize and advocate for reliability practices across our organization
      • Collaborate with other Engineering teams to support services before they go live through activities such as system design consulting, developing software platforms and frameworks, monitoring/alerting, capacity planning and launch reviews.
      • Analyze and optimize our core product by developing and implementing reliability and performance practices.
      • Scale systems sustainably through automation, and evolve systems by pushing for changes that improve reliability and velocity.
      • Be on-call for services that the SRE team owns.
      • Practice sustainable incident response and blameless postmortems.
 

Qualifications

      • You are Interested in designing, analyzing and troubleshooting large-scale distributed systems.
      • You have a systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
      • You have a great ability to debug and optimize code and automate routine tasks.
      • You have a solid background in software development and architecting resilient and reliable applications.
      • Excellent command of cloud services on AWS/GCP/Azure, Kubernetes and CI/CD pipelines.
 
Dremio doesn't accept unsolicited agency resumes and won't pay fees to any third-party agency or firm that doesn't have a signed agreement with Dremio.

Apply for the job

Subscribe to our blog.


 

Blog & Newsletter Signup