Stay in Touch



Senior Staff Engineer- DevOps






Bangalore, IN


Qubole, the leading cloud-agnostic, big data-as-a-service provider, is passionate about making data-driven insights easily accessible to anyone. Qubole delivers the industry’s first autonomous data platform. The cloud-based data platform, Qubole Data Service (QDS), removes the burden of maintaining infrastructure of multiple big data processing engines, and enables customers to focus on their data. Qubole customers process nearly an exabyte of data every month. Qubole investors include Charles River, Institutional Venture Partners, Lightspeed, Norwest, Harmony and Singtel Innov8. 
We have a rapidly growing footprint on AWS, a fast-growing customer base and up and coming services on GCE and Azure as well. We strongly believe in automating and codifying as much of our operational procedures as possible. Running this service securely, reliably and within budget is a hard problem. As one of our dedicated Production Engineer - you would be responsible for managing the Cloud infrastructure end to end.
Do you understand the challenges with managing dozens of production environments across regions on various public clouds as a SAAS platform, then we would definitely love to talk to you.

What you'll be doing;

    • Design/Improve tools to automate and write elegant automation to improve the deployment, administration, and monitoring of large-scale web service across AWS, Azure and GCP cloud.
    • Work with development teams to harden, enhance, document, establish processes and generally improve the operability and supportability and resiliency of our systems.
    • Drive cloud architecture design discussions and also innovate to improve the environment in which the services run.
    • Assist in the configuration/build-out of new deployments to facilitate our constant growth.
    • Develop software/processes for better utilization of underlying cloud resources.
    • Own and deliver projects aimed at improving infrastructure for various needs like monitoring, log analysis, alerting, deployment, etc.
    • Work with Security Managers to establish and document security controls and procedures.
    • Troubleshoot and resolve live production issues by analyzing logs from different sources.
    • Escalation for pager duty on-call during major outages.
    • Automate yourself out of the job if possible.

Required experiences and skills ;

    • Engineering degree in Computer Science and at least 10+ years of experience in a similar job profile.
    • Expert automation skills with Python or  Ruby or Go.
    • Strong system administration background for Linux based systems.
    • Large scale production experience with Kubernetes, AWS EKS, or other PaaS technologies.
    • Operational expertise around deploying and managing components like MySQL, Nginx, ElasticSearch,  Java Applications, RoR, Load Balancers.
    • Comfortable with networking fundamentals like Firewalls, Subnetting, Routing, etc.
    • Experience working with config and deploy management tools like Chef, Puppet, Ansible or Salt.
    • Monitoring and logging with ELK, Datadog, Signalfx, Graphite, Statsd.
    • Expert in cloud orchestration tools like terraformExperience in optimization/troubleshooting issues that span public clouds, systems, network, and code. 
    • Good RESTful API and systems design sensibilities

Apply for the job

Subscribe to our blog.


Blog & Newsletter Signup