Stay in Touch



Data Scientist - CAFE



Health Catalyst



Salt Lake City, UT, US


About Health Catalyst

Health Catalyst was named as one of the 30 Best Workplaces in Technology by Fortune Magazine and the 11th best place to work by Glassdoor. Health Catalyst earned the highest overall score in Healthcare BI by KLAS and was named to the World’s Best 100 cloud companies by Forbes. Health Catalyst analyzes healthcare records of almost a third of the US population (65 million patients) and recently released the first open source software for healthcare machine learning:

Health Catalyst’s platform and applications are being used at leading health systems including, John Muir Health, Kaiser Permanente, MultiCare Health System, Partners HealthCare, Providence Health & Services, Stanford Hospital & Clinics, Texas Children’s Hospital, and over 40 others. Health Catalyst products and services are utilized in over 400 hospitals and 4000 clinics, supporting over 90 million patients.

Our team lives the cultural attributes of Smart, Hardworking, and Humble. Learn more about working at Health Catalyst here:

Job Summary

This position is a unique career opportunity to make an enormous, unprecedented impact on the US healthcare system using a large, detailed healthcare dataset. Health Catalyst’s CAFÉ product line is aggregating data from Health Catalyst’s 50+ clients (representing health systems and payers), encompassing 100+ million patients and their data – and growing every day. Ultimately, CAFE will be one of the world’s largest, most detailed healthcare datasets. Using the CAFÉ data, we’re creating machine learning models and software to direct hospitals and payers to their biggest improvement opportunities. This position will lead the development of data science and machine learning activities for the CAFÉ product line.

Is this You?

You are an expert in using python for data science
You love learning about the latest machine learning developments
You care about making a positive impact in US healthcare
You are self-motivated and comfortable working independently under general direction

Duties & Responsibilities

You’ll develop the recommendation engine we use to direct users to improvement opportunities
You’ll develop machine learning models for healthcare risk adjustment
You’ll add to our machine learning pipeline
You’ll participate in planning sprints, software design, and developing and testing features

Required Skills

SQL: Intermediate to advanced query writing, including CTEs, aggregations, window functions, and pivots
Python: sci-kit learn, pandas, Matplotlib, etc.
Comfortable working in the command line and with source control
Understand the pros / cons of various supervised learning modeling approaches, i.e. predictive power, bias / variance trade-off, computational requirements, etc.
Comfortable with basic concepts from probability, statistics, and linear algebra

Desired Skills

You are an expert in either electronic health record or administrative claims data
You understand object oriented programming and how to apply it to data science
You are world-renown at ping pong

Education & Relevant Experience

BS/BA in computer science, mathematics, physics, statistics, economics, or a related field
Masters or PhD preferred

The above statements describe the general nature and level of work being performed in this job function. They are not intended to be an exhaustive list of all duties, and indeed additional responsibilities may be assigned by Health Catalyst.

Apply for the job

Subscribe to our blog.


Blog & Newsletter Signup