A Blueprint for Better Preventive Care

The healthcare industry is shifting towards value-based care that emphasizes prevention.  As a result, the search for models of preventive care that can effectively reduce the cost of chronic disease has intensified. Social determinants of health that drive a great deal of the chronic disease epidemic need to be addressed upstream, before they manifest in medical conditions that require treatment.

Financial solvency for many healthcare organizations rests on their ability to engage the populations they serve in order to help drive health behavior change, Research has shown a personalized approach to prevention is much more effective than any other model. However, providers are limited in their ability to coordinate personalized care with the traditional tools of case management or public health.

Modeling Patient Behaviors

Broadly, technology advancements such as machine learning, Internet of Things, and geographic information systems (GIS) can help providers significantly enhance their ability to engage healthcare users.

Back in 2014, I leveraged over 1000 hours spent rounding in hospitals with clinicians as a clinical software designer and technology consultant to create ProjectVision.  This was a machine learning platform that focused on predicting the likelihood of a patient successfully completing a typical 6-9 month diabetes management program.

The platform was based on a 5 step approach to understanding how patients interact with their daily environment from a health perspective:

  1. Initial assessment and grouping of patients based on psychological readiness for health behavior change
  2. Sharing personalized recommendations of healthy activities with patients via smartphone app
  3. Capturing GPS data each day to assess environmental features like crime, traffic, grocery stores in more that patients are exposed to daily.  Capturing data from connected wearable devices (optional)
  4. Update recommendations as needed to based on environmental risks and changes to psychological readiness
  5. Repeat step 4 every 30 days


Building the Model

One of the most critical pieces of this approach is properly quantifying psychological readiness for behavior change.  Our starting point was the Health Belief Model.  Per Wikipedia, the Health Belief Model “is a psychological health behavior change model developed to explain and predict health-related behaviors, particularly in regard to the uptake of health services.  The health belief model was developed in the 1950s by social psychologists at the U.S. Public Health Service and remains one of the best known and most widely used theories in health behavior research.”

The model is built around 7 core variables:

  • Perceived severity – the patient’s understanding of the severity of their health problem
  • Perceived susceptibility – the patient’s belief in their risk of developing a health problem
  • Perceived benefits – the patient’s belief in benefits of taking action to address their health problem
  • Perceived barriers – the patient’s belief in barriers preventing them from taking action
  • Individual demographics and psychosocial features (we looked at age, gender, ethnicity, dietary habits, exercise habits, stress level, and ABSI)
  • Cues to action – things that the patient perceives as triggers for making change

To capture this information from each patient, we built an onboarding survey onto our patient smartphone app.

Based on how the patient responded, we then presented them with daily diet, mental health, and physical health challenges tailored to their level of “activation”

Enter Machine Learning

This is where we felt a machine learning approach was necessary.  While health beliefs, or “activation” was a decent enough starting point, we were also sure that there were deeper psychographic and environmental factors that impacted actual behavior, regardless of how psychologically primed a patient was for change.

In order to capture those factors, we focused on two things.  First, we (after asking for permission, of course) captured GPS data from each patient’s phone.  From there, we built a simple behavior risk calculator to quantify environmental factors that had a potential negative impact on healthy behaviors within a 100 meter radius of each set of GPS coordinates captured.  The overall score for each patient was based on a 7-day rolling calculation

Second, we attached descriptive meta data to each health challenge.  The goal was to break down each challenge into one or more characteristics that we could use to add more depth to our understanding of each patient’s preferences.  Based on which challenges patients selected and completed, we could learn more about what specific health activities were most suited to their personality.

Diet Challenge Examples

Characteristic 1 Vegetarian vs. Non vegetarian
Characteristic 2 Involves substitution, addition, or reduction
Characteristic 3 Involves morning meal, afternoon meal or evening meal
Characteristic 4 Involves single vs. multiple food items
Characteristic 5 Involves social vs. solitary activity
Characteristic 6 Involves liquids vs. solid food

The initial challenges we fed to users were based strictly on how we classified them following the initial onboarding survey.  Our goal was to get as close to 100% daily challenge completion rate as possible for each patient by the end of the 6-9 month diabetes management program that our platform was used to support.  In order to get there, we reassessed each patient every 30 days based on behavior risk score, characteristics of challenges they actually completed, and level of “activation.”

At each 30 day mark, we used k-means clustering to group patients based on preferred characteristics across each of diet, mental health, and physical health challenges.  Then, we used collaborative filtering based on clinical goals established by the onboarding survey responses to further hone in on a specific list of updated challenges.  This new list was based on what other patients with similar clinical goals and shared preferences of challenge characteristics completed over the previous 30 days.

Deploying the Platform

Since we were dealing with both structured user demographics data as well as unstructured challenge and GPS data, we organized core functions using a microservices approach.  We stored the unstructured data as JSON via MongoDB and the structured data via MySQL.  The microservices layer was deployed via Docker on an AWS EC2 instance, while the challenge assignment and behavior risk calculations were performed via triggered jobs run on AWS Lambda.



We worked with 6 different diabetes management programs across California and Nevada, who each used our platform and app as part of 6-9 month long programs.  To use our system, each program used the onboarding survey on our patient-facing app as part of their initial program onboarding process.

Prior to implementing our platform, the historical average program completion rate across the 6 programs was 32%.  With our approach, we were able to increase that to 74%.  At the top performing program, we saw a completion rate of 92%.

From a less quantifiable perspective, we were able to offer providers a way of capturing analytics around the unique preferences of their populations (and even segments within their populations) that could help drive future changes to overall curriculum that are more tailored to items that have a bigger positive impact on program completion rate.

The financial incentives for most of the programs were such that payer reimbursement only occurred on successful program completion.  Our approach enabled providers to rapidly scale preventive care programming that not only maximized the likelihood of sustained health behavior change, but can also provided rapid return on investment.

Shingai Samudzi

Shingai Samudzi

Data and decision scientist by training, Shingai has spent much of the past decade building solutions to help solve clinical, business, and customer problems.