User-based collaborative filtering recommender systems are compared to logistic regression and random forests with different types of imputation and varying amounts of missingness on four different publicly available medical data sets: National Health and Nutrition Examination Survey (NHANES, 2011-2012 on Obesity), Study to Understand Prognoses Preferences Outcomes and Risks of Treatment (SUPPORT), chronic kidney disease, and dermatology data. In the present study, we set out to understand and assess the performance of recommender systems in a controlled yet realistic setting. As the “Big Data” era progresses, it is likely that approaches of this type will be reached for as biomedical data continues to grow in both size and complexity (e.g., electronic health records). The rationale is that patients with similar clinical features carry a similar disease risk. MethodsĪpplication of recommender systems to a problem of this type requires the recasting a supervised learning problem as unsupervised. In these applications, individuals represent patients, and items represent clinical data, which includes an outcome. Recently, there have been applications of collaborative filtering based recommender systems for clinical risk prediction. User-based collaborative filtering is a popular recommender system, which leverages an individuals’ prior satisfaction with items, as well as the satisfaction of individuals that are “similar”. Recommender systems have shown tremendous value for the prediction of personalized item recommendations for individuals in a variety of settings (e.g., marketing, e-commerce, etc.).