Much of today’s psychological research is focused on mental health treatment. Whether or not a mental health patient successfully completes their treatment depends on many different variables, and it is often difficult for professionals to determine what the best course of treatment is for a patient. Since 21% of mental health patients prematurely terminate their treatment, according to an Idaho University study, it is crucial to find a method to accurately predict treatment completion.

With the rise of machine learning, we now have powerful tools to do just that. The goal of this project is to use machine learning tools to predict treatment completion for people diagnosed with major depressive disorder. Using data from the National Comorbidity Survey Replication (NCS-R), a group of UC Berkeley students trained a dense neural network (DNN) to predict treatment completion based on numerous variables, which include sex, age, demographics, socioeconomic status, medication, treatment type, and symptoms.

The students on this project encountered several challenges along the way. They found their biggest challenge was how difficult it was to access a large psychological dataset. Although they were able to find a large national study (the NCS-R) to work with, they ended up working with only a small subset of the dataset due to the lack of usable data. However, by applying data engineering techniques, the students were able to train several machine learning models with the data. After experimenting with several models including KNN, Random Forest, and XGBoost, the students ultimately decided on using a DNN due to its relatively high accuracy and F1 scores.

Based on the promising results of their project, the students hope to see more machine learning methods in future psychological research into mental health treatment.