These novel methods enable researchers to utilize larger amounts of data than is possible using traditional statistical analysis.

UNC Thurston Arthritis Research Center investigators recently published a paper on the application of machine learning methods to knee osteoarthritis phenotypes in Osteoarthritis and Cartilage, showing that this approach was successful in assessing numerous variables of different types and scales simultaneously, allowing identification of key variables associated with disease progression in a data-driven manner.

The use of machine learning can be particularly helpful to researchers when they need to utilize a large amount of data. This is especially true when the number of characteristics being assessed for each person is large, and the number of people studied is comparatively small — also known as a “high dimension, low sample size” setting. This type of analysis can allow the identification of new, previously unconsidered subgroups in the data, providing new avenues of investigation which may lead to improved treatments in the future.

In the Osteoarthritis and Cartilage paper, TARC investigators found, in a single analysis, that imaging variables were more strongly related to knee osteoarthritis progression than were clinical or biomarker variables, and that specific combinations of clinical, biomarker, and imaging features were associated with progression or non-progression, confirming prior findings that required many separate analyses.

Dr. Amanda Nelson will present a poster on a related analysis using data from the Johnston County Osteoarthritis Project at the Osteoarthritis Research Society International (OARSI) meeting May 3-4, 2019. In this project, machine learning methods helped identify potential clusters in the data (based on genetic polymorphisms) that had unique risk factor profiles for progression. She has also been invited to present this work at a poster tour, and will summarize her work to date in this rapidly developing research area at the Osteoarthritis Phenotype Research Discussion Group Meeting on May 4th.