Skip to main content

Table 2  A summary of reviewed studies on knee osteoarthritis diagnosis and knee arthroplasty prediction

From: Artificial intelligence in diagnosis of knee osteoarthritis and prediction of arthroplasty outcomes: a review

Author (Year)

Journal

Prediction outcome

AI/ML algorithm(s)

Statistical performance

Strengths

Weaknesses

Clinical significance of study

Norman (2019) [18]

Journal of Digital Imaging

OA severity (KL grade)

DenseNet neural network architectures

Sensitivity & specificity: 84% & 86% (KL grades 0–1), 70% & 84% (KL grade 2), 69% & 97% (KL grade 3), 86% & 99% (KL grade 4).

Comparable sensitivity and specificity to manual KL grading and previous automatic systems employing different AI/ML algorithms

Training, validation and testing sets were selected from the same dataset. Misclassifications of KL grading typically occurred when there was hardware in the knee.

Provides additional data supporting the potential of AI in automatic assessment of OA radiological severity.

Tiulpin (2018) [19]

Scientific reports

OA severity (KL grade)

Deep Siamese CNN architecture

Average multi-class accuracy: 66.71%. AUC: 0.93. Kappa coefficient (agreement with expert annotations on test dataset): 0.83 (excellent). MSE value: 0.48.

Different datasets used for initial training and testing

Validation and testing sets were selected from the same dataset.

The provision of probability distributions for each KL grade prediction may assist clinicians in choosing KL grade in ambiguous cases.

Heisinger (2020) [13]

Journal of Clinical Medicine

Need for TKA

Artificial neural networks (ANNs) with linear, radial basis function and three-layer perceptron neural networks architectures

Total percentage of correctly predicted knees: 80%. Positive predictive value: 84%. Negative predictive value: 73%. Sensitivity: 41%. Specificity 30%.

First study to consider longitudinal change in symptomology (pain, function, quality of life) and radiographic structural change in a 4-year period prior to TKA

Training and testing sets were selected from the same dataset.

Future externally validated algorithms that can predict TKA need in advance using routinely available patient data could be highly useful for decisions for referral and triage in a primary care setting.

Leung (2020) [15]

Radiology

Need for TKA

Multitask deep learning model (ResNet34) trained with transfer learning

AUC: 0.87. Sensitivity: 83%. Specificity: 77%.

First study to directly predict TKA from knee radiographs using deep learning model

Limited data size (radiographs from 728 individuals in total) / Training and testing sets were selected from the same dataset.

TKA prediction models solely based on radiological data have limited clinical utility, although they may serve as a reference for future ML studies.

El-Galaly (2020) [12]

Clinical Orthopaedics and Related Research

Need for early revision TKA

LASSO regression, random forest classifier, gradient boosting model, neural network

AUCs: 0.57–0.60.

First study to predict early revision TKA (≤ 2 years of primary TKA) using preoperative patient data from arthroplasty registries / Temporal external validation was conducted (testing set selected from a separate hold-out year not included in training set).

Training and testing sets were selected from the same dataset.

Results from this study suggest that future models predicting early revision TKA may benefit from including more pre-operative information or predicting revision over a longer follow-up duration.