Artificial intelligence in diagnosis of knee osteoarthritis and prediction of arthroplasty outcomes: a review

Lee, Lok Sze; Chan, Ping Keung; Wen, Chunyi; Fung, Wing Chiu; Cheung, Amy; Chan, Vincent Wai Kwan; Cheung, Man Hong; Fu, Henry; Yan, Chun Hoi; Chiu, Kwong Yuen

doi:10.1186/s42836-022-00118-7

Arthroplasty

Table 2 A summary of reviewed studies on knee osteoarthritis diagnosis and knee arthroplasty prediction

From: Artificial intelligence in diagnosis of knee osteoarthritis and prediction of arthroplasty outcomes: a review

Author (Year)	Journal	Prediction outcome	AI/ML algorithm(s)	Statistical performance	Strengths	Weaknesses	Clinical significance of study
Norman (2019) [18]	Journal of Digital Imaging	OA severity (KL grade)	DenseNet neural network architectures	Sensitivity & specificity: 84% & 86% (KL grades 0–1), 70% & 84% (KL grade 2), 69% & 97% (KL grade 3), 86% & 99% (KL grade 4).	Comparable sensitivity and specificity to manual KL grading and previous automatic systems employing different AI/ML algorithms	Training, validation and testing sets were selected from the same dataset. Misclassifications of KL grading typically occurred when there was hardware in the knee.	Provides additional data supporting the potential of AI in automatic assessment of OA radiological severity.
Tiulpin (2018) [19]	Scientific reports	OA severity (KL grade)	Deep Siamese CNN architecture	Average multi-class accuracy: 66.71%. AUC: 0.93. Kappa coefficient (agreement with expert annotations on test dataset): 0.83 (excellent). MSE value: 0.48.	Different datasets used for initial training and testing	Validation and testing sets were selected from the same dataset.	The provision of probability distributions for each KL grade prediction may assist clinicians in choosing KL grade in ambiguous cases.
Heisinger (2020) [13]	Journal of Clinical Medicine	Need for TKA	Artificial neural networks (ANNs) with linear, radial basis function and three-layer perceptron neural networks architectures	Total percentage of correctly predicted knees: 80%. Positive predictive value: 84%. Negative predictive value: 73%. Sensitivity: 41%. Specificity 30%.	First study to consider longitudinal change in symptomology (pain, function, quality of life) and radiographic structural change in a 4-year period prior to TKA	Training and testing sets were selected from the same dataset.	Future externally validated algorithms that can predict TKA need in advance using routinely available patient data could be highly useful for decisions for referral and triage in a primary care setting.
Leung (2020) [15]	Radiology	Need for TKA	Multitask deep learning model (ResNet34) trained with transfer learning	AUC: 0.87. Sensitivity: 83%. Specificity: 77%.	First study to directly predict TKA from knee radiographs using deep learning model	Limited data size (radiographs from 728 individuals in total) / Training and testing sets were selected from the same dataset.	TKA prediction models solely based on radiological data have limited clinical utility, although they may serve as a reference for future ML studies.
El-Galaly (2020) [12]	Clinical Orthopaedics and Related Research	Need for early revision TKA	LASSO regression, random forest classifier, gradient boosting model, neural network	AUCs: 0.57–0.60.	First study to predict early revision TKA (≤ 2 years of primary TKA) using preoperative patient data from arthroplasty registries / Temporal external validation was conducted (testing set selected from a separate hold-out year not included in training set).	Training and testing sets were selected from the same dataset.	Results from this study suggest that future models predicting early revision TKA may benefit from including more pre-operative information or predicting revision over a longer follow-up duration.

Back to article page

ISSN: 2524-7948

Contact us

Submission enquiries: Access here and click Contact Us