Skip to main content

Table 1 Description of machine learning models

From: Improved performance of machine learning models in predicting length of stay, discharge disposition, and inpatient mortality after total knee arthroplasty using patient-specific variables

Machine Learning Models

Description

Random Forest (RF)

Qualitative algorithm using individual decision trees to generate a collective prediction. The strengths of this model are based on randomness utilizing methods such as bootstrapping, creating individual data sets through sampling, and bootstrap aggregating, otherwise known as bagging to shuffle individual variables each tree is trained. The algorithm works in a voting matter, so that the collective decision is supported by the number of individual trees that cast a vote

Neural Network (NN)

Network based on the working layers of neurons programmed to interpret data based on the channels and their corresponding weight in the forward propagation of decision making. Backpropagation trains the neurons by comparing the output with the correct output to generate the appropriate weight of each channel

Extreme Gradient Boost Tree (XGBoost Tree)

Expands on existing tree algorithms by further subtraining each tree in smaller subsets of data. The integration of small batch training strengthens an individual tree while the gradient boosting process uses the collective output from the trees. Gradient boosting builds upon sequential loss function to build the next generation of trees. This method occurs until the boosted ensemble can no longer improve upon the previous generation

Extreme Gradient Boost Linear (XGBoost Linear)

Similar to XGBoost tree, however, its utility is in features with less data-sets or low noise. The algorithm acts in a linear solution model with gradient boosting acting to build on the next rule until a rule can no longer improve upon the next generation. The speed is generally faster than that of XGBoost Tree, but accuracy is decreased if noise is high

Linear Support Vector Machine (LSVM)

Classifies a dataset using a regression algorithm with a small learning datasets. The model aims to divide the dataset into two classes. Each data point represents a distinct point in the Nth dimension of the hyperplane. LSVM maximizes the distances between the data points to determine the margin and to predict outcomes

Chi square Automatic Interaction Detector (CHAID)

Model based on the statistical differences between parent and child nodes given qualitative descriptors. The development requires large datasets to determine how to best identify patterns to generate accurate predictions

Decision lists

Boolean function model based on “if–then-else” statements with all subsets having either a true or false functional value, which is also known as an ordered rule set. Rules in this form are usually learned with a covering algorithm, learning one rule at a time

The rules of this subset are tried in order unless no rule is induced, which pushes a default rule to be invoked

Linear Discriminant Analysis (Discriminant)

Calculate summary statistics of data by means and standard deviations. Using a training data source, new predictions are made when data are added and class labels are given based on each input feature. This machine learning method assumes input variables are normally distributed and therefore have the same overall variance

Logistic Regression

Similar to other linear regression models, but instead of solving for regression it acts to solve for classification. The input data sources can give a binary discrete value probability based on the independent variables of a given set. The benefit of logistic regression is its ability to classify observations and determine the most efficient observation group for classification, which can then be used to identify the probabilities of new data sets to fit into that classification

Bayesian Networks

Probabilistic graphical model of machine learning. They act to use a data source to identify probabilities for predictions, anomaly detections, and times predictions of an inputted data source. The data are computed into nodes which represent the variables that are linked to one another indicating their influence on one another. These links are a part of the structural learning and are identified automatically from the data. The data source can then be represented in graphical depictions called Asia networks making their data easy to understand following calculation