- Open Access
Reliability and validity of the Hebrew version of the forgotten joint score for assessing the outcomes of total knee arthroplasty
Arthroplasty volume 3, Article number: 27 (2021)
This prospective study aimed to assess the reliability and validity of the Hebrew version of the forgotten joint score-12 in patients undergoing total knee arthroplasty, because it is going to be used in the Hebrew-speaking populations in Israel.
The English version of forgotten joint score-12 was translated into Hebrew version by using the standard procedures and in collaboration with its authors. The consecutive patients who had undergone total knee arthroplasty in a single hospital were asked to fill out the Hebrew version of forgotten joint score-12, Oxford knee score, Short Form 12, and visual analog scale. A random subgroup of 60 patients were then asked to fill out a second Hebrew version of forgotten joint score-12 at a minimum of 2-week interval. The reliability was assessed in terms of internal consistency, test-retest reliability and split-half reliability. The validity was measured in terms of the outcomes as mentioned above.
A total of 102 patients participated in the study. The Hebrew version of forgotten joint score-12 showed high reliability. The internal consistency was excellent (Cronbachs’ α = 0.943) and test-retest reliability was high (Intraclass correlation = 0.97). The forgotten joint scores were correlated with the Oxford knee score, Short Form 12, and visual analog scale (r = 0.86, r = 0.72, and r=-0.8, respectively), indicating a high validity.
The Hebrew version of forgotten joint score-12 has excellent reliability, excellent test-retest reliability and good validity. It can be safely used for assessing outcomes of TKA.
Total knee arthroplasty (TKA) represents one of the most common surgeries performed worldwide . However, approximately 20 % of patients report dissatisfaction following primary TKA . The English version of forgotten joint score (FJS)-12 is a self-administered questionnaire for assessing awareness of TKA, but language barrier may pose a challenge when the questionnaire is used in non-English-speaking populations in clinical practice [3,4,5,6,7,8,9].
The exact reason for a relatively high dissatisfaction rate is unknown. Many surgeons suggest that it might result from the inability to restore natural joint sensation . Some traditional “surgeon-centered” tools have been developed to assess the outcomes of TKA, including active and passive range of motion, muscle strength, functional tasks, implant survival, etc. [7,8,9] However, none of the tools assesses patient’s understanding of joint awareness.
In recent years, patient-reported outcomes (PROs) have become increasingly common due to a new insight into the understanding and measuring the surgical outcomes from the patients’ point of view . The forgotten joint score (FJS)-12 was developed by Behrend et al.  in 2012. It is a questionnaire based on the notion that a successful surgery enables the patient to be unaware of his artificial joint in daily living. The questionnaire is comprised of 12 self-administered questions regarding joint awareness. It is assumed that the lack of joint awareness implies a successful outcome. It encompasses many factors for a good outcome, including knee pain, mobility, joint stiffness, daily function, and patient expectation. The original English version of FJS-12 shows a high internal consistency (Cronbach α = 0.95) and a good correlation with other PROs (r = 0.69–0.79). It also includes a few sociodemographic features that may impact the outcomes. Therefore, a validation study is needed before the FJS-12 is introduced to a new population , e.g., a Hebrew-speaking population.
Moreover, because of improved outcomes of TKA and increased patient expectations, many assessments made by the PRO tools result in ceiling effects . In addition, these tools showed the weakness in differentiation between “good” and “excellent” outcomes. The FJS-12 has shown a lower ceiling effect than other PROs, with a strong differentiating power . Its evident effectiveness has made it popular in a great many countries, including Germany, Italy, Spain, Netherlands, France, Poland, Portugal, Sweden, Norway, China, Japan, and the Republic of Korea. Moreover, it has been used as a research tool in more than 180 papers published globally. (http://www.forgotten-joint-score.info/).
The purpose of this prospective study was to examine the reliability and validity of the Hebrew version of the FJS-12 in patients undergoing TKA, since it is going to be used in the Hebrew-speaking populations in Israel.
Materials and methods
The protocol was approved by the Institutional Review Board that was responsible for human experiments in accordance with the ethical standards. All patients gave informed consent to participate in the study.
The inclusion criteria were patients who had undergone a primary TKA in a single hospital between March 2018 and December 2019, and the patients were sufficiently proficient in Hebrew. The exclusion criteria included another injury or illness of the lower limb, mental disorder, revision TKA, and lack of informed consent.
English Version of FJS-12
The FJS-12 is a questionnaire consisting of 12 items regarding a patient’s ability to “forget” the artificial joint in everyday life . The 12 questions are about daily living. For each question, there are 6 options, i.e., “never”, “almost never”, “sometimes”, “mostly”, and the last option “irrelevant for me”. The score ranges from 0 to 100, with 100 representing the lowest awareness of the knee implant. If the response to more than 4 items was “non-relevant”, the score should not be used. The English version of FJS-12 is shown in Table 1.
Translation and Validation
Translation and validation were performed in collaboration with the official developers and according to the accepted guidelines  in the following order: (1) preparation of files; (2) two forward translations into the Hebrew language by two independent working translators; (3) reconciliation of these two translations into one optimal version; (4) two back translations of the reconciled version into English; (5) review and discussion of the translated report sent to the developers; (6) proofreading arranged by the developers who sent the results back for approval; (7) pilot testing in 10 patients (10 knees); (8) review of the report; and (9) finalization of the project.
The research was conducted in a home setting. The patients received a primary call when they were given an explanation regarding the research and asked to give oral consent. Following the consent, the patients were instructed to answer the Hebrew version of FJS-12, Oxford knee score (OKS), 10-cm visual analog scale (VAS), and Short Form (SF)-12 Health Survey. The patients who had undergone staged bilateral knee arthroplasty were asked to answer the questions in terms of the side that received TKA most recently. Sixty patients were randomly selected for assessing the test-retest reliability. After a minimum of 2 weeks following the first call, these patients received a second phone call and were asked to answer the Hebrew version of FJS-12 again. We chose a minimal period of 2 weeks to decrease the patients’ option of remembering the questions. Before the second questionnaire, the patients were asked whether a change in their physical status had occurred.
The SF-12 Health Survey Questionnaire is an abridged version of the SF-36 developed in 1996. All 12 items are used to calculate the physical and mental component summary scores by applying a scoring algorithm. The SF-12 can serve as a general tool to evaluate the patients’ general health or well-being following a specific procedure, such as TKA [12, 13]. The SF-12 gives two separate scores (physical and mental). In this study, we assessed only the physical score.
In 1998, the OKS was developed following the Oxford hip score. It is comprised of 12 items, and all are related to the knee joint. Its main application is to assess pain and function in patients with knee osteoarthritis, either before or after surgery. The scores range between 0 and 48 . The VAS was used to rate knee pain in patient’s daily living. The pain was rated on a 1–10 point scale.
We measured the ceiling and floor effects from the percentage of the best or worst possible score . The ceiling and floor effects are commonly accepted when the percentage is less than 15 %. Low ceiling and floor effects indicate a high ability to distinguish between “good” and “excellent” outcomes, which means the questionnaire, as a whole, possesses a high discriminatory power.
To assess the reliability of the FJS-12, we measured the internal consistency, test-retest reliability, split-half reliability, and the SEM (standard error of measurements) . Internal consistency, measured in Cronbach’s α, tests and confirms a unified construct measured. The scores greater than 0.7 were considered sufficient, scores more than 0.8 were deemed to be good, and scores greater than − 0.9 were considered excellent . The test-retest reliability was assessed in terms of intraclass correlation coefficients (ICC). An ICC greater than 0.7 was considered sufficient . The split-half reliability was rated in terms of the Spearman-Brown coefficient and a value higher than 0.6 was considered adequate . The standard error of the measurement (SEM) was calculated using the following formula: SEM = variance*√(1-ICC) . A small SEM is indicative of high reliability. The smallest detectable change (SDC) was also calculated. SDC is the smallest change in a score that can be interpreted as real change and was calculated by using the formula: SDC = 1.96 * √2 * SEM .
Validity is the degree to which the scores of a PRO instrument are consistent with hypotheses based on the assumption that the PRO instrument validly measures the construct to be measured . Validity was measured in terms of the Pearson correlation coefficient with the OKS, SF-12, and VAS. A correlation coefficient was taken as low if it was less than 0.3, moderate if it was in the range of 0.3 to 0.7 and high if it was greater than 0.7.
Additionally, we measured the discriminatory ability of the Hebrew version of FJS-12. We conducted a statistical t-test between the high (top 25 %) and low (25 %) score groups for each item on the questionnaire. The statistical significance indicated that the item was able to discriminate between the different groups of patients.
A total of 110 patients met the inclusion criteria, and all were contacted and agreed to participate in this study. In accordance with the FJS-12 protocol , 3 patients were excluded from the analysis because their responses were “irrelevant” with more than 4 items. Five patients refused to participate (95 % acceptance rate). Among them, 2 refused because of privacy concerns and 3 did not provide any answer. One patient completed the first FJS-12 but declined to finish the second FJS-12. The mean patient age was 67.42 ± 7.15 (mean ± standard deviation), 68 (67 %) patients were female, and 34 (33 %) were male. The average follow-up time lasted 12.6 ± 6.47 months. A random group of 60 patients were reached a second time and they responded to the FJS-12 for test-retest reliability (Table 2). The average time interval between calls was 31.26 ± 13.1 days. Only 15 (14 %) patients responded to question number 12 that is related to sports activity, which significantly impacted the evaluation of internal consistency. Therefore, we conducted two internal consistency tests (one covered this question and the other did not). Before exclusion, Cronbach’s α was 0.92, and the inter-item correlation was 0.53. After excluding question 12, the Hebrew version of FJS demonstrated excellent internal consistency with a Cronbach’s α score of 0.943 (95 % confidence interval [CI] 0.92–0.95) and an inter-item correlation coefficient of 0.6. The test-retest reliability was very high, with a measured ICC of 0.97 (95 % CI 0.95–0.98) (Fig. 1). The split-half reliability was increased with a Spearman-Brown coefficient of 0.93 (95 % CI 0.87–0.94); the SEM was 4.97 (low), and accordingly, the SDC was 13.77.
Figure 2 shows the score distribution over the scales of the FJS-12, OKS, SF-12 (physical), and VAS pain. The FJS and the OKS showed a positive correlation (r = 0.86, P < 0.001), which was in the “high” correlation category. The FJS and the SF-12 (physical) also exhibited a positive correlation (r = 0.72, P < 0.001), which was also in the “high” correlation category. The FJS and the VAS pain score revealed a high negative correlation (r = -0.8, P < 0.001) (Table 3). Only 2 patients responding to the FJS scored the maximal point of 100, resulting in a negligible ceiling effect of 1.9 %. No patient scored 0, suggesting that there was no floor effect and the FJS-12, as a whole, had high discriminatory power and a good content validity. The t-test scores between the high and low scores were greater than 2 (questions 1–11). With all the tests, P value < 0.0001.
TKA procedures have been proven to be highly effective in treating severe osteoarthritis, relieving pain, and restoring joint functionality [4, 7, 8]. The clinical success of the procedure has made it an increasingly common operation. [1, 2]. As a result, nowadays, patients undergoing TKAs are younger and more physically active and the expectation to the procedure is being raised . This trend towards better function and higher success rates, in combination with the shift to patient-centered care, lead to development of multiple PRO tools .
These tools have enabled doctors to better evaluate postoperative successes and failures from patients’ perspective and they have showed relatively good discriminatory power. Nonetheless, they lack, as many believe, the critical criteria for judging a successful arthroplasty: a natural joint feeling and joint awareness . What is more, many of these tools have shown considerable ceiling and floor effects, which render it difficult to distinguish between a good and an excellent score [20, 21]. The FJS-12 was developed by Behrend et al.  to address these issues. In this study, we were able to reproduce the original paper results, which showed excellent reliability. Moreover, we examined the correlation between the FJS-12 and other commonly-used PROs, which enabled us to show that the Hebrew version of FJS-12 has high validity and is culturally adapted to the Hebrew-speaking populations.
The average patient age in this study was comparable to that of early studies on the subject, and the male-to-female ratio was also similar (the optimal ratio is 2:1) [22, 23]. The sample size was determined by using the recommended guidelines for PRO validation and was applied to each questionnaire item . Notably, question number 12 regarding awareness during physical activity was excluded from the analyses because only 15 (14 %) participants answered this question. This phenomenon was also observed in other Mediterranean countries . The lack of compliance with this question, as compared to all other questions, implies that it is irrelevant with our population. As this work was phone-call based, we understand the reason for non-compliance since most of the subjects do not perform any form of regular physical activity, regardless of their knee status.
The Hebrew version of FJS-12 demonstrated excellent internal consistency with a Cronbach’s α of 0.943, which was virtually identical to the one achieved by Behrend et al.  in the original FJS study (having a Cronbach’s α of 0.95). The test-retest reliability was similar to test-retest scores in the early studies conducted in Mediterranean and European countries [23, 24]. The split half-reliability was very high. Finally, the floor effects (zero) and the ceiling effects were significantly lower than the accepted threshold of 15 %, indicating that the entire test has good discriminatory power. Questions 1 to 11 possessed significant discriminatory power.
In this study, we chose to assess the OKS, SF-12, and VAS on the basis of the fact that there is no gold standard for the postoperative evaluation of TKA. In many early studies, different PROs were chosen for comparison. Therefore, it is difficult to compare our outcomes with other studies precisely. Although not comparable, our results still showed a high correlation with different PROs [24,25,26]. The stronger correlation between the FJS-12 and OKS can be explained by the fact that both questionnaires were explicitly designed to measure knee function during daily activities. The SF-12 measures the general physical function and health, which are influenced not only by TKA outcomes but also by other factors.
In addition, the strong negative correlation of (-0.8) indicates that knee pain is still a significant factor impacting joint awareness. The correlation was negative because a high VAS suggests an undesirable outcome while a high FJS-12 score indicates a desirable outcome.
This study has several limitations. First, we did not assess the responsiveness used for measuring the change in a patient’s condition over time. Second, this study focused only on the postoperative evaluation that the original FJS-12 is designed for, and further assessments are required for understanding the preoperative outcomes. Finally, this study was conducted via phone, which only assessed their ability to understand them, not their ability to read the questions.
The Hebrew version of FJS-12 has excellent reliability, excellent test-retest reliability, and good validity. It can be safely used for assessing patient outcomes of TKA in the Hebrew-speaking population.
Availability of data and materials
All necessary data will be provided on demand.
Total knee arthroplasty
Range of motion
Forgotten joint score
Oxford knee score
- SF 12:
Short form 12
Visual analog score
Intraclass correlation coefficient
Standard error of measurement
Smallest detectable chang
Sloan M, Premkumar A, Neil P, Sheth. Projected Volume of Primary Total Joint Arthroplasty in the U.S., 2014 to 2030. J Bone Joint Surg Am. 2018;100:1455–60. https://doi.org/10.2106/JBJS.17.01617.
Kurtz SM, Ong KL, Lau E, Widmer M, Maravic M, Gómez-Barrena E, de Pina Mde F, Manno V, Torre M, Walter WL, de Steiger R, Geesink RG, Peltola M, Röder C. International Survey of Primary and Revision Total Knee Replacement. Int Orthop. 2011;35:1783–89. https://doi.org/10.1007/s00264-011-1235-5.
Williams DP, Price AJ, Beard DJ, Hadfield SG, Arden NK, Murray DW, Field RE. The Effects of Age on Patient-Reported Outcome Measures in Total Knee Replacements. Bone Joint J. 2013;95-B:38–44. https://doi.org/10.1302/0301-620X.95B1.28061.
Lange T, Schmitt J, Kopkow C, Rataj E, Günther KP, Lützner J. What Do Patients Expect From Total Knee Arthroplasty? A Delphi Consensus Study on Patient Treatment Goals. J Arthroplasty. 2017;32:2093–2099.e1. https://doi.org/10.1016/j.arth.2017.01.053.
Scuderi GR, Bourne RB, Noble PC, Benjamin JB, Lonner JH, Scott WN. The New Knee Society Knee Scoring System. Clin Orthop Relat Res. 2012;470:3–19. https://doi.org/10.1007/s11999-011-2135-0.
Bullens PH, van Loon CJ, de Waal Malefijt MC, Laan RF, Veth RP. Patient satisfaction after total knee arthroplasty: A comparison between subjective and objective outcome assessments. J Arthroplasty. 2001;16:740–47. https://doi.org/10.1054/arth.2001.23922.
Hawker G, Wright J, Coyte P, Paul J, Dittus R, Croxford R, Katz B, Bombardier C, Heck D, Freund D. Health-Related Quality of Life after Knee Replacement. J Bone Joint Surg Am. 1998;80:163–73. https://doi.org/10.2106/00004623-199802000-00003.
Bryan S, Goldsmith LJ, Davis JC, Hejazi S, MacDonald V, McAllister P, Randall E, Suryaprakash N, Wu AD, Sawatzky R. Revisiting patient satisfaction following total knee arthroplasty: a longitudinal observational study. BMC Musculoskelet Disord. 2018;19:423. https://doi.org/10.1186/s12891-018-2340-z.
Behrend H, Giesinger K, Giesinger JM, Kuster MS. The “forgotten joint” as the ultimate goal in joint arthroplasty: validation of a new patient-reported outcome measure. J Arthroplasty. 2012;27(1):430–6.e. https://doi.org/10.1016/j.arth.2011.06.035.
Marx RG, Jones EC, Atwan NC, Closkey RF, Salvati EA, Sculco TP. Measuring improvement following total hip and knee arthroplasty using patient-based measures of outcome. J Bone Joint Surg Am. 2005;87:1999–2005. https://doi.org/10.2106/JBJS.D.02286.
Bullinger M, Alonso J, Apolone G, Leplège A, Sullivan M, Wood-Dauphinee S, Gandek B, Wagner A, Aaronson N, Bech P, Fukuhara S, Kaasa S, Ware JE Jr. Translating health status questionnaires and evaluating their quality: the IQOLA Project approach. International Quality of Life Assessment. J Clin Epidemiol. 1998;51:913–23. https://doi.org/10.1016/s0895-4356(98)00082-1.
Ware J Jr, Kosinski M, Keller SD. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Med Care. 1996;34:220–33. https://doi.org/10.1097/00005650-199603000-00003.
Bentur N, King Y. The challenge of validating SF-12 for its use with community-dwelling elderly in Israel. Qual Life Res. 2010;19:91–5. https://doi.org/10.1007/s11136-009-9562-3.
Dawson J, Fitzpatrick R, Murray D, Carr A. Questionnaire on the perceptions of patients about total knee replacement. J Bone Joint Surg Br. 1998;80:63–9. https://doi.org/10.1302/0301-620x.80b1.7859.
Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, Bouter LM, de Vet HC. The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study. Qual Life Res. 2010;19:539–49. https://doi.org/10.1007/s11136-010-9606-8.
Cronbach LJ, Warrington WG. Time-Limit Tests: Estimating Their Reliability and Degree of Speeding. Psychometrika. 1951;16:167–88. https://doi.org/10.1007/BF02289113.
Terwee CB, Bot SD, de Boer MR, van der Windt DA, Knol DL, Dekker J, Bouter LM, de Vet HC. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol. 2007;60:34–42. https://doi.org/10.1016/j.jclinepi.2006.03.012.
Streiner DL. Starting at the beginning: an introduction to coefficient alpha and internal consistency. J Pers Assess. 2003;80:99–103. https://doi.org/10.1207/S15327752JPA8001_18.
de Vet HC, Terwee CB, Knol DL, Bouter LM. When to use agreement versus reliability measures. J Clin Epidemiol. 2006;59:1033–9. https://doi.org/10.1016/j.jclinepi.2005.10.015.
Clement ND, MacDonald D, Simpson AH. Erratum to: The minimal clinically important difference in the Oxford knee score and Short Form 12 score after total knee arthroplasty. Knee Surg Sports Traumatol Arthrosc. 2016;24:3696. https://doi.org/10.1007/s00167-015-3959-z.
Eymard F, Charles-Nelson A, Katsahian S, Chevalier X, Bercovy M. “Forgotten knee” after total knee replacement: A pragmatic study from a single-centre cohort. Joint Bone Spine. 2015;82:177–81. https://doi.org/10.1016/j.jbspin.2014.11.006.
Sansone V, Fennema P, Applefield RC, Marchina S, Ronco R, Pascale W, Pascale V. Translation, cross-cultural adaptation, and validation of the Italian language Forgotten Joint Score-12 (FJS-12) as an outcome measure for total knee arthroplasty in an Italian population. BMC Musculoskelet Disord. 2020;21:23. https://doi.org/10.1186/s12891-019-2985-2.
Kinikli G, Deni̇z HG, Karahan S, Yüksel E, Kalkan S, Kara DD, Önal S, Sevinc C, Çaglar Ö, Atilla B, Yuksel İ. Validity and Reliability of Turkish Version of the Forgotten Joint Score-12. Journal of Exercise Therapy Rehabilitation. 2017;4:18–25.
Shadid MB, Vinken NS, Marting LN, Wolterbeek N. The Dutch version of the Forgotten Joint Score: test-retesting reliability and validation. Acta Orthop Belg. 2016;82:112–8.
Hamilton DF, Loth FL, Giesinger JM, Giesinger K, MacDonald DJ, Patton JT, Simpson AH, Howie CR. Validation of the English language Forgotten Joint Score-12 as an outcome measure for total hip and knee arthroplasty in a British population. Bone Joint J. 2017;99-B:218–24. https://doi.org/10.1302/0301-620X.99B2.BJJ-2016-0606.R1.
Heijbel S, Naili JE, Hedin A, W-Dahl A, Nilsson KG, Hedström M. The Forgotten Joint Score-12 in Swedish patients undergoing knee arthroplasty: a validation study with the Knee Injury and Osteoarthritis Outcome Score (KOOS) as comparator. Acta Orthop. 2020;91:88–93. https://doi.org/10.1080/17453674.2019.1689327.
Ms. Tali septon for the technical support.
No funding was provided.
This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Ethics Committee of Assaf ha rofe Hospital ( 16/12/19, 0305-19-ASF).
Consent for publication
Verbal consent to publish was obtained prior to the interview and the paper contains no personal data/images.
Consent to participate
Verbal informed consent was obtained prior to the interview.
The authors declare that they have no competing interests and they were not involved in the journal’s review of or decisions related to, this manuscript.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was performed in partial fulfillment of the M.D. thesis requirements of the Sackler Faculty of Medicine, Tel Aviv University
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Pansky, A., Bar-Ziv, Y., Tamir, E. et al. Reliability and validity of the Hebrew version of the forgotten joint score for assessing the outcomes of total knee arthroplasty. Arthroplasty 3, 27 (2021). https://doi.org/10.1186/s42836-021-00084-6
- Total knee arthroplasty
- Patient-reported outcome
- Forgotten joint score