Table 3:

Variables included in final XGBoost model ranked by SHAP values of importance

Predictor variableSHAP value*
Age0.7567
Days since last creatinine blood test0.1320
Geographical latitude0.1299
Days since last basophils test0.1196
Male0.1196
No. of family doctor visits in the last 2 yr0.1165
No. of comorbidities0.1072
No. of unique drug subclasses taken in the last 2 yr0.0845
Highest recorded level of creatinine in the last 2 yr0.0773
No. of diagnostic radiology studies in the last 2 yr0.0381
Average measurement of neutrophils in blood in the last 2 yr0.0289
No. of doctor visits in the last 2 yr0.0237
Median level of neutrophils in the last 2 yr0.0165
Average level of leukocytes in the last 2 yr0.0144
No. of creatinine tests in the last 2 yr0.0144
Highest recorded level of hemoglobin in blood in the last 2 yr0.0021
History of chronic kidney disease0.0021
Days since last mean corpuscular hemoglobin test in the last 2 yr0.0010
  • Note: SHAP = Shapley Additive Explanation, XGBoost = Extreme Gradient Boosting.

  • * SHAP values represent the weighted average of marginal contributions for each predictive variable included in the XGBoost model.