Performance of models for estimating absolute risk difference in multicenter trials with binary outcome

BMC Med Res Methodol. 2016 Aug 30;16(1):113. doi: 10.1186/s12874-016-0217-0.

Abstract

Background: Reporting of absolute risk difference (RD) is recommended for clinical and epidemiological prospective studies. In analyses of multicenter studies, adjustment for center is necessary when randomization is stratified by center or when there is large variation in patients outcomes across centers. While regression methods are used to estimate RD adjusted for baseline predictors and clustering, no formal evaluation of their performance has been previously conducted.

Methods: We performed a simulation study to evaluate 6 regression methods fitted under a generalized estimating equation framework: binomial identity, Poisson identity, Normal identity, log binomial, log Poisson, and logistic regression model. We compared the model estimates to unadjusted estimates. We varied the true response function (identity or log), number of subjects per center, true risk difference, control outcome rate, effect of baseline predictor, and intracenter correlation. We compared the models in terms of convergence, absolute bias and coverage of 95 % confidence intervals for RD.

Results: The 6 models performed very similar to each other for the majority of scenarios. However, the log binomial model did not converge for a large portion of the scenarios including a baseline predictor. In scenarios with outcome rate close to the parameter boundary, the binomial and Poisson identity models had the best performance, but differences from other models were negligible. The unadjusted method introduced little bias to the RD estimates, but its coverage was larger than the nominal value in some scenarios with an identity response. Under the log response, coverage from the unadjusted method was well below the nominal value (<80 %) for some scenarios.

Conclusions: We recommend the use of a binomial or Poisson GEE model with identity link to estimate RD for correlated binary outcome data. If these models fail to run, then either a logistic regression, log Poisson regression, or linear regression GEE model can be used.

Keywords: Clustered data; Correlated binary data; Generalized estimating equation; Multicenter trial; Risk difference; Robust standard errors.

MeSH terms

  • Clinical Trials as Topic*
  • Humans
  • Models, Theoretical*
  • Multicenter Studies as Topic*
  • Outcome Assessment, Health Care / methods
  • Outcome Assessment, Health Care / statistics & numerical data
  • Risk Assessment / methods
  • Risk Assessment / statistics & numerical data*
  • Risk Factors