Abstract
Background: Burnout among postgraduate medical trainees (PMTs) is increasingly being recognized as a crisis in the medical profession. We aimed to establish the prevalence of burnout among PMTs, identify risk and protective factors, and assess whether burnout varied by country of training, year of study and specialty of practice.
Methods: We systematically searched MEDLINE, Embase, PsycINFO, the Cochrane Database of Systematic Reviews, Web of Science and Education Resources Information Center from their inception to Aug. 21, 2018, for studies of burnout among PMTs. The primary objective was to identify the global prevalence of burnout among PMTs. Our secondary objective was to evaluate the association between burnout and country of training, year of study, specialty of training and other sociodemographic factors commonly thought to be related to burnout. We employed random-effects meta-analysis and meta-regression techniques to estimate a pooled prevalence and conduct secondary analyses.
Results: In total, 8505 published studies were screened, 196 met eligibility and 114 were included in the meta-analysis. The pooled prevalence of burnout was 47.3% (95% confidence interval 43.1% to 51.5%), based on studies published over 20 years involving 31 210 PMTs from 47 countries. The prevalence of burnout remained unchanged over the past 2 decades. Burnout varied by region, with PMTs of European countries experiencing the lowest level. Burnout rates among medical and surgical PMTs were similar.
Interpretation: Current wellness efforts and policies have not changed the prevalence of burnout worldwide. Future research should focus on understanding systemic factors and leveraging these findings to design interventions to combat burnout.
Study registration: PROSPERO no. CRD42018108774
We are in a critical period within the medical profession as alarming rates of suicide among physicians bring burnout to the forefront.1–3 The effects of burnout are widespread, affecting physician wellness and productivity as well as patient health outcomes.2,4 Burnout is characterized by physical, emotional and mental exhaustion, resulting from long-term involvement in emotionally taxing situations.5 In the United States, about half of practising physicians experience an episode of burnout during their career.1,6,7 Canadian data report a slightly lower prevalence; about 30% of the surveyed physicians endorsed burnout.8
Residency is a particularly stressful time; the trainee is tasked with a tremendous responsibility of consistently providing high-quality care while learning and integrating new skills. Adapting to these job demands has a direct consequence on one’s emotional and intellectual reserve, and the ability to establish a healthy home–work interface.9 The prevalence of burnout among postgraduate medical trainees (PMTs) varies widely from 3% to 88%. 7,10–13 However, existing attempts at systematic investigation of burnout in this group have been limited by methodological flaws including restrictive search strategies, lack of evaluation of temporal and associated factors, and lack of global investigation.7,10–13
The objective of our study was twofold; our main aim was to establish the prevalence of burnout among PMTs based on a meta-analysis of studies from across the world. Second, we used the extracted data to explore whether commonly studied factors, such as age, sex and relationship status, are shown to protect against or increase risk for burnout as well as to understand, using meta-regression, whether country of training, year of study and specialty of training were associated with burnout, as these factors may explain heterogeneity in the prevalence of burnout.
Methods
Study design and data sources
We conducted a systematic review of the literature to identify all studies evaluating burnout among PMTs. The search strategy was developed and conducted by a health research librarian (L.B.) at McMaster University. We searched MEDLINE (1946 to Aug. 21, 2018), Embase (1974 to Aug. 21, 2018), PsycINFO (1987 to Aug. 21, 2018), the Cochrane Database of Systematic Reviews (2005 to Aug. 21, 2018), Web of Science (1976 to Aug. 21, 2018) and Education Resources Information Center database (ERIC) (1966 to Aug. 21, 2018). The search encompassed terms used to refer to PMTs worldwide (e.g., resident, intern, junior physician and house officer), burnout and its components (emotional exhaustion, physical exhaustion, depersonalization and cynicism) and the setting (medical, hospital and clinical).
MEDLINE was used to develop the initial search and the search strategy was used to search the other databases. Results were exported from each database after all searches had been made and finalized. The reference lists of reviews identified were searched for relevant articles. No restrictions on geography or date were applied. The full search strategy is provided in Appendix 1, available at www.cmajopen.ca/content/9/1/E189/suppl/DC1. The protocol was registered in PROS-PERO (CRD42018108774).
Study selection
All studies measuring burnout among PMTs were included regardless of country of training, specialty, year of training or setting.
The Maslach Burnout Inventory (MBI) is a validated and commonly used tool to measure burnout.14 Although this tool was used by most included studies, there is a lack of unified and standardized definition for burnout in the current literature. Therefore, we accepted the definition of burnout as used in the study, recognizing that it is measured and defined variably across the literature.
The MBI measures burnout in the context of emotional exhaustion, depersonalization and lack of personal accomplishment. This is a 22-item self-administered questionnaire whereby respondents are asked to rank their responses on a 7-point Likert scale (ranging from 0 to 6, or less commonly, from 1 to 7). Although the MBI was initially created to assess burnout on a continuum, it has commonly been adapted to dichotomize burnout.14 However, there is a lack of standardization regarding which of the 3 dimensions are necessary to constitute burnout or specific cut-off values for each of these dimensions.2 In addition, modified versions of the MBI, including single item measures, are sometimes employed.15,16
We included studies that either reported or provided data necessary to quantify burnout, such as through the prevalence of burnout, PMTs’ scores on a burnout scale or their classifications into percentiles based on score. We included studies published in English only. Studies investigating doctors of osteopathic medicine were excluded, as were case studies, dissertations and opinion papers.
All titles, abstracts and full-text articles were evaluated for eligibility independently and in duplicate by 5 reviewers (L.N., B.S., A.S., O.K. and F.N.) using the Covidence software. 17 Discrepancies were discussed and resolved by consensus; if consensus was not reached, the decision was taken by an independent reviewer (L.N. or B.S.). In addition, we reviewed the reference list of each identified study.
Data extraction and quality assessment
From each study, we extracted study characteristics, participant demographic characteristics, definition and measurement of burnout, burnout rates and factors associated with burnout. Definitions for the following extracted associated factors were accepted as they were reported by study authors: depression, job satisfaction and income satisfaction.
Data were extracted independently and in duplicate by 4 reviewers (L.N., B.S., A.S. and F.N.). As before, discrepancies were resolved by an independent reviewer (L.N. or B.S.). There is a lack of a validated tool to assess the risk of bias in cross-sectional studies, and this prohibited us from assessing risk systematically. However, using the general framework of the Newcastle–Ottawa Scale, a well-established tool to assess risk of bias, the same reviewers also rated the quality of included studies based on representativeness of the sample, sample size, ascertainment of outcome and reporting of findings.18
Statistical analysis
Primary analysis
For our analyses, we accepted the reported value of burnout as defined by each individual study, regardless of the tool employed. We estimated a random-effects pooled prevalence for all included studies using a restricted maximum-likelihood estimator. We used raw proportions without transforming the data based on recommendations by Lipsey and Wilson, since most of our proportions were between 0.2 and 0.8.19 The meta-analysis was conducted in R using the metafor package.20
As we anticipated systematic differences among the results of studies (heterogeneity), we report both the τ2 values of heterogeneity and calculated I2. We sought to understand whether prevalence of burnout changed depending on the tool used to ascertain the prevalence. Therefore, we conducted a meta-regression analysis with the use of the MBI as a categorical moderator variable (yes v. no). We hypothesized that since use of the MBI decreases heterogeneity in how burnout is defined, it would be a significant predictor in our meta-regression.21,22 If use of the MBI was found to be a significant variable, we intended to adjust all additional analyses for the use of the MBI.
We conducted a subgroup analysis of North American studies to establish a pooled estimate of burnout among North American PMTs. As before, we employed a random-effects model to pool data.
Secondary analysis
We conducted 2 secondary analyses. First, we extracted data on reported risk and protective factors, including age, sex, relationship status (single v. having a partner), depression, level of stress, work hours, frequency of call shifts, job satisfaction, wage or income satisfaction, family or network support, sleep and level of training. We present a descriptive summary of associations found for these factors in the literature. Second, we employed meta-regression, a regression technique of aggregate data which allows for study of the impact of moderator variables on pooled effect size, to study the effect of region of training, program of residency (medicine v. surgery) and the year burnout data were collected on the pooled measure of burnout.
We first categorized regions as continents, but as only a few studies were conducted in Africa, Asia, Australia, the Middle East and South America, we collapsed these regions into one and compared them against Europe and North America, which had larger samples.
A random-effect inverse-variance weighted model was used to conduct the meta-regression. A 2-tailed Q-statistic was used to test the significance of the slope in a multivariate analysis and the standard Z-statistic in a univariate analysis. We used the Comprehensive Meta-Analysis software (version 3) to conduct our analysis. Because of the post hoc nature of this analysis, we did not take a significant finding to be definitive, but rather to promote a direction for future research.
Ethics approval
In keeping with research ethics board guidelines at McMaster University, ethics approval was not required for this systematic review.
Results
Upon completion of screening, 196 of the 8505 studies met our eligibility criteria (Figure 1). These studies were published between 1987 and 2018 and represent data from 44 128 PMTs across 47 countries; a large proportion of studies (82/196) were conducted in the US. The studies included PMTs from a variety of programs and at different levels of training. Of the 196 studies meeting eligibility criteria, an overall proportion of burnout was reported for 31 210 PMTs in 114 studies conducted between 2001 and 2017. Four of these 114 studies report burnout in 2 separate populations of PMTs, rendering 118 data sets eligible for our meta-analysis (Appendix 2, available at www.cmajopen.ca/content/9/1/E189/suppl/DC1). Therefore, our analysis of risk and protective factors is based on 44 128 PMTs from 196 studies, whereas our meta-analysis includes 31 210 PMTs from 114 studies.
Measurement of burnout
Among the studies included in our review, burnout was measured using a variety of tools, detailed in Appendix 3, available at www.cmajopen.ca/content/9/1/E189/suppl/DC1. The most commonly used tool to assess burnout was the MBI (138 of 196 studies). Among the 138 studies that used the MBI, 83 studies reported an overall proportion of PMTs experiencing burnout. These studies defined burnout using 9 different definitions, with the most common one being a high score in either emotional exhaustion or depersonalization (42 of 83). Five of the 83 studies did not report how overall burnout was determined. The cut-off values for the individual dimensions also varied, as described in Appendix 2. For instance, there were 6 different definitions for high emotional exhaustion, whereas 24 of 83 studies did not report a cut-off value for emotional exhaustion.
Some studies (29 of 196) used a modified version of the MBI. A single item measure for emotional exhaustion and depersonalization was the most commonly employed modified version. Thirty-one of 196 studies used a different tool altogether, as described in Appendix 3.
Pooled prevalence of burnout
Overall
As mentioned previously, 114 studies were included in our meta-analysis. These data came from 31 210 PMTs from 47 countries. The pooled random-effects estimate of burnout was 47.3% (95% confidence interval [CI] 43.1% to 51.5%). An analysis of heterogeneity suggests significant differences among the pooled studies; τ2 was 0.052 (phet < 10−16) and I2 was 98.56%. A forest plot of all studies is presented in Figure 2 and Figure 3.7,9,10,22–132 We then sought to understand whether capturing burnout in a standardized manner using the MBI would explain heterogeneity in prevalence. Thus, we conducted a meta-regression analysis with the MBI use as a categorical variable. As expected, use of the MBI to capture burnout significantly explained heterogeneity in the prevalence (meta-regression β 0.117, 95% CI 0.027 to 0.207. We therefore adjusted our future analyses for use of the MBI.
North America
Sixty studies captured North American data; the pooled random-effects estimate of burnout among North American PMTs was 51.2% (95% CI 45.9% to 56.6%). We further explored regional variation in burnout across the world and results of this analysis are presented below.
Meta-regression
Year of study
We undertook a meta-regression of burnout with the year in which a study was conducted to evaluate whether heterogeneity in the prevalence of burnout was explained by time (i.e., whether burnout changed over time). Data for this analysis were available from 100 studies; 14 did not report the year of survey. Our analysis, adjusted for MBI use, found that the year of study was not a significant moderator of burnout (estimate of meta-regression β 0.002, 95% CI −0.009 to 0.013). Burnout prevalence over the years is presented in Figure 4.
Medical versus surgical training
We also investigated whether prevalence of burnout was affected by the choice of specialty; specifically, we were interested in understanding whether medical PMTs experienced differing rates of burnout compared with surgical PMTs. Data were available from 82 studies. Our meta-regression analysis, adjusted for MBI use, showed no evidence that specialty of training was associated with burnout prevalence (estimate of meta-regression β −0.005, 95% CI −0.110 to 0.099).
Geographic region
We anticipated that geographic region of study would be an important predictor of burnout prevalence. We categorized regions by continents, but due to limited studies conducted in some of these regions, we collapsed Africa, Asia, Australia, Middle East and South America into 1 category. Therefore, we conducted an analysis with region as a 3-category variable (North America, Europe and rest of the world). Data from all 114 studies were available for this analysis. Our analysis, adjusted for MBI use, found that region was a significant predictor of variation in burnout prevalence (estimate of meta-regression p < 0.001). Appendix 2 includes details of studies from each region, and Table 1 presents the results of our meta-analysis of burnout prevalence by region. As limited data were available for other regions, stronger conclusions can be made about burnout only among North American and European residents; the prevalence of burnout among European PMTs was 30.8% versus 51.2% in North America. Figure 5 illustrates burnout prevalence by region.
Risk and protective factors
We aimed to study the following factors and their role in burnout: age, sex, relationship status, depression, level of stress, work hours, frequency of call shifts, job satisfaction, wage or income satisfaction, family or network support, sleep and level of training. However, because of heterogeneity in how these factors are studied and reported in literature, we were unable to pool results but rather present only a descriptive analysis of our findings (Figure 6). In brief, we found that most studies did not find a significant association between burnout and age, sex, relationship status and level of training. However, stress and lower job satisfaction were commonly associated with higher rates of burnout in the literature. In addition, although 28 of 58 studies investigating the association between work hours and burnout found a positive association, 27 of 58 concluded that no significant association existed. Similarly, only 7 of 23 studies found that burnout was positively and significantly associated with more call shifts.
Risk-of-bias assessment
Many of the included studies had methodological flaws, limiting the reliability of their findings. Specifically, 30.6% (60/196) of studies included a consecutive or obviously representative sample of PMTs, and only 9.7% (19/196) of studies justified their sample size by using a sample size calculation. In addition, 25.5% (50/196) of studies compared respondents’ characteristics to those of nonrespondents or had a satisfactory response rate of greater than 80%. Although 98.5% (193/196) of studies used a well-described validated tool to measure burnout, this is of doubtful significance given the heterogeneity in interpreting the tool and establishing cut-off values highlighted above. A total of 44.9% (88/196) appropriately reported on descriptive statistics to describe the population with proper measures of dispersion. Lastly, 49.5% (97/196) of studies provided adequate statistics to describe burnout with proper measures of dispersion.
Interpretation
Our analyses underscore 4 key findings: the prevalence of burnout has not changed significantly over time; prevalence of burnout is associated with region; burnout rates among medical and surgical PMTs were similar; and most of the commonly studied risk and protective factors were not associated with burnout.
First, although policies over the past 2 decades have aimed to circumvent systemic causes of burnout by limiting work hours, on-call responsibilities and, more recently, instituting wellness programs, our results show that the prevalence of burnout has not significantly changed over time.133 Our results are consistent with the equivocal findings of a recent systematic review.134 Notably, our findings are likely more reflective of policy changes around work hours, rather than wellness programs, as these are more recent additions to the battle against burnout. Taken together, our findings suggest that unknown or underrecognized systemic factors are likely major contributors, and efforts should be aimed to uncover these. To this end, considerable qualitative work has focused on understanding the pitfalls of medical training, the hidden curriculum and challenges within the medical culture. Some training programs have sought to overcome the potential toxic culture by incorporating mentorship programs to promote collegiality135 and create platforms to give voice to PMTs.136 It is likely that answers lie at the intersection of further quantitative research of structural differences between geographically diverse training systems and qualitative work understanding the prevailing toxic culture of medicine and its impact on physicians and patients alike.137
Second, we report that burnout is associated with region, suggesting a role of systemic factors on PMT wellness. Among North American and European PMTs, the regions for which we had the most data available, there exists a stark difference in the prevalence of burnout. Although there is a paucity of research comparing health care systems among these regions, a study of general workplace trends finds that factors such as more involved unions and longer paid vacations, among other such social policies, contribute to overall improved work–life balance and less burnout.138 It is possible that our findings are biased by methodological considerations such as the possibility that the MBI may be filled out in a different manner across cultures, contributing to the observed variation in prevalence. Nonetheless, our findings warrant future research to identify cultural and systemic differences that may explain our results, both within and outside the training environment. The possible effect of cultural and systemic differences is supported by other recent work; a cross-sectional study reports that physicians as a group are more resilient than the general population, lending less credence to the view that individual factors lead to burnout.139 Furthermore, a systematic review by Panagioti and colleagues evaluating strategies to mitigate burnout emphasized the need for organizational level change.4
Third, our meta-regression suggests that whether a PMT is a surgical trainee or medical trainee does not significantly explain the heterogeneity in burnout prevalence. Although there may exist differences between the 2 training programs, they are likely small in comparison to other determinants of burnout. It is often hypothesized that surgical residents experience greater stress and harassment during their training, leading to high rates of burnout.132,140–142 However, our findings suggest that rates likely do not differ and support alliance of efforts, both policy and research, by medical and surgical training programs to address a crisis that affects all PMTs equally.
Lastly, with the exception of stress and depression, our descriptive analysis of the literature failed to identify any consistent relation between commonly studied risk and protective factors with burnout. For example, although commonly believed to be associated with burnout, we found that most literature does not support an association between relationship status and level of training with burnout. We also found an equivocal relation between work hours and burnout. Given the cross-sectional nature of the included studies and the likely between-study variance in how these factors are measured, it is difficult to make strong conclusions; nevertheless, our results suggest that research to date into causes of burnout has failed to yield definitive risk factors that can be mitigated or protective factors that can be enhanced to combat the increasing prevalence.
Although previous reviews have aimed to summarize rates of burnout among resident physicians, these studies have been limited by restrictive search terms; have included largely North American studies, have been quite small (e.g., the review by Rodriguez and colleagues including only 26 studies); or have focused on attending physicians, excluding trainees.2,11–13 The comprehensiveness of our data makes our results generalizable and provides a solid platform on which additional data can be added to make more robust conclusions.
Our results propose a clear direction for future research on burnout among trainees. Our study suggests that the key to mitigating burnout lies in systemic changes that may be uncovered by studying regional variation in the medical culture. Our study highlights that we do not yet have a grasp on what factors cause burnout among physicians; it is critical that we amend research efforts to gain a better understanding of burnout so that appropriate interventions can be developed to alleviate this crisis.
Limitations
There are 2 key limitations to our study. First, we included only studies published in English. This limitation is reflected in our reduced sample size from continents outside Europe and North America. Future studies should focus on translating non-English studies and searching the grey literature to gain a better understanding of burnout among all regions worldwide. Nonetheless, even with our geographically limited sample size, we observed that significant variation in burnout exists globally. This supports our view that the next steps in burnout research should focus on understanding differences between health care and education systems. To be cautious, we predominantly limit the discussion in this paper to North America and Europe.
Second, there is significant heterogeneity in the measurement of burnout, subsequently leading to pooled estimates that are less reliable and should be interpreted with caution. Notably, use of the MBI to define burnout explained some heterogeneity and we subsequently adjusted our meta-regression accordingly to ensure robustness of our findings. As previously noted, some studies used a modified version of the MBI. While the one most commonly employed — using single item measures for emotional exhaustion and depersonalization — has previously been validated and found to correlate strongly with the full version of the MBI,15,16 many of these studies used arbitrary versions that are of questionable validity. Therefore, we encourage readers to assess the results critically. The bias resulting from substantial heterogeneity is a limitation that exists in literature and highlights the need for standardized measurement of burnout. The MBI is the most commonly used and widely validated tool; we encourage its use by future researchers to facilitate further research in this field and to assess adequately temporal trends in burnout globally.
Conclusion
Despite burnout’s substantial impact, interventions appear to have had little effect on its prevalence over the last few decades. We provide a comprehensive characterization of burnout within our profession and a new direction for future research.
Footnotes
Competing interests: None declared.
This article has been peer reviewed.
Contributors: Leen Naji, Zahra Sohani, Brendan Singh and Jason Profetto conceived the research question. Leen Naji, Brendan Singh, Brittany Dennis, Zahra Sohani, Zainab Samaan and Lehana Thabane designed the review protocol. Leen Naji and Laura Banfield designed the search strategy, which was completed by Laura Banfield. Leen Naji, Brendan Singh, Ajay Shah, Faysal Naji and Owen Kavanagh completed the systematic screening of studies for inclusion independently and in duplicate. Leen Naji, Brendan Singh, Ajay Shah, Faysal Naji and Owen Kavanagh performed data extraction and quality assessment of included studies independently and in duplicate. Leen Naji, Akram Alyass and Zahra Sohani performed data analyses. Fahad Razak reviewed the manuscript. All authors contributed to the writing and revision of the manuscript. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted. Leen Naji and Zahra Sohani accept full responsibility for the finished article, had access to any data, and controlled the decision to publish. All authors gave final approval of the version to be published and agreed to be accountable for all aspects of the work. Lehana Thabane and Zahra Sohani are co-senior authors.
Funding: No funding was received for this study.
Data sharing: All extracted data are available for sharing. Data may be accessed by contacting (Zahra Sohani, zahra.sohani{at}mail.mcgill.ca or Leen Naji, leen.naji{at}medportal.ca).
Supplemental information: For reviewer comments and the original submission of this manuscript, please see www.cmajopen.ca/content/9/1/E189/suppl/DC1.
This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY-NC-ND 4.0) licence, which permits use, distribution and reproduction in any medium, provided that the original publication is properly cited, the use is noncommercial (i.e., research or educational use), and no modifications or adaptations are made. See: https://creativecommons.org/licenses/by-nc-nd/4.0/
References
- © 2021 Joule Inc. or its licensors