Article Text
Abstract
Objectives Regular breast cancer screening is a widely used cancer prevention strategy. Important quality indicators of screening include cancer detection rate, false positive rate, benign biopsy rate and post-screen invasive cancer rate. We compared quality indicators of community radiology clinics to those of ‘Screen Test’, which feature centralised batch reading and quality control processes. Both types of providers operated under a single provincial Breast Cancer Screening Programme.
Setting Community radiology clinics are operated by independent fee-for-service radiologists serving large and small communities throughout the Canadian province of Alberta. Launched by the provincial cancer agency, the Screen Test operates two physical clinics serving metropolises and mobile units serving remote regions. Eligible women may self-refer to any provider for screening mammography.
Participants Women aged 50 to 69 years who had at least one screening mammogram between July 2006 and June 2010 in Alberta were included. Women with missing health region information or prior breast cancer diagnosis were excluded.
Results A total of 389 788 screening mammograms were analysed, of which 12.7% were performed by Screen Test. Compared with Screen Test during 2006 to 2008, community radiology clinics had a lower cancer detection rate (3.6 vs 4.6 per 1000 screens, risk ratio (RR): 0.81, 95% CI: 0.67 to 0.98) and a much higher false positive rate (9.4% vs 3.4%, RR: 2.72, 95% CI: 2.55 to 2.90). Most other performance indicators were also better in Screen Test overall and across all health regions. These performance indicators were similar during 2008 to 2010, showing no improvement with time.
Conclusions Screen Test has a quality assurance process in place and performed significantly better. This provides empirical evidence of the effectiveness of a quality assurance process and may explain some of the large differences in breast cancer screening indicators between provinces and countries with formal programmes and those without.
- quality in health care
- public health
- oncology
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
Our study used population data that includes all women participating in breast cancer screening in one jurisdiction over a 4-year period, which minimises the selection bias.
The presence of two breast cancer screening models in which women arbitrarily receive screening services in one or the other, provides a natural experiment that allows the assessment of quality between the two service models in Alberta.
Our study used administrative data which lacks detailed patient information.
Introduction
Mammography is a widely used screening test for breast cancer, though estimates of its effectiveness vary widely, from 30% reduction in mortality to possibly none.1 2 Recommendations for screening presume that the same benefits and harms occur to women in routine clinical practice as in the randomised clinical trials.3–5 Given that the estimated effectiveness is at best moderate, it is critical that quality of mammography is high, otherwise the harms of screening, such as false positives and benign biopsies, may outweigh the benefits from early detection and treatment.6–8 Quality of performance is ensured by use of high quality equipment, working under high technical standards, well trained technicians undertaking the mammography and trained, experienced radiologists reading the image with feedback.9
In USA, false positive rates in screening programmes tend to be around 9%, while they are below 5% in most European programmes and some Canadian provincial programmes such as Manitoba and Nova Scotia .6 10–13 In the province of Alberta, Canada, breast cancer screening is provided via an organised breast cancer screening programme, Screen Test, which uses centralised batch reading and strict quality assurance process, as well as through independently operated community radiology clinics. Operationally, Screen Test is similar to other organised screening programmes in Canada (eg, Manitoba and Nova Scotia) and Europe, while the screening conducted in the community radiology clinics shares common features of the opportunistic screening typically provided in the private clinics and healthcare organisations in the USA.
In Alberta, mammography was performed on request by many radiology clinics from the mid-1980s, initially without dedicated equipment or specifically trained staff. In 1990, the Alberta Cancer Agency launched Screen Test through two physical clinics, one established in each of Edmonton and Calgary, the two primary cities in the province. In 1991, outreach programmes were launched operating mobile mammography units for rural areas.14 All mammography images were, and still are, processed and interpreted by salaried sessional radiologists in a central office. Parallel to Screen Test, radiologists working in community radiology clinics and hospitals also perform screening mammograms. Screen-eligible women can self-refer to any service provider. To establish a systematic screening programme which can send screening invitation letters to women who had not been screened, the Alberta Cancer Agency worked on a registry of eligible women who had mammograms. By June 2006, the database of Screen Test and the database maintained by the Alberta Radiology Association to which most community radiologists belong was fully linked as the Alberta Breast Cancer Screening Program (ABCSP).
The purpose of this study was to assess whether there is systematic variation in the quality of breast cancer screening based on standard performance indicators between the two groups of mammography providers and whether the establishment of the ABCSP improved screening quality.
Methods
Study population
We studied screening results from screen-eligible women from July 2006 to June 2010, at the beginning of the ABCSP, to assess whether differences in quality between the two types of providers existed at the beginning of ABCSP and whether there was improvement in screening mammography quality over time after the initiation of the ABCSP. We split the 4-year period into two 2-year study periods, to assess quality indicators in the two service models over two distinct time periods. Screen-eligible women were defined as those aged 50 to 69 in either study period who had not been diagnosed with breast cancer prior to their screening mammogram.
During the study period, Screen Test had a policy of screening every 2 years, consistent with the Canadian Guidelines at the time,3 while most Alberta radiologists in the community recommended annual screening.
During the study period, about 3.4 to 3.7 million Albertans occupied a land of 662 000 km2 in size,15 about the size of France. Approximately 65% of the total population lived in one of the two urban regions Edmonton and Calgary.16
Data sources
Every breast cancer screening and diagnostic procedure, corresponding date, patients’ birthdate and health zone were obtained from the Screen Test database for procedures conducted by Screen Test. The same data were obtained from the Alberta Health Physician Claims database for procedures performed by radiologists in the community radiology clinics. Separate procedure codes distinguish screening and diagnostic mammograms. These two databases are complementary to each other and their quality is high.17 They were used to identify all screening mammograms and subsequent diagnostic tests conducted for eligible women. The first screening mammogram for a woman in each study period was defined as the woman’s index screen, which marks a screening episode.
The Alberta Cancer Registry (ACR) was used to exclude women not eligible for screening due to prior breast cancer diagnosis and to identify breast cancers diagnosed in the screening cohort. ACR is a population-based cancer registry that includes all newly diagnosed cancers in Alberta since 1942. It is regularly awarded the highest degree of certification for data completeness by the North American Association of Comprehensive Cancer Registries.18
The three data sets were linked at the individual level by the unique provincial healthcare identification number. All data were obtained from the data custodian Alberta Health Services.
Primary outcomes
We used the following standard performance indicators as our primary outcomes to assess screening quality: abnormal recall rates, benign biopsy rates, false positive rates, positive predictive value (PPV), screen-detected cancer rates (invasive and non-invasive) and post-screen invasive cancer rates. These routine performance indicators are used by screening programmes in many countries including Canada.5 19–21 In addition, we used overall cancer diagnosed within a specific time frame regardless of mode of detection to approximate the cancer incidence in the underlying population. The derivation of the key quantities using administrative data sources are detailed in the online supplemental materials. Each indicator was stratified by service provider, region of residence and study period. Region of residence was classified into three categories: the two major urban centres Edmonton (Edmonton zone), Calgary (Calgary zone), and the rest of Alberta (comprising the South, Central and North health zones as defined by Alberta Health Services, online supplementary figure 1). Women with missing health zone were excluded (n=267).
Supplemental material
Data analysis
Multivariate log-binomial models were used to estimate the rate ratio between providers for abnormal recalls, cancer detection through screening, false positives, benign biopsy and overall cancer. Poisson regression models were used to estimate the rate ratios for post-screen invasive cancer for specific follow-up periods, accounting for person-year follow-up. PPV was examined using a logistic regression model to estimate the OR between the providers because the log-binomial model did not converge. The 95% confidence intervals (CIs) for these rates and rate ratios were estimated. Stratified analyses by region and study period were performed. All models adjusted for patient age. Statistical analyses were conducted using SAS 9.4 (SAS Institute, Cary, North Carolina).
Patient and public involvement
This project was done without involving screen-eligible women. Screen-eligible women were not invited to comment on the study design and were not consulted to develop relevant outcomes or interpret the results. They were not invited to contribute to the writing or editing of this document for readability or accuracy.
Results
There were approximately 200 000 index screens in each of the two study periods. Over 84% of screens were performed in the community radiology clinics. The total number of screens performed by Screen Test decreased from 2006–2008 to 2008–2010 (online supplementary table 1). Age distributions of screened women were very similar across the two service providers and all regions with a median age of 57 to 58 years and an IQR of 53 to 62 years (online supplementary table 2).
Table 1 shows the rates of abnormal recall, false positives and benign biopsies were higher in community clinics than in Screen Test. The overall abnormal recall rates in the two study periods were 9.7% and 8.5% in community clinics, compared with 3.8% and 3.5% in Screen Test, respectively. The corresponding risk ratio estimates were 2.50 (95% CI: 2.35 to 2.65) for study period A (2006 to 2008) and 2.29 (95% CI: 2.14 to 2.46) for study period B (2008 to 2010). The abnormal recall rates in community clinics ranged widely from 5.9% in ‘other’ areas to 11.2% in Calgary, while Screen Test had fairly consistent abnormal recall rates regardless of residence area (range 3.2% to 4.1%). Similarly, the false positive rates in the two study periods were 9.4% and 8.2% in community clinics and 3.4% and 3.0% in Screen Test, respectively, with corresponding risk ratios (RRs) 2.72 (95% CI: 2.55 to 2.9) and 2.56 (95% CI: 2.38 to 2.76). The benign biopsy rates (per 1000 screens) were higher in community clinics during both study period A (8.5 vs 5.6, RR=1.47, 95% CI: 1.25 to 1.74) and B (6.7 vs 4.8, RR=1.35, 95% CI: 1.11 to 1.65). The community clinics had a lower positive predictive value (PPV) than Screen Test.
Table 2 compares the cancer detection rate (through screening) per 1000 screens. Though community clinics had higher abnormal recall rates than Screen Test, their overall screen-detected cancer rates were lower in both study periods (A: 3.6 vs 4.6, RR=0.81, 95% CI: 0.67 to 0.98, and B: 3.3 vs 5.0, RR=0.68, 95% CI: 0.55 to 0.83). The performance difference in cancer detection occurred primarily from the detection of invasive cancer. In the two study periods, the community clinics on average detected 2.7 and 2.6 invasive cancers per 1000 screens, while Screen Test on average detected 3.7 and 4.3 invasive cancers per 1000 screens, respectively. The corresponding estimated risk ratios were 0.79 (95% CI: 0.63 to 0.98) and 0.65 (95% CI: 0.52 to 0.81).
The post-screen invasive cancer rates were 6.8 (95% CI:5.6 to 8.3) and 18.6 (95% CI:16.0 to 21.7) in community clinics for the 0 to 12 and 12 to 24 months periods, respectively, following a normal screening mammogram in study period A (table 3). The corresponding rates were 3.4 (95% CI: 1.8 to 6.5) and 7.5 (95% CI: 4.7 to 12.1) in Screen Test. When compared, the community clinics had much higher rates for both the first 12 months (RR: 2.36, 95% CI: 1.18 to 4.73) and 12 to 24 months (RR: 2.27, 95% CI: 1.36 to 3.78).
Women screened in the community clinics on average had shorter screening intervals than those screened in Screen Test (online supplementary figure 2). Thus, we carried out a sensitivity analysis and compared service providers by the overall cancer diagnosed within prespecified time frames regardless cancer detection mode (screen or non-screen). Table 4 showed that the overall cancer incidence and the overall invasive cancer incidence were similar between providers in both study periods A and B.
Discussion
Overall the community clinics had lower cancer detection rates, higher abnormal recall rates, higher false positives, lower positive predictive value, higher benign biopsy rate and higher rates of post-screen invasive cancers when compared with Screen Test in both 2-year study periods and across all regions. Since the overall cancer incidence in women attending each type of clinic was similar, these findings are unlikely due to systematic differences in the population. Our results are consistent with a previous study comparing opportunistic and organised screening which showed better sensitivity in the organised screening programme.22
Screen Test performance measures were consistent across health zones, presumably resulting from its centralised structure. In contrast, community clinics had variation in abnormal recall rates, likely attributable to variation in quality assurance activity across community clinics (online supplementary figure 3). The consistency between Screen Test and the variation in community clinics across the five health zones further support our conclusion that the difference observed in these indicators can be largely attributed to quality assurance differences rather than by differences in the characteristics of women attending the two types of clinics. Measures for community clinics in Edmonton and Calgary improved from the first study period to the second, possibly from the positive influence of formation of the Alberta Breast Cancer Screening Program, whereas for community clinics in other health zones, both false positive rates and PPV were worse in the second period.
At the time, both community radiology clinics and Screen Test typically sent abnormal screening results to their patients’ primary care providers who then coordinated the diagnostic follow-up, and such diagnostic imaging was done through community clinics. One might expect that this practice should produce similar benign biopsy rates between the two types of screening service providers. The rate ratios, however, range from 1.38 to 1.90 comparing community clinics to Screen Test in Calgary and Edmonton in study periods A and B (table 2), showing that women screened in a community clinic had a higher rate of benign biopsy than women screened by Screen Test. This highlights that the higher rate of false positive findings increases the risk of invasive follow-up procedures. Thus, the effectiveness of screening was lower in the community radiology clinics than Screen Test due to both higher false positive rates and higher benign biopsy rates. Additionally, diagnostic tests after a false positive have been shown to reduce the likelihood of future screening participation: benign biopsies have the largest negative impact.23
The operations and quality assurance practices of Screen Test and community clinics were distinct in a several ways during 2006 to 2010. In Screen Test, staff radiologists read batches of 70 to 90 screening mammograms in a 2 to 4 hour screening session. At monthly quality assurance meetings, they reviewed and discussed difficult cases and received personal statistics as a feedback mechanism; these practices are still standard today. The operation and quality assurance practices of community radiology clinics varied, and still do today. First, many clinics interpret screening mammograms while women are still present, to decide whether further testing is needed. Our previous analysis showed that more than 55% of the follow-up imaging procedures were performed the same day.23 Although this practice minimises time to diagnosis/resolution, it encourages a diagnostic rather than a screening mindset. Second, screening mammograms are not read separately from diagnostic mammograms thus images from symptomatic and asymptomatic patients are read without clear distinction. Third, it is unknown whether recall statistics are produced to help radiologists self-evaluate against the screening mammography standard.
Another contributing factor to the quality of reading could be reading volume; a high volume has been shown to reduce radiologists’ false positive rate.24 25 While the average reading volume in Screen Test was about 2300 screening mammograms per year, the minimum requirement for radiologists in the community was 400 mammograms per year during the study period, in accordance with the Canadian Association of Radiologists Mammography Accreditation Program.26
While the Canadian target for PPV is ≥5% for initial screen and ≥6% for rescreens,21 PPVs were 3.7% for community clinics in both study periods compared with 11.9% and 14.1% for Screen Test in the respective time periods. The average false positive rate of radiologists in community clinics in Alberta is similar to that of US radiologists in the Breast Cancer Screening Consortium.13 False positive rates vary greatly among US radiologists,24 which is likely also true for community radiologists in Alberta. In comparison, the false positive rate of organised screening programmes with strong standard quality assurance programmes, for example, Screen Test and Canadian provinces including Manitoba and Nova Scotia, are comparable to countries in the European Union at less than 5%.6
Study limitations and advantages
The main advantages of this study are that the two ways to access breast cancer screening in Alberta provide a natural experiment to assess differences in quality of screening between the two breast cancer screening models. Neither screen-eligible women nor the majority of family physicians in the province are aware that there are two distinct systems offering screening services. Women living in Edmonton and Calgary choose to receive screening at either a community-based clinic or a Screen Test clinic. Their choice is independent of their disease state. The arbitrary selection of screening provider conceptually mimics a single blind randomised experiment.
There are several limitations of the study. We could not distinguish whether a mammogram was an initial or subsequent screen, which influences abnormal recall rate and cancer detection rate. The ages of women are similar among types of service providers, however, which mitigate the potential bias. Women attending Screen Test generally had longer intervals between screens than those attending community radiology clinics. We performed sensitivity analysis for women who had a normal index screen in study period A and were subsequently screened between 18 to 30 months. The same pattern of differences emerged (online supplementary table 3): when compared with Screen Test in the subsequent screening, community clinics had a lower cancer detection rate (2.4 vs 4.0 per 1000 screens) and a higher abnormal recall rate (6.6% vs 2.5%). Additionally, the low cancer detection rate was attributed to the lower invasive cancer rate (online supplementary table 4). There are likely variations between different community clinics and individual radiologists,24 25 27 however, we only have aggregated data, so cannot identify individual performance.
Conclusion
The majority of quality measures examined were better in the organised screening programme where an effective quality assurance process is in place. This study thus provides empirical evidence that the high false positive rates observed in some provinces of Canada, for example, Quebec and New Brunswick,6 and the USA (compared with European Union countries) and significant variation world-wide in effectiveness of screening programmes may be in large part due to lack of rigorous, standardised quality assurance processes (including minimum reading volume requirement) among providers of screening mammography. Second, the differences in quality measures between the community clinics and Screen Test strongly suggest that quality assurance is essential to reduce harm from false positives and missed cancers. Public health programmes must be carefully implemented to maximise the benefits and minimise potential harms to participants.
Acknowledgments
Following individuals assisted in the development of our manuscript: anonymous radiologists for providing insights into our study findings, Jingyu Bu for contributing to the statistical analysis and Zhe Lu for reference management. The authors would also like to acknowledge valuable comments from reviewers which strengthened this manuscript.
References
Footnotes
Contributors Conception and design: YY, MW and JD. Data acquisition, analysis and interpretation: YS, KV, MW, YY and JD. Manuscript draft and revising: JD, YY, MW, KV and YS. Final approval and accountable for the study: YY, KV, YS, JD and MW.
Funding This research is jointly supported by Canadian Institutes of Health Research (DC0190GP to MW) and the MSI foundation (869 to YY). Dr Yuan’s research program is supported by Natural Sciences and Engineering Research Council of Canada (FRN: RGPIN-2019-04862).
Map disclaimer The depiction of boundaries on this map does not imply the expression of any opinion whatsoever on the part of BMJ (or any member of its group) concerning the legal status of any country, territory, jurisdiction or area or of its authorities. This map is provided without any warranty of any kind, either express or implied.
Competing interests None declared.
Patient consent for publication Not required.
Ethics approval Ethics approval for the study was obtained from the Health Research Ethics Board of Alberta, Cancer Committee.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement Data may be obtained from a third party and are not publicly available. Alberta Health Services in the provincial of Alberta Canada, is the data custodian of the administrative data sets analysed in this article. A research protocol and ethics approval are required in order to request the data for research use.