Geocoding large population-level administrative datasets at highly resolved spatial scales

Trans GIS. 2014 Aug;18(4):586-603. doi: 10.1111/tgis.12052.

Abstract

Using geographic information systems to link administrative databases with demographic, social, and environmental data allows researchers to use spatial approaches to explore relationships between exposures and health. Traditionally, spatial analysis in public health has focused on the county, zip code, or tract level because of limitations to geocoding at highly resolved scales. Using 2005 birth and death data from North Carolina, we examine our ability to geocode population-level datasets at three spatial resolutions - zip code, street, and parcel. We achieve high geocoding rates at all three resolutions, with statewide street geocoding rates of 88.0% for births and 93.2% for deaths. We observe differences in geocoding rates across demographics and health outcomes, with lower geocoding rates in disadvantaged populations and the most dramatic differences occurring across the urban-rural spectrum. Our results suggest highly resolved spatial data architectures for population-level datasets are viable through geocoding individual street addresses. We recommend routinely geocoding administrative datasets to the highest spatial resolution feasible, allowing public health researchers to choose the spatial resolution used in analysis based on an understanding of the spatial dimensions of the health outcomes and exposures being investigated. Such research, however, must acknowledge how disparate geocoding success across subpopulations may affect findings.

Keywords: Birth record; Death record; Geocoding; Spatial resolution.