Evaluation of the positional difference between two common geocoding methods

Dustin Duncan, Marcia C. Castro, Jeffrey C. Blossom, Gary G. Bennett, L. G Gortmaker G Steven

Research output: Contribution to journalArticle

Abstract

Geocoding, the process of matching addresses to geographic coordinates, is a necessary first step when using geographical information systems (GIS) technology. However, different geocoding methodologies can result in different geographic coordinates. The objective of this study was to compare the positional (i.e. longitude/latitude) difference between two common geocoding methods, i.e. ArcGIS (Environmental System Research Institute, Redlands, CA, USA) and Batchgeo (freely available online at http://www.batchgeo.com). Address data came from the YMCA-Harvard After School Food and Fitness Project, an obesity prevention intervention involving children aged 5-11 years and their families participating in YMCAadministered, after-school programmes located in four geographically diverse metropolitan areas in the USA. Our analyses include baseline addresses (n = 748) collected from the parents of the children in the after school sites. Addresses were first geocoded to the street level and assigned longitude and latitude coordinates with ArcGIS, version 9.3, then the same addresses were geocoded with Batchgeo. For this analysis, the ArcGIS minimum match score was 80. The resulting geocodes were projected into state plane coordinates, and the difference in longitude and latitude coordinates were calculated in meters between the two methods for all data points in each of the four metropolitan areas. We also quantified the descriptions of the geocoding accuracy provided by Batchgeo with the match scores from ArcGIS. We found a 94% match rate (n = 705), 2% (n = 18) were tied and 3% (n = 25) were unmatched using ArcGIS. Forty-eight addresses (6.4%) were not matched in ArcGIS with a match score ≥80 (therefore only 700 addresses were included in our positional difference analysis). Six hundred thirteen (87.6%) of these addresses had a match score of 100. Batchgeo yielded a 100% match rate for the addresses that ArcGIS geocoded. The median for longitude and latitude coordinates for all the data was just over 25 m. Overall, the range for longitude was 0.04-12,911.8 m, and the range for latitude was 0.02-37,766.6 m. Comparisons show minimal differences in the median and minimum values, while there were slightly larger differences in the maximum values. The majority (>75%) of the geographic differences were within 50 m of each other; mostly <25 m from each other (about 49%). Only about 4% overall were ≥400 m apart. We also found geographic differences in the proportion of addresses that fell within certain meter ranges. The match-score range associated with the Batchgeo accuracy level "approximate" (least accurate) was 84-100 (mean = 92), while the "rooftop" Batchgeo accuracy level (most accurate) delivered a mean of 98.9 but the range was the same. Although future research should compare the positional difference of Batchgeo to criterion measures of longitude/latitude (e.g. with global positioning system measurement), this study suggests that Batchgeo is a good, free-of-charge option to geocode addresses.

Original languageEnglish (US)
Pages (from-to)265-273
Number of pages9
JournalGeospatial health
Volume5
Issue number2
StatePublished - May 2011

Fingerprint

Geographic Mapping
evaluation
metropolitan area
agglomeration area
Geographic Information Systems
obesity
systems research
research facility
fitness
school
method
longitude
Geographical Information System
parents
GPS
GIS
food
methodology
Obesity
Parents

Keywords

  • Addresses
  • Arcgis
  • Batchgeo
  • Geocoding
  • Positional difference
  • Usa

ASJC Scopus subject areas

  • Medicine (miscellaneous)
  • Health Policy
  • Geography, Planning and Development
  • Health(social science)

Cite this

Duncan, D., Castro, M. C., Blossom, J. C., Bennett, G. G., & Steven, L. G. G. G. (2011). Evaluation of the positional difference between two common geocoding methods. Geospatial health, 5(2), 265-273.

Evaluation of the positional difference between two common geocoding methods. / Duncan, Dustin; Castro, Marcia C.; Blossom, Jeffrey C.; Bennett, Gary G.; Steven, L. G Gortmaker G.

In: Geospatial health, Vol. 5, No. 2, 05.2011, p. 265-273.

Research output: Contribution to journalArticle

Duncan, D, Castro, MC, Blossom, JC, Bennett, GG & Steven, LGGG 2011, 'Evaluation of the positional difference between two common geocoding methods', Geospatial health, vol. 5, no. 2, pp. 265-273.
Duncan D, Castro MC, Blossom JC, Bennett GG, Steven LGGG. Evaluation of the positional difference between two common geocoding methods. Geospatial health. 2011 May;5(2):265-273.
Duncan, Dustin ; Castro, Marcia C. ; Blossom, Jeffrey C. ; Bennett, Gary G. ; Steven, L. G Gortmaker G. / Evaluation of the positional difference between two common geocoding methods. In: Geospatial health. 2011 ; Vol. 5, No. 2. pp. 265-273.
@article{c3b82d7559fb48ae9bb5142b2b5469ae,
title = "Evaluation of the positional difference between two common geocoding methods",
abstract = "Geocoding, the process of matching addresses to geographic coordinates, is a necessary first step when using geographical information systems (GIS) technology. However, different geocoding methodologies can result in different geographic coordinates. The objective of this study was to compare the positional (i.e. longitude/latitude) difference between two common geocoding methods, i.e. ArcGIS (Environmental System Research Institute, Redlands, CA, USA) and Batchgeo (freely available online at http://www.batchgeo.com). Address data came from the YMCA-Harvard After School Food and Fitness Project, an obesity prevention intervention involving children aged 5-11 years and their families participating in YMCAadministered, after-school programmes located in four geographically diverse metropolitan areas in the USA. Our analyses include baseline addresses (n = 748) collected from the parents of the children in the after school sites. Addresses were first geocoded to the street level and assigned longitude and latitude coordinates with ArcGIS, version 9.3, then the same addresses were geocoded with Batchgeo. For this analysis, the ArcGIS minimum match score was 80. The resulting geocodes were projected into state plane coordinates, and the difference in longitude and latitude coordinates were calculated in meters between the two methods for all data points in each of the four metropolitan areas. We also quantified the descriptions of the geocoding accuracy provided by Batchgeo with the match scores from ArcGIS. We found a 94{\%} match rate (n = 705), 2{\%} (n = 18) were tied and 3{\%} (n = 25) were unmatched using ArcGIS. Forty-eight addresses (6.4{\%}) were not matched in ArcGIS with a match score ≥80 (therefore only 700 addresses were included in our positional difference analysis). Six hundred thirteen (87.6{\%}) of these addresses had a match score of 100. Batchgeo yielded a 100{\%} match rate for the addresses that ArcGIS geocoded. The median for longitude and latitude coordinates for all the data was just over 25 m. Overall, the range for longitude was 0.04-12,911.8 m, and the range for latitude was 0.02-37,766.6 m. Comparisons show minimal differences in the median and minimum values, while there were slightly larger differences in the maximum values. The majority (>75{\%}) of the geographic differences were within 50 m of each other; mostly <25 m from each other (about 49{\%}). Only about 4{\%} overall were ≥400 m apart. We also found geographic differences in the proportion of addresses that fell within certain meter ranges. The match-score range associated with the Batchgeo accuracy level {"}approximate{"} (least accurate) was 84-100 (mean = 92), while the {"}rooftop{"} Batchgeo accuracy level (most accurate) delivered a mean of 98.9 but the range was the same. Although future research should compare the positional difference of Batchgeo to criterion measures of longitude/latitude (e.g. with global positioning system measurement), this study suggests that Batchgeo is a good, free-of-charge option to geocode addresses.",
keywords = "Addresses, Arcgis, Batchgeo, Geocoding, Positional difference, Usa",
author = "Dustin Duncan and Castro, {Marcia C.} and Blossom, {Jeffrey C.} and Bennett, {Gary G.} and Steven, {L. G Gortmaker G}",
year = "2011",
month = "5",
language = "English (US)",
volume = "5",
pages = "265--273",
journal = "Geospatial health",
issn = "1827-1987",
publisher = "University of Naples Federico II",
number = "2",

}

TY - JOUR

T1 - Evaluation of the positional difference between two common geocoding methods

AU - Duncan, Dustin

AU - Castro, Marcia C.

AU - Blossom, Jeffrey C.

AU - Bennett, Gary G.

AU - Steven, L. G Gortmaker G

PY - 2011/5

Y1 - 2011/5

N2 - Geocoding, the process of matching addresses to geographic coordinates, is a necessary first step when using geographical information systems (GIS) technology. However, different geocoding methodologies can result in different geographic coordinates. The objective of this study was to compare the positional (i.e. longitude/latitude) difference between two common geocoding methods, i.e. ArcGIS (Environmental System Research Institute, Redlands, CA, USA) and Batchgeo (freely available online at http://www.batchgeo.com). Address data came from the YMCA-Harvard After School Food and Fitness Project, an obesity prevention intervention involving children aged 5-11 years and their families participating in YMCAadministered, after-school programmes located in four geographically diverse metropolitan areas in the USA. Our analyses include baseline addresses (n = 748) collected from the parents of the children in the after school sites. Addresses were first geocoded to the street level and assigned longitude and latitude coordinates with ArcGIS, version 9.3, then the same addresses were geocoded with Batchgeo. For this analysis, the ArcGIS minimum match score was 80. The resulting geocodes were projected into state plane coordinates, and the difference in longitude and latitude coordinates were calculated in meters between the two methods for all data points in each of the four metropolitan areas. We also quantified the descriptions of the geocoding accuracy provided by Batchgeo with the match scores from ArcGIS. We found a 94% match rate (n = 705), 2% (n = 18) were tied and 3% (n = 25) were unmatched using ArcGIS. Forty-eight addresses (6.4%) were not matched in ArcGIS with a match score ≥80 (therefore only 700 addresses were included in our positional difference analysis). Six hundred thirteen (87.6%) of these addresses had a match score of 100. Batchgeo yielded a 100% match rate for the addresses that ArcGIS geocoded. The median for longitude and latitude coordinates for all the data was just over 25 m. Overall, the range for longitude was 0.04-12,911.8 m, and the range for latitude was 0.02-37,766.6 m. Comparisons show minimal differences in the median and minimum values, while there were slightly larger differences in the maximum values. The majority (>75%) of the geographic differences were within 50 m of each other; mostly <25 m from each other (about 49%). Only about 4% overall were ≥400 m apart. We also found geographic differences in the proportion of addresses that fell within certain meter ranges. The match-score range associated with the Batchgeo accuracy level "approximate" (least accurate) was 84-100 (mean = 92), while the "rooftop" Batchgeo accuracy level (most accurate) delivered a mean of 98.9 but the range was the same. Although future research should compare the positional difference of Batchgeo to criterion measures of longitude/latitude (e.g. with global positioning system measurement), this study suggests that Batchgeo is a good, free-of-charge option to geocode addresses.

AB - Geocoding, the process of matching addresses to geographic coordinates, is a necessary first step when using geographical information systems (GIS) technology. However, different geocoding methodologies can result in different geographic coordinates. The objective of this study was to compare the positional (i.e. longitude/latitude) difference between two common geocoding methods, i.e. ArcGIS (Environmental System Research Institute, Redlands, CA, USA) and Batchgeo (freely available online at http://www.batchgeo.com). Address data came from the YMCA-Harvard After School Food and Fitness Project, an obesity prevention intervention involving children aged 5-11 years and their families participating in YMCAadministered, after-school programmes located in four geographically diverse metropolitan areas in the USA. Our analyses include baseline addresses (n = 748) collected from the parents of the children in the after school sites. Addresses were first geocoded to the street level and assigned longitude and latitude coordinates with ArcGIS, version 9.3, then the same addresses were geocoded with Batchgeo. For this analysis, the ArcGIS minimum match score was 80. The resulting geocodes were projected into state plane coordinates, and the difference in longitude and latitude coordinates were calculated in meters between the two methods for all data points in each of the four metropolitan areas. We also quantified the descriptions of the geocoding accuracy provided by Batchgeo with the match scores from ArcGIS. We found a 94% match rate (n = 705), 2% (n = 18) were tied and 3% (n = 25) were unmatched using ArcGIS. Forty-eight addresses (6.4%) were not matched in ArcGIS with a match score ≥80 (therefore only 700 addresses were included in our positional difference analysis). Six hundred thirteen (87.6%) of these addresses had a match score of 100. Batchgeo yielded a 100% match rate for the addresses that ArcGIS geocoded. The median for longitude and latitude coordinates for all the data was just over 25 m. Overall, the range for longitude was 0.04-12,911.8 m, and the range for latitude was 0.02-37,766.6 m. Comparisons show minimal differences in the median and minimum values, while there were slightly larger differences in the maximum values. The majority (>75%) of the geographic differences were within 50 m of each other; mostly <25 m from each other (about 49%). Only about 4% overall were ≥400 m apart. We also found geographic differences in the proportion of addresses that fell within certain meter ranges. The match-score range associated with the Batchgeo accuracy level "approximate" (least accurate) was 84-100 (mean = 92), while the "rooftop" Batchgeo accuracy level (most accurate) delivered a mean of 98.9 but the range was the same. Although future research should compare the positional difference of Batchgeo to criterion measures of longitude/latitude (e.g. with global positioning system measurement), this study suggests that Batchgeo is a good, free-of-charge option to geocode addresses.

KW - Addresses

KW - Arcgis

KW - Batchgeo

KW - Geocoding

KW - Positional difference

KW - Usa

UR - http://www.scopus.com/inward/record.url?scp=80052049834&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80052049834&partnerID=8YFLogxK

M3 - Article

VL - 5

SP - 265

EP - 273

JO - Geospatial health

JF - Geospatial health

SN - 1827-1987

IS - 2

ER -