Topic modeling to discover the thematic structure and spatial-temporal patterns of building renovation and adaptive reuse in cities

Research output: Contribution to journalArticle

Abstract

Building alteration and redevelopment play a central role in the revitalization of developed cities, where the scarcity of available land limits the construction of new buildings. The adaptive reuse of existing space reflects the underlying socioeconomic dynamics of the city and can be a leading indicator of economic growth and diversification. However, the collective understanding of building alteration patterns is constrained by significant barriers to data accessibility and analysis. We present a data mining and knowledge discovery process for extracting, analyzing, and integrating building permit data for more than 2,500,000 alteration projects from seven major U.S. cities. We utilize natural language processing and topic modeling to discover the thematic structure of construction activities from permit descriptions and merge with other urban data to explore the dynamics of urban change. The knowledge discovery process proceeds in three steps: (1) text mining to identify popular words, popularity change, and their co-appearance likelihood; (2) topic modeling using latent Dirichlet allocation (LDA); and (3) integrating the topic modeling output with building information and ancillary data to discover the spatial, temporal, and thematic patterns of urban redevelopment and regeneration. The results demonstrate a generalizable approach that can be used to analyze unstructured text data extracted from permit records across varying database structures, permit typologies, and local contexts. Our machine learning methodology can assist cities to better monitor building alteration activity, analyze spatiotemporal patterns of redevelopment, and more fully understand the economic, social, and environmental implications of changes to the urban built environment.

Original languageEnglish (US)
Article number101383
JournalComputers, Environment and Urban Systems
Volume78
DOIs
StatePublished - Nov 1 2019

Fingerprint

renovation
redevelopment
modeling
economic diversification
data mining
new building
typology
accessibility
economic growth
diversification
regeneration
knowledge
popularity
city
building
methodology
economics
language
learning

Keywords

  • Big data
  • Building alteration
  • Machine learning
  • Natural language processing
  • Topic modeling

ASJC Scopus subject areas

  • Geography, Planning and Development
  • Ecological Modeling
  • Environmental Science(all)
  • Urban Studies

Cite this

@article{1357832360ad48e680b7c2be96e7927b,
title = "Topic modeling to discover the thematic structure and spatial-temporal patterns of building renovation and adaptive reuse in cities",
abstract = "Building alteration and redevelopment play a central role in the revitalization of developed cities, where the scarcity of available land limits the construction of new buildings. The adaptive reuse of existing space reflects the underlying socioeconomic dynamics of the city and can be a leading indicator of economic growth and diversification. However, the collective understanding of building alteration patterns is constrained by significant barriers to data accessibility and analysis. We present a data mining and knowledge discovery process for extracting, analyzing, and integrating building permit data for more than 2,500,000 alteration projects from seven major U.S. cities. We utilize natural language processing and topic modeling to discover the thematic structure of construction activities from permit descriptions and merge with other urban data to explore the dynamics of urban change. The knowledge discovery process proceeds in three steps: (1) text mining to identify popular words, popularity change, and their co-appearance likelihood; (2) topic modeling using latent Dirichlet allocation (LDA); and (3) integrating the topic modeling output with building information and ancillary data to discover the spatial, temporal, and thematic patterns of urban redevelopment and regeneration. The results demonstrate a generalizable approach that can be used to analyze unstructured text data extracted from permit records across varying database structures, permit typologies, and local contexts. Our machine learning methodology can assist cities to better monitor building alteration activity, analyze spatiotemporal patterns of redevelopment, and more fully understand the economic, social, and environmental implications of changes to the urban built environment.",
keywords = "Big data, Building alteration, Machine learning, Natural language processing, Topic modeling",
author = "Yuan Lai and Constantine Kontokosta",
year = "2019",
month = "11",
day = "1",
doi = "10.1016/j.compenvurbsys.2019.101383",
language = "English (US)",
volume = "78",
journal = "Computers, Environment and Urban Systems",
issn = "0198-9715",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Topic modeling to discover the thematic structure and spatial-temporal patterns of building renovation and adaptive reuse in cities

AU - Lai, Yuan

AU - Kontokosta, Constantine

PY - 2019/11/1

Y1 - 2019/11/1

N2 - Building alteration and redevelopment play a central role in the revitalization of developed cities, where the scarcity of available land limits the construction of new buildings. The adaptive reuse of existing space reflects the underlying socioeconomic dynamics of the city and can be a leading indicator of economic growth and diversification. However, the collective understanding of building alteration patterns is constrained by significant barriers to data accessibility and analysis. We present a data mining and knowledge discovery process for extracting, analyzing, and integrating building permit data for more than 2,500,000 alteration projects from seven major U.S. cities. We utilize natural language processing and topic modeling to discover the thematic structure of construction activities from permit descriptions and merge with other urban data to explore the dynamics of urban change. The knowledge discovery process proceeds in three steps: (1) text mining to identify popular words, popularity change, and their co-appearance likelihood; (2) topic modeling using latent Dirichlet allocation (LDA); and (3) integrating the topic modeling output with building information and ancillary data to discover the spatial, temporal, and thematic patterns of urban redevelopment and regeneration. The results demonstrate a generalizable approach that can be used to analyze unstructured text data extracted from permit records across varying database structures, permit typologies, and local contexts. Our machine learning methodology can assist cities to better monitor building alteration activity, analyze spatiotemporal patterns of redevelopment, and more fully understand the economic, social, and environmental implications of changes to the urban built environment.

AB - Building alteration and redevelopment play a central role in the revitalization of developed cities, where the scarcity of available land limits the construction of new buildings. The adaptive reuse of existing space reflects the underlying socioeconomic dynamics of the city and can be a leading indicator of economic growth and diversification. However, the collective understanding of building alteration patterns is constrained by significant barriers to data accessibility and analysis. We present a data mining and knowledge discovery process for extracting, analyzing, and integrating building permit data for more than 2,500,000 alteration projects from seven major U.S. cities. We utilize natural language processing and topic modeling to discover the thematic structure of construction activities from permit descriptions and merge with other urban data to explore the dynamics of urban change. The knowledge discovery process proceeds in three steps: (1) text mining to identify popular words, popularity change, and their co-appearance likelihood; (2) topic modeling using latent Dirichlet allocation (LDA); and (3) integrating the topic modeling output with building information and ancillary data to discover the spatial, temporal, and thematic patterns of urban redevelopment and regeneration. The results demonstrate a generalizable approach that can be used to analyze unstructured text data extracted from permit records across varying database structures, permit typologies, and local contexts. Our machine learning methodology can assist cities to better monitor building alteration activity, analyze spatiotemporal patterns of redevelopment, and more fully understand the economic, social, and environmental implications of changes to the urban built environment.

KW - Big data

KW - Building alteration

KW - Machine learning

KW - Natural language processing

KW - Topic modeling

UR - http://www.scopus.com/inward/record.url?scp=85070719049&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85070719049&partnerID=8YFLogxK

U2 - 10.1016/j.compenvurbsys.2019.101383

DO - 10.1016/j.compenvurbsys.2019.101383

M3 - Article

VL - 78

JO - Computers, Environment and Urban Systems

JF - Computers, Environment and Urban Systems

SN - 0198-9715

M1 - 101383

ER -