Feature selection for ranking of most influential variables for evacuation behavior modeling across disasters

Sami Demiroluk, M. Anil Yazici, Kaan Ozbay, Jon A. Carnegie

Research output: Contribution to journalArticle

Abstract

The extensive list of factors that affect the evacuee decision process makes it difficult to design effective surveys and to develop decision models with high predictive power. Regression models and significance levels can help identify relevant variables and overcome this problem to an extent. However, such approaches fall short of ranking these variables or recognizing the redundant ones. In this study, the use of a feature selection method was proposed to ensure that the selected features were relevant and not at the same time redundant. This method, called conditional mutual information maximization, consists of picking features at each step and minimizes the uncertainty in the decision conditional on the response of any feature already picked. As a case study, the variables influencing evacuation behavior in the Northern New Jersey Evacuation Survey were ranked and compared for disaster scenarios. To validate the method and to demonstrate how it compared with the traditional methods, logistic regression models were also estimated with the same data set. It was found that the top-ranked variables might be available through an existing database such as the U.S. census and some could be calculated on the basis of the threat type and government action. This fact can be useful for emergency planners when an evacuation survey for a study area is not readily available. Overall, the feature selection algorithm succeeds in identifying the most influential factors for all threat types. The suggested approach can help both preprocessing (e.g., defining a set of input variables) and postprocessing (e.g., identification of variables that should be kept) for behavioral modeling.

Original languageEnglish (US)
Pages (from-to)24-32
Number of pages9
JournalTransportation Research Record
Volume2599
DOIs
StatePublished - 2016

Fingerprint

Disasters
Feature extraction
Logistics
Uncertainty

ASJC Scopus subject areas

  • Civil and Structural Engineering
  • Mechanical Engineering

Cite this

Feature selection for ranking of most influential variables for evacuation behavior modeling across disasters. / Demiroluk, Sami; Anil Yazici, M.; Ozbay, Kaan; Carnegie, Jon A.

In: Transportation Research Record, Vol. 2599, 2016, p. 24-32.

Research output: Contribution to journalArticle

Demiroluk, Sami ; Anil Yazici, M. ; Ozbay, Kaan ; Carnegie, Jon A. / Feature selection for ranking of most influential variables for evacuation behavior modeling across disasters. In: Transportation Research Record. 2016 ; Vol. 2599. pp. 24-32.
@article{36799b898c7340adbdaa29b2d1b531b1,
title = "Feature selection for ranking of most influential variables for evacuation behavior modeling across disasters",
abstract = "The extensive list of factors that affect the evacuee decision process makes it difficult to design effective surveys and to develop decision models with high predictive power. Regression models and significance levels can help identify relevant variables and overcome this problem to an extent. However, such approaches fall short of ranking these variables or recognizing the redundant ones. In this study, the use of a feature selection method was proposed to ensure that the selected features were relevant and not at the same time redundant. This method, called conditional mutual information maximization, consists of picking features at each step and minimizes the uncertainty in the decision conditional on the response of any feature already picked. As a case study, the variables influencing evacuation behavior in the Northern New Jersey Evacuation Survey were ranked and compared for disaster scenarios. To validate the method and to demonstrate how it compared with the traditional methods, logistic regression models were also estimated with the same data set. It was found that the top-ranked variables might be available through an existing database such as the U.S. census and some could be calculated on the basis of the threat type and government action. This fact can be useful for emergency planners when an evacuation survey for a study area is not readily available. Overall, the feature selection algorithm succeeds in identifying the most influential factors for all threat types. The suggested approach can help both preprocessing (e.g., defining a set of input variables) and postprocessing (e.g., identification of variables that should be kept) for behavioral modeling.",
author = "Sami Demiroluk and {Anil Yazici}, M. and Kaan Ozbay and Carnegie, {Jon A.}",
year = "2016",
doi = "10.3141/2599-04",
language = "English (US)",
volume = "2599",
pages = "24--32",
journal = "Transportation Research Record",
issn = "0361-1981",
publisher = "US National Research Council",

}

TY - JOUR

T1 - Feature selection for ranking of most influential variables for evacuation behavior modeling across disasters

AU - Demiroluk, Sami

AU - Anil Yazici, M.

AU - Ozbay, Kaan

AU - Carnegie, Jon A.

PY - 2016

Y1 - 2016

N2 - The extensive list of factors that affect the evacuee decision process makes it difficult to design effective surveys and to develop decision models with high predictive power. Regression models and significance levels can help identify relevant variables and overcome this problem to an extent. However, such approaches fall short of ranking these variables or recognizing the redundant ones. In this study, the use of a feature selection method was proposed to ensure that the selected features were relevant and not at the same time redundant. This method, called conditional mutual information maximization, consists of picking features at each step and minimizes the uncertainty in the decision conditional on the response of any feature already picked. As a case study, the variables influencing evacuation behavior in the Northern New Jersey Evacuation Survey were ranked and compared for disaster scenarios. To validate the method and to demonstrate how it compared with the traditional methods, logistic regression models were also estimated with the same data set. It was found that the top-ranked variables might be available through an existing database such as the U.S. census and some could be calculated on the basis of the threat type and government action. This fact can be useful for emergency planners when an evacuation survey for a study area is not readily available. Overall, the feature selection algorithm succeeds in identifying the most influential factors for all threat types. The suggested approach can help both preprocessing (e.g., defining a set of input variables) and postprocessing (e.g., identification of variables that should be kept) for behavioral modeling.

AB - The extensive list of factors that affect the evacuee decision process makes it difficult to design effective surveys and to develop decision models with high predictive power. Regression models and significance levels can help identify relevant variables and overcome this problem to an extent. However, such approaches fall short of ranking these variables or recognizing the redundant ones. In this study, the use of a feature selection method was proposed to ensure that the selected features were relevant and not at the same time redundant. This method, called conditional mutual information maximization, consists of picking features at each step and minimizes the uncertainty in the decision conditional on the response of any feature already picked. As a case study, the variables influencing evacuation behavior in the Northern New Jersey Evacuation Survey were ranked and compared for disaster scenarios. To validate the method and to demonstrate how it compared with the traditional methods, logistic regression models were also estimated with the same data set. It was found that the top-ranked variables might be available through an existing database such as the U.S. census and some could be calculated on the basis of the threat type and government action. This fact can be useful for emergency planners when an evacuation survey for a study area is not readily available. Overall, the feature selection algorithm succeeds in identifying the most influential factors for all threat types. The suggested approach can help both preprocessing (e.g., defining a set of input variables) and postprocessing (e.g., identification of variables that should be kept) for behavioral modeling.

UR - http://www.scopus.com/inward/record.url?scp=85012039875&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85012039875&partnerID=8YFLogxK

U2 - 10.3141/2599-04

DO - 10.3141/2599-04

M3 - Article

VL - 2599

SP - 24

EP - 32

JO - Transportation Research Record

JF - Transportation Research Record

SN - 0361-1981

ER -