### Abstract

Case-based reasoning (CBR) is a methodology that is seeing increasing use to make predictions during the early phases of a project. It allows estimators to exploit existing knowledge to make predictions that are considerably better than without its use. All CBR, however, is not identical, and variations in how CBR is done can affect the accuracy of the predictions. One particular area of sensitivity is the retrieval phase, i.e. the way in which the CBR determines the closeness between the new and the existing cases. In this paper, CBR is used to make estimates of resources for construction projects, and the use of the nearest neighbor technique to identify the similarity for the retrieval phase to predict the construction material quantities (CMQs) in concrete structures is investigated. Two types of distances, i.e. 1) the City-block distance and 2) the Euclidean distance, and four different types of weights, based on regression analysis and feature counting, to account for the relative importance of the different parameters, are investigated. The four different types of weights used were 1) the adjusted unstandardized coefficients from the regression models, 2) the unadjusted unstandardized coefficients from the regression models, 3) the standardized coefficients from the regression models, and 4) equal weights (i.e.; feature counting), in which the weights applied are 1/k, and k is the number of parameter being compared to determine the distance. The mean absolute percentage error (MAPE) was used to evaluate each combination investigated. It was found that for a similarity threshold of 90%, the CBR methodology using the City-block distance with the adjusted unstandardized coefficients from the regression analysis models using the transformed (LN) dataset as weights, gave the best results, with a MAPE of 8.16%. The worst results were obtained from the CBR methodology using the Euclidean distance with feature counting weights, with a MAPE of 28.40%.

Original language | English (US) |
---|---|

Pages (from-to) | 169-181 |

Number of pages | 13 |

Journal | Procedia Engineering |

Volume | 123 |

DOIs | |

State | Published - Jan 1 2015 |

Event | 4th Creative Construction Conference, CCC 2015 - Krakow, Poland Duration: Jun 21 2015 → Jun 24 2015 |

### Fingerprint

### Keywords

- artificial intelligence
- case-based reasoning
- preliminary estimates
- resource estimates
- retrieval process

### ASJC Scopus subject areas

- Engineering(all)

### Cite this

**Investigation of the Case-based Reasoning Retrieval Process to Estimate Resources in Construction Projects.** / Garcia de Soto, Borja; Adey, Bryan T.

Research output: Contribution to journal › Conference article

*Procedia Engineering*, vol. 123, pp. 169-181. https://doi.org/10.1016/j.proeng.2015.10.074

}

TY - JOUR

T1 - Investigation of the Case-based Reasoning Retrieval Process to Estimate Resources in Construction Projects

AU - Garcia de Soto, Borja

AU - Adey, Bryan T.

PY - 2015/1/1

Y1 - 2015/1/1

N2 - Case-based reasoning (CBR) is a methodology that is seeing increasing use to make predictions during the early phases of a project. It allows estimators to exploit existing knowledge to make predictions that are considerably better than without its use. All CBR, however, is not identical, and variations in how CBR is done can affect the accuracy of the predictions. One particular area of sensitivity is the retrieval phase, i.e. the way in which the CBR determines the closeness between the new and the existing cases. In this paper, CBR is used to make estimates of resources for construction projects, and the use of the nearest neighbor technique to identify the similarity for the retrieval phase to predict the construction material quantities (CMQs) in concrete structures is investigated. Two types of distances, i.e. 1) the City-block distance and 2) the Euclidean distance, and four different types of weights, based on regression analysis and feature counting, to account for the relative importance of the different parameters, are investigated. The four different types of weights used were 1) the adjusted unstandardized coefficients from the regression models, 2) the unadjusted unstandardized coefficients from the regression models, 3) the standardized coefficients from the regression models, and 4) equal weights (i.e.; feature counting), in which the weights applied are 1/k, and k is the number of parameter being compared to determine the distance. The mean absolute percentage error (MAPE) was used to evaluate each combination investigated. It was found that for a similarity threshold of 90%, the CBR methodology using the City-block distance with the adjusted unstandardized coefficients from the regression analysis models using the transformed (LN) dataset as weights, gave the best results, with a MAPE of 8.16%. The worst results were obtained from the CBR methodology using the Euclidean distance with feature counting weights, with a MAPE of 28.40%.

AB - Case-based reasoning (CBR) is a methodology that is seeing increasing use to make predictions during the early phases of a project. It allows estimators to exploit existing knowledge to make predictions that are considerably better than without its use. All CBR, however, is not identical, and variations in how CBR is done can affect the accuracy of the predictions. One particular area of sensitivity is the retrieval phase, i.e. the way in which the CBR determines the closeness between the new and the existing cases. In this paper, CBR is used to make estimates of resources for construction projects, and the use of the nearest neighbor technique to identify the similarity for the retrieval phase to predict the construction material quantities (CMQs) in concrete structures is investigated. Two types of distances, i.e. 1) the City-block distance and 2) the Euclidean distance, and four different types of weights, based on regression analysis and feature counting, to account for the relative importance of the different parameters, are investigated. The four different types of weights used were 1) the adjusted unstandardized coefficients from the regression models, 2) the unadjusted unstandardized coefficients from the regression models, 3) the standardized coefficients from the regression models, and 4) equal weights (i.e.; feature counting), in which the weights applied are 1/k, and k is the number of parameter being compared to determine the distance. The mean absolute percentage error (MAPE) was used to evaluate each combination investigated. It was found that for a similarity threshold of 90%, the CBR methodology using the City-block distance with the adjusted unstandardized coefficients from the regression analysis models using the transformed (LN) dataset as weights, gave the best results, with a MAPE of 8.16%. The worst results were obtained from the CBR methodology using the Euclidean distance with feature counting weights, with a MAPE of 28.40%.

KW - artificial intelligence

KW - case-based reasoning

KW - preliminary estimates

KW - resource estimates

KW - retrieval process

UR - http://www.scopus.com/inward/record.url?scp=84953282518&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84953282518&partnerID=8YFLogxK

U2 - 10.1016/j.proeng.2015.10.074

DO - 10.1016/j.proeng.2015.10.074

M3 - Conference article

VL - 123

SP - 169

EP - 181

JO - Procedia Engineering

JF - Procedia Engineering

SN - 1877-7058

ER -