view presentation

15
THE ROLE OF DATA THE ROLE OF DATA MINING IN THE MINING IN THE PREDICTION OF PREDICTION OF WINDSTORM DAMAGES WINDSTORM DAMAGES N. O. Nawari, Ph.D., P.E., M.ASCE, Kent State University, College of Architecture and Environmental Design, Taylor Building, Kent, OH 44242 Email: [email protected]

Upload: tommy96

Post on 11-Jul-2015

149 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: View Presentation

THE ROLE OF DATA THE ROLE OF DATA MINING IN THE MINING IN THE PREDICTION OF PREDICTION OF

WINDSTORM DAMAGES WINDSTORM DAMAGES N. O. Nawari, Ph.D., P.E., M.ASCE, Kent State University, College of Architecture and Environmental Design, Taylor Building, Kent, OH 44242 Email: [email protected]

Page 2: View Presentation
Page 3: View Presentation

Abstract

Prediction and assessment of hurricane buildings damage involves many sources of uncertain data that make it difficult task using conventional prediction models. These data would be imprecise and multidimensional in nature; some hidden relationships within the data can only be retrieved using comprehensive data analysis techniques like data mining. Data mining employs algorithms that are a mixture of statistics, fuzzy logic, genetic algorithms, maths and artificial intelligence. There are a large number of algorithms that seek relationships within datasets from which rules of some kind can be derived and subsequently used for design prediction, classification or other functions. This work discusses the role of data mining algorithms in the prediction and classification of damages due to hurricane forces. The research focuses on the conceptual framework for the data mining models to assist in the prediction and assessment of buildings damages caused by hurricanes. Different data mining models defined in MS SQL2005 and in particular algorithms were chosen that were amongst the simplest examples of these groups of models, namely Association Rules, Decision Tree and Naïve Bayes, Clustering and Neural Network.

Page 4: View Presentation

INTRODUCTRION

The areas along the United States Gulf and Atlantic coasts where most of this country's windstorm related fatalities have occurred are also now experiencing the country's most significant growth in population. This situation, in combination with continued building along the coast, will lead to serious problems for many areas in hurricanes. Because it is likely that people will always be attracted to live along the shoreline, a solution to the problem lies in proper engineering design and mitigation.

OBJECTIVES

The objective of the study is to establish the importance of data mining models to assist in the prediction and assessment of building damages due to tropical cyclones. The presentation below offers a conceptual perspective on the role of the data mining algorithms in supporting changes and improvement to building codes and standards to achieve higher performance and safety measures of residential buildings.

Page 5: View Presentation

UNCERTAINTIES

Windstorm is a very complicated phenomenon. It is air and water in turbulent flow, which means that the motion of individual air or water particles is so erratic that in studying storm one ought to be concerned with statistical distributions of speeds and directions rather than with simple averages or fixed physical quantities. For analytical model, storm forces can be classified as one of a combination of:

Wind Pressure

Windborne Debris

Falling objects

Flood Pressure

Rain Forces

The resistance of buildings to wind pressures has been the subject of The resistance of buildings to wind pressures has been the subject of considerable research and is addressed by building codes. However, considerable research and is addressed by building codes. However, normal design loads specified in these codes are substantially lower than normal design loads specified in these codes are substantially lower than those that occur during a windstorm. This is due to many sources of those that occur during a windstorm. This is due to many sources of uncertainties involved in the computational model.uncertainties involved in the computational model.

Page 6: View Presentation

The ASCE 7 provision describes computational method for wind The ASCE 7 provision describes computational method for wind pressure using a number of coefficients that requirepressure using a number of coefficients that require considerable judgment to determine which pressure coefficients to use, how to determine tributary areas for cladding and framing elements, and whether building elements should be designed as part of the main wind force resisting system or components and cladding. The corners, edges, The corners, edges, and eave overhang of a building are subjected to complicated forces as and eave overhang of a building are subjected to complicated forces as windstorm passes these obstructions, causing higher localized suction windstorm passes these obstructions, causing higher localized suction forces that are not considered appropriately in Building Codes. forces that are not considered appropriately in Building Codes.

In addition, there is no computational model or standard test protocol here is no computational model or standard test protocol in the industry for the critical structural elements that addresses storm in the industry for the critical structural elements that addresses storm pressures generated by hurricanes or tornadoes. pressures generated by hurricanes or tornadoes.

Page 7: View Presentation

DATA MINING MODELES

Data mining is the process of extracting valid, authentic, and meaningful relationships from large quantities of data. It involves uncovering patterns in the data and is often tied to data warehousing because it attempts to make large amounts of data actionable. Data mining employs algorithms that are a mixture of statistics, fuzzy logic, genetic algorithms, and artificial intelligence.

Building a mining model is part of a larger process that includes all from defining the basic problem that the model will solve, to deploying the model into a working environment. This process can be defined by using the following basic steps:

Define the problem

Preparing data

Defining models

Validation and exploration

Deploying and updating models

Page 8: View Presentation

The following diagram shows the steps involved in a typical data-mining project.

Figure 1- Data mining components

Page 9: View Presentation

A mining model is defined by a data mining structure object, a data mining model object, and a data mining algorithm. Microsoft SQL Server 2005 Analysis Services (SSAS) provides several algorithms for use in data mining solutions:- Decisions Trees, Clustering, Association Rules, Naïve Bayes, and Neural Network. The following is a brief illustration of these algorithms.

Decisions TreesDecision tree is a classification and regression analysis for discrete or continuous attributes, the algorithm makes predictions based on the relationships between input columns in a dataset (Figure 2). It uses the values, or states, of those columns to predict the states of a column that is designated as predictable.

Figure 2- Decision Tree Diagram

Page 10: View Presentation

Association Rules

Association rules algorithm is a mining mechanism for finding correlations between different attributes in a dataset. The most common application of this kind of algorithm is for creating association rules, which can be used in a forecast analysis. Association models are built on datasets that contain identifiers both for individual cases and for the items that the cases contain.

Naïve Bayes

This algorithm calculates the conditional probability between input and predictable columns, and assumes that the columns are independent. It is based upon the simplifying hypothesis that when you evaluate column A as a predictor for target columns B1, B2, and son on, you can disregard dependencies between these target columns.

Page 11: View Presentation

Clustering

The Microsoft Clustering algorithm is a segmentation algorithm provided that uses iterative techniques to group data cases into clusters that contain similar characteristics. These groupings are useful for exploring data, identifying anomalies in the data, and creating predictions. Clustering models identify relationships in a dataset that might not be derived logically through normal observation. The Microsoft Clustering algorithm first identifies relationships in a dataset and generates a series of clusters based on those relationships. A scatter plot is a useful way to visually represent how the algorithm groups data, as shown in the following diagram.

Figure 4. Cluster groups diagram.

Page 12: View Presentation

Neural Network

The Microsoft Neural Network algorithm creates classification and regression mining models by constructing a Multilayer Perceptron network of neurons. In this Multilayer Perceptron network, each neuron receives one or more inputs and produces one or more identical outputs. Similar to the Microsoft Decision Trees algorithm, the Neural Network algorithm calculates probabilities for each possible state of the input attribute when given each state of the predictable attribute. These probabilities can be used to predict an outcome of the predicted attribute, based on the input attributes.

Page 13: View Presentation

Analysis

The relationship between common sizes and geometric shapes of concrete, steel, timber and masonry residential building, gravity and lateral resistive systems, the intensity of storm, and the degree of damages can be analyzed using the data mining models techniques to provide supportive damage prediction system. The output prediction vector is The output prediction vector is based on based on the damage categories specified by FEMA320. These categories are modified to include the distinction between envelope and structural damages. In contrast to FEMA320, ten damage categories are proposed in . In contrast to FEMA320, ten damage categories are proposed in this study:this study:

(1)- (1)- Minimal:Minimal: No real structural damage is done. Minor building envelope No real structural damage is done. Minor building envelope damages may occur (less than 5%).damages may occur (less than 5%).(2)- (2)- Low Moderate:Low Moderate: 6%-10% Roof and other envelope components are 6%-10% Roof and other envelope components are damaged.damaged.(3)- (3)- Moderate:Moderate: 11%-20%11%-20% Roof and other envelope components are Roof and other envelope components are damaged. damaged.

Page 14: View Presentation

(4)- (4)- High Moderate:High Moderate: more than more than 20%20% Roof and other envelope Roof and other envelope components are damaged.components are damaged.(5)- (5)- Low Extensive:Low Extensive: less than less than 10%10% Structural damage is done along with Structural damage is done along with damages to envelope.damages to envelope.(6)- (6)- Extensive:Extensive: less than less than 20%20% Structural damage is done along with Structural damage is done along with damages to envelope.damages to envelope.(7)- (7)- Low Extreme:Low Extreme: Extensive damage is done to many envelope Extensive damage is done to many envelope components components ((21% - 30%) accompanied by structural damages (21% - 21% - 30%) accompanied by structural damages (21% - 30%) 30%) (8)- (8)- Extreme:Extreme: Extensive damage is done to many envelope components Extensive damage is done to many envelope components (31% - 50%)(31% - 50%) accompanied by structural damages accompanied by structural damages (31% - 50%)(31% - 50%) that may that may result in complete building failure.result in complete building failure.(9)- (9)- Very Extreme:Very Extreme: Extensive damage is done to many envelope Extensive damage is done to many envelope components (51% - 60%) accompanied by structural damages (51% - components (51% - 60%) accompanied by structural damages (51% - 60%) that may result in complete building failure.60%) that may result in complete building failure.(10)- (10)- CatastrophicCatastrophic: Envelope damage is extensive and widespread (> : Envelope damage is extensive and widespread (> 60%). Structural damage is considerable and there are complete to near 60%). Structural damage is considerable and there are complete to near complete buildings failure. complete buildings failure.

Page 15: View Presentation

CONCLUSIONS

The relationship between windstorm and building damages sited raises many uncertainties about current building design and construction practices. Application of Data Mining Techniques provide a supportive tool to handle uncertainty and discover hidden relationships and rules that assist in classifying, predicting and associating different building damages and windstorm patterns. The system could also be instrumental in updating Building Codes and standard. From the severity of destructions shown in recent hurricanes and tornadoes, it is apparent that Building Codes and Standards need to re-address windstorm resistive systems.