data warehousing and knowledge discovery

6
A Min Tjoa Juan Trujillo (Eds.) Data Warehousing and Knowledge Discovery 8th International Conference, DaWaK 2006 Krakow, Poland, September 4-8, 2006 Proceedings £} Spri mger

Upload: others

Post on 16-May-2022

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Warehousing and Knowledge Discovery

A Min Tjoa Juan Trujillo (Eds.)

Data Warehousing and Knowledge Discovery

8th International Conference, DaWaK 2006 Krakow, Poland, September 4-8, 2006 Proceedings

£} Spri mger

Page 2: Data Warehousing and Knowledge Discovery

Table of Contents

ETL Processing

ETLDiff: A Semi-automatic Framework for Regression Test of ETL Software l

Christian Thom,sen, Torben Bach Pedersen

Applying Transformations to Model Driven Data Warehouses 13 Jose-Norberto Mazön, Jesus Pardillo, Juan Trujillo

Bulk Loading a Linear Hash File 23 Davood Rafiei, Cheng Hu

Materialized View

Dynamic View Selection for OLAP 33 Michael Lawrence, Andrew Rau-Chaplin

Preview: Optimizing View Materialization Cost in Spatial Data Warehouses 45

Songmei Yu, Vijayalakshmi Atluri, Nabu Adam

Preprocessing for Fast Refreshing Materialized Views in DB 2 55 Wugang Xu, Calisto ZuzaHe, Dimitri Theodoratos, Wenbin Ma

Multidimensional Design

A Multiversion-Based Multidimensional Model 65 Franck Ravat, Olivier Teste, Gilles Zurfluh

Towards Multidimensional Requirement Design 75 Estella Annoni, Franck Ravat, Olivier Teste, Gilles Zurfluh

Multidimensional Design by Examples 85 Oscar Römern, Alberto Abello

OLAP and Multidimensional Model

Extending Visual OLAP for Handling Irregulär Dimensional Hierarchies 95

Svetlana Mansmann, Marc H. Scholl

Page 3: Data Warehousing and Knowledge Discovery

XIV Table of Contents

A Hierarchy-Driven Compression Technique for Advanced OLAP Visualization of Multidimensional Data Cubes IQQ

Alfredo Cuzzocrea, Domenico Saccä, Paolo Serafino

Analysing Multi-dimensional Data Across Autonomous Data WarehouseH 1 „,.

Stefan Berger, Michael Schreft

What Time Is It in the Data Warehouse? 1 3 4

Stefano Rizzi, Matteo Golfarelli

Cubes Processing

Computing Iccberg Quotient Cubes with Bounding 1 4 5

Xiuzhen Zhang, Pauline Lienhua Chou, Kotagiri Ramamohanara'o

An Effective Algorithm to Extract Dense Sub-cubes from a Large Sparse Cube

Seok-Lyong Lee

On the Computation of Maximal-Correlated Cuboids Cells 165 Ronnie Alves, Orlando Belo

Data Warehouse Applications

Warehousing Dynamic XML Documenta . . , . . , 1 7 5

Laura Irina Rusu, Wenny Rahayu, David Tani ar

Integrating Different Grain Levels in a Medical Data Warehouse Fedcration ,Rt,

Marko Banek, A Min Tjoa, Nevena Stolba

A Versioning Management Model for Ontology-Based Data Warehouses -. „ r

Dung Nguyen Xuan, Ladjel Bellatreche, Guy Pierra

Data Warehouses in Grids with High QoS 207 Rogerio Luis de Garvalho Costa, Pedro Furtado

Mining Techniques (1)

Mining Dircct Marketing Data by Ensembles of Weak Learners and Rough Set Methods 2 ] „

Jerzy Blaszczynskt, Krzysztof Dembczynski, Wojctech KoÜowski Mariusz Pawlowski '

Efficient Mining of Dissociation Rules 228 Mikolaj Morzy

Page 4: Data Warehousing and Knowledge Discovery

Table of Contents XV

Optimized Rule Mining Through a Unified Framework for Interestingness Measures • 238

Celine Hebert, Bruno Cremilleux

An Information-Theoretic Framework for Process Stnicture and Data Mining 2 4 8

Antonio D. Chiaravalloti, Gianluigi Greco, Antonella Guzzo, Luigi Pontieri

Mining Techniques (2)

Mixed Decision Trees: An Evolutionary Approach 260 Marek Kr§towski, Marek Grzes

ITER: An Algorithm for Predictive Regression Rule Extraction 270 Johan Huysrnans, Bart Baesens, Jan Vanthienen

COBRA: Closed Sequential Pattern Mining Using Bi-phase Reduction Approach 280

Kuo-Yu Huang, Chia-Hui Chang, Jiun-Hung Tung. Cheng-Tao Ho

Prequent Itemsets

A Greedy Approach to Coneurrent Processing of Frequent Itemset Queries 2^2

Pawel Boinski, Marek Wojciechowski, Maciej Zakrzewicz

Two New Techniques for Hiding Sensitive Itemsets and Their Empirical Evaluation 302

Ahmed HajYasien, Vladimir Estivill-Castro

EStream: Online Mining of Frequent Sets with Precise Error Guarantee 312

Xuan Hong Dang, Wee-Keong Ng, Kok-Leong Ong

Mining Data Streams

Granularity Adaptive Density Estimation and on Demand Clustering of Concept-Drifting Data Streams 3 2 2

Weiheng Zhu, Jim Pei, Jim Yin, Yihuang Xie

Classification of Hidden Network Streams 332 Matthew Gebski, Alex Penev, Raymond K. Wong

Adaptive Load Shedding for Mining Frequent Patterns from Data Streams 3 4 2

Xuan Hong Dang, Wee-Keong Ng, Kok-Leong Ong

Page 5: Data Warehousing and Knowledge Discovery

XVI Table of Contents

An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams 352

Jia-Ling Koh, Shu-Ning Shin

Ontology-Based Mining

Learning Classifiers from Distributed, Ontology-Extendcd Data Sources ggg

Doina Caragea, Jun Zhang, Jyotishman Pathak, Vasant Honavar

A Coherent Biomedical Literaturc Clustering and Summarization Approach Through Ontology-Enriched Graphical Representations 374

Illhoi Yoo, Xiaohua Hu, Il-Yeol Song

Automatic Extraction for Creating a Lexical Repository of Abbreviations in the Biomedical Literature 384

Min Song, Il-Yeol Song, Ki Jung Lee

Clustering

Priority-Based k-Anonymity Accompiished by Weighted Generalisation Stnictures 394

Konrad Stark, Johann Eder, Kurt Zatloukal

Achieving fc-Anonymity by Clustering in Attribute Hierarchical Structures 495

Jiuyong Li, Raymond Chi-Wmg Wong, Ada Wai-Chee Fu, Jzan Pei

Calculation of Density-Based Clustering Parameters Supported with Distributed Processing 4^7

Marcin Gorawski, Rafal Malczok

Cluster-Based Sampling Approaches to Imbalanced Data Distribittions 427

Show-Jane Yen, Yue-Shi Lee

Advanced Mining Techniques

Efficient Mining of Largc Maximal Bicliques 437 Guimei Liu, Kelvin S.H. Sirn, Jinyan Li

Automatic Image Annotation by Mining the Web 449 Zhiguo Gong, Qian Liu, Jingbai Zhang

Privacy Preserving Spatio-temporal Clustering on Horizontally Partitioned Data 459

Ali Inan, Yücel Saygm

Page 6: Data Warehousing and Knowledge Discovery

Table of Contents XVII

Association Rules

Discovering Semantic Sibling Associations from Web Documenta with XTREEM-SP 469

Marko Brunzel, Myra Spiliopoulou

Difi'erence Detection Between Two Contrast Sets 481 Hui-fing Huang, Yongsong Qin, Xiaofeng Zhu, Jilian Zhang, Shichao Zhang

EGEA: A New Hybrid Approach Towards Extracting Reduced Generic Association Rule Set (Application to AML Bio od Cancer Therapy) 491

M.A. Esseghir, G. Gasrni, Sadok Ben Yahia, Y. Slimani

Miscellaneous Applications

AISS: An Index for Non-timestamped Set Subsequence Queries 503 Witold Andrzejewski, Tadeusz Morzy

A Method for Feature Selection on Microarray Data Using Support Vector Machine 513

Xiao Bing Huang, Man Tang

Providing Persistence for Sensor Data Streams by Remote WAL 524 Hideyuki Kawashima, Michita Imai, Yuichiro Anzai

Classification

Support Vector Machine Approach for Fast Classification 534 Keivan Kianmehr, Reda Alha.jj

Document Representations for Classification of Short Web-Page Descriptions 544

Milos Radovanovic, Mirjana Ivanovic

GARC: A New Associative Classification Approach 554 Ines Bouzouüa. Samir Elloumi, Sadok Ben Yahia

Coiiceptual Modeling for Classification Mining in Data Warehouscs 566 Jose Zubcoff, Juan TTUJUIO

Author Index 577