data warehousing and knowledge discovery
TRANSCRIPT
A Min Tjoa Juan Trujillo (Eds.)
Data Warehousing and Knowledge Discovery
8th International Conference, DaWaK 2006 Krakow, Poland, September 4-8, 2006 Proceedings
£} Spri mger
Table of Contents
ETL Processing
ETLDiff: A Semi-automatic Framework for Regression Test of ETL Software l
Christian Thom,sen, Torben Bach Pedersen
Applying Transformations to Model Driven Data Warehouses 13 Jose-Norberto Mazön, Jesus Pardillo, Juan Trujillo
Bulk Loading a Linear Hash File 23 Davood Rafiei, Cheng Hu
Materialized View
Dynamic View Selection for OLAP 33 Michael Lawrence, Andrew Rau-Chaplin
Preview: Optimizing View Materialization Cost in Spatial Data Warehouses 45
Songmei Yu, Vijayalakshmi Atluri, Nabu Adam
Preprocessing for Fast Refreshing Materialized Views in DB 2 55 Wugang Xu, Calisto ZuzaHe, Dimitri Theodoratos, Wenbin Ma
Multidimensional Design
A Multiversion-Based Multidimensional Model 65 Franck Ravat, Olivier Teste, Gilles Zurfluh
Towards Multidimensional Requirement Design 75 Estella Annoni, Franck Ravat, Olivier Teste, Gilles Zurfluh
Multidimensional Design by Examples 85 Oscar Römern, Alberto Abello
OLAP and Multidimensional Model
Extending Visual OLAP for Handling Irregulär Dimensional Hierarchies 95
Svetlana Mansmann, Marc H. Scholl
XIV Table of Contents
A Hierarchy-Driven Compression Technique for Advanced OLAP Visualization of Multidimensional Data Cubes IQQ
Alfredo Cuzzocrea, Domenico Saccä, Paolo Serafino
Analysing Multi-dimensional Data Across Autonomous Data WarehouseH 1 „,.
Stefan Berger, Michael Schreft
What Time Is It in the Data Warehouse? 1 3 4
Stefano Rizzi, Matteo Golfarelli
Cubes Processing
Computing Iccberg Quotient Cubes with Bounding 1 4 5
Xiuzhen Zhang, Pauline Lienhua Chou, Kotagiri Ramamohanara'o
An Effective Algorithm to Extract Dense Sub-cubes from a Large Sparse Cube
Seok-Lyong Lee
On the Computation of Maximal-Correlated Cuboids Cells 165 Ronnie Alves, Orlando Belo
Data Warehouse Applications
Warehousing Dynamic XML Documenta . . , . . , 1 7 5
Laura Irina Rusu, Wenny Rahayu, David Tani ar
Integrating Different Grain Levels in a Medical Data Warehouse Fedcration ,Rt,
Marko Banek, A Min Tjoa, Nevena Stolba
A Versioning Management Model for Ontology-Based Data Warehouses -. „ r
Dung Nguyen Xuan, Ladjel Bellatreche, Guy Pierra
Data Warehouses in Grids with High QoS 207 Rogerio Luis de Garvalho Costa, Pedro Furtado
Mining Techniques (1)
Mining Dircct Marketing Data by Ensembles of Weak Learners and Rough Set Methods 2 ] „
Jerzy Blaszczynskt, Krzysztof Dembczynski, Wojctech KoÜowski Mariusz Pawlowski '
Efficient Mining of Dissociation Rules 228 Mikolaj Morzy
Table of Contents XV
Optimized Rule Mining Through a Unified Framework for Interestingness Measures • 238
Celine Hebert, Bruno Cremilleux
An Information-Theoretic Framework for Process Stnicture and Data Mining 2 4 8
Antonio D. Chiaravalloti, Gianluigi Greco, Antonella Guzzo, Luigi Pontieri
Mining Techniques (2)
Mixed Decision Trees: An Evolutionary Approach 260 Marek Kr§towski, Marek Grzes
ITER: An Algorithm for Predictive Regression Rule Extraction 270 Johan Huysrnans, Bart Baesens, Jan Vanthienen
COBRA: Closed Sequential Pattern Mining Using Bi-phase Reduction Approach 280
Kuo-Yu Huang, Chia-Hui Chang, Jiun-Hung Tung. Cheng-Tao Ho
Prequent Itemsets
A Greedy Approach to Coneurrent Processing of Frequent Itemset Queries 2^2
Pawel Boinski, Marek Wojciechowski, Maciej Zakrzewicz
Two New Techniques for Hiding Sensitive Itemsets and Their Empirical Evaluation 302
Ahmed HajYasien, Vladimir Estivill-Castro
EStream: Online Mining of Frequent Sets with Precise Error Guarantee 312
Xuan Hong Dang, Wee-Keong Ng, Kok-Leong Ong
Mining Data Streams
Granularity Adaptive Density Estimation and on Demand Clustering of Concept-Drifting Data Streams 3 2 2
Weiheng Zhu, Jim Pei, Jim Yin, Yihuang Xie
Classification of Hidden Network Streams 332 Matthew Gebski, Alex Penev, Raymond K. Wong
Adaptive Load Shedding for Mining Frequent Patterns from Data Streams 3 4 2
Xuan Hong Dang, Wee-Keong Ng, Kok-Leong Ong
XVI Table of Contents
An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams 352
Jia-Ling Koh, Shu-Ning Shin
Ontology-Based Mining
Learning Classifiers from Distributed, Ontology-Extendcd Data Sources ggg
Doina Caragea, Jun Zhang, Jyotishman Pathak, Vasant Honavar
A Coherent Biomedical Literaturc Clustering and Summarization Approach Through Ontology-Enriched Graphical Representations 374
Illhoi Yoo, Xiaohua Hu, Il-Yeol Song
Automatic Extraction for Creating a Lexical Repository of Abbreviations in the Biomedical Literature 384
Min Song, Il-Yeol Song, Ki Jung Lee
Clustering
Priority-Based k-Anonymity Accompiished by Weighted Generalisation Stnictures 394
Konrad Stark, Johann Eder, Kurt Zatloukal
Achieving fc-Anonymity by Clustering in Attribute Hierarchical Structures 495
Jiuyong Li, Raymond Chi-Wmg Wong, Ada Wai-Chee Fu, Jzan Pei
Calculation of Density-Based Clustering Parameters Supported with Distributed Processing 4^7
Marcin Gorawski, Rafal Malczok
Cluster-Based Sampling Approaches to Imbalanced Data Distribittions 427
Show-Jane Yen, Yue-Shi Lee
Advanced Mining Techniques
Efficient Mining of Largc Maximal Bicliques 437 Guimei Liu, Kelvin S.H. Sirn, Jinyan Li
Automatic Image Annotation by Mining the Web 449 Zhiguo Gong, Qian Liu, Jingbai Zhang
Privacy Preserving Spatio-temporal Clustering on Horizontally Partitioned Data 459
Ali Inan, Yücel Saygm
Table of Contents XVII
Association Rules
Discovering Semantic Sibling Associations from Web Documenta with XTREEM-SP 469
Marko Brunzel, Myra Spiliopoulou
Difi'erence Detection Between Two Contrast Sets 481 Hui-fing Huang, Yongsong Qin, Xiaofeng Zhu, Jilian Zhang, Shichao Zhang
EGEA: A New Hybrid Approach Towards Extracting Reduced Generic Association Rule Set (Application to AML Bio od Cancer Therapy) 491
M.A. Esseghir, G. Gasrni, Sadok Ben Yahia, Y. Slimani
Miscellaneous Applications
AISS: An Index for Non-timestamped Set Subsequence Queries 503 Witold Andrzejewski, Tadeusz Morzy
A Method for Feature Selection on Microarray Data Using Support Vector Machine 513
Xiao Bing Huang, Man Tang
Providing Persistence for Sensor Data Streams by Remote WAL 524 Hideyuki Kawashima, Michita Imai, Yuichiro Anzai
Classification
Support Vector Machine Approach for Fast Classification 534 Keivan Kianmehr, Reda Alha.jj
Document Representations for Classification of Short Web-Page Descriptions 544
Milos Radovanovic, Mirjana Ivanovic
GARC: A New Associative Classification Approach 554 Ines Bouzouüa. Samir Elloumi, Sadok Ben Yahia
Coiiceptual Modeling for Classification Mining in Data Warehouscs 566 Jose Zubcoff, Juan TTUJUIO
Author Index 577