p r o g r a m - ecaip r o g r a m 1st international conference on electronics, computers and...
Post on 21-Jan-2020
0 Views
Preview:
TRANSCRIPT
P R
O G
R A
M
1st INTERNATIONAL CONFERENCE on
ELECTRONICS, COMPUTERS and ARTIFICIAL INTELLIGENCE
ECAI 2005
1-2 July 2005
hosted by
UNIVERSITY OF PITESTI
DEPARTMENT OF ELECTRONICS AND COMPUTERS
Organizers:
University of Pitesti:
- Department of Electronics and Computers
- Research Centre for Systems and Processes' Modelling and Simulation
- Medical College
Co-organizers:
Politehnica University of Bucharest
- Faculty of Electronics and Telecommunications
- Faculty of Automatics and Computers
Romanian Academy - Section for Theoretical Informatics, Iaşi branch
National Institute of Inventors, Iaşi
Nuclear Research Institute, Piteşti
Romanian Medical College - Argeş branch
Romanian Medical Association– Argeş branch
1st INTERNATIONAL CONFERENCE on
ELECTRONICS, COMPUTERS and ARTIFICIAL INTELLIGENCE
ECAI 2005
1-2 July 2005
Organizers:
Organizers:
University of Pitesti: • Department of Electronics and Computers
• Research Centre for Systems and Processes' Modelling and Simulation • Medical College
Co-organizers: � Politehnica University of Bucharest
� Faculty of Electronics and Telecommunications � Faculty of Automatics and Computers
� Romanian Academy - Section for Theoretical Informatics, Iaşi branch � National Institute of Inventors, Iaşi � Nuclear Research Institute, Piteşti
� Romanian Medical College - Argeş branch � Romanian Medical Association– Argeş branch
Honorary Chair
Takeshi Yamakawa, Japan General Chair Emil Sofron
General Co-chairs Gheorghe Serban and Ilie Popa
Program Chair Nicu Bizon
Local Arrangenents Chair Ion Tutanescu
International Scientific Committee
Horia N. Teodorescu (Romania)
Alexandru Şerbănescu (Romania)
Constantin Negoita (USA)
Gheorghe Secară (Romania)
Gheorghe Barbu (Romania)
Ioan Dumitrache (Romania)
Maaruf Ali (UK)
Marin Drăgulinescu (Romania)
Gheorghe Şerban (Romania)
Ognyan Manolov (Bulgaria)
Emil Sofron (Romania),
Harold Szu (USA)
Rodica Strungaru (Romania)
Junzo Watada (Japan)
Ilie Popa (Romania)
Lucien Dascalescu (France )
Teodor Petrescu (Romania)
Dumitru Popescu (Romania)
Paul Svasta (Romania)
Eugene Roventa (Canada)
Stoichescu Alexandru (Romania)
Silviu Ioniţă (Romania)
Ioan Liţă (Romania)
Ion Sima (Romania)
Nicu Bizon (Romania)
Ion Tutănescu (Romania)
Tiberiu Stănescu (Romania)
Mihai Man (Romania)
Ştefănescu Mircea (Romania)
1st INTERNATIONAL CONFERENCE on
ELECTRONICS, COMPUTERS and ARTIFICIAL INTELLIGENCE
ECAI 2005
1-2 July 2005
Timetable and Program Date Time Session or activity Location
14:00-20:00 Registration “T” building, 4th floor, secretary room June 30 20:00-22:00 Welcome cocktail “T” building, 4th floor, protocol room
9:00-10:00 Registration “T” building, conference desk 10:00-10:15 Opening Ceremony “T” building, T308 10:15-11:00 Plenary Session I “T” building, T308 11:00-11:15 Coffee Break
Artificial Intelligence I “T” building, T308 11:15-12:30 Research and Educational multimedia applications
“T” building, T307
12:30-13:30 Student presentation and exhibition
“T” building, Laboratories
13:30-15:30 Lunch Break “University House” restaurant Communications I “T” building, T308 15:30-16:45 Bio-medical applications I “T” building, T307
16:45-17:00 Coffee Break Electronic circuits and equipments I
“T” building, T308 17:00-19:00
Software and computer applications I
“T” building, T307
July 1
20:00 Dinner “University House” restaurant 9:00-9:45 Plenary Session II “T” building, T308 9:45-10:00 Coffee Break
Artificial Intelligence II “T” building, T308 10:00-11:15 Bio-medical applications II “T” building, T307 Electronic circuits and equipments II
“T” building, T308
Software and computer applications II
“T” building, T307
11:15-13:15
Communications II “T” building, T309 13:15-13:30 Closing remarks and
Coffee Break
13:30-15:00 Lunch Break “University House” restaurant
July 2
15:00-21:00 Excursion in Argeş county
PROGRAM
June 30, 14:00-20:00 Registration of the participants “T” building, 4th floor, secretary room June 30, 20:00-22:00 Welcome cocktail “T” building, 4th floor, protocol room July 1, 9:00-10:00 Registration of the participants “T” building, conference desk
OPENING CEREMONY July 1, 10:00-10:15 Welcome and Opening Addresses “T” building, T308
PLENARY SESSION I and EXHIBITION July 1, 10:15-11:00 “T” building, T308 Chairs: Harold SZU, Emil SOFRON, Ilie POPA Fuzzy multi-criteria decision making method for diagnostic selection
Andreeva PLAMENA, Manolov OGNYAN Bulgarian Academy of Sciences, Bulgaria
Internet security: using hacker techniques to improve information systems
Philip FOGARTY, Ali MAARUF Department of Electronic Engineering, Oxford Brookes University, Oxford, UK
Telemonitoring system for cardiological diseases Sorin PUSCOCI, Florin SERBANESCU INSCC Bucharest, Quickweb INFO Bucharest
Bluetooth: profiles and applications Ion BOGDAN Technical University „Gheorghe Asachi”
PLENARY SESSION II and EXHIBITION July 2, 9:00-9:45 “T” building, T308 Chairs: Junzo WATADA , Teodor PETRESCU, Ion BOGDAN On image smoothing filters - new approaches
Horia – Nicolai TEODORESCU Technical University „Gheorghe Asachi”
Automatic speech recognition user interface
Doru MUNTEANU Military Technical Academy
REGULAR AND INVITED SESSIONS
I. Artificial Intelligence I (July 1, “T” building, T308, 11:15-12:30) Co-Chairs: Manolov OGNYAN, Nicu BIZON
Postmodernism and fuzzy set theory
Constantin Virgil NEGOITA City University of New York, USA
Neurofuzzy networks in motor imagery
Stefan COSOSCHI, Alexandru UNGUREANU, Rodica STRUNGARU “Politehnica” University of Bucharest
Knock detection by time analysis of the vibration signal using a neural
Dan LAZARESCU, Mihaela UNGUREANU, Vasile LAZARESCU “Politehnica” University of Bucharest
Hysteretic fuzzy control of the boost converter Nicu BIZON, Mihai OPROESCU University of Pitesti
II. Artificial Intelligence II (July 2, “T” building, T308, 10:00-11:15) Co-Chairs: Horia – Nicolai TEODORESCU, Ionita SILVI U
Clocked hysteretic fuzzy control of the boost converter
Nicu BIZON, Mihai OPROESCU University of Pitesti
Correlation between the information access time and the benefits the commercial players obtain, in a fuzzy network model
Horia – Nicolai TEODORESCU, Marius ZBANCIOC Technical University „Gheorghe Asachi”,
Controlling the trajectories for mobile robots with neural networks
Robert BELOIU, Mariana IORGULESCU Octavian DUMITRU, Petre COMBEI, Adrinel STANCESCU University of Pitesti
Neural networks and SPAM detection Constantin Alin MIROIU, Florin SMARANDA Draexlmaier Group, University of Pitesti
III. Research and Educational multimedia applications (July 1, “T” building, T307, 11:15-12:30) Co-Chairs: Serban GHEORGHE, Eugene ROVENTA , Mircea STEFANESCU
Technogical transfer and bussiness incubation processess - international and national perspectives
Alexandru MARIN, Claudia-Roberta VISAN
“Politehnica” University of Bucharest Case study: hybrid, internet-based medical informatics education course
Mircea STEFANESCU Clinical Hospital “Coltea”
The Copyright in Internet Media
Dumitru SCHEIANU, Ilie IORADCHE, Adriana GIJU University of Pitesti, Gr. Sc. Ind. Eneregtic Craiova
Student support in distance learning Luminiţa ŞERBĂNESCU University of Pitesti
Internet-based distance education Luminiţa ŞERBĂNESCU University of Pitesti
Multimedia data management by object_oriented semantic tool Florentina Magda ENESCU, Marian Marius POPESCU University of Pitesti, STS-Valcea
IV. Electronic circuits and equipments I
(July 1, “T” building, T308, 17:00-19:00) Co-Chairs: Lucien DASCALESCU, Andrei HORIA, Ioan LITA
Accelerometric signal processing for uncertain environments investigation
Silviu IONITA, Horia – Nicolai TEODORESCU, Emil SOFRON, Vasile Gabriel IANA University of Pitesti, Technical University „Gheorghe Asachi”
The minimum dissipated power in stationary regime
Andrei HORIA, Fanica SPINEI, Costin CEPISCA, Valentin DOGARU Valahia University of Targoviste, Politehnica University of Bucharest
Control structure of a walking robot without degrees of freedom
Anca PETRISOR University of Craiova
Consideration on the evaluation of the reliability parameters for electronic equipments
Gheorghe VIERU Institute for Nuclear Research Pitesti
Electric drive system with dc motor with closed control speed loop
Robert BELOIU, Mariana IORGULESCU, Octavian DUMITRU, Petre COMBEI, Adrinel STANCESCU University of Piteşti
Implementation of a RISC architecture in a reduced complexity FPGA
Vasile Gabriel IANA, Gheorghe SERBAN University of Piteşti
Circuit for Low Inductance Measurement Marian RADUCU, Silviu IONITA University of Piteşti
V. Electronic circuits and equipments II (July 2, “T” building, T308, 11:15-13:15) Co-Chairs: Marin DRAGULINESCU, Gheorghe GAVRILOAIA, Octavian DUMITRU
Utilations of radar polarimetry in environment analyse Gheorghe GAVRILOAIA, O. SUCIU University of Piteşti
Performance optimization for fuel vehicle Gheorghe GAVRILOAIA, M. CIUPERCEANU University of Piteşti
About LC oscillators Luiza GRIGORESCU Dunarea de Jos University
Oscillator with magnetic coupling Luiza GRIGORESCU Dunarea de Jos University
Electric drive system with dc motor with closed current loop and open speed loop Robert BELOIU, Mariana IORGULESCU, Octavian DUMITRU, Petre COMBEI, Adrinel STANCESCU University of Piteşti
Induction motors faults diagnosis based on neural networks Mariana IORGULESCU, Robert BELOIU University of Piteşti
Induction motors models and faults simulation Mariana IORGULESCU University of Piteşti
Implementing multipliers for computational intensive applications in reconfigurable hardware Valeriu Manuel IONESCU, Petre ANGHELESCU, Laurentiu IONESCU, Vasile Gabriel IANA University of Piteşti
LabVIEW Application for a Power Line Communication system Ioan LIŢĂ, Daniel Alexandru VIŞAN, Ion Bogdan CIOC, George ANGELESCU University of Pitesti
Spherical Coordonating Command of Stepper Motors Using LabVIEW Ioan LIŢĂ, Ion Bogdan CIOC, Daniel Alexandru VIŞAN, Alexandru DOGARU University of Pitesti
VI. Communications I (July 1, “T” building, T308, 15:30-16:45) Co-Chairs: Ali MAARUF, Ion TUTANESCU
Beamforming techniques in wireless networks Ion BOGDAN Technical University „Gheorghe Asachi”
Analysis of NEWPRED Implementation for MPEG-4 over MTS Moving Propagation Environment Bhumin H. PATHAK, Geoff CHILDS, Ali MAARUF Communications Research Group, Oxford Brookes University, Oxford UK
Speech synthesis methods – formant synthesis Valentin BOŞCĂNICI, Constantin ANTON, Paul ROŞIANU, Dumitru BREBEANU Special Telecommunication Service - Argeş
The EZW algorithm in wavelet-based image compression Dumitru BREBEANU, Mariana JURIAN, Constantin ANTON, Valentin BOSCANICI Special Telecommunication Service – Argeş, University of Pitesti
VII. Communications I I (July 2, “T” building, T309, 11:15-13:15) Co-Chairs: Ion SIMA, Mariana JURIAN
Design of tunable filter based on distributed MEMS resonators Stefan SIMION Military Technical Academy
An efficient method to design the broad-band equivalent antennas Ion T. SIMA University of Pitesti
Watermarking - system for copyright protection Paul ROŞIANU, Constantin ANTON, Dumitru BREBEANU, Valentin BOŞCĂNICI Special Telecommunication Service - Argeş
Digital to analog converter based on sigma-delta modulation with reprogrammable structures
Vasile Gabriel IANA, Petre ANGELESCU, Cosmin IVAN, Emil SOFRON, Serban GHEORGHE, Alexandru SERBANESCU University of Piteşti, Military Technical Academy
Chaos router: increasing performance and stability of computers network Daniel MURUGĂ, Alexandru ŞERBĂNESCU University of Piteşti, Military Technical Academy
Embedded system for measure and wireless communication in the band 433 Mhz Alin MAZĂRE, Ionel BOSTAN, Laurenţiu IONESCU University of Pitesti
An automatic analog modulation recognition algorithm based on a decision tree approach Viorel GRIGORE, Loredan MOLOCENIUC, Toni-Cristian VOICULESCU University of Piteşti
DSL Toni-Cristian VOICULESCU, Loredan MOLOCENIUC, Viorel GRIGORE Special Telecommunication Service – Argeş, University of Pitesti
VIII. Bio-medical applications I (July 1, “T” building, T308, 15:30-16:45) Co-Chairs: Rodica STRUNGARU, Harold SZU, Pasca SEVER
Image information mining for medical diagnosis help M.DATCU, A. COLAPICCHIONI, Rodica STRUNGARU, Clara PASQUALI, R. MURRI, S. PASCA “Politehnica” University of Bucharest, Advanced Computer Systems, Rome, Italy
Heart rate variability analysis using time-frequency analysis Alexandru UNGUREANU, Mihaela UNGUREANU, Rodica STRUNGARU “Politehnica” University of Bucharest
E-health applications for mobile biosignal analysis Liviu MORARU, Rodica STRUNGARU, Mihaela UNGUREANU “Politehnica” University of Bucharest
Content-based image retrieval in medical image databases Mihaela UNGUREANU, Rodica STRUNGARU, Sever PASCA “Politehnica” University of Bucharest
Application of surface fractal analysis in medical images Elena MIRCEA , Florin MUNTEANU “Politehnica” University of Bucharest
The effect of electrical stimulation on the persistent vegetative coma patients Zabach BAREAA, Pasca SEVER, Mihaela UNGUREANU, Jean CIUREA “Politehnica” University of Bucharest, Bagdazar Hospital, Bucharest
IX. Bio-medical applications II (July 2, “T” building, T309, 10:00-11:15) Co-Chairs: Horia – Nicolai Teodorescu, Mihaela UNGUREANU
Computational analysis of a new image filter for echography Horia – Nicolai Teodorescu, Raluca Ganea, Constantin Mocanasu Technical University “Gh. Asachi”
Digital Image Processing in Dentistry Stefan OPREA, Ilie POPA, Costin MARINESCU, Lauretiu APOSTOL University of Pitesti, UCLA School of Dentistry, Los Angeles, USA
The Robot System with Autonomous Moving in the Tubes for inservice inspection
Ilie POPA, Stefan OPREA University of Pitesti
The impact of the apical morphology of a layer v burst firing neocortical pyramidal cell to its discharge pattern Otilia PĂDURARU Institute for Theoretical Computer Science, Romanian Academy, Iaşi Branch
Full mouth reconstruction using computers instead of scalpels Costin Marinescu, Stefan OPREA UCLA School of Dentistry, Los Angeles, USA, University of Pitesti
The statistics of nonlinear parameters for the normal and emotional voice Horia Nicolai Teodorescu, Monica Feraru Technical University “Gh. Asachi”
Knowledge representation into expert systems using a relational model Emil SOFRON, Gheorghe GAVRILOAIA, Viorel PAUN, Doru CONSTANTIN University of Pitesti
X. Software and computer applications I (July 1, “T” building, T308, 17:00-19:00) Co-Chairs: Ioan DUMITRACHE, Ilie POPA, Ion STEFANES CU
Mobile agents in remote energy meter reading and management systems
Radwan TAHBOUB, Vasile LAZARESCU “Politehnica” University of Bucharest
A statistical approach to fast algorithms for vector quantization
Spiridon Florin BELDIANU Technical University “Gh. Asachi”
Discrimination, a new principle in the efficient simplification of the digital binary and multi-valent functions
Ion STEFANESCU University of Piteşti
Technical aspects of hazard parts of making evident and hazard elimination in binary and multivalent combination logic structure
Ion STEFANESCU University of Piteşti
Aspects of minimization hardware and software decisional systems - an extension of discrimination method
Adrian ZAFIU, Ion STEFANESCU University of Piteşti
A new version of the Flusser moments set
Iulian-Constantin VIZITIU Military Technical Academy
Use of the autocorrelation function for identifying the dominant noise type Constantin ANTON, Paul ROŞIANU, Dumitru BREBEANU Special Telecommunication Service - Argeş
XI Software and computer applications II
(July 2, “T” building, T308, 11:15-13:15) Co-Chairs: Constantin NEGOIESCU, Alexandru SERBANESCU, Eugen Diaconescu
Human body models for electromagnetic absorption analyze Gheorghe GAVRILOAIA University of Pitesti
Blue data fusion in the presence of incertitude Gheorghe GAVRILOAIA, Adrian STOICA, Augustin SPERILA University of Pitesti, Military Technical Academy, Romanian Air Force
Pseudorandom pattern generators based on cellular automata Petre ANGHELESCU, Emil SOFRON, Vasile Gabriel IANA, Valeriu IONESCU University of Pitesti
The implementation of a control and command automat on evolutive hardware based structure Laurenţiu IONESCU, Alin MAZĂRE, Valeriu IONESCU, Georghe ŞERBAN University of Pitesti
Using of discrimination method for optimizing and hardware implementation of an algorithm for computing the maximum and the minimum of two numbers coded on four bits each
Adrian ZAFIU, Laurenţiu IONESCU, Ion STEFANESCU University of Pitesti
A comparison between traditional and discrimination method of maximum and minimum circuit synthesis for implementation on field programmable gates array
Laurenţiu IONESCU, Adrian ZAFIU, Ion ŞTEFĂNESCU University of Pitesti
PLC with Rabbit 3000 Eugen DIACONESCU, Adrian ZAFIU University of Pitesti
A new simple and practical corner orientation detector Eugen DIACONESCU, Cristian DRAGOMIRESCU University of Pitesti
A data-parallel implementation of a problem regarding multiphase flows in porous media Catalin DIACONESCU University of Pitesti
Local Organizing Committee:
Valentin Bănică
Marian Răducu
Corina Savulescu
Florin Smaranda
Valeriu Ionescu
Daniel Visan
Cioc Bogdan
Mihai Oproescu
SCIENTIFIC BULLETIN
Series: ELECTRONICS AND COMPUTERS SCIENCE
UNIVERSITATEA DIN PITE ŞTI UNIVERSITY OF PITEST I
Proceedings of the 1st INTERNATIONAL CONFERENCE on ELECTRONICS, COMPUTERS and ARTIFICIAL INTELLIGENCE
ECAI 2005
Seria: Electronică şi stiinţa calculatoarelor Series: Electronics and computers science
Number 5/ 2005 ISSN – 1453 – 1119
Continued from front cover Software and computer applications S4 Use of the autocorrelation function for identifying the dominant noise type Constantin ANTON, Paul ROŞIANU, Dumitru BREBEANU .....................................................................................................
1
Discrimination, a new principle in the efficient simplification of the digital binary and multi-valent functions Ion STEFANESCU .............................................................................................................................................................................
5
A new version of the Flusser moments set Iulian-Constantin VIZITIU ...............................................................................................................................................................
10
Pseudorandom pattern generators based on cellular automata Petre ANGHELESCU, Emil SOFRON, Gabriel IANA, Valeriu IONESCU ...............................................................................
14
A new simple and practical corner orientation detector Eugen DIACONESCU, Cristian DRAGOMIRESCU ....................................................................................................................
20
A comparison between traditional and discrimination method of maximum and minimum circuit synthesis for implementation on field programmable gates array Laurenţiu IONESCU, Adrian ZAFIU, Ion ŞTEFĂNESCU ..........................................................................................................
27
Mobile agents in remote energy meter reading and management systems Radwan TAHBOUB, Vasile LAZARESCU ....................................................................................................................................
33
A data-parallel implementation of a problem regarding multiphase flows in porous media Catalin DIACONESCU .....................................................................................................................................................................
43
Aspects of the minimization for hardware and software decisional systems - an extension of discrimination method Adrian ZAFIU, Ion STEFANESCU .................................................................................................................................................
49
A model of motions trajectory control by neural networks for two degree-of-freedom robot Ilie POPA …………………………………………………………………………………………………………………………...
53
New technical aspects in identifying and eliminating of logic and essential hazards of the combinational networks Ion STEFANESCU .............................................................................................................................................................................
63
Implementing multipliers for computational intensive applications in reconfigurable hardware Valeriu IONESCU, Petre ANGHELESCU, Laurenţiu IONESCU, Gabriel IANA, ..................................................................
67
Communications S5 Analysis of NEWPRED Implementation for MPEG-4 over MTS Moving Propagation Environment Bhumin H. PATHAK, Geoff CHILDS, Ali MAARUF ..... ............................................................................................................
1
Beamforming Techniques in Wireless Networks Ion BOGDAN ......................................................................................................................................................................................
5
Design of Tunable Filter Based on Distributed MEMS Resonators Stefan SIMION . .................................................................................................................................................................................
11
Labview Application for an Power Line Communication System Ioan LITA, Daniel Alexandru VISAN, Bogdan Ion CIOC, George ANGELESCU ....................................................................
16
An Efficient Method to Design the Broad-Band Equivalent Antennas Ion T. SIMA …………………………………………………………………………………………………………………………
21
The EZW algorithm in wavelet-based image compression Dumitru BREBEANU, Mariana JURIAN, Constantin ANTON, Valentin BOSCANICI ……………………………………..
26
Embedded system for measure and wireless communication in the band 433 MHz Alin MAZ ĂRE, Ionel BOSTAN, Laurenţiu IONESCU .................................................................................................................
34
Chaos router: increasing performance and stability of computers network Daniel MURUGĂ, Alexandru ŞERBĂNESCU ..................................................................... ........................................................
38
Speech synthesis methods – formant synthesis Valentin BOŞCĂNICI, Constantin ANTON, Paul ROŞIANU, Dumitru BREBEANU ..............................................................
46
Digital to Analog Converter Based on Sigma-Delta Modulation with Reprogrammable Structures Vasile G. IANA, Petre ANGELESCU, Cosmin IVAN, Emil SOFRON, Serban GHEORGHE, Alexandru SERBANESCU
50
Watermarking - system for copyright protection Paul ROŞIANU, Constantin ANTON, Dumitru BREBEANU, Valentin BOŞCĂNICI ..............................................................
54
DSL Toni-Cristian VOICULESCU, Loredan MOLOCENIUC, Vio rel GRIGORE ..........................................................................
58
An automatic analog modulation recognition algorithm based on a decision tree approach Viorel GRIGORE, Loredan MOLOCENIUC, Toni-Cristian VOICULESCU ..........................................................................
63
Research and Educational multimedia applications S6 Technogical transfer and bussiness incubation processess - international and national perspectives Alexandru MARIN, Claudia-Roberta VISAN ...............................................................................................................................
1
Case study: hybrid, internet-based medical informatics education course Mircea STEFANESCU ..................................................................... ................................................................................................
7
Student support in distance learning Lumini ţa ŞERBĂNESCU ..................................................................... ............................................................................................
11
PLC with Rabbit 3000 Eugen DIACONESCU, Adrian ZAFIU ..................................................................... .....................................................................
17
Internet-based distance education Lumini ţa ŞERBĂNESCU ..................................................................... .............................................................................................
23
Multimedia data management by object-oriented semantic tool Florentina Magda ENESCU, Marian Marius POPESCU ..................................................................... ........................................
29
Autors Index
Contents
Number 5/ 2005 ISSN – 1453 – 1119
SCIENTIFIC BULLETIN
Series: ELECTRONICS AND COMPUTERS SCIENCE
EDITOR - IN - CHIEF EMIL SOFRON
ASSOCIATE EDITORS:
Electronic circuits and equipments Ion Sima, Alexandru Şerbănescu Software and computer applications Ilie Popa, Ioan Liţă Communications Maaruf Ali, Ion Bogdan Artificial Intelligence Horia N. Teodorescu, Ognyan Manolov Bio-medical applications Rodica Strungaru, Nicu Bizon Research and Educational multimedia applications Gheorghe Şerban, Silviu Ioniţă
VOLUME EDITORS NICU BIZON, MIHAI OPROESCU
EDITORIAL ADVISORY BOARD:
Takeshi Yamakawa (Japan) Horia N. Teodorescu (Romania) Alexandru Şerbănescu (Romania) Constantin Negoita (USA) Gheorghe Secară (Romania) Gheorghe Barbu (Romania) Ioan Dumitrache (Romania) Maaruf Ali (UK) Marin Dr ăgulinescu (Romania) Gheorghe Şerban (Romania) Ognyan Manolov (Bulgaria) Emil Sofron (Romania), Harold Szu (USA) Rodica Strungaru (Romania) Junzo Watada (Japan)
Ilie Popa (Romania) Lucien Dascalescu (France ) Teodor Petrescu (Romania) Dumitru Popescu (Romania) Paul Svasta (Romania) Eugene Roventa (Canada) Stoichescu Alexandru (Romania) Silviu Ioni ţă (Romania) Ioan Li ţă (Romania) Ion Sima (Romania) Nicu Bizon (Romania) Ion Tutănescu (Romania) Tiberiu Stănescu (Romania) Mihai Man (Romania) Ştefănescu Mircea (Romania)
ECAI organizers:
University of Pitesti:
- Department of Electronics and Computers
- Research Centre for Systems and Processes'
Modelling and Simulation
- Medical College
ECAI co-organizers:
Politehnica University of Bucharest
- Faculty of Electronics and Telecommunications
- Faculty of Automatics and Computers
Romanian Academy - Section for Theoretical
Informatics, Iaşi branch
National Institute of Inventors, Iaşi
Military Technical Academy, Bucharest
Nuclear Research Institute, Piteşti
Romanian Medical College - Argeş branch
Romanian Medical Association– Argeş branch
Edited by University of PITESTI ADDRESS: Street: Târgu din Vale, No. 1, 110040, Piteşti 55111, Argeş Romania PHONE / FAX: 0248 222949
Number 5/ 2005 ISSN – 1453 – 1119
Plenary Session P Fuzzy multi-criteria decision making method for diagnostic selection Plamena ANDREEVA, Ognyan MANOLOV ..................................................................................................................................
1
Internet security: using hacker techniques to improve information systems Philip FOGARTY, Ali MAARUF .....................................................................................................................................................
15
Bluetooth: Profiles and Applications Networks Ion BOGDAN ......................................................................................................................................................................................
36
Automatic speech recognition user interface Doru MUNTEANU .............................................................................................................................................................................
41
Bio-medical applications S1 E-health applications for mobile biosignal analysis Liviu MORARU, Rodica STRUNGARU, Mihaela UNGUREANU ...............................................................................................
1
Image information mining for medical diagnosis help M.DATCU, A. COLAPICCHIONI, Rodica STRUNGARU, Clara PASQUALI, R. MURRI, S. PASCA ................................
6
Content-based image retrieval in medical image databases Mihaela UNGUREANU, Rodica STRUNGARU, Sever PASCA, Radu STANCIU, Mihai DATCU ..........................................
11
A statistical approach to fast algorithms for vector quantization Spiridon Florin BELDIANU .............................................................................................................................................................
16
Application of surface fractal analysis in medical images Elena MIRCEA , Florin MUNTEANU ............................................................................................................................................
22
The effect of electrical stimulation on the persistent vegetative coma patients Zabach BAREAA, Pasca SEVER, Mihaela UNGUREANU, Jean CIUREA ...............................................................................
29
Heart rate variability analysis using time-frequency analysis Alexandru UNGUREANU, Mihaela UNGUREANU, Rodica STRUNGARU
35
The impact of the apical morphology of a layer v burst firing neocortical pyramidal cell on its discharge pattern Otilia PĂDURARU ...................................................................................................... .....................................................................
40
Digital Image Processing in the Area of Mask Mode Radiography Ştefan OPREA, Ilie POPA, Costin MARINESCU …………………………………………….......................................................
52
Full mouth reconstruction using computers instead of scalpels Costin MARINESCU, Stefan OPREA
56
Artificial Intelligence S2 Hysteretic Fuzzy Control of The Boost Converter .......................................................................................................................... Nicu BIZON, Mihai OPROESCU
1
Clocked Hysteretic Fuzzy Control of The Boost Converter ........................................................................................................... Nicu BIZON, Mihai OPROESCU
11
Neurofuzzy networks in motor imagery Stefan COSOSCHI, Alexandru UNGUREANU, Rodica STRUNGARU .................................................................................
20
Knock detection by time analysis of the vibration signal using a neural Dan LAZARESCU, Vasile LAZARESCU, Mihaela UNGUREANU ...........................................................................................
25
Neural networks and SPAM detection Constantin Alin MIROIU, Florin SMARANDA .......... .......................................................................................................................
30
Electronic circuits and equipments S3 The minimum dissipated power in stationary regime Horia ANDREI, Fanica SPINEI, Costin CEPISCA, Valentin DOGARU .....................................................................................
1
Symbolic Generation of Network Functions by Two-Graph Tree Enumeration Method Lucia DUMITRIU, Mihai IORDACHE ................... ........................................................................................................................
8
Design of Matching Networks For MMIC Ion T. SIMA, Teodor PETRESCU ...................................................................................................................................................
14
Control structure of a walking robot without degrees of freedom Anca PETRISOR ...............................................................................................................................................................................
18
Consideration on the evaluation of the reliability parameters for electronic equipments Gheorghe VIERU ...............................................................................................................................................................................
24
Implementation of a RISC architecture in a reduced complexity FPGA Vasile Gabriel IANA, Gheorghe SERBAN, Marius PREOTEASA ..............................................................................................
32
The implementation of a command and control machine on evolved hardware based structure Laurentiu IONESCU, Ionel BOSTAN, Alin MAZARE, Valer iu IONESCU, Gheorghe SERBAN ...........................................
36
Induction motors faults diagnosis based on neural networks Mariana IORGULESCU, Robert BELOIU ....................................................................................................................................
40
Induction motors models and faults simulation Mariana IORGULESCU ...................................................................................................... ............................................................
43
Electric drive system with dc motor with closed control speed loop Robert BELOIU, Mariana IORGULESCU, Octavian DUMITRU , Petre COMBEI, Adrinel STANCESCU .........................
47
Electric drive system with dc motor with closed current loop and open speed loop Robert BELOIU, Mariana IORGULESCU, Octavian DUMITRU , Petre COMBEI, Adrinel STANCESCU ……………….
50
Continued on back cover
Contents
ECAI 2005 - International Conference - First Edition
Electronics, Computers and Artificial Intelligence
1-
FUZZY MULTI-CRITERIA DECISION MAKING
METHOD FOR DIAGNOSTIC SELECTION
PLAMENA Andreeva, OGNYAN ManolovBulgarian Academy of Sciences,
Block 2, Acad. G. Bonchev St., P.O.Box 79, 1113 Sofia, BULGARIA
plamena@icsr.bas.bg , omanolov@bas.bg
Keywords: Medical data classification, Diagnostic inference, Decision making, Robotics-sensors data processing.
Abstract. An integrated method for diagnostic classification is proposed.
It combines similarity models and fuzzy rules evaluation criterion to predict
early cancer form in discrete data sets. The proposed decision making
method supports the diagnostic inference in multicriteria problems under
uncertainties from fuzzy type. The self-organizing maps (SOM) technique,
originally stated in T. Kohonen (1995), is applied for data visualization.
Kohonen Maps are also used for modeling of the multidimensional data. The
obtained results exceed these from other neural network approaches. The
proposed method has been tested with two medical examples achieving good
results near to the best reported in Ben-Hur et. all (2001) and in P.
Mangiameli et. all (2002) and having better interpretation and
understandability, which are important for practical medical application.
Here the integrated multi-criteria diagnostic (IMD) method has been shown
to provide a viable alternative.
1. INTRODUCTION
In general, diagnostic inference can be taken
as a typical multi-criteria decision making task due
to the multiple factors involved in the analysis of the
specific medical problem. Multiple criteria decision making approach is the major part of decision
theory and analysis, requiring explicit account of
more than one criterion in supporting the decision
process.
Diagnosis of diseases is an important and
difficult task in medicine. Detecting a disease from
several factors or symptoms is a many-layered
problem that also may lead to false assumptions
with often unpredictable effects. Therefore, using
the knowledge and experience of many specialists collected in databases to support the decision
making process is a necessary step. In this paper, we
propose an integrated method for diagnostic
classification, which combines known similarity
models (clustering and classification) and fuzzy
rules for evaluation criterion. Recently, many
different methods [Provost et. all, 1998, Hall, 2000,
Veropoulos, 2001] are proposed with the aim of introducing a trade-off between accuracy and
interpretability.
1.1 Problem statement.Diabetes mellitus and cancer illness are the
two most frequent forms of diseases that lead to
important symptoms resulting in functional
impairment. This are treatable diseases and the key
to their control is rapid identification and immediate
treatment of patients. According to World Health
Organization criteria the diagnosis of diabetes in early stages is investigated. In medical data analysis
one has to select useful knowledge for medical
decisions, such as the diagnosis of presence or
PLAMENA Andreeva, OGNYAN ManolovP 2
absence of breast cancer disease. A reliable and
precise classification of tumors is essential for
successful diagnosis and treatment of cancer. Current methods for classifying human
malignancies rely on a variety of morphological,
clinical, and molecular variables. In spite of recent
progress, there are still uncertainties in diagnosis.
The existing classes are heterogeneous and follow
different clinical courses. The first task includes the
identification of new tumor classes (unsupervised
learning). The classification of malignancies into
known classes is subject to the methods of
supervised learning techniques. An important task is the identification of marker genes that
characterize the different tumor classes, which is
subject to feature selection methods. The most
important thing for such domain in practical medical
application is usefulness, understandability and
better interpretation of the selected decision. The
potential benefits of the proposed integrated method
for diagnostic classification include increased
feasibility, enhanced accuracy, low complexity analysis and reduced time.
Current researchFor medical diagnosis there are many expert
systems based on logical rules [3], [4], [8] for
decision making and prediction. The most precise rules are often with low completeness as they are
special cases and do not bring any help in prediction
and future trends. The real medical applications are
systems with imprecision and uncertainty in logical
rules. This is why fuzzy logic is well suited for
decision-making and rule extraction. The analysis of collected medical data is performed in two steps:
selection of the training samples and input features,
and construction of classifier. For the first step a modified k-means type algorithm is used, which
combines fuzzy sets and k-means to achieve the
optimal selection of the training samples and input features simultaneously. By this algorithm, the
"singular" samples will be eliminated according to
the classification accuracy and the features that
facilitate the classification will be enhanced. On the
opposite, the useless features will be suppressed and
eliminated. For the second step the hierarchical
strategy is proposed for the construction of
classifiers expressed as decision trees or as sets of
rules.
First a brief introduction to the problem and
self-organizing maps [6] is presented, than this
technique is applied together with a fuzzy reasoning
evaluation criterion to diagnostic classification.
Finally the paper shows that the Kohonen neural network achieves good performance that exceeds
the results of other neural network approaches and
decision trees. To verify the feasibility of this
approach, a case study was conducted with
Breast.Cancer and Diabetes data sets
(http://www.ics.uci.edu ). The first one contains 698
cases and 9 attributes, and the second – 768 cases
and 8 attributes as given in table 1. The goal of the
experimental examples refers to the presence of
breast cancer disease (diabetes mellitus) in the patient. The target variable (class attribute)
distinguishes between five levels of cancer: 0 - no
presence and 1, 2, 3, 4 - presence at a gradually
increased level. So the problem is twofold: first. a
binary classification problem (0/1) on attempting to
distinguish presence from absence and second - to
find a model that classifies the five levels of disease
most accurately (0-4).
2. INTEGRATED METHOD
DESCRIPTION
There are four types of decision rules: exact
but not complete; inexact but complete; exact and
compete and inexact and incomplete. In the first
case an exact rule means that it is always true. For
example “Part of the people with cancer die”. This
rule is not complete, because it does not state
exactly what part and under which circumstance is
going to die. The second case is the most
interesting. It gives a percentage probability to become true. For examples “The smokers develop a
lung cancer”. It is not an exact rule because only 6%
of all smokers are going to develop a lung cancer (Pcancer=0,06), but from the lung cancer patients 95%
are smokers, so it becomes an almost exact rule. In
the third case the truth is always full (P = 1). For
example “The sum of each two length in the triangle
is greater than the third one”. The last case is not
used in the analysis of information systems and it
does not bring usefull decision.
2.1. Decision Rules Using Fuzzy Sets
When using fuzzy sets in the algorithm for
data clustering we generate decision rules that map
FUZZY MULTI-CRITERIA DECISION MAKING METHOD FOR DIAGNOSTIC SELECTION P 3
the value of an objects attribute to a decision value.
The rules are generated from a test set. A
deterministic decision system can be completely expressed by a set of definite rules. Depending on
the threshold value, it will be generated zero, one or
more mutually inconsistent rules. A threshold of 1
gives only the set of default rules that are exactly
identical to the set of definite rules. A threshold
value greater than 0.5 maps the value of an object
attributes of an inconsistent equivalence class into
only one decision class, where as a threshold value
below 0.5 may map into several decision classes.
Definite and default rules generated directly from the training set, tends to be very specialized.
This can be a problem when the rules are applied to
unknown data. By making variations in the
threshold value, one can tune the number of default
rules to get the most optional result. Rules are easy
for humans to interpret and use. If we make the
rules more general, chances are that a greater
number of the objects can be matched by one or
more of the rules. One way to do this is to remove some of the attributes in the decision system, and
glue together equivalence classes. This will certainly
increase the inconsistency in the decision system, but by assuming that the attributes removed only
discerns a few rare objects from a much larger
decision class, this may not be a problem in most cases.
Clustering is a more difficult problem than
classification because there is no learning set of
labeled observations, the number of groups is
usually unknown, the relevant features and distance
measure have to be selected in advance. Most of the algorithms that are appealing are computationally
too complex to have exact solutions, therefore
approximate solutions including fuzzy rules are used. Clustering procedures fall into two broad
categories:
Hierarchical methods, which provide a hierarchy of clusters, from the smallest, where all
objects are in one cluster, through to the largest set
(each case has own cluster);
Partitioning methods, which usually require
the specification of the number of clusters.
Most methods used in practice are
agglomerative hierarchical methods. In large part,
this is due to the availability of efficient exact
algorithms. Examples of partitioning methods are k-
means and Self-organizing Maps [7].
2.2. Decision making for Medical
DiagnosisThe human interpretability of rules and trees is a
major benefit. We concentrated on decision tree learners
and rule learners as they generate clear descriptions of
how the ML method arrives at a particular classification.
These models tend to be simple and understandable. In
medical domains comprehensibility is particularly
important.
In medical diagnosis, sensitivity gives the
percentage of correctly classified diseased cases and
specificity the percentage of correctly classified individuals without the disease. The area of overlap
indicates where the test cannot distinguish normally
from the diseased cases. In practice, therefore, a cut-off value is chosen (cut point indicated by the
vertical dashed line) above which the test is
considered to be abnormal and below which the test
is considered to be normal. For the ideal medical
case a frequency distribution is given on figure 1.
Fig. 1. Frequency distribution of ideally medical
cases
Fuzzy logics, as examples of well-studied
many-valued truth structures, have shown how an
algebraic treatment can be used for measuring the
usefulness of implicators. The central notion in
possibility theory is that of a so-called elastic
restriction that allows us to discriminate between the
more or less plausible values for a variable X in a
universe U. In this sense, it reflects our uncertainty about the true value of X. This elastic restriction is
modeled by a mapping p(X) from U to a set L,
whose values represent degrees of possibility, so that u2 = 0,7 means that it is possible to
degree 0,7 from L that X takes the value u2 from U.
Typically a mix of positive and negative evidence
contributes to our knowledge about X; positive
evidence here means that we get information that
particular values are to a given extent possible for X,
while negative evidence includes those statements
PLAMENA Andreeva, OGNYAN ManolovP 4
that tell us something about the necessity that X
cannot in fact take a particular value.
In statistics it is customary to call the components of the data vectors observations
recorded on variables. The components may also be
called features as is customary in pattern
recognition literature. A central role in applying a
method to large, high-dimensional data sets play the
questions: what kinds of structures the method is
able to extract from the data set, how does it
illustrate the structures, does it reduce the
dimensionality of the data, and does it reduce the
number of data items. The major drawback that applies to all methods in the data mining setting is
that they do not reduce the amount of data. The
methods could be useful for illustrating some kinds
of summaries of the data set like the cluster
centroids or the reference vectors of a self-
organizing map. Clusters can be interpreted as if-
then rules. The structure information discovered by
fuzzy clustering can therefore be translated to
human readable fuzzy rule bases.
Clustering Algorithm
Partitional clustering attempts to directly decompose the data set into a set of disjoint clusters.
The criterion function that the clustering algorithm
tries to minimize may emphasize the local structure of the data, as by assigning clusters to peaks in the
probability density function, or the global structure.
Typically the global criteria involve minimizing
some measure of dissimilarity in the samples. In this
paper a commonly used partitional clustering
method, K-means clustering is applied with the modified fuzzy sets for the attribute class
(diagnosis). The functional to be minimized is given
in Eq. (1). In K-means clustering the criterion function is the average squared distance of the data
items from their nearest cluster centroids. For IMD
method the distance is iteratively computed and applied, so a fine granulation is done. An overall
data analysis scheme with plausible prediction,
based on fuzzy evaluation criterion is given on
figure 2.
The fuzzy clustering algorithm uses
assignment of fuzzy membership values as a
confidence measure in tumor classification. Given
an input data space X = {x1j, x2j,...,xnj}, where n is the
number of records in dataset and j is the dimension
of attributes space, we assume the existence of C =2 clusters for binary classification. We are interested
in minimizing the following cost:
2
1 1
)(n
i
C
j
jiij CxuJ > 1;
(1)
The parameter controls the degree of
fuzziness in the process. The following algorithm finds a solution that converges to a local minimum
of J(U).
For 1 and 1 calculate
membership values µji ;
For 1 < j update the cluster centers by
(2);
The process stops when the difference in the
µj 's between two consecutive iterations is smaller than a given tolerance ;
otherwise go to step 3.
Fig. 2. Data analysis scheme with plausible
prediction, based on fuzzy evaluation criterion
It may be especially advantageous to
introduce fuzzy sets [16] in diagnosis classification, where frequently unlabeled tumor samples may not
necessarily be clear members of one class or
another. Using crisp techniques, an ambiguous
sample will be assigned to one class only, resulting
in an aura of precision and definiteness to the
assignment that is not warranted. On the other hand,
fuzzy techniques will specify to what degree the
object belongs to each class, which is information
that will frequently be useful. For instance, if we use
fuzzy membership values to construct a weighted SOM component plane of each type of tumor class,
we may identify some important expression features
of the tumor class.
FUZZY MULTI-CRITERIA DECISION MAKING METHOD FOR DIAGNOSTIC SELECTION P 5
The cluster centers are calculated as
weighted fuzzy sum of the Xk:As the first step, a single decision tree (rule)
is constructed from the training data (e.g.
BreastCancer dataset). This classifier will usually make mistakes on some cases in the data. When the
second classifier is constructed, more attention is
paid to these cases in an attempt to get them right.
As a consequence, the second classifier will
generally be different from the first. It also will
make errors on some cases, and these become the
focus of attention during construction of the third
classifier. This process continues for a pre-
determined number of iterations or trials. The degrees of mach in a rule are combined with the
Min operator. In this way, the importance of
attributes of a rule, which can be changed from time to time, can be reflected to reasoning. After stepwise
refinement simple and useful models are achieved.
Rules has the following structure:Rk : if x1 is A1k and . . . and xn is Ank then Y is
Cj ,
where x1 . . . xn are the attributes in the
antecedent affecting status of the consequent, A1k . .
. Ank are linguistic labels used to discretize the
continuous domain of the variables, and Y is the
class Cj to which the object belongs. Classical
fuzzy reasoning methods derive a class based on
maximum matching among the rules in the fuzzy production system. Lets suppose that a set of rules R
= {R1, . . . Rm} and a set of input pattern A. = {A.1 .
. . A.n} where A.j is a input pattern for the j th
attribute of the antecedent part of the rule base R.
Then the strength of the activation of the if-part of
the rule k (matching degree) usually obtained by
applying a t-norm to the degree of satisfaction of the
clauses (x j is Ajk ).
In the literature, there are some proposals for
this conjunction operator such as t-norm product and minimum. In our method each attribute was
fuzzified into 5 linguistic variables from triangle
type. To get a sharper difference between presence and absence of disease, we transformed the target
variable into -1 for absence, and 1 to 4 for presence
correspondingly. From Fuzzy Rule Induction using
the Breast.Cancer data set two models were
generated for the binary classification task :
IF NB_attr2 AND PB_attr3 THEN
Negative_diagnose
IF PB_attr2 AND PB_attr3 OR PS_attr6 OR
PS_attr7 THEN Positive_diagnisee
Here the attribute2 is the most important
attribute, and has importance weight =1.
An importance considered here can be
interpreted as a weight of respective attribute in a
fuzzy rule. Various techniques are used to determine weight coefficients [1], [4], [11], [14] to allow the
decision maker to assign weighting coefficients to
the criteria in the multi- criteria decision making
situation such as Direct Determination Method,
Comparative Matrix Method, and Analytic
Hierarchy Method. The FeatureSelection algorithm
in package WEKA [15] has been used to derive
weights of attributes in multi-attribute decision-
making situations. The method to design fuzzy rules automatically is by learning function of neural
network. The input data are divided into 5 clusters
using a conventional hierarchical clustering method. The output y from neural network is:
n
i
i
n
i
ii
y
1
1
.
, where i is the membership
value of the i-th rule and wi is the weight. (3)
This way the fuzzy rules are created by means of the neural network. For multiclass classifier they
combine binary classifiers and choose the class
where the argument of the decision function is
maximal. First, they guide the clustering process to
partition the data set into more meaningful clusters.
Second, they can be used in the subsequent steps of a learning system to improve its learning behavior.
This proposed integrated method combines the
advantages of neural networks (ability for identification and control) with the advantages of
fuzzy logic (ability for decision and use of expert
knowledge).A problem with the clustering methods is that
the interpretation of the clusters may be difficult.
Most clustering algorithms prefer certain cluster
shapes, and the algorithms will always assign the
data to clusters of such shapes even if there were no
clusters in the data. Therefore, if the goal is not just
n
i
ij
n
i
iij
j
x
C
1
1
.
(2)
PLAMENA Andreeva, OGNYAN ManolovP 6
to compress the data set but also to make inferences
about its cluster structure, it is essential to analyze
whether the data set exhibits a clustering tendency, as sated in [1]. The results of the cluster analysis
need to be validated. Another potential problem is
that the choice of the number of clusters may be
critical: quite different kinds of clusters may emerge
when K is changed. Good initialization of the cluster
centroids may also be crucial; some clusters may
even be left empty if their centroids lie initially far
from the distribution of data. The Bayesian rule is
the optimal classification rule if the underlying
distribution of the data is known. In practice we do not know the underlying distribution.
The clusters should be illustrated somehow to
aid in understanding what they are like. For example
in the case of the K-means algorithm the centroids
that represent the clusters are still high-dimensional,
and some additional illustration methods are needed
for visualizing them. The goal of clustering is to
reduce the amount of data by categorizing or
grouping similar data items together. Such grouping is pervasive in the way humans process information,
and one of the motivations for using clustering
algorithms is to provide automated tools to help in constructing categories or taxonomies.
3. SOM
The Self-Organizing Map (SOM, also called
Kohonen map) is a popular neural network based on
unsupervised learning. The SOM Toolbox for
MATLAB [7] is suited for data understanding and
can be used for classification and modeling. Each neuron is a d-dimensional weight vector, where d is
equal to the dimension of the input vectors. The
neurons are connected to adjacent neurons by a
neighborhood relation, which dictates the topology,
or structure, of the map. In the traditional sequential
training, samples are presented to the map one at a
time, and the algorithm gradually moves the weight
vectors towards them, as shown in the experiments
on Figure 6. In the batch training, the data set is
presented to the SOM as a whole, and the new weight vectors are weighted averages of the data
vectors. Both algorithms are iterative. The
MATLAB version in both tests was 6.1. The SOM is used mainly in data visualization, as it can be
effectively used to reduce high-dimensional data to
a two dimensional map. One of the main strengths
of the method is the ability to automatically cluster
similar patterns in its training set.
With each iteration, the criteria are losing their external character, i.e., overfitting is not
completely avoided. Very often a self-organization
of models can be found in those cases where the
data sample contains only a part of the model
arguments. This case corresponds to the presence of
large additive noise. If the noise dispersion is larger
than the dispersion of the output variable, a
minimum of the external criterion does not exist.
Fuzzy, very noisy variables have to go through
several levels of modeling. Exact variables need no modeling. For complex models the internal criterion
RRA will decrease with increasing complexity of the
model while the external criterion RRB will grow.
The intersection determines the structure of the
physical model by the given reference function (fig.
3).
Fig. 3. Self-organization of a physical model
using inductive selection procedures. RRA -
min of the criterion of accuracy, calculated on
the learning sequence; RRB - min of the
criterion of accuracy, calculated on the testing
sequence
The self-organizing maps illustrate structures
in the data in a different manner than, for example,
multidimensional scaling, a more traditional
multivariate data analysis methodology. The SOM
algorithm concentrates on preserving the
neighborhood relations in the data instead of trying
to preserve the distances between the data items.
Comparisons between methods having different goals must eventually be based on their practical
applicability. Here the SOM has been shown to
provide a viable alternative. No assumptions about the distribution of the data need to be made, it may
even find quite unexpected structures from the data
map units, n is the number of data samples and d is
FUZZY MULTI-CRITERIA DECISION MAKING METHOD FOR DIAGNOSTIC SELECTION P 7
the input space dimension. For [3000 x 10] data
matrix and 300 map units the amount of memory
required is still moderate, in the order of 3.5 MBs. But for [30000 x 50] data matrix and 3000 map
units, the memory requirement is more than 280
MBs. SOM_PAK requires much less memory, about
20 MBs for the [30000 x 50] case, and can operate
with buffered data.
4. CASE STUDIES
In practical applications one has to decide which models and parameters may be appropriate
for diagnosis and prediction problems. An algorithm
proven useful for a medical database may show not
to be useful in a cooperate database. Different ML
program packages (WizWhy from Wizsoft.com,
See5/C5.0 from RuleQuest, WEKA et. all) are
widely developed and used for decision making and
data analysis. The approaches used for data analysis
are different: traditional statistical methods, neuronal nets, case-based reasoning, decision trees
and genetic methods.
The software package WEKA [12,15] hasnumber of ML tools for data analysis – Decision
Trees, Naïve Bias, Decision Table, Sequential
Model Optimization, NN, Linear Regression andVoting Features. The learning methods are called
classifiers. WEKA also implements cost-sensitive
classification. When a cost matrix is provided, the
dataset will be reweighted. Because of its object
oriented program code and good interface and
several visual tools we prefer this program shell to conduct the experiments.
TABLE 1.Features of the examined medical data sets, taken
from UCI – ML Repository
Dataset attrib
utes
insta
nces
Conti
nius
Clas
ses
1.
Breast Cancer
10 698 2 2 -> 457
benign ; 241
malignant
2.
Diabetes
Pima
9 768 8 2 -> 500
negative;
268
positive
The accuracy level of WizWhy predictions is
usually much higher in comparison with other
approaches. In the BreastCancer experiment it
predicted class “malignant” with 93.3% probability,
and the incorrectly classified cases are 6. In See5 a very simple majority classifier predicts that every
new case belongs to the most common class in the
training data. In this example, 365 of the 460
training cases belong to class “benign” so that a
majority classifier would always optimize for
“benign”. The decision tree has a lower error rate
of 0.4% on the new cases, but this is higher than its
error rate on the training cases. If boosting is used, this confidence is measured using an artificial
weighting of the training cases and so does not
reflect the accuracy of the rule. The diagnostic sensitivity (94.1%) and specificity (97.4%), which
are two important factors may be used in
conjunction with the overall accuracy of the method to determine the performance of a neural network on
a medical application. (Sensitivity is the ratio of true
positive decisions to the number of positive objects,
specificity is the ratio of true negative decisions to
the total number of negative objects and the overall
accuracy expresses the ratio of correct decisions to
the total number of objects).
4.1. Experimental results tested with WEKAThe program package WEKA has good
explanatory and visual part. One great advantage is
the object-oriented structure of the implemented
learning models and algorithms. The program has
sufficient interface and a variety of tools for
transforming datasets. The unique feature of WEKA
is the Distributed Experiments. The experimenter
includes the ability to split and distribute an
experiment to multiple hosts. We have chosen to
test the different classifiers in WEKA with
breast.cancer dataset and the best results are achieved from the NaiveBayes + Kernel estimation
algorithm. The incorrectly classified instances are 3.
The same results are achieved with SMO classifier. (Data sets are taken from UCI ML repository
http://www.ics.uci.edu). The features of data sets are
given in table 1.
For the first dataset we use 460 randomly
selected instances for training. When the class
distribution is not well balanced the training set is
locked so every one classification is done under the
same condition. For the second dataset the negative cases in the whole dataset are 65,1% and in the
training set they are 63,70%, and 67,82% for testing.
PLAMENA Andreeva, OGNYAN ManolovP 8
This gives us a well-spread distribution, by analogy
with the first dataset. A detailed study of
comparative experiments is performed and the results are shown in table 2. When tested with
DiabetsPima dataset WEKA gives 76,71% accuracy
with the DecisionTable classifier and 75,95% with
NaiveBayes. The best result is with the SMO
classifier – 76,34% accuracy. The only drawback is
its increased time consumption. The breast.cancer
data set (Wisconsin) has nonlinearly separated classes “benign” and “malignant” and is chosen for
the testing dataset for a number of different
classifiers available in WEKA.
TABLE 2. Results from the examined medical data set Breast.Cancer. NaiveBayes´ indicates kernel estimation,
Voted perceptron classifier obtains 127 perceptron
Classifier Prediction
benign
Mean abs.
error
Mean
squared
error
Correctly
Classified
Time for
build model
ZeroR All benign 0,458 0,4726 66,39% 0,71 s
DecisionStump Cell_Size5
0,123 0,2334 94,11% 0,49 s
Decision Table No (23 rules) 0,0767 0,2343 94,95 % 3,84 s
IB1 0,0588 0,2425 94,12 % 0,94 s
IBk 1 neighbor 0,2054 0,2887 91,18 % 0,72 s
3 neighbors 0,1873 0,277 93,28 % 0,33 s
j48.J48 Cell_Size 0,0527 0,1695 96,64 % 3,35 s
j48.PART 10 Rules 0,0461 0,1775 96,64 % 2,92 s
Kernel Density 0,058 0,2384 94,12% 0,77s
kStar.KStar 0,0643 0,2025 94,54% 0,55 s
Linear Regression 0.9344
correl.
0.2371 0.3409 - 3,07 s
LogisticRegress-
2class
0.0409 0.1193 97,89 % 1,04 s
LocalWeighRegress
on
0.9453
correl.
0.2044 0.3113 - 0,77 s
Naive Bayes
(simple)
P (C) =
0.6548
0.0277 0,1631 97,06 % 1,05 s
Naive Bayes Prior P =
0.65
0,0282 0,1652 97.058 % 1,49 s
Naïve Bayes Kernel
Es
P = 0,65 0.0145 0.1129 98,74% 3,75 s
OneR Cell_Size<3,
5
0.0588 0.2425 94.12 % 1.92 s
SMOptimization 0.1512 0.1799 98,74 % 39,1 s
Voted Perceptron
=127
0.1092 0.3305 89.076 % 1,15 s
8 iter. 536
perceptr.
0.0462 0.215 95.378% 4.78 s
AdTree Cell_Size<2,
5
0.0742 0.175 96,64% 3,79 s
Neural Network 7 nodes 0.0452 0.1712 96,22% 24,5 s
2 nodes 0,0405 0,1348 97,89% 11 s
FUZZY MULTI-CRITERIA DECISION MAKING METHOD FOR DIAGNOSTIC SELECTION P 9
Voting Feat.
Intervals
0.3967 0.4095 96,22 % 0,55 s
AdaBoostM1 10 iteration 0.1221 0.1897 97.0588 % 6.48
5 iteration 0.0803 0,1878 94.958 % 4.6 s
J48
ReducedErrorPrun
10 iteration 0.0947 0.1729 97.8992 % 4.45
5 iteration 0.0745 0.1639 98.3193 % 1.27s
AttributeSelected 0.0145 0.1129 98.74 % 3,68 s
The program outputs the mean absolute error and the root mean-squared error of the
probability estimates. The root mean-squared
error is the square root of the average quadratic
loss. The mean absolute error is calculated in a similar way by using the absolute instead of the
squared difference.
Fig. 4. Working screen in WEKA for PimaDiabetes dataset. The two dimensional distribution (for X axis
attribute 2 “Plasma glucose concentration a 2 hours in an oral glucose tolerance test”, and for Y axis
attribute 7 “Diabetes pedigree function”) of 698 cases –in blue are the benign cases and in red – the
malignant.
ZeroR classifier simply predicts the majority
class in the training data. Although it makes little
sense to use this scheme for prediction, it can be
useful for determining a baseline performance as a
benchmark for other learning schemes. DecisionStump model builds a simple one-level
binary decision tree (with an extra branch for
missing values).
DecisionTable produces a decision table
using the wrapper method to find a good subset of
attributes for inclusion in the table. This is done using a best-first search.
PLAMENA Andreeva, OGNYAN ManolovP 10
NaiveBayes is a probabilistic classifier. By
default it uses the normal distribution to model
numeric attributes. When kernel density estimators is used, this improves the performance. The j48.J48
algorithm uses confidence threshold = 0,25.
The SMO (sequential minimal optimization
algorithm) is one of the fastest methods for learning
but it works only for 2 classes.
VFI (Classification by voting feature
intervals) model uses intervals, which are
constructed around each class for each attribute.
Higher weight is assigned to more confident
intervals, where confidence is a function of entropy. When no WeightedConfidence is selected, the
correct classified instances increase by 1.
The SMO class implements the sequential
minimal optimization algorithm, which learns this
type of classifier. Despite being one of the fastest
methods for learning support vector machines [2],
sequential minimal optimization is often slow to
converge to a solution—particularly when the data
is not linearly separable in the space spanned by the nonlinear mapping. Because of noise, this often
happens. Both run time and accuracy depend
critically on the values that are given to two parameters: the upper bound on the coefficients’
values, and the degree of the polynomials in the
non-linear mapping. Both are set to 1 by default.
The results of the binary tree method are
shown on figure 5. The confidence factor (CF) is set
by the user. The most important attribute is found to be Cell_Size_Uniformity, but compared to other
methods it separates at level <=3.
Cell_Size_Uniformity <= 3.0| Normal_Nucleoli <= 3.0| | Bare_Nuclei <= 2.0: benign (210.66)| | Bare_Nuclei > 2.0| | | Cell_Size_Uniformity <= 2.0: benign
(11.34/2.0)| | | Cell_Size_Uniformity > 2.0: malignant (4.0)| Normal_Nucleoli > 3.0: malignant (13.0/4.0)Cell_Size_Uniformity > 3.0: malignant (111.0/5.0)Number of Leaves : 5 Size of the tree : 9Correctly Classified Instances 221 92.8571
%
The experiments gave higher accuracy
(98,74) with NaiveBayes´ and SMO, while the
performance of the other classifiers excepting ZeroR
were comparable. The LogisticRegression and NN
algorithm came close with an accuracy of 97,89%.
When modeling all five levels separately, the result
from neural net model was correct in 14 from 17
previous misclassified records (they were for
diagnosis “absence of disease”). To apply such schemes to multiclass datasets, the problem must be
transformed into several two-class ones and the
results combined.
Fig. 5. Extracted rules with J.48 pruned tree with CF =0,25. If the confidence factor decreases, the number of
extracted rules increases but also the error rises up to 7,14%.
FUZZY MULTI-CRITERIA DECISION MAKING METHOD FOR DIAGNOSTIC SELECTION P 11
4.2. Application of Fuzzy Rules
The Fuzzy Sets Theory [16] allows the
mathematical modeling of imprecise propositions.
The fuzzy based model has been employed in many
areas to simulate how inferences are made by
humans, or to manage uncertain information. Such a
model can be applied to data and image analysis. The idea in fuzzy analysis is formulating the
decision model by logical rules. Cluster analysis
deals with the discovery of structures or groupings within data. Fuzzy cluster analysis dispenses with
unambiguous mapping of the data to classes and
clusters, and instead computes degrees of membership that specify to what extend data belong
to clusters. The objective function assigns a quality
or error to each cluster, based on the distance
between the data and the typical representatives of
the clusters. The structure information discovered
by fuzzy clustering can therefore be translated to
human readable fuzzy rule bases. The advantage of
the fuzzy system approach compared with other
non-linear regression methods is that fuzzy systems not only are universal approximators, but also have
clear physical interpretation for their structures and
parameters so that the results can be interpreted in
terms of fuzzy IF-THEN rules.
Let x1,x2,...,xn be elementary logical variables,
each of them taking its values from the sets
X1,X2,...,Xn called feature domains respectively. In
general, domains are supposed to be continuous
interval but in our examples for the sake of
simplicity we will consider only the case of domains consisting of a finite number of values aij, where
i=1,2,...,n, and j=1,2,...,ni. The Cartesian product of
all domains X1×X2×...×Xn forms the universe of discourse U with the power in the finite case
n1×n2×...×nn. Each element u=<x1,x2,...,xn> in U is
an ordered n-tuple of the values of all variables.
In the experiments we cluster the input data
set, using modified K-mean algorithm and obtain
the fuzzy class parameters. The rules are then
evaluated according to given criterion and for
selected granulation. Only two parameters are
required: the number of classes, and the 'alpha' coefficient that measures the desired spatial
smoothing. The process is otherwise entirely
unsupervised.For every ML tool the notion of distance
measurement between the objects to be clustered or
classified is important and apriory selected. In
supervised learning new observation are assigned to classes on the basis of their distances from objects
with known class labels. The SVM [2] is based on
the Euclidean distance between individual
observation and a separating hyperplan (margin).
The choice of distance can have a large impact on
the results of supervised and unsupervised learning
analysis. Subject matter knowledge is very helpful
in selecting an appropriate distance (based on
correlations will be a better choice). The distance we
calculate between clusters is single linkage = min distance between any two objects, one from each
cluster. Other possible distances between clusters
are: average linkage = the average of all pairwise
distances between the members of both clusters;
complete linkage = max distance between two
objects, one from each cluster; centroid distance =
distance between their centroids. Single linkage
distance leads to long thin clusters, while average
linkage leads to round clusters.
Fig. 6. Experimental results from MATLAB
The SOM is an excellent tool in the
visualization of high dimensional data [7]. As such
it is most suitable for data understanding phase of
the knowledge discovery process, although it can be
used for data preparation, modeling and
classification as well. In future work, our research
will concentrate on the quantitative analysis of SOM
mappings, especially analysis of clusters and their
PLAMENA Andreeva, OGNYAN ManolovP 12
properties. The classification accuracy is measured
by the percentage of correct classified cases. The
experiments were performed using a Pentium II 300 MHz PC with 128MB of RAM, running MS-
Windows ME
5. CONCLUSIONS AND
DISCUSSIONComplex classifiers, especially those
generated with the boosting option, can be difficult.
Partitioning is also appealing when we classify in order to predict, particularly when we want to
predict some property of a medical treatment.
However, partitioning implies sharp boundaries, either logical subsets of discrete attributes or
subranges of measured attributes. We may be
limited in our ability to distinguish discrete attribute
values, and we are always limited in our ability to
measure continuous attribute values. Thus a
partitioning classifier may find some cases (records)
lying "near" the class boundaries for which
unequivocal class assignment is not possible and
may not be desirable. It is needed knowledge about the particular
problem domain to do any more than filtering based
on statistical attributes of the discovered rules or patterns. One possible solution to this problem is to
express class assignment with fuzzy membership
function. The membership function Cj (ai) gives the grade to that record ai belongs to each of the
possible classes Cj. Each record must belong to
some class, even though there is insufficient
information to say exactly which class. This form
of "fuzzy" classification produces classes that are
closer to everyday concepts of "low", "very" and "malignant" than partitioning does. There are no
boundaries and no class assignments, only the
membership functions.
For most medical applications the logical
rules are not precise but vague and the uncertainty is
present both in premise and in the decision. For this
kind of application a good methodology is the rule
representation, which is easily understood by the
user. Therefore, the integration of fuzzy set and
similarity measure methods gives a much better and more exact representation of relationship between
symptoms and diagnosis. Self-organizing data
mining technologies in medical data analysis have to select automatically useful knowledge for
medical decisions, such as diagnosis of breast
cancer disease. With the proposed IMD method we try to granulate the extracted knowledge and refine
the rules. The detection of disease may also get
more efficient by reducing both time and costs for
the corresponding procedure. Besides classification accuracy, the efforts (time, costs) for
creating/applying a classification model are quite
important also. Since self-organizing data mining
selects a subset of attributes necessary to obtain a
certain classification quality, this gives us cost
saving effects.
For the tested medical datasets with WEKA
Bayes classifier and SMO model show the highest
accuracy and the best correctly classified cases. The
program used only the most important attributes, and discarded the rest. In breast cancer dataset the
most important attribute was Cell_Size_Uniformity
but there were 14 misclassifications. The same
attribute was selected in our method and after the
refinement the accuracy rises to 97,45 % (only 3
errors). In the most cases the rules from WizWhy
gave an overfit problem, there were too many
specialized rules.
Using the SOM toolbox for the visualization of high dimensional data and applying the fuzzy
rules evaluation is most suitable for data
understanding phase of the diagnosis selection process. The experiments show that induced
decision trees are useful for the analysis of the
importance of clinical parameters and their combinations for the prediction of the diagnosis. We
achieve results from 97,45% correctly detected
cases with more explanatory power and at little
costs. This provides a viable alternative to the
Bayesian and other neural nets, without having to
have extensive knowledge of mathematical, cybernetic and statistical techniques or sufficient
time for complex dialog driven modeling tools. A
self-organising data mining creates optimal complex models systematically and autonomously by
employing both parameter and structure
identification. The IMD method could solve the basic problem of experimental systems analysis to
avoid "overfitted" models based on the data's
information only. We are also planning to provide
new training examples for another medical datasets
in order to derive simple rules for diagnosis
determination on a distributed information system.
For this purpose the Experimenter in Weka-3-1-9
will be used. The Experimenter includes the ability
to split an experiment up and distribute it to multiple
hosts. This works best when all results are being sent to a central DB and is appropriate to implement
on the Web.
FUZZY MULTI-CRITERIA DECISION MAKING METHOD FOR DIAGNOSTIC SELECTION P 13
REFERENCE
[1]. Bandyopadhyay S. (2004), An automatic shape
independent clustering technique, Pattern
Recognition v.37 pp. 33– 45
[2]. Ben-Hur A., D. Horn, H. Siegelmann and V.
Vapnik (2001), Support Vector Clustering, Journal
of Machine Learning Research, v2, pp.125-137
[3]. Data Mining Software,
http://www.chel.com.ru/~rav/data_mining_software
.html
[4]. Gómez-Rubio V., J. Ferrándiz, A. López (2003),
Detecting Clusters of Diseases with R, Proceedings
of the 3rd International Workshop on Distributed
Statistical Computing (DSC 2003) March 20–22,
Vienna, Austria ISSN 1609-395X, Kurt Hornik,
Friedrich Leisch & Achim Zeileis (eds.)
[5]. Hall, M. (2000), Correlation-based feature selection
of discrete and numeric class machine learning. In
Proc .of the Int. Conf. on Machine Learning, 359–
366, Morgan Kaufmann, CA.
[6]. Kohonen T., (1995), Self-Organizing Maps, 30,
Springer Series in Information Sciences. Springer,
Berlin, Heidelberg
[7]. Kohonen T., Hynninen J., Kangas J., Laaksonen J.
(1996), SOM_PAK: The Self-Organizing Map
program package, Technical Report A31, http://
www.cis.hut.fi/nnrc/nnrc-programs.html, Helsinki.
[8]. Mangiameli P., D. West and R. Rampal (2002),
Model selection for medical diagnosis decision
support systems, Decision Support Systems,
Elsevier Press, v.36, 3, pp.247-259
[9]. Provost, F., Fawcett, T., and Kohavi, R. (1998). The
case against accuracy estimation for comparing
induction algorithms. In Shavlik, J., (ed.) Proc. of
the Fifteenth Int. Conf. on Machine Learning
(ICML98), 445–453, Morgan Kaufmann. San
Francisco, CA.
[10]. Schölkopf, B., Burges, C. J. C., and Vapnik, V.
(1995). Extracting Support data for a given task. In
Fayyad, U. M. and Uthurusamy, R., (eds,) Proc. Of
First Int. Conf. on Knowledge Discovery and Data
Mining, pp 252–257, AAAI Press. Menlo Park, CA.
[11]. Stutz J., P. Cheeseman. Bayesian classification
(autoclass): Theory and results. In Advances in
Knowledge Discovery and Data Mining.
AAAI/MIT Press, 1996.
[12]. University of Waikato in New Zealand,
www.cs.waikato.ac.nz/ml/weka
[13]. Veropoulos, K. (2001), Machine learning
approaches to medical decision making, in Proc.
ECCAI Advanced
[14]. Wang L.-X. and L.M.Mendel (1992). Generating
Fuzzy Rules by Learning from Examples. IEEE
Trans. On Systems, Man and Cybernetics, 22, 1414-
1427pp
[15]. Witten Ian H., E. Frank, Data Mining: Practical
Machine Learning Tools and Techniques with Java
Implementations, Ch. 8, © 2000 Morgan Kaufmann
Publishers
[16]. Zadeh L.A., Fuzzy sets, Information and Control 8
(1965), 338-358
ECAI 2005 - International Conference - First Edition
Electronics, Computers and Artificial Intelligence
1-
INTERNET SECURITY:USING HACKER TECHNIQUES TO IMPROVE INFORMATION SYSTEMS
Philip Fogarty & Maaruf AliDepartment of Electronic Engineering, Oxford Brookes University
Gipsy Lane Campus, Headington, Oxford, OX3 0BP, UKmaaruf@ieee.org
Keywords: Internet Security, Hacker, Information Systems
Abstract. The objective of this paper is to investigate and discuss the
issues of security in computer based information systems. The paper will
primarily focus on the statement “Using hacker techniques to improve
information systems” and will address the issues predominantly to
Information Systems (IS) network managers (also with some relevance to the
average home user) so that they become more aware of the dangers to their
computers and information systems that need to be prevented and controlled.
The paper shall discuss the consequences of an attack, the breaches made by
hackers and their prevention, security improvements and legal legislations
and possible solutions.
1. INTRODUCTION
Since the birth of the computer and computer networks a minority of people in the internet community have existed to solely demonstrate their intellectual amplitude by attacking computer systems. In other cases attacks have come from people who seek financial gain, or want revenge fora wrong doing or simply for fun. There are known cases of hacking to fund terrorism also know as cyber-terrorism. The nature of computing has changed tremendously over the last few years. As computer products and services have become cheaper and more powerful, they have also become more ubiquitous. This report is aimed at making recommendations for IS and network managers of small organisations who are concerned in the security of their transactions on their systems and via the internet, whether they are financial or hold private information, the report will also concentrate on the prevention of criminal activities by using data in a fraudulent manner along with examples of previous criminal activities. One unfortunate side effect of these changes is that computer crime has become more common. To quote detective sergeant
Clive Blake, from the U.K. Metropolitan Police Computer Crime Unit, “Computers are the future of
crime…They will become as crucial to the criminal
as a gun or a getaway vehicle”. A legislation has also been endorsed by a
Convention and is the first international treaty on crimes committed via the Internet and other computer networks, dealing particularly with infringements of copyright, computer-related fraud, child pornography and violations of network security.
1.0 INTERNET SECURITY ISSUES IN
ORGANISATIONS
As the Internet becomes a growing phenomenon and has become increasingly more important over the years, more organisations are joining on to the Internet for advertising and web shopping. Because of such trends, Internet security has become a major issue and concern for everyorganisation.
Philip Fogarty , Maaruf Ali P15
The evidence is clear to see that Cyber cimeis on the rise and costing the global corporate market hundreds of billons of pounds. CERT (Computer Emergency Readiness Team) the organisation that tracks incidents of cyber crime decided to stop tracking cyber crime as the number of incidents rose so rapidly 3,734 in 1998 to 137,529 in 2003, a 3683 % increase in 5 years. In June 2002 Katherine McLuskie published an article on www.globalcontinuity.com explaining how Internet-based attacks might be costing US companies over $350 billion, this figure made up 3.5% of America Gross National Product. Figure 1,below, shows the growing threat that network and IS mangers face.
Fig. 1. Threat Evolution. [Mark Jackson’s Network Security Beyond the Firewall Report. Page 4 Cisco Systems 2004].
Figure 2, below, shows the spread of worms.
Fig. 2. Internet Worms. [Mark Jackson’s Network Security Beyond the Firewall Report. Page 3 Cisco Systems 2004].
It is clear that over the last decade there has been a significant increase in the reported cases and, in a 1998 survey in the UK, 46 per cent of the 900 U.K. respondents reported some form of incident. A clear factor influencing this increase is the explosion in virus incidents that can be observed in the 1990s. It is also worth noting that, “hacking” is the only category of abuse in which the reported incidents have risen. A possible reason for this is that technological barriers to entry have been reduced in recent years and there are now numerous tools that provide automated support for would-be hackers.
Even though the costs are running into the millions, the reported number of incidents are small. It is often conjectured that the true level of computer crime remains much higher than reported, as organisations do no wish to risk undesirable consequences such as bad publicity, legal liability, or loss of custom. Financial loss is merely one type of impact that may result from cybercrime. Other impacts, such as disruption to services, loss of data or damage to reputation, are more difficult to quantify and may actually be more significant in many contexts.
Some Internet users think that hacking is pretty harmless fun and even quite clever but it can be a serious invasion of privacy and a significant threat to e-commerce. The Information Security Advisory Group estimates that world-wide there are now over 100,000 hackers .White-hat hackers test computer security at the request of organizations. Black-hat hackers act privately to break into systems; and grey-hat hackers work both sides of the fence.
The hacking of personal computer terminals is actually quite easy because most people use Microsoft software which is notoriously open to abuse. The latest operating system Microsoft XP has been no different and Internet security experts have found flaws. Hackers have developed a program called “The Backdoor” which provides them access to systems running Microsoft Windows. This could potentially allow a hacker to switch on remotely a microphone or webcam associated with the PC being hacked.
Philip Fogarty , Maaruf AliP16
1.1 Hacked Advertising
Organisations have to open a web site to have a typical example of a shop window, where they are in a position to advertise their products and promote their services. Hackers can enter the web site illegally and modify the outlook of the homepages; a typical scenario of this would be walking in to a retail outlet and changing their entire shop layout. In either of these cases, when a hacker attacks a web site, the new homepage does not promote the activities of the organisation. A recent example of this was published on PC world.com, “On November 22nd 2004, users who visited a number of popular European web sites and clicked on banner ads could have infected their computers with variants of the Bofra worm. Experts warned that PCs operating on windows XP with service pack 1 and Windows 2000 would be vulnerable due to the flaws in the software”.
1.1.1 Retail Outlet Crime via Internet
In 1998, eBay allowed its customers to handle the auctions themselves even though there was a risk of ‘scam artists’ selling illegal items or pirated goods via eBay trading web site. According to eBay CEO Meg Whitman “eBay has a zero tolerance to fraud policy”; she also argued that “We have committed resources to have the most comprehensive programs in order to keep eBay a safe harbour for person to person trading”. However, in 1998, they expelled a man from Oklahoma (USA) from the eBay trading site after discovering that he was already under investigation from postal authorities for mail fraud.
More Recently in November 2004 Paul Rogers of IG news reported an anonymous group of malicious hackers who opened an online store that sells the stolen source code of prominent software products. The group is offering the code for Cisco Systems' PIX firewall software to interested parties for $24,000, according to messages posted in online discussion groups. The group is using e-mail and messages posted in a Usenet group to communicate with customers and receive orders for the source code of several security products, including Cisco's PIX 6.3.1 firewall and intrusion detection system (IDS) software.
1.1.2 Hacktivism
“The word hacktivism is a combination of
hacking and activism. A hactivisit is someone who
uses system penetration to propagate a political,
social or religious message.” 1
Over the past ten years there have been several examples of hacking breaches which have been politically motivated hence given these hackers the name “Hacktivist”. An instructive series of Hactivist attacks happened back in June of 1998.“The group called "Milw0rm." Milw0rm hacked the website and LANs of India's Bhabha Atomic Research Centre (BARC). They uploaded a spoofed web page showing a mushroom cloud and the text "If a nuclear war does start, you will be the first to scream". They also downloaded several thousand pages of e-mail and research documents, including messages between India's nuclear scientists and Israeli government officials. Milw0rm defaced hundreds of other sites”.5
1.1.3 Industrial Espionage
Today the majority of Internet host are corporate sites. Some companies use the Internet as a network to transmit data. There are a lot of examples of company host’s that have been attacked.” According to a survey by PricewaterhouseCoopers in November 2003, nearly half (46 per cent) of the fastest growing companies in the United States have suffered a recent breach of their information security, despite increased security precautions since Sept. 11, 2001, In most cases, these businesses were victims of computer viruses or worms, with hackers and e-mail”.2
When a hacker intrudes, the main risk to such a system is that there may be the theft of private and confidential information and data regardless of the fact that it may be financial or not. Another risk also arises which may in some cases cause more problems to a business and organisation when hackers attack systems is that, has the hacker modified any of the data. This may prove to be more
1 YOUNG, S, AITEL The hackers handbook, Auerbach
Publications 2004, p.35 2http://sacramento.bizjournals.com/sacramento/stories/2003/1124/daily3.html Accessed on 4th September 2004.
Philip Fogarty , Maaruf Ali P17
costly to a business or organisation because it may take several months before the modifications are detected, which may cause businesses to loose money and profits. The danger of this to an organisation is that it will work with modified data and will undoubtly produce inaccurate and faulty results, which may not be realised if the modifications go undetected.
Since September 2001 the Bush administration has toughened anti-hacking laws and increasingly lobbied foreign governments to cooperate in international computer-crime investigations. The United States and England were among 26 nations that signed the Council of Europe Convention on Cybercrime, an international treaty in 2001.
Examples of these types of crimes are abundant in the world of the Internet. However, a large number of attacks that are made by hackers are not reported to the police due to the company wanting to conceal them or that the crime/attack committed by the hacker is sophisticated that IS and network managers are not able to detect these attacks.
1.1.4 Business Security
In today’s business environment, credit card security is one of the largest issues of the Internet as more businesses and companies are using the Internet to advertise and promote their services. This is due to the potential to literally create a worldwide commerce so that the majority of purchasing can and will be done from the home of the customer ordering over the Internet. However the downside to creating such a worldwide electronic commerce environment is that it will open a very large hole to credit card fraud.
Today there are numerous virtual outlets on the Internet that sell books, computer accessories, clothes, mobile ring tones, etc. The method of payment that these virtual outlets use is the ‘credit card’. This means that when a customer wants to purchase something from a web site, the customer sends his/her credit card number and details to the virtual outlet, which then either debit the money from the customer’s bank account or charge their credit card. However, there are two problems with this method of payment:
Firstly, when the credit card number is transmitted from the customer’s computer, the credit card number can be picked up by a third partywithout the knowledge of the merchant and the customer.
Secondly, the majority of the commercial web sites are not secure at all. Even if the transmission of the credit card is safe through the Internet, it is far from being safe to store in a web site.
Credit card numbers can also be taken from inside a company’s database. There have been several incidents where companies that specialise in Internet commerce have been hacked in the search of credit card numbers or customer files.
The concept of ethical hacking also exists, where hacker skills are employed in order to test and improve the security of a computer system. The underlying theme is that, depending on the context, the techniques and technologies that are used to pursue some form of commercial or political oriented objective may legitimise the basis that does not automatically cast the perpetrator as a criminal.
Despite the claimed security of many systems and products, it is often uncertain whether a system will actually stand up to an assault until someone actually tries it. It is obviously undesirable for this person to be an uninvited hacker, and as such, security-conscious companies are often keen to test their protection for themselves. A market consequently exists for what is known as “ethical” hacking, whereby hacking skills and methods are applied against a system by persons who can be trusted not to use any discovered weaknesses for illegitimate purposes.
The provision of ethical hacking services is frequently an internal company function in larger organisations. For example, Microsoft has a dedicated group known as the Rapid Exposure Detection (RED) team, who represent the first line of defence against genuine hackers from the outside world. The role of the team is to identify security vulnerabilities within software systems at the development stage, and suggest a solution before systems are released and can be exploited by hackers. An alternate approach that can be considered is to engage the services of external hackers to test the security of the system, so that
Philip Fogarty , Maaruf AliP18
organisations can demonstrate the effectiveness of their products.
These arguments tend to suggest that IS/network managers need to employ and use the expertise and knowledge of hackers, and their techniques in order for them to improve and maintain their systems. In the commercial sphere, ethical hackers are applying their skills for the benefit of security, taking techniques that would be otherwise be regarded as illegal and using them as the basis for business operations. Of course, the whole domain of ethical hacking is driven by cybercrime - if traditional hackers were not there to threaten systems in the first place, then the services of ethical hackers would not be required either.
2. HOW HACKERS ATTACK
The study of this section will cover the weaknesses of networks and how hackers work and what the potential weaknesses of the Internet are.
2.1 Definition of a Hacker Hacker (n). Slang. “A computer fanatic, esp.
one who through a personal computer breaks into the computer system of a company, government, etc.” Collins English Dictionary, Harper Collins Publishers, 2000
In context to the Internet, a “hacker” is usually a statement made about a person who has great in-depth knowledge and expertise in the field of computer networking A hacker is originally, someone who makes furniture with an axe. Below is a list of other examples of how a hacker can be identified as:
A person who enjoys exploring the details of programmable systems and how to stretch their capabilities, as opposed to most users, who prefer to learn only the minimum necessary required.
One who programs enthusiastically (even obsessively) or who enjoys programming rather than just theorising about programming.
A person capable of appreciating hack value.A person who is good at programming
quickly.
An expert at a particular program, or one who frequently does work using it or on it; as in ‘a UNIX hacker’. (Definitions 1 through 5 are correlated, and people who fit them congregate).
An expert or enthusiast of any kind. One might be an astronomy hacker, for example, someone who enjoys the intellectual challenge of creatively overcoming or circumventing limitations.
[deprecated] A malicious meddler who tries to discover sensitive information by poking around. Hence ‘password hacker’, ‘network hacker’.
It is better to be described as a hacker by others than to describe oneself that way. Hackers consider themselves to be something of an elitist group. A Cracker is a person who breaks into other people’s computer systems for various reasons. To get a “kick” the particularly antisocial cracker has a vandalistic streak, deletes file stores, crashes machines, and stops running processes that are in pursuit. Various media people in the ‘real world’ have used the term “hacker” and have driven it in to the same meaning as the term “cracker”. Amongst the Internet community this usage of the two terms is wholly inappropriate.12
2.2 Network File Systems
All over the world there are millions of offices, schools, Colleges, Universities and various other environments that have network computers. In the environment of a local area network for example the Oxford Brookes intranet there is a necessity to share files, create files and manage who has access rights to certain files. To share, create and govern who has access and ownership to various files you need a program to implement the desired functions. The Network Files System (NFS) developed by Sun Microsystems is a network capable of carrying out all the desired functions mentioned. NFS is the most widely used Unix network Files System.
Operating Systems (OS) that are used today have the ability to question and then decide if the user is allowed access to a file he/she is trying to access. The way that the system makes the decision is by verifying the ownership of the file; it verifies who is trying to gain access to that file, and also the access permissions that the author has set to the file in question. The access permissions define who will have access to the file.
Philip Fogarty , Maaruf Ali P19
There are several reasons why people should care about protecting their files from other users. Firstly, the main obvious reason is so that the owner may want to protect the contents of his/her file from others due to the owner considering the files to be private and does not wish other users to be in a position where they can read or modify the contents of his/her files. The second main reason is to prevent another user modifying the authors file because if the user can modify the file he or she is then in a position to obtain access to the users account. An example of this is that a malicious user can create or modify the authors “ghost” file to give any user access to your account.
“When an intruder gains access to your account/computer, the intruder’s main aim is to gain root access. The reason why an intruder wants to gain root access to the system is because he or she can then delete, modify or add new files to the system. When an intruder breaks-in and gains access to the root access (directory), his/her initial way of entering the system is from a regular user’s account. This is the method by which the majority of computer break-ins occur. By using a security hole in the operating system, an intruder can launch an attack to gain root access on a machine. This can only happen once the intruder is on the machine as a regular user”.3
2.3 Social Engineering and Phishing
Social Engineering is used quite often for gathering information necessary to carry out a successful attack on a system as some cracking techniques rely on weaknesses. According to US-CERT (United States Computer Emergency Readiness Team), the aim of the hacker is to “obtain or compromise information about an organization or its computer systems. An attacker may seem unassuming and respectable, possibly claiming to be a new employee, repair person, or researcher and even offering credentials to support that identity. However, by asking questions, he or she may be able to piece together enough information to infiltrate an organization's network”
4 If you happen to fall victim to this social trickery, attackers will be able to internally break the target system’s security
3 Bryant, R. UNIX Security, SAMS publishing, p.53.4 . http://www.us-cert.gov/cas/tips/ST04-014.htmlAccessed March Jan 17th 2005.
as opposed to tampering with the software on the system.
Even when a system is secure if used properly, its users can subvert its security by accident - especially if the system is not designed very well. The classic example of this is the user who gives his or her password to a co-worker so they can solve some problem when he or she is out of the office. Users may not report missing smart cards for a few days, in case they are just misplaced. They may not carefully check the name on a digital certificate. They may reuse their secure passwords on other insecure systems or they may not change their software's default weak security settings.
A classic trick is by phoning up a mark that has the required information and posing as a field service technician or a fellow colleague or employee who has an urgent access problem. The common variation how this is carried out is through the use of phone, general talk or Internet Relay Chat (IRC).
Another classic social engineering trick that a hacker may make use of is a form of social engineering called ‘Phishing’. “Phishing attacks come in the form of email or malicious websites with the soul purpose to trick a user into revealing personnel information, normally of a financial nature. Attackers may send an email pretending to be a well known credit card company or financial institution that requests account information, when a user innocently responds the attackers (hackers) can use the information to gain access to the accounts”.13
In a local server or intranet an e-mail could be sent to a user from someone that claims to be a system administrator who requires the user’s password for some important administration work, also asks the user to reply via e-mail. It is possible that the hacker will forge an e-mail, which makes the e-mail seem as if it has come from a reliable source who the user knows as a legitimate system administrator. Hackers realise that if they send such e-mails to a few users, the chances of someone falling for the trick are minimal. However, hackers realise that if such cunning and devious e-mails are sent to every user on a multi-user system, the chances of success for a couple of users falling for the trick and replying are far higher.
Philip Fogarty , Maaruf AliP20
An older form of social engineering is by guessing a user’s password. Individuals who can find out personal information about you or who already have personal information about you can use that information to guess your password. Examples of this could be your children’s name; pet name, birth date or car registration number plate. These are all common candidates for guessing as passwords. Hackers will and can go very far to guess user passwords.
This tends to highlight the important issue of password protection to IS and network managers, and emphasises the point that users should be made aware of the fact that passwords should remain protected at all times and never disclosed to any other member of staff. By adopting such basic principles and testing staff randomly trying to obtain their passwords using the social engineering methods will hopefully minimise risks and maintain the system because members of staff will be moreaware of the risks and will therefore be reluctant to disclose such private information.
2.4 Password Cracking
“A password cracker is any program that can
decrypt passwords or otherwise disable password protection. A password cracker need not decrypt
anything. In fact, most of them don't. Real encrypted
passwords cannot be reverse-decrypted “.5
The term password cracking is a very important in the world of internet security. There are many tools involved in the area of password cracking, all of which are not designed to hack into sites. The term password cracker is often misunderstood, it can be defined as shown in italic type above. “Password Recovery” is the alternative term to be used instead of “Password Cracking” if you are one of the good guys (not a hacker).
A password can be defined as a secret series of characters that enable a user to access a file, computer or program. On multi-user systems, each user must enter his or her password before the computer will respond to commands. The passwordhelps ensure that unauthorised users do not access
5 http://docs.rinet.ru/LomamVse/ch10/ch10.htm Accessed on 12th January 2005.
the computer, and on some systems data files and programs may also be protected by a password. “In theory, an ideal password is one that nobody could guess. However, in practice, most people choose a password that is easy to remember such as their name or their initials. This is one main reason why it is relatively easy for hackers to break into most computer systems”.6 Password can be cracked by Brute Force, Dictionary or Hybrid Attacks.
Dictionary Attacks
Hackers have programs such as Crack 5.0 that can guess hundreds or thousands of words per second and by using such software they can use a “dictionary attack”. This is the most common method by which a hacker will try to deceitfully obtain a password. In a dictionary attack, the attacker takes a dictionary of names and tries each one to see if it is the correct password. This makes it easy for hackers to try lots of variations: words spelled in reverse, lower and upper case and adding numeric values to the end.
Brute Force Attacks
Here the hackers use a program to basically guess a password by going through every possible combination. For example the program could start alphabetically on aaaaaaaa then aaaaaaab……etc. This does take somewhat time as the program would have roughly 200 billion combinations to check for an eight character password.
Hybrid Attacks
This is simply a mixture of the above techniques. Simply a Hybrid attack might add a couple of numerical values to the end of a dictionary attack. This method saves time by testing more combinations at once without using a full brute force attack. A lot of cracking programs will incorporate all three methods in their software.
Passwords are the first line of defence and arevery important in the defence against interactive attacks. The advice given to home users and IS/network managers would be ensure yourself that the hacker has no chance to mount an attack in that; if the hacker cannot interact with a remote system and the hacker has no access to read or write the information contained in the password file, it becomes almost impossible for the hacker to mount
6 http://www.webopedia.com/TERM/p/password.html(Accessed on 19th December 2004)
Philip Fogarty , Maaruf Ali P21
an attack. If a hacker can at least read the password file of a remote host then it is very important that the hacker is not able to break any of the passwords enclosed within the file. It is fair to assume that if the hacker can break in to the password then he or she can log on to the system and is then in a position to break into the ‘root’ directory via an operatingsystem hole.
2.4.1 Password Sniffing
“A device or program that monitors the data
travelling between computers on a network” or
“Passive wire tapping, usually on local network to gain knowledge of passwords.”
If a hacker is unsuccessful in guessing a password, there are alternate routes which can be used to obtain such details. One method which has become very popular is called “password sniffing”.The majority of networks use “broadcast” technology in that every message that a computer on a network transmits can be read by any other computer on that network. In practice, all the computers except the recipient of the message will notice that the message is not meant for them and ignore it. However, many computers can be programmed to look at every message on the network. If one does this, one can look at a message that is not intended for them. Learning from simple errors such as this will prove beneficial for IS and network managers because it will help improve and maintain their systems because messages will only be read by the intended recipients and will hopefully reduce network congestion and traffic because messages will be directed and routed to their correct destinations addresses.
If a user logs in to a computer across a network and another computer has been compromised this way, the user may unwittingly give his/her password to the attacker. This is a serious threat to users who login from remote sites. If someone logs in on the console of a computer, their password never crosses a network where it can be sniffed. But if someone logs in from some other network of from an Internet service provider, the user is dependant on the security of these networks. In recent times Trojan programs have taken password sniffing to another level with ability tosniff a password as a user types from the keyboard. There are devices available that can monitor the
typing of keyboards. Below are some Password Sniffing Programs:
TCPDump (is the most used network sniffer/analyzer for UNIX)
See: http://www.phaster.com/find_info_net_traffic.html
[Accessed March 17th 2005] for a full list of similar programs and in depth information on Packet & Password Sniffing.
2.5 IP Spoofing
(I-P Spoofing) (n.) “A technique used to gain unauthorized access to computers, whereby the
intruder sends messages to a computer with an IP
address indicating that the message is coming from a
trusted host”.7
Some example that use spoofing techniques include::
“So called man-in-the middle attacks, in which
a hacker intercepts and captures traffic between two communicating systems;
Session hijacking attacks, in which a hacker is able to hijack or take over an active session between two communicating systems;
Source routing attacks, in which a hacker spoofs an IP address and sets source route options in IP packets to route packets in and out of a network (the attackers system);
Denial-of-service attacks, which utilize IP spoofing to ensure that packet responses flood a target network, as opposed to the originating system (the attackers system);
True relationship exploitation, in which a hacker is able to spoof a source IP address to circumvent IP-based operating system or application access controls.”8
An even simpler method for spoofing a client is to wait until the client system is turned off and then impersonate the client’s system. In many organisations, staff members use personal computers and TCP/IP network software to connect
7 http://www.webopedia.com/TERM/I/IP_spoofing.htmlAccessed on 18th December 2004.
8 Young, S. Aitel, D., The Hackers HandBook,
The Strategy behind Breaking into and Defending
Networks, CRC Press, London, 2004.
Philip Fogarty , Maaruf AliP22
to a local area network server. The personal computers often use Network File Structure (NFS) to obtain access to server directories and files (NFS uses IP addresses only to authenticate clients). An attacker could, after hours, configure a personal computer with the same name and IP address as another, and then initiate connections to the UNIX host as if it were the ‘real’ client. This is very simple to accomplish and would likely result from an inside attack, probably from a bitter and angry employee.9
Electronic mail on the Internet is easily to spoof and, without enhancements such as digital signatures, generally can not be trusted. Imagine the process that takes place when Internet hosts exchange mail (people send and receive email). This exchange process takes place using a simple protocol consisting of ASCII-character commands. An intruder easily could enter these commands by hand using TELNET to connect directly to a system’s Simple Mail Transfer Protocol (SMTP). SMTP is a protocol for sending e-mail messages between servers. Most e-mail systems that send mail over the Internet use SMTP to send messages from one server to another; the messages can then be retrieved with an e-mail client using either POP or IMAP.
In addition, SMTP is generally used to send messages from a mail client to a mail server. This is why you need to specify both the POP or IMAP server and the SMTP server when you configure your e-mail application.10 The receiving host trusts that the sending host is who it says it is, thus the origin of the e-mail can be spoofed easily by entering a sender address that is different from the true address. As a result, any user, without privileges, can falsify or spoof e-mail.
2.6 Viruses, Malicious code and Spam
Viruses are the most publicised form of attack. A computer virus is able to simulate itself and spread itself to other computers. It possible to get a virus by e-maill, e.g. I love you and Melissa e-mails, downloading infected software, or by loading an infected floppy, zip disks or CDs on to a computer. Malicious code is generally a program
9 http://www.deter.com/unix/papers/cert_ip_spoof.txt10 <http://www.deter.com/unix/papers/cert_ip_spoof.txt>
that a attacker has written with the intention to cause mischief or damage to a victim or Victims. In broad terms malicious code covers all viruses, worms, Trojan horses, Adware and Spyware. The worst kind of virus can wipe your hard drive or cause unfixable damage forcing you to replace your hard drive. This is why it is always very important to keep a back up of all data on some sort of medium.
SPAM along with Adware are one of the most annoying things with computer use. Spam is the electronic form of junk mail and according to The Spamcon Foundation News at http://spamcon.org/ on 4th April 2004: “U.S. businesses lost about US$4 billion in productivity last year because of spam, and those losses could mount without an intervening technology or policy to curb unwanted messages. But, on a macabre note, the anti-spam market has a bright future and could eclipse $500 million annually. Anti-spam service provider Brightmail reported spam reached the highest levels ever in March”
2.7 Trojan Horses
11 Trojan (n.) “A computer
program that appears to be useful but that actually does damage.”
A Trojan horse was originally famous for a Mythological legend. The
Greeks won the Trojan War by hiding in a huge, hollow wooden horse to sneak into the fortified city of Troy. In the modern Computer age of today, a Trojan horse is known as a security-breaking program that hides itself in a program pretending to be some useful software.
“A Trojan horse is defined as a malicious, security-breaking program that is disguised as something benign. For example, you download what appears to be a movie or music file, but when you click on it, you unleash a dangerous program that erases your disk, sends your credit card numbers and passwords to a stranger, or lets that stranger hijack your computer to commit illegal denial of serviceWhen the victim runs the apparently benign program they also trigger and run the hidden Trojan horse program as well. Trojan horses are destructive programs that masquerade as an application is
11 Image taken from <http://www.microsoft.com/athome/security/viruses/virus
101.mspx> Accessed on 25th March 2005 )
Philip Fogarty , Maaruf Ali P23
executed or run. Unlike viruses, Trojan horses do not replicate themselves but they can be just as destructive. One of the most insidious types of Trojan horse is a program that claims to rid a user’s computer of viruses but instead does the exact opposite and infects and introduces viruses onto the machine”.
12
There are examples of UNIX Trojan horse programs on the Internet. There have been incidents of Hackers breaking into a FTP archive. The Hackers modified a popular program available from this site, allowing them to break into computers that subsequently downloaded and installed the program.Examples like the one above are of too huge importance to the report because IS and network managers should be in a position where they monitor their employee’s use of computers and the Internet so that they are not installing programs where they do not have permission or access rights to do so. Also it makes the average Home user more ware and conscious of what sites they access and content they download. By preventing such attacks from happening will provide a barrier for potential attackers who will find it hard to penetrate and hack into computer systems through modified and manipulated programs.
“It can be argued that if a technically competent hacker wishes to get into your system, then he or she will be able to do so, regardless of assistance from analyser and exploit programs. This can be illustrated by considering the use of the Back Orifice 2000 (BO2K) tool, which provides a means for remote administration of a target system (Back Orifice is chosen here as it has been around for some time and, therefore, to show the process of attack is revealing nothing new. In addition, most security conscious organisations should already be protected against it via standard anti-virus software).Back Orifice 2000 is a tool consisting of two main elements, a client application and a server application. The client, running on one machine, can be used to monitor and control a second machine running the server. The use of BO2K, therefore, requires that the server program be installed onto each target machine. This could be explicitly installed by an administrator wishing to conduct remote administration duties, but it will more typically arrive as a Trojan horse, attached to an e-
12 http://www.irchelp.org/irchelp/security/trojan.htmlAccessed on 25th March 2005.
mail message or similar, and rely on installation by an unwary end-user. Then, anyone with the other half of the BO2K software (the administrator tool) can control the victim's PC from anywhere on the Internet. The remote user can stealthily do anything to the victim's machine that the victim could do locally. Some of the operations that can be performed remotely include:
Execute any application on the target machine;
Log keystrokes from the target machine; Restart the target machine; Lock up the target machine; View the contents of any file on the target
machine; Transfer files to and from the target machine; Display the screen saver password of the
current user of the target machine.
Whilst all of the above features could conceivably be of use to a IS/network manager or system administrator wishing to remotely monitor and control a machine within his/her network, it can also be seen that the facilities would represent a significant security risk if placed in the wrong hands. The clear problem in the case of BO2K is that it can be distributed, using stealth methods, by someone other than the legitimate administrator”.13
2.8 Worms
A Worm is an autonomous agent capable of propagating itself without the use of another program or any action by a person and it will most commonly perform malicious actions, such as using up the computer’s resources and possibly shutting the system down. An example of a recent and famous worm attack was ‘The Sapphire Worm’ (also know as the Slammer) that occurred in 2003. This was the fastest computer worm in history. According to www.cnn.com “Within 10 minutes of the first infection, Slammer had reached 90 percent of the world's vulnerable hosts, doubling in size
13 FURNELL, S. CHILIARCHAKI, P. DOWLAND, P.S, Emerald Fulltext Journal - Security analysers:
administrator assistants or hacker helpers? ,
Information Management and Computer Security, Volume 9, No. 2, pp. 95-101, Emerald Group Publishing Limited, 2001.
Philip Fogarty , Maaruf AliP24
every 8.5 seconds, according to computer scientists at CAIDA, the Cooperative Association for Internet Data Analysis, and other research groups It caused network failures, canceled airline flights, interrupted elections, and crashed ATMs and it could have been much worse.
The researchers at the time reported. "It is important to realize that if the worm had carried a malicious payload, had attacked a more widespread vulnerability, or had targeted a more popular service, the effects would likely have been far more severe,"
"There is no conceivable way for system administrators to respond to threats of this speed."
The worm took advantage of vulnerability in some Microsoft Corp. software that had been discovered in July. Microsoft had made software updates available to patch the vulnerability in its SQL Server 2000 software - used mostly by businesses and governments - but many system administrators had yet to install them when the attack hit. As the worm infected one computer, it was programmed to seek other victims by sending out thousands of probes a second, saturating many Internet data pipelines.
Unlike most viruses and worms, it spread directly through network connections and did not need e-mail as a carrier. Thus, only network administrators who run the servers, not end users, could generally do anything to remedy the situation. However, many machines may have been overlooked in the repairs because they run related programs, Microsoft Desktop Engine or Data Engine, that reside on individual desktops or laptops.
"While the weekend focus was on servers,
now the problems persist in desktop machines," said Russ Cooper, a security analyst at TruSecure Corp.
He said users can get rid of the worm by simply turning off the machine, but he suggests users then contact their network managers to prevent getting it again.
Worm attacks are rare, but it is still a method used by hackers when a new bug is found on an operating system (OS). This has the “advantage” of being able to hack a lot of sites in little time. These examples highlight the importance why Network/ IS managers need to ensure that they keep computers
up-to-date to improve system maintainability because security experts are concerned that too many system managers are only fixing problems as they occur, rather than keeping their defences up to date. The example of the sapphire worm is of course an extreme case but never the less underlines the fact and gives us crystal clear evidence that internet security is a very important issue in the modern Computer dependant world. Network and IS mangers need to be aware of security updates and implementing them. Two of the previous major outbreaks, Code Red and Nimda exploited known problems for which fixes were available, but if remedies are not easily available this brings about extended problems which will be harder to detect and fix.
2.9 Trap Door
A trap door or back door is an entry point into a computer system that bypass the normal security measures hidden in software or hardware mechanisms that permits system protection mechanisms to be circumvented, i.e. left vulnerable to attacks. It is activated in some non-apparent manner.
It can be a hole in the security of a system deliberately left in place by designers or maintainers. The motivation for such holes is not always sinister; some operating systems, for example, come out of the box with privileged accounts intended for use by field service technicians or the vendor’s maintenance programmers.
On 19 July 2001, the White House narrowly averted a terrorist attack when security personnel were able to exploit a flaw in a bomb’s trigger mechanism and evacuate key personnel to a remote location, causing the bomb to fizzle. The attack was a denial-of-service attack, the target was the White House web site, and the flaw was in malicious code. The Code Red worm was programmed to flood www.whitehouse.com web site in a massively coordinated distributed denial-of-service attack at 8.00pm on July 19, however the attack failed because of some programming errors in the worm. Firstly, the attack was against a specific IP address and not a URL, which meant that whitehouse.gov, could move from one URL to another to avoid the attack. Secondly, the worm was programmed to
Philip Fogarty , Maaruf Ali P25
check for a valid connection before flooding its target. With whitehouse.gov, at a different IP address, there was no valid connection which meant and resulted in no flooding. The worm was programmed to continue to spread until July 20, and try to attack the former IP address of whitehouse.gov, until July 28 as there was a back door which was left open by the hackers. Unless a network’s IDS signatures were updated, firewalls would not have caught this problem because they would not know where the problem started from.
Conclusions of Section Two
Other attack methods are also present but they are very technical for non-advanced UNIX users. Here is a short list of them:
Sendmail attack: attack via the mail system on port (25);
NIS and NFS attack;FTP attack: attack via the ftp port (21);Telnet attack: attack via the telnet port (23) &Rlogin and rush attack.
The discussion in this chapter has really only scratched the surface of the tools and techniques that hackers are able to employ. Nonetheless, it is clear that the various activities can represent a major problem for both IS/network managers and the general users who wish to use and access the system, with the effects of incidents like web defacement and denial of service being felt by more than just the people within the target organisation. Methods of attack can take many forms and can be utilised by hackers of varying levels of abilities with significant impacts as a result. This highlights the threats that hackers bring to information systems such that, the problem posed by hackers is certainly not one that can be dismissed lightly.
If the illicit activities of hackers are to be stopped then a major element of responsibility falls upon the IS and network managers (administrators). The discussion has demonstrated that they often face a significant problem due to the sheer volume of security patches and advisories that need to be heeded in order to keep a system secure. While there are a variety of tools that administrators can use to assist them in assessing their protection, these are equally available to the hacker community and can be deployed to the detriment of security. From this
perspective, the maintenance of security and the prevention of cybercrime can be regarded as an ongoing battle, in which each side strives to regain the advantage. Although specific vulnerabilities have been addressed, the underlying problem of hacking will not disappear, and so the only solution that can be given to IS and network managers is constant vigilance.
Although significant discussion has been devoted to the issue, it is important that hacking is only one face of the cybercrime problem. The next chapter will focus on ways of securing systems and methods that could help counteract the threat of hackers to computer based information systems.
3. PROTECTION AGAINST THE
HACKER
This section will focus on methods that can be used to improve the security on Internet and systems, along with description of firewalls and encryption methods and technologies. Effectiveness of these methods along with benefits will also be described in this section.
3.1 Internet Firewalls
Firewalls provide digital protection associated with the rapid growth of internetworking and commercialisation of the Internet. Many people have heard of firewalls and some people use them. However, the number of security incidents arising from Internet connections strongly suggests that not enough people are using them and should have them.
3.2 What Is a Firewall?
“A computer system that isolates another
computer from the internet in order to prevent
unauthorized access.” Collins English Dictionary,
Harper Collins 5th edition 2000.
A firewall is a system designed to prevent unauthorised access to or from a private network. Firewalls can be implemented in both hardware and software, or a combination of both. Firewalls are
Philip Fogarty , Maaruf AliP26
frequently used to prevent unauthorised Internet users from accessing private networks connected to the Internet, especially intranets. All messages entering or leaving the intranet pass through the firewall, which examines each message and blocks those that do not meet the specified security criteria.
There are several types of firewall techniques and in practice many firewalls use two or more of these techniques:
Packet filter: Looks at each packet entering or leaving the network and either accepts or rejects it based on user-defined rules. Packet filtering is fairly effective and transparent to users, but it is difficult to configure. A major downside to it is that it is susceptible to IP spoofing.
Application gateway: Applies security mechanisms to specific applications, such as FTPand Telnet servers. This is very effective, but can impose performance degradation.
Proxy server: Intercepts all messages entering and leaving the network. The proxy server effectively hides the network addresses.
Some firewalls permit only e-mail traffic though them; thereby protecting the network against any attacks other than attacks against the e-mail service. Other firewalls provide less strict protections, and block services that are known to be problematic.14
Firewalls are also important since they can provide a single “choke point” where security and audit can be imposed. Unlike in a situation where a computer system is being attacked by someonedialling in with a modem, the firewall can act as an effective “phone tap” and tracing tool, which will prove very beneficial for IS and network managers or administrators because they will be in a position to monitor and locate the problem and take immediate action to remedy the situation.
3.3 What Can a Firewall Not Do?
To clear up any misunderstandings, firewalls can not protect against attacks that do not go or pass through the firewall. Firewall policies must be realistic and reflect the level of security in the entire
14http://www.deter.com/unix/papers/firewall_ranum.ps.gzAccessed on 26 January, 2005.
network to what ever the required specification is. For example, a company/organisation with top secret or classified data does not need a firewall, the reason being because they should not be connected to the Internet in the first place, or the systems with the really sensitive data should be kept isolated from the rest of the corporate network.
Firewalls cannot protect very well against viruses. Currently there are several hundred (if not thousand) ways of encoding binary files for transfer over networks. In general, a firewall can not protect against a data-driven attack (attacks in which something is mailed or copied to an internal host where it is then executed). Firewalls are useless against inside attacks. An attack could come from a legitimate user inside a organisation, maybe a angry employee.
3.4 Firewall Conclusion
There exists a range of types of firewalls. However, the main part of them is a piece of software installed on the router of the company or on another host. Hardware firewalls also exist, which can be described as an electronic board that is plugged inside the computer. There are different roles for a firewall. Some are packet filtering router and dual-home gateways. There is also a wide range of firewalls for each operating system: UNIX, Novell NetWare, Windows NT, LINUX, but the main concern of the firewall is to enhance the security around the server’s.
Today’s firewalls provide a good resistance against hackers and are a recommended piece of software and hardware that all IS and network managers (administrators) need to consider in their company/organisations defence so that they improve and maintain their systems. However, if a firewall is not installed properly, it could be worth not having one due to a false sense of security and the gaps in the security which can be exploited by potential hackers which is not good for any company as they all have data transactions happening in their systems.
Philip Fogarty , Maaruf Ali P27
3.5 Password Protection
Protection of passwords is one of the main threats and problems of Internet security and there are two major methods in improving these security procedures; these are Shadow passwords and generation of “secure” passwords. Shadow password is a system where the plain text of the password file is hidden from all users except root, hopefully stopping all attempts of password cracking at the source. It provides a good degree of password file robustness, which should be utilised by IS and network managers so that their systems are more secure and less prone to hackers attacking and making unauthorised access to their systems.
3.6 Encryption
Encryption originates from the Egyptians back in 2000 B.C. Since then encryption has developed into a cryptographic technology which computer users can send messages that are understood (decrypted) only by the intended recipient, improving controls on routing messages over the Internet or either Intranet, and improving operating system quality to decrease program flaws and other security vulnerabilities. There are two main types of encryption methods: asymmetric encryption (also called public-key encryption) and symmetric encryption.
3.7 Symmetric Encryption
It is a type of encryption where the same key is used to encrypt and decrypt the message. DES encryption (Data Encryption Standard) is the most famous form of symmetric encryption. It is currently used by US administrations to send data through a network. However, they use the public key encryption system to send the key of DES encryption to the recipient of the encrypted file.
3.8 Pretty Good Privacy (PGP)
This is a program for encrypting messages developed by Philip Zimmerman. PGP is one of the most common ways to protect messages on the Internet because it is effective, easy to use and free.
PGP is based on the public-key method, which uses two keys: one is a public key that allows you to disseminate to anyone from whom you want to receive a message, the other is a private key that is used to decrypt messages that are received.
Highlighted areas of security concern are as follows and will be discussed in the following chapter:
Firewalls;Encryption methods;Password protection;E-mail security;IP spoofing and restriction andPGP.
4. FURTHER WAYS TO SECURE
AND SOLUTIONS
This section will give information on recommended solutions to problems which have been mentioned in the discussion of this report along with methods that can be employed by IS and network managers or system administrators to secure their ‘insecure’ systems, examples will be given to justify points raised where ever appropriate.
4.1 Security through Obscurity
This is a way to consider that any system can be secure so long as nobody outside of its implementation group is allowed to find out anything about its internal mechanisms. The technique is hiding account passwords in binary files or scripts with the presumption that “nobody will ever find it”.
This is a philosophy favoured by many bureaucratic US agencies. The main critic of this technique maintains that it is pseudo-security because it does not solve the real problems of security but instead hides them. It can also tie the manager into trusting a small group of people for as long as they live. If the employees get an offer of better pay from somewhere else, the knowledge goes with them, whether the knowledge is replaceable or not. Once the secret gets out, that is
Philip Fogarty , Maaruf AliP28
the end of the security. However, this technique can complement other security steps.
4.2 Secure Passwords
First, it is interesting to see how many possible passwords there are. Most people worried that programs like “Crack” will eventually grow in power until they can do a completely exhaustive search of all possible passwords, to break into specific user’s account, usually root.
Valid passwords are created from a set of 62 characters [A-Z, a-z, 0-9] and additional characters can also be added such as [#’/$%^&] this results in a password varying in length between 5 and 8 alpha-numeric characters long. With only the 63 common characters the size of the set of all valid passwords is worked using the sum below:
625 + 626 + 627 + 628 = 2.2E + 14
A figure that is used as a password which is large will undertake an exhaustive search with current technologies and make life difficult for a hacker to uncover the password. Moreover, if one can use some of the 95 non-control characters in a password, this increases the search space for a hacker/cracker to cover even further.
Any password derived from any dictionary word (or personal information), modified in any way, constitutes a potentially guessable password.
For example a password based on:
Login Name: ee45212First Name: SeanLast Name: MurphyBackward words: htims, 21254ss, retupmocWords of dictionary: computerCapitalised word: Computer, CoMpuTerWords of cracking dictionary: PORSCHE
911, 12345678, qwerty, abcxyz, mr.spokeForeign language words: bonjour, gutentag
4.3 Security Auditing Tools
A number of different tools are available through the Internet to check the security of a
system. Some tools scan hosts for known vulnerabilities: SATAN is the most famous program; other tools check the file integrity such as Tripwire. IS or network managers are strongly advised to use these tools to maintain and secure their systems from the threats that hacker’s pose.PGP is such an effective encryption tool that the US government actually brought a lawsuit against Zimmerman for putting it in the public domain and hence making it available to enemies of the U.S. After a public outcry the US lawsuit was dropped, however, it is still illegal to use PGP in many other countries. IS and network managers need to utilise and adopt such techniques so that they can maintain and improve their computer based information systems because encryption is one of the most effective methods to achieve data security and it ensures data integrity and confidentiality.
4.4 IP Restriction
IP Restriction is very common and will allow IS and network managers/administrators to limit a user to parts of the server. It is strongly advised that IS and network managers employ some form of IP restriction within their network configuration because this will prove very beneficial because by only allowing a few IP addresses to other specific parts of the server, a hacker will not be granted accesses to area where s/he can be a nuisance and cause damage to the system
4.5 Firewall Security
The typical firewall is an inexpensive micro-based UNIX box kept clean of critical data, with a cluster of modems and public network ports on it but just one carefully watched connection back to the rest of the huddle. The special precautions may include threat monitoring or call-back which are all related to security and risk.
Generally, firewalls are configured to protect against unauthorised interactive logins from the “outside” world. This, more than anything, helps prevent vandals/hackers from logging into machines on the internal network. More elaborate firewalls block traffic from the outside to the inside, but permit users on the inside to communicate freely with the outside or external network(s). These
Philip Fogarty , Maaruf Ali P29
reasons help to justify why IS and network managers should incorporate such technologies in to their systems because they act as a first line of defence and protect internal networks from the wider threats of the external network, and they will help protect computer based information systems against any type of network borne attack.
4.6 Asymmetric or Public Key
Encryption
This is a cryptographic system that uses two keys: a public key known to all recognised users and a private or secret key that is only known to the recipient of the message.
Example: When Jack wants to send a secure message to Jill, he uses Jill’s public key to encrypt the message. Jill then uses her private key to decrypt the message.
An important element to the public key system is that the public and private keys are related in such a way that only the public key can be used to encrypt messages and only the corresponding private key can be used to decrypt the message. Moreover, it is virtually impossible to deduce the private key if you know the public key.
Public key systems, such as Pretty Good Privacy (PGP) are becoming popular for transmitting information via the Internet. RSA system used for securing payment through the World Wide Web is now a standard. These systems are extremely secure and relatively simple to use. The only difficulty with public-key systems is that users need to know the recipient’s public key to encrypt a message for him or her. However, when paying through the WWW, the browser manages this task by itself, asking the remote server its public key.
4.7 E-mail Security
Baltimore Technologies Secure Messaging Solution will allow IS and network managers to seamlessly incorporate PKI-based electronic security into existing e-mail applications (such as Microsoft Outlook and Lotus Notes) within their companies and businesses. This solution will allow
IS and network managers (or administrators) to quickly and easily assign unique digital IDs to every e-mail user, and will enable individual users to digitally sign mail (ensuring messages can not be tampered with, and content is inextricably linked to individual users). Award winning UniCERT PKI provides the foundations to such techniques that will hinder an attack via e-mail, and by utilising Baltimore’s Secure Messaging Solution, IS and network managers will find that it is one of the strongest forms of protection against e-mail messages providing both security and non-repudiation.
The Secure Messaging Solution addresses the requirements of IS and network managers or the system administrators and end users. End users will want to ensure message integrity so that they can attach a digital signature to a message and/or encrypt it. By attaching a digital signature to a message, the message recipient can verify the originator. Mail programs can also send encrypted messages preventing anyone but the proper/intended recipient from reading the message in transit.
4.8 Education and Awareness
One of the major threats to the security of a system is the lack of awareness of users. “Lack of awareness” refers to how users of the system and the Internet are under the impression that the only way a hacker can break into their account or the system is through some secret “back door” left open by careless IS or network administrators. Another misunderstanding is the belief that if there is nothing of value in a user’s account, it is hardly worth the effort of someone trying to break in or gain unauthorised access. This single access allows the intruder to get root access via a hole in the operating system or can be used as a gateway to hack other sites.
The user is then responsible for following these codes of conduct:
“A good step is to take strict measures to make
users aware of the importance of their password by encouraging them to change their passwords either after the first login or randomly after a set period of time, for example, using grace login techniques, and do not revert to using the previous password over again.
Philip Fogarty , Maaruf AliP30
Not disclosing or sharing their account with any other user of the system because they will be able to assume all of your access rights and privileges on the system (alternate and more secure methods of securing resources exist).
Protecting their password and take extra caution when typing in their password to log on to a machine and choose a password that can not be found in a dictionary - password cracking tools have in-built dictionaries and can match these within seconds.
Changing their password regularly and not writing them down so they are not forgotten. The result is that passwords can be found jotted down on post-it notes on monitors, underneath keyboards and in desk drawers, which can be found if someone is motivated to look.
Passwords also require changing after logging into the account from a remote machine.
Choose a secure password(s). Users should not assume that they are safe because they have some kind of password on their account. Although passwords are normally stored and transmitted in encrypted form, they are still vulnerable to attack. Someone will not try to crack passwords by sitting there and typing different combinations. Automated password cracking programs are freely available on the Internet and can do all the hard work on the impostor’s behalf.”15
Security will only be implemented and maintained effectively when individuals have to understand what to do about it. This applies throughout an organisation, and affects people at all levels, from end users to IS/network managers or system administrators.
For end users, it is required to instilawareness of security policy and compliance with good practice in relation to things such as passwords and viruses. From an organisational perspective, this may dictate a requirement for initiatives such as staff education and awareness programmes.
Users should be made aware of the tactics that hackers may employ in order to steal passwords and gain unauthorised access. This will hopefully ensure that they are less susceptible to methods such as social engineering. It is also essential for users to
15 FURNELL. S, CyberCrime - Vandalising the
Information Society, Page 278, 2002.
know the acceptable bounds of their own access and activities, and to be made aware of relevant cybercrime laws, so that they do not inadvertently put themselves in the position of breaking the rules.
At the other end of the scale are IS/network managers, and here too there are requirements to acquire security skills and keep up to date. At the baseline level, it demands that administrators are familiar with the security facilities available within the operating systems and applications for which they are responsible, as well as how to properly configure them. Professional training courses are offered that address these specific issues, albeit at a price in many cases. However, the known cost of undertaking training is probably preferable to the unpredictable cost of a security breach.
CONCLUSIONS
IS/network managers or system administrators need to realise that there is not and never will be a foolproof secure network, and that they will never be able to protect their systems totally from the problems associated with the Internet and hackers. As the Internet continues to grow in popularity, it will surely grow with statistics of fraud, unauthorised access to systems and plain mischief from cyber-criminals who feel they have a point to prove. The advice offered will be given to IS and network managers which will help them to improve and secure their networks. The different methods that will be discussed are:
BSO and ISO standards;User authentication and passwords;Auditing and monitoring;Keeping backups;Anti-virus measures andSystem administration and maintenance.
“Unless dramatic changes are made, it seems probable that the problem of security vulnerabilities will not only remain, but will become worse. The reasons for this are twofold. First, as new software emerges, offering more complex functionality, the potential for unforeseen vulnerabilities is almost inevitable. Second, the increasing proliferation of Internet systems means that computers incorporating such vulnerabilities will be more widespread, thereby offering more opportunities for
Philip Fogarty , Maaruf Ali P31
automated analysers to be used.”16 It should be an organisations interest to have adequate security, in order to protect its system and data. Experience has shown that Internet systems may be made the unwitting participants in hacker activity, for example, providing an entry point from whichhackers may attack other systems, or permitting software to be installed that allows them to be used in denial of service attacks upon other systems.
Security can represent an overhead in terms of cost and performance, but in view of the problems that can arise without it, it would be detrimental to dismiss the issue as unnecessary. Given the growing dependence upon information networks, IS/network managers need to have a proactive attitude so that they introduce appropriate security measures, which will help protect their systems against much more than just cybercrime.This section identifies a number of key areas in which security ought to be considered by IS/network managers.
BSO/ISO Standards
A very good reference for IT security that could be used by IS/network managers, which is gaining international recognition and acceptance, is BS 7799 - the British Standard in Information Security. The standard has been updated to account for advances in technology and following the addition of a security management specification in 1998, it now comes in two parts:
ISO/EC 17799:2000 (Part 1) - a code of practice for information security management.
BS7799-2:2004 (Part 2) - a standard specification for an information security management system, providing a means by which an organisation can monitor and control its security.
As the reference number suggests, Part 1 has been accepted more widely than just a British Standard and has been internationally endorsed by the ISO, the International Standard for Organisations. Because of their renowned reputation, IS/network managers need to incorporate
16 Emerald Fulltext Journal - Security analysers:
administrator assistants or hacker helpers? 2001, Information Management and Computer Security, Vol. 9, No. 2, pp. 93-101.
these codes of practice because they have increasingly become recognised on a worldwide basis. The top-level headings encapsulate a total of 127 security guidelines, which then further break down into over 500 individual security controls that IS/network managers should consider implementing in order to achieve baseline protection (a minimum level of protection).
Organisations that have particularly sensitive systems and data, such as healthcare, banking and government will require more significant levels of protection.
IS/network managers have to tackle the problem of cybercrime and address it as part of a wider security strategy, rather than attempting to solve the problem in isolation. Protection countermeasures should be introduced in order to contribute to a defined security policy, and appropriate measures should be selected by IS/network managers on the basis of formal risk assessment. IS/network managers need to make anassessment beforehand, so that they are sure that attention is being devoted to safeguarding the correct assets or that appropriate level of protection is being provided.
In addition to adopting internationally recommended guidelines, IS/network managers can demonstrate their compliance by seeking certification from national accreditation bodies. This is valuable in the sense that it provides a basis for mutual trust in business and other relationships. If a prospective partner is certified as ISO/EC 17799 then one can have some confidence that a relationship with them will not risk compromising your organisation’s security or data.
User Authentication and Passwords
Another principle issue that should be considered by IS/network managers when securing their systems is User Authentication and Passwords.
Given that a vast number of abuse has been shown to occur as a result of hackers masquerading as other users, consideration should be given to ensuring sufficient point of entry authentication by IS/network managers. If people can not gain access, it narrows the range of damage that they can cause. The most common form of authentication is the password, whereby a claimed identity is verified by
Philip Fogarty , Maaruf AliP32
the possession of secret knowledge. Other methods of authentication exist, such as the use of physical tokens (e.g. swipe cards and keys) or the assessment of biometric characteristics (e.g. fingerprint, face or voice recognition), but the simplicity and low cost of passwords makes them ideal to help improve and maintain systems.
IS/network managers have to persuade their staff to select passwords which hackers could not determine unless they were willing to do a bit of research. IS/network managers have to ensure poor passwords are not chosen (for example, the words“qwerty” and “fred” because of the close proximity of the keys) and that staff avoid using names of a spouse, pet or car registration plate. The reason why IS/network managers need to stress the importance of choosing an appropriate password is because poor choice of passwords serves to make them more vulnerable to automated cracker tools such as L0phtCrack. A poor password also makes it easy for someone to gain unauthorised access to the system, not only threatens the account holder’s own data, it also jeopardises the security of other users. As a guideline, IS/network managers should make passwords at least eight characters long and include at least one non-alphabetic character. This will increase the number of possible character permutations that an attacker using the brute force approach would have to try, hopefully frustrating the success of such an attack.
Auditing and Monitoring
A good method for IS/network managers to identify if anything is amiss in their systems is to watch what is actually going on in their systems. IS/network managers need to carry out routine audit checks on their systems because it will allow them to record a log of security-relevant events for later inspection. Audit mechanisms should be employed at the operating system level and by certain applications, and provided the added benefit to IS/network managers in revealing signs of external penetration attempts as well as internal misuse of the system. These methods should be adopted by IS/network managers because a variety of tools can be used to simplify this analysis and create suitable reports for system administrators.
Audit data should be collected by IS/network managers because it will prove valuable evidence to
system administrators especially in the event of a security breach because careful attackers will try to cover evidence of their activities by modifying or deleting log entries that may record such details. A more proactive measure which is increasingly essential with Internet accessible systems, that IS/network managers should take is the use of Intrusion Detection Systems (IDS), which can actively watch for signs of hacker activity by monitoring network traffic and host-based activity. The advantage of employing such techniques is that if potential problems are identified, the IDS will be able to alert IS/network managers and, in some cases, invoke automatic responses to contain the situation.
Keeping Backups
If a hacker or a virus corrupts your system data, a very good safeguard that all IS/network managers should have is a backup copy of it from which things can be restored. Backups are vital as more than just a safeguard against cybercrime -there are many other security threats that may result in loss of data, such as equipment failure and physical hazards like fire and flood.
In order to be of real use, IS/network managers should ensure that backups are made on a regular basis, and frequently, so that if the occasion arises to use them, they will enable the majority of data to be recovered. IS/network managers should also consider testing their ability to recover from backups, in order to ensure that the process is known and effective if a genuine requirement should ever rise.
Anti-Virus Measures
The protection against viruses is unfortunately essential in the IT environment and system administrators should make use of anti-virus (AV) software so that they provide effective safeguards for their systems. It is advised that IS/network managers use AV software to protect their systems because the software can scan memory files, and incoming e-mail messages for signs of infection, which will help improve and maintain their systems.
IS/network managers should also configure their systems to automatically run a virus scanner at
Philip Fogarty , Maaruf Ali P33
start-up, without offering the user the option to bypass the process. The AV package should then run transparently in the background throughout the time that the system is in use, which will provide the added benefit that the system is in constant monitoring of viruses all the time. Users should also be explicitly instructed by IS/network managers not to disable the scanner, and if they do, their system should be excluded from the network, therefore reducing the chances of a successful viral attack on the system.
As new virus definitions become available, it is important that IS/network managers ensure that scanning software is kept up to date because if they fail to do this, the safeguard will diminish over time as it ceases to provide protection against strains. Many anti-virus packages now offer the facility to take care of this automatically and can download vendor-issued updates over the Internet.
Even with anti-virus software installed, programs and documents that are obtained or received from unknown or un-trusted sources should be handled with appropriate care. Incidents such as ‘Melissa’ and the ‘Love Bug’ have shown, even things that appear to come from friends may not be exactly what they seem.
In view of these risks, IS/network managers should take special care with:
Programs and other materials downloaded from un-trusted Internet sites;
Unexpected e-mail attachments andFloppy disks of unknown origin.
From an organisational perspective, staff should be educated about the risks, and appropriate controls introduced to prevent them from downloading materials from the Internet or bringing them in from home in an unrestricted manner.
System Administration and Maintenance
Many security breaches result from the fact that systems have not received appropriate attention in terms of fairly straightforward safeguards. As such, IS/network managers have a number of key responsibilities in addition to co-ordinating many of the aforementioned activities. IS/network managers must ensure that:
Any operating system and application level security safeguards are enabled. Although these are frequently available, in many cases they are disabled in a default installation.
The latest software updates and patches are installed. Hackers tend to attack systems running older versions of software in order to exploit known vulnerabilities. Many vendors will make patches freely available to IS/network managers via their web sites, and have electronic mailing lists that are used to advice subscribers of new releases.
Anti-virus software is in use and up to date.A separation is maintained, if possible,
between those systems that hold sensitive data and those that are publicly accessible via the Internet and World Wide Web.
Appropriate restrictions are imposed upon Internet access, to prevent unauthorised traffic from entering and leaving the organisation. This can be achieved using firewall technologies, which can inspect network packets and selectively accept or block them based upon the security policy.
The effectiveness of protection is tested and maintained. The former can be ensured by commissioning an independent assessment, as well as by penetration testing (subjecting the system to the same assault as it would be expected to face from hackers). The requirement for maintenance recognises that, once implemented, security can not just be forgotten about. New threats and forms of attack are guaranteed to emerge, and if protection remains static, a system will not be safe against them.
It should be noted that the recommendations presented are indicative rather than exhaustive considerations, and organisational users in particular are advised to consider more comprehensive guidelines, such as those of BS7799. However, even following BS7799 will not give IS/network managers 100 per cent protection against cybercrime - indeed, there is no such thing as 100 per cent security. Being secure is like trying to hit a moving target, but it is possible for all parties to significantly improve their chances by following good practice.
So, by using hacker techniques, will systems be maintained and improved? Considering only those hackers who would claim that their motivation is the exploration of systems, the probable answer is
Philip Fogarty , Maaruf AliP34
“Yes”. It is certainly preferable for security vulnerabilities to be established through unauthorised exploration, as opposed to being discovered as a result of malicious attack or commercial espionage. It can be argued that the existence of hackers prevents complacency on the part of software vendors, and motivates them to give security due consideration in product design. The prospect of hackers constantly nipping away at their systems is also likely to have a positive effect upon the vigilance of IS/network managers. It can also be argued that the success of these hackers in exposing security vulnerabilities serves to “raise the bar” and put the ability to break into systems beyond the reach of the casual masses. However, there are two requirements to this. Firstly, the ability to actually penetrate a system is not a prerequisite for causing trouble. Secondly, while some hackers content themselves with exposing vulnerabilities, others occupy their time by creating toolkits that automate the task, and, thereby, allow the masses to get their feet in the door again. Another problem is that the benign explorers represent only a small subset of the overall hacker community. Others may be interested in financial gain or outright vandalism of the penetrated system. However, until their motivations become clear, there is nothing to indicate the level of danger that unauthorised users may pose, as the methods that they use to gain access will be the same regardless.
REFERENCES
[1]. Furnell, Steven (2002); CyberCrime - Vandalising
the Information Society, Pearson Education Ltd;
ISBN: 0201721597 (Pages 35-38). (Pages 231-233).
[2]. http://www.cybersource.com/products_and_service
s/electronic_payments/payerauthentication/howitworks.x
ml Accessed on 10 November, 2004.
[3]. http://home.wxs.nl/~faase009/Ha_hacker.html Accessed on 24
September 2004.
[4]. Journal - Computer Fraud and Security, (1999),
Elsevier Sciences Ltd, (pages 8,11,14).
[5]. http://news.com/2100-1023-221613.html?legacy=cnet Accessed
on 10 November 2004.
[6]. http://story.news.yahoo.com.news?tmp1=story&cid1093&e=3&u
=/pcworld/118687 Accessed 10 November 2004.
[7]. http://www.protectedcomputer.com/hacktivisthistory.htm
Accessed on 11th Sep 2004.
[8]. http://sacramento.bizjournals.com/sacramento/stories/2003/11/24/dai
ly3.html Accessed 24 September 2004.
[9]. Hacked page published in the 2600Magazine
[10]. http://www.2600.com/hacked_pages/ (Accessed on 20
November 2002).
[11]. http://www.usatoday.com/tech/news/computersecurity/2002-11-12-
hacker-case_x.htm Accessed 10 November 2004.
[12]. http://wired.com/technology.0,1282,33563,00html Accessed on
10 September, 2004.
ECAI 2005 - International Conference - First Edition
Electronics, Computers and Artificial Intelligence
1-
Bluetooth: Profiles and Applications
BOGDAN IonTechnical University „Gheorghe Asachi”
Blvd. D. Mangeron nr. 59, 700050
bogdani@etc.tuiasi.ro
Keywords: Bluetooth, profiles, applications, communication security, smart
home
Abstract. Tehnologia Bluetooth, de -hoc de
lul important al profilelor
med
Abstract. Bluetooth technology developed initially for short-range
communications between devices in ad-hoc networks as a replacement for
wired connections, has applications in various domains. This paper reviews
the main parameters, underlines the fundamental role of Bluetooth profiles,
includes a classification of the approved applications and presents a home
environment application design as an implementation of the “smart home”
concept.
1. BLUETOOTH BASICS
Bluetooth specifications define short range
radio links (10 to 100 meters) for voice or data
transmissions by using channels with rates up to 720
Kb/s. There are defined 79 channels of 1 MHz bandwidth in the ISM frequency band of 2.4 to 2.8
GHz. Frequency hopping spread spectrum technique
with 1600 hops per second is used in order to cope with the local interferences generated by other
systems. The transmitted power should not exceed 0
dBm (1 mW) for short range communications (10 meters) or 20 dBm (100 mW) for medium range
communications (100 meters).
A Bluetooth device could use up to 3 time-
duplex symmetrical synchronous channels for voice
communications, each of them with 64 Kb/s. For
data transmissions asynchronous and, possibly, not
symmetrical time duplex channels could be used.
The maximal data rate is 433.9 Kb/s for
symmetrical communications and 723.3 Kb/s on
uplink and 57.6 Kb/s on downlink for asymmetrical
communications.
The Bluetooth devices in the radio range of
each other could establish an ad-hoc network by exchanging specific messages. The network is
denoted as a piconet. One device of a piconet yields
the reference clock for all of the devices in the piconet and it is the master; the other devices are
slaves. A master could simultaneously communicate
with at most 7 slaves. Voice and data traffic is transmitted only on point-to-point connections.
Common control data are broadcast (point-to-
multipoint connections). A master device could
simultaneously communicate as a slave in other
piconet. A set of (at most 10) piconets having in
common such devices form a scatternet (Fig. 1).
Piconets in a scatternet use different time reference
and different frequency hops, both of them based on
their own masters’ clock or address.
BOGDAN Ion36
Fig. 1. Piconets and scatternet
The communications in a piconet are
synchronized and have a very simple organization
(Fig. 2): the master sends a packet to the intended slave beginning with the start time of an even-
numbered time-slot and this one sends its answer at
the beginning of the next odd-numbered time-slot.
The time-slot ordering number increments by one at
each new time-slot appearance and returns to 0 after
134217727. A numbering cycle lasts for about 24
hours. The frequency hops at each new time slot.
Fig. 2. Time-slots
In order to increase the transmission rate and
to allow for asymmetrical traffic a packet is allowed
to cover more than a time-slot, but only an odd
number of them: 3 or 5. The frequency does not hop
during a packet transmission (Fig. 3).
Transmissions’ security
Bluetooth Specifications contain security
procedures in order to prevent unauthorized
reception of information or its intentional alteration
by a third party.
Fig. 3. Multi-slot packets
There are three techniques:
Using an inquiring routine for authentication at
the connection establishment;
Encyphering the data stream during the
transmission;
Modifying the encyphering key from a
communication session to another and even
during the same session.
The power of the security techniques resides
in using the Bluetooth device address (48 bits),
which is world-wide unique, as the base for
calculating the encyphering keys, deriving a new
device specific key (128 bits) at the initialization of
a new session, and using a locally generated random number (128 bits) in the process.
The communication security is enhanced by
the limited time a transmission is present on a given frequency (it randomly hops from a time-slot to
another) and by the limited range a Bluetooth
transmission could be received.
Interoperability
A Bluetooth device implements a restricted
set of protocols the Specifications define according
to the particular application it is intended to. In order for the devices from different manufacturers to
communicate with each other there are defined
profiles. A profile selectively groups coherent parts from the Specifications. Any Bluetooth device has
to implement one or more Profiles at the developer
choice, but it has to fully comply with the selected
Profiles. This way the devices using the same
Profile can communicate with each other,
irrespective of the manufacturer or the type or the
number of other Profiles it complies with.
Bluetooth: Profiles and Applications P37
Hardware architecture
A physical Bluetooth device consists in an
analogue radio part and a baseband digital part
denoted as the Host Controller (Fig. 4). The radio
part has the classical structure of a transceiver and it
enables the transfer of the baseband information into the radio frequency band by frequency or phase
modulation and the reciprocal transition by
detection. Filtering, matching, amplification, etc. are other auxilliary functions implemented in any
transceiver. For a Bluetooth device the frequency
hopping is another function of the radio part. The Host Controller includes a CPU core, a Link
Controller and interfaces with the host device.
Fig. 4. Bluetooth hardware architecture
Software architecture
The Bluetooth core system covers the four
lowest layers and associated protocols defined by the Bluetooth Specification as well as one common
service layer protocol, the Service Discovery
Protocol (SDP) and the overall profile the Generic Access Profile (GAP).
Fig. 5 shows the four lowest layers, each with
its associated communication protocol. The lowest three layers are sometimes grouped into a subsystem
known as the Bluetooth controller. This is a
common implementation involving a standard
physical communications interface between the
Bluetooth controller and the remainder of the
Bluetooth system known as the Bluetooth host. Although this interface is optional the architecture is
designed to allow for its existence and
characteristics.
The Bluetooth core system protocols are the
Radio (RF) protocol, Link Control (LC) protocol,
Link Manager (LM) protocol and Logical Link
Control and Adaptation protocol (L2CAP). In
addition the Service Discovery Protocol (SDP) is a
service layer protocol required by all Bluetooth
applications.
Fig. 5. Bluetooth core system architecture
The Bluetooth core system offers services
through a number of service access points. These
services consist of the basic primitives that control
the Bluetooth core system. The services can be split
into three types. There are device control services
that modify the behavior and modes of a Bluetooth
device, transport control services that create, modify
and release traffic bearers (channels and links), and
data services that are used to submit data for
transmission over traffic bearers.
A service interface to the Bluetooth controller sub-system is defined such that the Bluetooth
controller may be considered a standard part. In this
configuration the Bluetooth controller operates the lowest three layers and the L2CAP layer is
contained with the rest of the Bluetooth application
in a host system.The standard interface is called the Host to
Controller Interface (HCI). Implementation of this
standard service interface is optional.
2. PROFILES
A profile offers a clear description of how to
use the Specifications in order to implement a given
function. A profile should allow for a minimum number of options in order for applications to
exhibit identical functionalities, its parameters
BOGDAN Ion38
should be defined so that all the applications behave
identical, must contain mechanisms to combine
different standards, and should include guidelines for user interface implementation.
Fig. 6 shows the Bluetooth profiles and their
grouping based on the functionalities they inherit
from each other: a profile having its box included in
the box of another profile inherits all the properties
of the latter one.
Generic Access Profile (GAP)
Telephony Control Profile (TCP)
Cordless Telephony Intercomm
Service Discovery Profile (SDP)
Serial Port Profile (SPP)
Dial-up Networking (DUN)
FAX
Headset
LAN Access
Generic Object Exchange (OBEX)
File Transfer (FTP)
Object Push
Synchronization
Fig. 6. Bluetooth Profiles
Generic Access Profile (GAP)
GAP is the basic profile and it aims at
establishing and maintaining the baseband
connections. All other profiles are built upon GAP. In order to fulfil its functionalities GAP defines:
Requirements for the functions that must be
implemented in all Bluetooth devices;
Generic procedures for discovering a Bluetooth
device;
Link management features for connecting
Bluetooth devices;
Common format requirements for a device
parameters available at user interface (naming
conventions).
GAP defines also which operational modes
are mandatory or optional. The possible operational
modes for a Bluetooth device are: discoverability,
connectivity, pairing, and security level.
Serial Port Profile (SPP)
SPP emulates the serial wired interface RS-
232; thus, physical applications should not be
modified in order to communicate with a Bluetoothdevice: they view the Bluetooth connection as a
wired serial link.
3. APPLICATIONS
There is a great number of applications in
various domains. They are synthetically presented in
Table 1.
TABLE 1. Approved Bluetooth applications
Domain No. of products
Gaming 6
Medical equipment 8
Music 10
Positioning (GPS) 10
Office equipment 12
Audio and visual 14
Home environment 16
Access point 19
Keyboard and mouse 21
Automotive 22
Printers 23
Services 32
Unique products 44
Portable equipment 55
Headset 69
PC 71
Components 79
Mobile telephony 99
Others 439
4. BLUETOOTH BASED “SMART
HOME”
A Bluetooth scatternet is built in order to
monitor and control all the Bluetooth enabled
devices in a residential space as well as to control
the physical access into the residence. The scatternet architecture is presented in Fig. 7. It includes a main
controller as a master in the Bluetooth enabled home
appliances’ piconet and a Home access controller which is a slave in the above mentioned piconet and
a master for the home access sensors’ piconet.
Bluetooth: Profiles and Applications P39
Microwave
oven
Main
controller
Freezer
Washing
mashine
Keyboard
Cell phone
Main door
Home access
controller
Garage
Back door
Entrance
Fig. 7. Designed scatternet architecture
The main controller monitors the states of
Bluetooth enabled appliances, transmits control signals when they reach preset states and records
their state evolution. Also it exchanges messages
with the home access controller. This one monitors
the states of the sensors at the residence entrance
and activates allarm actions when necessary. In
complex implementations a sensor could be a
videocamera associated with a (half)duplex voice
system allowing for video monitoring of the main
entrance and short dialogues between the person in the residence and the ones in front of the main
entrance. The presence of a video camera allows for
significant events’ recording in the main or home access controller during the owners’ absence. The
home access controller sends appropriate control
messages towards the entrance sensors when a
person having a Bluetooth enabled hardware key
approaches the corresponding entrance.
Security aspects
The Bluetooth enabled sensors could face
attacks from unauthorized devices and measures should be taken to resist attack and to record the
relevant details. In order to increase the resistance to
such attacks the sensor piconet is designed as
follows:
The sensors are programmed to operate as slave
only in the piconet controlled by the home access
controller; thus they cannot answer to Inquiry
controls from other (unfriendly) devices.
The sensors are programmed to look for
connections only with Bluetooth devices having a restricted set of addresses, the ones in the
possession of the familly members.
The Bluetooth devices associated with the
hardware key are made undiscoverable; thus
their address can not be learned by establishing connections with other (possible) unfriendly
devices.
The code in the hardware key and the address of
the associated Bluetooth device are changed
periodically.
The communications in the scatternet are
defined as client – server message exchanges. The
design used the Widcomm development kit and is based mainly on a modified version of the chat
profile included in this medium
CONCLUSIONS
Bluetooth technology allows for complex, but
cost-effective implementations in various domains.
The “smart home” concept could greatly benefit
from its features and the presented design reveals
this.
ACKNOWLEDGEMENT
This work was supported in part by Grant INF134/2004, MEC, and Grant 415/2005, CNCSIS.
REFERENCES
[1]. BROADCOMM, inc., WIDCOMM BCM1000-BTW
Programmer’s Guide, Bluetooth for Windows SDK
[2]. BROADCOMM, inc., WIDCOMM BCM1000-BTW
SDK, User Manual, Sample Applications
[3]. ERICSSON, inc., Bluetooth PC Reference Stack by
Ericsson, Users Manual
[4]. JENIFFER BRAY, CHARLES STURMAN,
Bluetooth Connect Without Cables, Prentice Hall,
2001
[5]. MAREK BIALOGLOWY, Bluetooth Security
Review, Part 1, April 25, 2005,
http://www.securityfocus.com/infocus/1830?ref=rss
[6]. http://www.bluetooth.com/products/
[7]. CIPRIAN – Bluetooth:
pp. 62-86
ECAI 2005 - International Conference - First Edition
Electronics, Computers and Artificial Intelligence
1-
SPEECH USER INTERFACE FOR
ROMANIAN LANGUAGE
MUNTEANU Doru
Military Technical Academy-83, 050141, Bucharest, ROMANIA
munteanud@mta.ro
Keywords: speech user interface, automatic speech recognition, voice activity
detection, hidden markov models,
Abstract. des întâlnite
importante (timp, memorie). Un sistem de dictare cu o precizie de 90 % are
nevoie de un procesor
[2]
e un instrument foarte
[3]
de
modul de vorbire (continuu/izolat), , textul recunoscut
Abstract. Speech recognizers are not very popular nowadays because
their performances are still poor and need a lot of computational resources
(time, memory). A 90% accuracy speech dictation system requires a 1 GHz
processor, 512 MB RAM, high quality microphone and a quiet room. In this
paper it is described a speech user interface for Romanian language. The
automatic speech recognition system [2] is based on Hidden Markov Models
and uses a very popular toolkit [3]. A voice activity detector (VAD) is
embedded The interface allows user to: make an environment profile, tune
decoding parameters (beam width, word insertion penalty), select speech
mode (isolated/continuously), select different vocabularies, see the speech
waveform, recognized text and processing times
1. INTRODUCTION
The speech user interface relies on phone-
based continuous speech recognizer which has been
built for Romanian language. The automatic speech
recognition system [2] is based on statistical
modeling of time-varying speech sequences with a
well known and effective tool, Hidden Markov
Model (HMM).
Each Romanian phoneme was modeled with
a three-state HMM with a left-right topology, using multiple-mixture Gaussian continuous distribution
(Figure 1).
The covariance matrices are diagonal in order
to reduce the resources required for the output
SPEECH USER INTERFACE FOR ROMANIAN LANGUAGE P41
probability computation. In order to improve
recognition rates, context dependent modeling
approach has been adopted. It is known that context is the most important source in speech variability.
tob
4
12a
22a
23a
33a
34a
44a
Na
41 2 3 4 N
tob
2 tob
3
t
Observed
data
States:
1o
3o
4o
5o
6o
7o
8o
2o
Figure 1. Phone-based HMM in a
left-right topology
The context dependent models ensure a better modeling accuracy, but the number of models
increases heavily and the problem to face is there is
much less training data for each model. In large
vocabulary speech recognition, many contexts have
only few occurrences in the training data,
insufficient for a robust parameter estimation of the corresponding models; there are also contexts that
have no occurrence in the training set, the so-called
“unseen contexts”. To handle these problems, state tying proved to be extremely efficient [2]. The
contextually equivalent sets of HMM states are
determined in our approach applying the phonetic decision trees.
2. VOICE ACTIVITY DETECTION
The speech user interface uses an energy-
based voice activity detector (VAD). It uses a two
level algorithm which first classifies each frame of data as either speech or silence and then applies a
heuristic to determine the start and end of each
utterance [3].
The detector classifies each frame as speech or
silence based only on the log energy of the signal. A
frame has a length of 20 ms (320 samples, considering a 16 kHz sampling rate). When the
energy value exceeds a threshold, the frame is
marked as speech otherwise as silence. The
threshold is made up of two components both of
which are adjusted automatically within a
calibrating procedure when the detector is tuning its
parameters from the current acoustic environment
just prior to speech recognition process itself.
Once each frame has been classified as speech
(V) or silence (L) according to their log-energy, they
are grouped into windows consisting of FRN
consecutive frames. For each window, the number
of speech frames is counted.
Fre
qu
ency
[Hz]
8000
0Time [s]
VAD ASR
Speech
Silence
0
Figure 2. Real-time voice
activity detector applied to
speech signal
When the number of frames marked as silence within each window falls below a glitch count the
whole window is classed as speech.
Two separate glitch counts are used, STARTN
before speech is detected and STOPN whilst searching
for the end of the utterance. This allows the algorithm to take account of the tendency for the
end of an utterance to be somewhat quieter than the
beginning. Finally, a top level heuristic is used to determine the start and end of the utterance. The
heuristic defines the start of speech as the beginning
of the first window classified as speech. The actual
start of the processed utterance is few windows
( PREW ) before the detected start of speech to ensure
that when the speech detector triggers slightly late
the recognition accuracy is not affected. Once the
start of the utterance has been found the detector
searches for a number of windows (POSTW ) all
classified as silence and sets the end of speech to be
the end of the last window classified as speech.
Once again the processed utterance is extended
POSTW windows to ensure that if the silence
detector has triggered slightly early the whole of the speech is still available for further processing.
Figure 3 shows an example of the
speech/silence detection process. The waveform
data is first automatically classified as speech or
silence at frame and then at window level
according to the rules presented above (when
the number of speech frames/window is greater
than STARTN ). In this example, audio input starts
when the number of speech frames equals 5 and
MUNTEANU DoruP42
stops when the number of speech frames is less
than 2 and more than 3 consecutive windows
are marked as silence. Note that even if at some
point during speech acquisition the number of
speech frames / window is less than 3, the
device does not stop because the tree
consecutive silent windows condition is not
met.
Level
I Frame:
II Window:
time
0 1 0 3 5 8 7 8 2 0 3 6 8 7 2 0 0 0
L L L L V V V V L L V V V V L L L L
Speech signalSilence Silence
5START
N 3DECL
NSTART STOP
POSTW
PREW
Figure 3. End point detection by marking frames and windows as speech (V)
or silence (L)
3. SPEECH INTERFACE
DESCRIPTION AND
IMPLEMENTATION DETAILS
The speech user interface is presented in
Figure 4. It consists in three modules: speech
activity detector, decoding parameters and results display.
a. speech activity detector (upper left corner)
– user can create profiles for different
environmental conditions. After pushing
“ ”, user enters a label
for the new conditions and 4 seconds of
silence is recorded. The mean and variance
of the frame log-energy is computed and a
default threshold is established. User may adjust this threshold by the mean of a
cursor, tuning the VAD sensibility.
b. Decoding parameters (upper right corner) allow user to select one of the three
recognition vocabularies: Romanian digits
(10 words), Romanian districts (40 words)
and 3.000 words (chosen arbitrary). The
beam width and word insertion penalty
decoding parameters may be customized.
For language modeling, only two situation
are taking into account: continuously and
isolated speaking. User may switch between these two styles. It is obvious that
better recognition performances (accuracy,
computation time, memory usage) are obtained for isolated speech.
Results display (bottom half) – user may see
both the speech waveform and the recognized text.
Computation times for speech parameterization and
decoding are also displayed.
The recognizer is based on procedures that
are used for off-line tests. Consequently, VAD sub-
system allows to use the recognizer in real-time. We
have tested the speech detector in a 16 to 23 dB signal-to-noise ratio environment. After some VAD
tunings, the system was evaluated by uttering 300
phrases containing either one isolated word or several connected words, depending on the
grammar. Only 4 phrases were rejected that gives a
1,3% detection error rate. No false phrase acceptance was detected. The rejections occurred
when very short words were spoken, containing
mostly consonants, which are known to have low
SPEECH USER INTERFACE FOR ROMANIAN LANGUAGE P43
energy. That was the case for Romanian word “opt”
(eight). For a practical system, such detection error
rate could be easily over-passed by repeating louder
a word, a procedure used quite often in real life in
human to human dialog.
Figure 4. Speech user interface
4. CONCLUSIONS
We have presented here a speech user interface based on a recognizer built for Romanian
language. The core of the recognizer is not
implemented from scratch (we have used the well-known HTK [3]), but the VAD procedure and the
interface were implemented in MATLAB. For real-
time applications, speech detection is a very important issue for it is the interface to speech
decoder. If it does not work well, the overall
recognition performances may degrade significantly.
Although we have presented here a speech user
interface, building the automatic recognition system
behind is not a very easy task to perform. Previous
work [2] reveals many difficulties in order to train
context dependent phoneme-based models for
Romanian Language. Off-line experiments have
been extended to a real-time application framework
presented herein.
REFERENCES
[1]. X HUANG, et all, Spoken Language Processing,
Prentice Hall, (2001)
[2]. E. OANCEA, et all., Continuous Speech
Recognition for Romanian Language based on
Context-Dependent Modeling, Conf.
COMMUNICATIONS (2004) 221-224
[3]. P.C. WOODLAND, et all., Large Vocabulary
Continuous Speech Recognition Using HTK, Proc.
of International Conference on Acoustics, Speech
and Signal Processing, (1994) 125-128
top related