sjcse

Volume-1, Issue-1

SREEPATHYJOURNAL OF COMPUTER SCIENCE & ENGINEERING

Published by

Department of Computer Science and Engineering

Sreepathy Institute of Management and Technology, Vavanoor

Palakkad - 679 533

June 2014

Sreepathy Jouranl of Computer Science and Engg. i

Contents

Secret sharing of Color Images by N out N POB System, Ganesh P , G Santhosh Kumar ,A Sreekumar,In this paper, a new secret sharing scheme for color image is proposed. Our scheme uses a method to construct an Nout of N secret sharing scheme for color images based on a new number system called Permutation Ordered Binary(POB) Number System .This scheme is an efficient way to hide secret information in different shares. Furthermorethe size of the shares is less than or equal to the size of the secret. This method reconstructs the secret image withoriginal quality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Locating Emergency Responders in Disaster Area Using Wireless Sensor Network, Aparna M,A worldwide increase in number of natural hazards is causing heavy loss of human life and infrastructure. An effectivedisaster management system is required to reduce the impacts of natural hazards on common life. The first handresponders play a major role in effective and efficient disaster management. Locating and tracking the first handresponders are necessary to organize and manage real-time delivery of medical and food supplies for disaster hit people.This requires effective communication and information processing between various groups of emergency responders inharsh and remote environments. Locating, tracking, and communicating with emergency responders can be achieved bydevising a body sensor system for the emergency responders. This work discusses an algorithm and its implementationfor localization of emergency responders in a disaster hit area. The indoor and outdoor experimentation results are alsopresente. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Brute Force Attack Defensing With Online Password Guessing Resistant Protocol, Jyothis K P, Padmada M P,Krishnan NBrute force and dictionary attacks on password- only remote login services are now widespread and ever increasing.Enabling convenient login for legitimate users while preventing such attacks is a difficult problem. Automated TuringTests (ATTs) continue to be an effective, easy-to-deploy approach to identify automated malicious login attempts withreasonable cost of inconvenience to users.In this paper, we discuss the inadequacy of existing and proposed loginprotocols designed to address largescale online dictionary attacks (e.g., from a botnet of hundreds of thousands ofnodes). We propose a new Password Guessing Resistant Protocol (PGRP), derived upon revisiting prior proposalsdesigned to restrict such attacks. While PGRP limits the total number of login attempts from unknown remote hoststo as low as a single attempt per username, legitimate users in most cases (e.g., when attempts are made from known,frequently-used machines) can make several failed login attempts before being challenged with an ATT. We analyzethe performance of PGRP with two real-world data sets and find it more promising than existing proposals.PGRPaccommodates both graphical user interfaces (e.g., browser-based logins) and character-based interfaces (e.g., SSHlogins), while the previous protocols deal exclusively with the former, requiring the use of browser cookies. PGRPuses either cookies or IP addresses, or both for tracking legitimate users . . . . . . . . . . . . . . . . . . . . . . . . . 9

Intelligent Image Interpreter, Jayasree N Vettath,The intrinsic information present in an image is very hard to interpret by the computer. There is lot of approachesfocused on finding out the salient object in an image. Here we propose a novel architectural approach to find out therelation between salient objects using local and global analysis. The local analysis focuses salient object detection withefficient relation mining in the context of the processing image. For an effective global analysis we created ontologytree by considering a wide set of natural images. From these natural images we create an affinity based ontology graph;with the help of this local and global contextual graph we construct an annotated parse tree. Above formed tree ishelpful in large image search. So our proposal will give new heights to the content based image retrieval and imagerelated interpretation problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Sreepathy Jouranl of Computer Science and Engg. ii

Self Interpretive Algorithm Generator, Jayanthan K S,Thinking is a complex procedure which is necessary to deal with a complex world. Machines that help us handle thiscomplexity can be regarded as intelligent tools, tools support our thinking capabilities. The need for such intelligenttools will grow as new forms of complexity evolve, for our increasingly globally networked information society.Algorithms are considered as the key component of any problem solving activity. In order for machines to be capableof intelligently predigesting information, they should perform in a way similar to the way human think. Humans havedeveloped sophisticated methods for dealing with the worlds complexity, and it is worthwhile to adapt to some ofthem. Sometimes the principles of human thinking combined with the functional principles of biological cell systems.Here the consideration of human thinking is extended to interpretive capacity of the machine. This interpretive capacitymeasured the capacity of the machine to generate the algorithm. Many definitions of algorithm are based on the singleidea that input is converted to output in a finite number of steps. An algorithm is a step by step procedure to completea task. It is any set of detailed instructions which results in a predictable end-state from a known beginning. Thisproject is an attempt to generate algorithms by machine without human intervention. Evolutionary computation is amajor tool which is competitive to human in many different areas of problem solving. So here a using GeneticallyEvolved Cellular Automata is used for giving the system self interpretive capacity to generate algorithm. . . . . . . . 20

Natural Language Generation from Ontologies, Manu MadhavanNatural Language Generation is the task of generating natural language text suitable for human consumption frommachine representation of facts which can be pre-structured in some linguistically amenable fashion, or completelyunstructured. An ontology is a formal explicit description of concepts in a domain of discourse. An ontology isconsidered as a formal knowledge repository which can be used as a resource for NLG tasks. A domain ontology willprovides the input for content determination and micro-planing of NLG task. A linguistic ontology can be used forlexical realization. The logically structured manner of knowledge organization within an ontology enables to performreasoning tasks like Con- sistency checking, Concept Satisfiability, Concept Subsumption and Instance Checking.These types of logical inferencing actions will be applied to derive descriptive texts as answers to user queries fromontologies. Thus a simple natural language based Question Answering system can be implement, guided by robustNLG techniques that act upon ontologies. Some of the tools for constructing ontologies ( Protege, Natural OWL, etc.)and their combination with NLG process will also be discussed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Speech synthesis using Artificial Neural Network, Anjali Krishna C R, Arya M B, Neeraja P N, Sreevidya K M,Jayanthan K SThe text to speech conversion is a large area which shows a very fast development in the last few decades.Our goalis to study and implement the specific tasks concentrated during text to speech conversion namely text normalizationgrapheme to phoneme conversion, phoneme concatenation and speech engine processing.Usage of neural network forgrapheme to phoneme conversion provides more accuracy than normal corpus or dictionary based approach.Usuallyin text to speech grapheme to phoneme conversion is performed using a dictionary based method.The main limitationof this technique is that it cant able to give the phoneme of a word which is not in the dictionary and to have moreefficiency in phoneme generation we require a large collection of word- pronunciation pair.For using large dictionarywe require large storage space also.This limitation can be overcome using a neural network.The main advantage of thisapproach is that it can able to adapt unknown situvations.ie it can able to predict the phoneme of a grapheme whichis not defined so far.The neural network system requires less memory than a dictionary based system and performedwell in tests.The system will be very much useful for an illiterate and vision impaired people to hear and understandthe content, where they face many problem in their day to day life due to the differences in their script system. . . . 34

Cross Domain Sentiment Classification, S.Abilasha,C.H.Chithira,Fazlu Rahman,P.K.Megha,P.Safva Mol, ManuMadhavanSentiment analysis refers to the use of natural language processing and machine learning techniques to identify andextract subjective information in a source material like product reviews. Due to revolutionary development in webtechnology and social media reviews can span so many different domains that it is difficult to gather annotatedtraining data for all of them. A cross domain sentiment analysis invokes adaptation of learned information of some(labeled) source domain to unlabelled target domain. The method proposed in this project uses an automatically createdsentiment sensitive thesaurus for domain adaptation. Based on the survey conducted on related literature, we identifiedL1 regularized logistic regression is a good binary classifier for our area of interest. In addition to the previous workwe propose the use of sentiwordnet and adjective-adverb combinations for those effective feature learning. . . . . . . 41

Sreepathy Journal of Computer Science and Engg.,Vol 1, Issue 1, June 2014

Secret sharing of Color Images by N out N POBSystem

Ganesh P 1, G Santhosh Kumar2,A Sreekumar3 ,1Dept. of Computer Science and Engg,SIMAT, Vavanoor, Palakkad

[email protected],3Department of Computer Application Cochin University of Science & Technology,Kochi

Abstract—In this paper, a new secret sharing scheme for colorimage is proposed. Our scheme uses a method to construct anN out of N secret sharing scheme for color images based on anew number system called Permutation Ordered Binary (POB)Number System .This scheme is an efficient way to hide secretinformation in different shares. Furthermore the size of the sharesis less than or equal to the size of the secret. This methodreconstructs the secret image with original quality.

Keywords—Visual Secret Sharing, Visual cryptography, POBnumber system.

I. INTRODUCTION

SECRET sharing scheme is a method of distributing a secretamong a group of participants, each of which is allocated a

share of the secret. The secret can only be reconstructed whenthe shares are combined together; individual shares are of nouse on their own. Secret sharing was introduced by Shamir [1]in 1979.Shamirs solution is based on polynomial interpolationin finite field. Several secret sharing schemes were proposed,but most of them need a lot of computations to decodethe secret. In 1994, Naor and Shamir [2] introduced visualcryptography scheme (VCS) which allows visual information( pictures, text, etc. ) to be encrypted in such a way that thedecryption can be performed by the human visual system,without the aid of computers. The drawback of the schemeis that it works only with black and white images.In 1997, Verheul and Van Tilburg [7] used the concept ofarcs to construct a colored visual cryptography scheme. Themajor disadvantage is that the number of colors and thenumber of subpixels determine the resolution of the recoveredsecret image. F.Liu et al.(2008) presents different work doneto improve the quality of the recovered images. The majordisadvantages of these schemes are the size of the sharesincreases with the number of participants and the quality ofrecovered secret is less than of the original secret.Rest of the paper is organized as follows. Section 2 brieflyreviews the permutation ordered number system and the algo-rithms related to N out of N threshold scheme. Our approachfor color image is given in Section 3. Results and discussionsare given in Section 4. Section 5 concludes the work.

II. PERMUTATION ORDERED BINARY (POB) NUMBERSYSTEM

POB number system[14] is defined with two non-negativeintegral parameters n and r where nr and is denoted as POB

(n,r).The system represents all decimal integers in the range0, 1,....,nCr−1 as a binary string B=bn−1 bn−2 ....b0 of lengthn and having exactly r number of 1s.The binary string B isknown as POB number. There exists a POB value for eachPOB number B, denoted as V(B) and is calculated using thefollowing formulae.

V (B) =∑ n−1

i=0 bjjCPj where Pj =

∑ ji=0bj

POB (9, 4) system is used in this paper. So each binary stringB is of length 9 with exactly 4 ones.There are 126 POBvalues with the range [0,125]. For POB (9, 4) number systemthe smallest POB number is 000001111 (decimal value is15 and POB value is 0C1+1C2+2C3+3C4=0) and the largestPOB number is 111100000(decimal value =480 and POBvalue =5C1+6C2+7C3+8C4 = 5+15+35+70=125).

POB value of (111100000)p= 5C1+6C2+7C3+8C4 =5+15+35+70=125. The binary representation of 125 is1111101.

A. N out of N Construction using POB [14]The N out of N secret sharing scheme encrypts the secret

image into N shares so that only someone with all N sharescould decrypt the image, while any N-1 shares revealed noinformation about the original image. Like the traditional vi-sual cryptography, the POB scheme uses only XOR operationsfor encryption and decryption of the secret .The N out of Nconstruction is based on the following theorem.Teorem: Let T be a binary string of even parity having length9,then we can find two binary strings A and B, each havingexactly four 1s and five 0s such that T = A⊕B.Proof: We can assume without loss of generality that theleading 2m digits of T are 1s where0 ≤ m ≤ 4and theremaining 9-2m digits are 0s. Now let A=PQ be the binarystring obtained by concatenating the strings P and Q, where Pis a string having exactly m 1s and 0s, and Q is having exactly4-m 1s and 5-m 0s.Then the choice B = P ′Q ,where P ′ isthe complement of P will prove the theorem.

B. Recovery ProcedureTo reconstruct the secret image/information from the N

shares, first find out the decimal equivalent of each 7 bit binarystring and perform the division on the second POB value by 14

c©Dept. of Computer Science & Engg.,Sreepathy Institute of Management And Technology, Vavanoor, Palakkad, 679533.

1

Ganesh P, et al., Secret sharing of Color Images by N out N POB System

and store the quotient as r. Then calculate the POB numbers ofeach decimal value and perform the XOR operation on thesePOB numbers. The result is stored in an array T of size 9.Inorder to get the secret information K, rth bit of T is droppedout. The detail of N out N construction using POB Scheme isdiscussed in [14].

III. PROPSED METHOD

The objective of the proposed scheme is to generate (N,N)scheme by apply the POB(9,4) system for color image. Theaim is to get better quality decrypted images with the sizesame as the original image. Figure 1 show the flowchartof the encryption algorithm. In this encryption algorithm acolor image is decomposed into three channels and eachchannel is considered as a gray-level image. For each graylevel image halftoning is applied and a single gray scaleimage is generated. Perform POB (9,4) encryption schemeto accomplish the creation of N shares. Figure 2 shows theflowchart of decryption where, we superimpose (using XORoperation) the shares of each channel to get the decryptedimage of each channel. These three decrypted channels arecombined to get the decrypted color image.

Fig. 1: Encryption Algorithm.

A. EncryptionIn the encryption algorithm the shares are generated from

the color image. The color image is decomposed into CR,CGand CB channels. From these channels the shares are createdusing following steps:

1) The values of CR and CG channels are mapped into 8levels and the values of C B channel into 4 levels asshown in Table1 and Table2.

2) Generate single gray scale image from the R,G and Bcomponents obtained from the level description using

Fig. 2: Decryption Algorithm.

a halftone method defined by the equation Gray scaleimage=32*R+4*G+B. The value of each element ofGray scale image is in the range [0,255].

3) N out N POB(9,4) encoding method described in Sec-tion 2.1.1 is used for creating N encrypted shares. Theencrypted shares are ES1,ES2,...ESN .

B. Decryption

In the decryption algorithm the color image channels arereconstructed by stacking the shares of channels. These colorimage channels are combined to get the secret color image.

1) Find the 9 bit POB numbers corresponding to thePOB values (elements) represented by the encryptedshares ES1,ES2, ....., ESN and store into A1,...,AN

respectively. Perform the XOR operation on A1,...,AN ,and store the result in T. T is a matrix of mXn POBnumbers where m is the width and n is the height ofthe share image.

2) Divide each element of share ES2 with 14 and store thequotient in Q.

3) Shift method: Let r=Qij where 1im, 1jn. Delete the bitin the r th position of Tij , shifting left the remainingbits by 1. The resultant T represents the recovered grayscale image.

4) Shift method: Let r=Qij where 1im, 1jn. Delete the bitin the r th position of Tij , shifting left the remainingbits by 1. The resultant T represents the recovered grayscale image.

5) For each element of gray scale image first three bitsrepresent red component level, next three bits representthe green component level and the remaining representsthe blue component level. Select random numbers in thevalue range of the corresponding level of each compo-nent based on their level description table and assign

Sreepathy Journal of Computer Sc. & Engg. Vol.1, Issue.1, June-2014

2


them to the corresponding channel values. Combinechannels to get the decrypted color image.

IV. RESULT AND DISCUSSION

As described in Section 3, three color channels are ex-tracted from the color image. Halftoning method with leveldescriptor table is applied to produce single gray-scale image.The 3out3POB (9,4) scheme is then applied to the resultantgray scale image to produce 3 shares. The standard Lenacolor image of size 256 256 is used for the experiment.The results are shown in Fig.3 from a-h. It is clear that thesecret is revealed only when the three schemes are combined.The clarity of the obtained image is comparable with theoriginal image. Table 3 represents the size of decrypted image

using different (n,n) color visual cryptography schemes with cnumber of colors. In all methods the size of decrypted imageis increased, where in the proposed method the size remainsame as that of original image.

A. Analysis of AttackWe have analyzed the effects of attack on proposed scheme.

In the construction under the POB (9, 4) number system,there are 126 shares corresponding to one byte of secret. Theprobability of correct guess of a share is (1/126) per byte

Fig. 3: (a) Secret Image (b)-(d) Shares of participants (e)Reconstructed image using share1 and share2 (f)

Reconstructed image using share1 and share3 (g) usingshare2 and share3 (h) Reconstructed image using share1,

share2 and share3.

of secret. This would mean that for a secret of m bytes,the probability of correct guess of a share will be as low as(1/126)m, which tends to 0 as m becomes large.

V. CONCLUSION

In this paper we have proposed a (9,4) POB scheme for colorimages which uses the halftoning on color channels. The XORoperation is used in stacking which produces better quality ofimage and there is no expansion in the size of decrypted image.The quality of decrypted image is shown to be better than theother schemes .The basic (n,n) threshold visual cryptographyscheme is used for color images in which the size of shareimage is nn-1. By optimizing the storage of POB values inthe memory it is possible to reduce the size of the shares. Toimprove the quality of the images POB number system can bedirectly applied to each RGB components of the image. Theseare treated as future work.

REFERENCES

[1] Shamir, Adi (1979). ”How to share a secret”. Communications of theACM 22 (11): pp. 612-613. doi:10.1145/359168.359176.

[2] MoniNaor ,Adi Shamir, Visual Cryptography, EUROCRYPT 1994, pp.112

[3] Ateniese, G., Blundo, C., De Santis, A., and Stinson, D. R. (1996),Con-structions andbounds for visual cryptography,23rd International Collo-quium on Automata , Languages and Programming (ICALP 96), LectureNotes in Computer Science,,Vol. 1099, pp. 416-428, Springer-Verlag,Berlin.

[4] Ateniese, G., Blundo, C., De Santis, A., and Stinson, D. R. (1996),Visualcryptography for General Access Structures,Information and Computa-tion , vol.129(2),p.86-106.

[5] Atici,M.,Stinson,D.R and Wei,R.,A New Practical Algorithm for theconstruction of a Perfect Hash Function,final version,1995

[6] Ateniese, G., Blundo, C., De Santis, A., and Stinson, D. R. (1996),Extended Schemesfor VisualCryptography, submitted to Discrete AppliedMathematics.

[7] Verheul,E., and Tilborg, H. V. , Constructions and Propertics of k outof n Visual Secret Sharing Schemes, Designs, Codes and Cryptography,vol. 11 no.2, 1997, pp. 179-196.


3


[8] Hsien-Chu Wu, Hao-Cheng Wang and Rui-Wen Yu, Color visual Cryp-tography Scheme using Meaningful Shares, Eighth International Confer-ence on Intelligent Systems Design and Applications, Volume 3, pp.173178, 2008.

[9] Kirankumari, Shalinibhatia ,Multi-pixel Visual Cryptography for colorimages with Meaningful Shares ,International Journal of EngineeringScience and Technology Vol.2(6), 2010, 2398-2407.

[10] Nagaraj V. Dharwadkar, B. B. Amberker, Sushil Raj Joshi, VisualCryptography for Color Image using Color Error Diffusion, ICGST-GVIPJournal, vol.10,issue.1,February 2010.

[11] B.SaiChandana , S.Anuradha , A New Visual Cryptography Schemefor Color Images, International Journal of Engineering Science andTechnology,vol.2(6), 1997-2000 ,2010.

[12] F. Liu, C.K. Wu, X.J. Lin, Colour visual cryptography schemes, IETInformation Security, 2008,Vol. 2, No. 4, pp. 151165.doi: 10.1049/ietifs:20080066.

[13] Chin-Chen Chang,Chwei-ShyongTsai,Tung-Shou Chen, A New Schemefor Sharing Secret Color Images in Computer Network, ICPADS ’00Proceedings of the Seventh International Conference on Parallel andDistributed Systems,pp. 21- 27.

[14] Sreekumar,A., BabuSundar,S., An Efficient Secret Sharing Scheme forn out of n scheme using POB-number system,IIT Kanpur Hackersworkshop 2009.


4


Locating Emergency Responders in Disaster AreaUsing Wireless Sensor Network

Aparna MDept. of Computer Science and Engg,

SIMAT, Vavanoor, [email protected]

Abstract—A worldwide increase in number of natural hazardsis causing heavy loss of human life and infrastructure. An effectivedisaster management system is required to reduce the impacts ofnatural hazards on common life. The first hand responders play amajor role in effective and efficient disaster management. Locat-ing and tracking the first hand responders are necessary to orga-nize and manage real-time delivery of medical and food suppliesfor disaster hit people. This requires effective communicationand information processing between various groups of emergencyresponders in harsh and remote environments. Locating, tracking,and communicating with emergency responders can be achievedby devising a body sensor system for the emergency responders.In phase 1 of this research work, we have developed an enhancedtrilateration algorithm for static and mobile wireless sensor nodes.This work discusses an algorithm and its implementation forlocalization of emergency responders in a disaster hit area. Theindoor and outdoor experimentation results are also presented.

Keywords—Wireless sensor networks, Localization, Disaster area,Emergency Responders.

I. INTRODUCTION

THE advent of Wireless Sensor Network (WSN) hasmarked an era in the sensing and monitoring field. The

technology has made possible to monitor otherwise remoteand inaccessible areas such as active volcanoes, avalanchesand so on. WSN is widely being used in various areas,such as, environmental monitoring, medical care, and disasterprevention and mitigation. This paper details yet anotherapplication of WSN in the post disaster scenario and comesup with an algorithm for localization. When a disaster hasstruck an area, it is important to act immediately to rescueand give first line help in the form of medical aid, food andso on to the people in that area. Thus the role of first lineemergency responders becomes a vital part of the post disasterscenario. In a disaster scenario, locating and tracking the firsthand responders is essential to organize and manage real-timedelivery of medicine and food to disaster hit people. Thisrequires effective communication and information processingbetween various groups of emergency responders in harsh andremote environments.

Locating, tracking, and communicating with emergencyresponders can be achieved by devising a body sensor systemfor the emergency responders. This project aims to locatethe emergency responders in different locations. In a disasterhit area whole communication networks may get damagedand the communication between responders is not possible.

So localization of responders is very difficult with othertechnologies other than WSN.

This research project is an application of Wireless SensorNetwork in disaster management. The paper addresses thedevelopment of an algorithm that can perform precise local-ization and tracking of the responders with indirect line-of-sight. Responders will be randomly located in the area so anad-hoc network will be formed between the sensor nodes. Thefollowing sections briefs the role of responders and the locationtracking algorithms used.

II. RELATED WORK

A lot of indoor localization algorithm has been developedusing Received Signal Strength Indicator (RSSI). A method fordistance measurement using RSSI has been discussed in [7].The accuracy of RSSI method can be very much influencedby multi- path, fading, non-line of sight (NLoS) conditionsand other sources of interference [11]. In a disaster prone areathese effects are more pronounced and as such RSSI methodcannot be used in our study. Coordinate estimation methodof localization using the principle of GPS has been suggestedin [1]. Since the GPS modules could be costly, it has beendiscarded for our study.

In this project work we focused on a category of localizationmethods which estimate coordinate based on distance measure-ment. Instead of considering signal strength, time of arrivalof each packet is considered which increases the accuracy oflocation. The algorithm is based on the Time Difference ofArrival (TDOA) method the accuracy of which is much higherwhen compared to the RSSI and we also present optimizationmethods to decrease the error of estimating the location usingTDOA.

III. SYSTEM ARCHITECTURE

As suggested in [13], creating a common operating picturefor all responders in an emergency situation is essential to takeappropriate action in the disaster hit area and the safety of theresponders. Protective suits used by the responders to be safefrom hazardous materials create unique problems for responseteams because their protective suits often make it difficult toread instrument screens, and if subject-matter experts arent onscene, responders must find ways to relay the information backto them.

The aim is to develop an algorithm that can perform preciselocalization of sensor nodes with indirect line-of-sight by


5

Aparna M, Locating Emergency Responders in Disaster Area Using Wireless Sensor Network

utilizing location information and distance measurements overmultiple hops. To achieve this goal, nodes use their rangingsensors to measure distance to their neighbors and share theirmeasurement and location information with their neighborsto collectively estimate their locations. Multilateration is asuitable method for localization in outdoor navigation. Asdescribed in [10], when the receiver sends the signal to locateitself, it finds out at least three nearest anchor nodes whichknow their positions. The receiver then calculates the distancebetween one satellite and the receiver. If the distance is X, thenit draws an imaginary sphere with X as the radius from thereceiver to the satellite and also the node as the centre [13].The same process is repeated for the next two nodes. Thusthree spheres are drawn with just two possible positions. Outof these one point will be in space and the other will be thelocation of the receiver. Thus the exact position of the receiveris found out. Usually the receivers try to locate more than foursatellites so as to increase the accuracy of the location. TheEarth is made as the fourth sphere so that two points convergewith the imaginary spheres of the other three satellites. Thismethod is commonly called 3-D Trilateration method. Here inthis paper we have tried to implement a modified version ofthe above mentioned method.

Fig: 5 shows the entire architecture of the system. Theentire wireless sensor network is formed by required number ofMicaZ mote. MicaZ mote includes the program for Localiza-tion, tracking and monitoring the position of unknown node.Also it includes the program for time synchronization. The

Fig. 1: Systm Architecture

TinyDB plays an important role here. Tiny database extractquery information from a network of motes. TinyDB providesa simple Java API for writing PC applications that query andextract data from the network; it also comes with a simplegraphical query-builder and result display that uses the API.

IV. ALGORITHM DESIGN

The algorithm used here is a trilateration algorithm, and isimplemented using TinyOS in the NesC language. This is thefirst step for responders localization.

Trilateration is the method of using relative position of thenearby objects to calculate the exact location of the objectof interest. Instead of using a known distance and an anglemeasurement as in normal triangulation methods, we usedthree known distances to perform the calculation. Here we

Fig. 2: TinyDB Architecture

use a GPS receiver to find out the coordinate points of threenodes.Algorithm Localization Algorithm1. Set N numbers of nodes in the field and synchronize all

the nodes.2. Each of them continuously sent RF signals with packet

containing a field for time stamp, i.e. whenever RF signalis sent by an unknown node, unknown node add the timeof sending RF signal to the packet it broadcasts.

3. In addition there are B beacon nodes.4. Each node ni,1 ≤ i ≤ N estimate its distance dij from

each beacon j 1 ≤ j ≤ M where M is the no: of beaconnodes in its transmission range(assume M as 3 nodes)

5. Each beacon node will calculate the distance from nodeDistance = speed× time

6. Making this distance as radius each beacon create a circle.7. The intersecting point of these 3 circles will be the

position of unknown node.8. Go to Step5 if the unknown node is moving and track

its position at regular instance of time. Generate thegenetically evolved population of cellular automata rules.

9. Stop.

Fig. 3: Graphical Representation of Trilateration

For example (see Figure 2), point B is where the object ofinterest is located. The distances to the nearby objects P1and P2 are known. From geometry, it can be concluded thatonly two possible locations, A and B, can satisfy the criteria.To avoid ambiguity, the distance of the third nearby objectis introduced and now there is only one point B that could


6


possibly exist [14].If we apply the concept of 2-D trilateration to a GPS

application which exists in 3- D space, the circles in Figure 2become spheres.

In order for the receiver to calculate the distance from B,B send RF signals with time stamping. We know the speed ofRF signal as 3x108m/s and we know the time of sending fromthe time stamping value. From this distance can be measuredusing equation 1.

Distance = speed× time (1)

Using this calculated distance as radius and p1 as center, drawa circle. Repeat the same steps for p2 and p3. We will getthree circles. Solving the three equations of the circle we getone intersecting point.Consider three circles with positions as (x1,y1, z1); (x2,y2, z2)(x3,y3, z3);

X = (4k1 + 3k2 − 2k1)/6 (2)

Y = (2k1 + 3k2 − k3)/6; (3)

Z = (k1 + k2 − k3)/2 (4)

X, Y, Z are the co-ordinates of unknown node.

V. IMPLEMENTATION

With the entire setup field testing is also find out. Thereadings from the mote is calculated from the outdoor andfind out the readings and they are tabulated which is shownin the table. The setup includes one unknown node and three

beacon nodes. The beacon nodes calculate the distance of theunknown node from them and calculate the intersection pointof the three circles formed by calculated distances. The actualdistance, the time stamp and speed of the wave is tabulated intable1.

The beacon node uses the time stamp to calculate the time ofarrival. Knowing the speed of the RF waves the distance couldbe computed as discussed in section 3.1.1. This distance iscalculated in beacon node which localizes the unknown node.The data from beacon node is viewed using XSniffer (figure4).The beacon node uses the time stamp to calculate the time ofarrival. Knowing the speed of the RF waves the distance couldbe computed as discussed in section 3.1.1. This distance iscalculated in beacon node which localizes the unknown node.The data from beacon node is viewed using XSniffer (figure4).

A second method was tried out where the unknown nodeitself will act as a sink node. The beacon nodes calculatethe distance and send it to the unknown node which doesthe localization calculations. This will avoid the use of morenumber of nodes. It was found that the accuracy of thecalculation increases. The XSniffer output is as shown infigure 5. Actual distance is measured using the measuring

Fig. 4: Distance measurement as viewed using XSniffer

Fig. 5: Distance measurement

instruments. Using the micaz mote we calculated the distanceand viewed the result using XSniffer. We did an analysis onactual distance and calculated distance which has been plottedin Figure 6 and came up with following three conclusions:

1) There are small differences in actual value and calcu-lated values.

2) Error rate always remains in the range of 0.025%-0.5%.3) The error rate is minimum when the unknown node and

the beacon node are closer. When they are far from eachother, error rate increases due to interference and delay.

Multilateration with a number of beacon nodes is a goodmethod to reduce error in calculations. Multilateration, alsoknown as hyperbolic positioning, is the process of locatingan object by accurately computing the time difference ofarrival (TDOA) of a signal emitted from that object to threeor more receivers. It also refers to the case of locating areceiver by measuring the TDOA of a signal transmitted fromthree or more synchronized transmitters. In practice, errors


7


in the measurement of the time of arrival of pulses meansthat enhanced accuracy can be obtained with more than fourreceivers. In general, N receivers provide N 1 hyperboloids.When there are N ¿ 4 receivers, the N 1 hyperboloids should,assuming a perfect model and measurements, intersect on asingle point. In reality, the surfaces rarely intersect, becauseof various errors. In this case, the location problem canbe posed as an optimization problem and solved using, forexample, least squares method or an extended Kalman filter.Additionally, the TDOA of multiple transmitted pulses fromthe emitter can be averaged to improve accuracy.

VI. ADVANTAGES

Main advantage of the system is the use of distance measure-ments when compared to RSSI method which is based on themeasurement of signal strength. The signal strength fluctuatesevery time as such accurate localization of the object in theplane may not be possible. In the above said algorithm therelationship between speed and time is used for measuring thedistance. The time Stamping interface is used to keep track ofthe time. By knowing the speed of RF signal it is possible tocalculate the distance.

VII. CONCLUSION AND FUTURE WORKS

This project tries to develop a proper localization of firstresponders. Algorithm is implemented and tested; trilaterationalgorithm is used for localization and extracts the value oftime from Time stamping. From this distance is measured. Sothere is no ranging problem. When we measure distance usingstrength of radio signal there is a problem like reducing signalstrength due to different factors. Took the relationship betweentime and speed and calculated the distance. This method helpsto locate the real coordinate of object in the plane.

This work could be extended by implementing the algorithmin a tag which has the capability to perform the functions ofGPS receiver. So that exact location can be identified withgreat accuracy in an outdoor environment.

The developed algorithm could be extended to Mulilatera-tion with minor changes. When the number of nodes increasesthe accuracy increases. Also the unknown node can act as asink node and reduce the complexity of the network.

REFERENCES

[1] Hu, L., Evans, D.: Localization of mobile sensor networks. In: IEEEInfoCom 2000 (March 2000)

[2] Sabatto, S.Z., Elangovan, V., Chen, W., Mgaya, R.: Localization Strate-gies for Large- Scale Airborne Deployed Wireless Sensors. IEEE, LosAlamitos (2009)

[3] Kang, Li, X.: Power-Aware Markov Chain Based Tracking. IEEEComputer 37(8), 4149 (2004)

[4] Park, J.Y., Song, H.Y.: Multilevel Localization for Mobile SensorNetworkPlatforms. Proceedings of the IMCSIT 3 (2008)

[5] http://www.tinyos.net/tinyos-1.x/doc/tutorial[6] Ramesh, M.V., Kumar, S., Rangan, P.V.: Wireless Sensor Network

for Landslide Detection. In: Proceedings of the Third InternationalConference on Sensor Technologies

[7] V.: RADER: An In-Building RF-based User Location and TrackingSystem. In: Proc. of IEEE INFOCOM 2000, pp. 775784 (2000)

[8] Biswas, P., Ye, Y.: Semidefinite Programming for Ad Hoc WirelessSensor Network Localization. In: 3rd International Symposium onInformation Processing

[9] Bulusu, N., Estrin, D., Girod, L., Heidemann, J.: Scalable Coordinationfor Wireless Sensor Networks: Self-Configuring Localization Systems.In: Proceedings of the 6th IEEE International Symposium.

[10] Kannan, A., Mao, G., Vucetic, B.: Simulated Annealing based Localiza-tion in Wireless Sensor Network. In: Vehicular Technology Conference(2006), ieeexplore.ieee.org.

[11] Mao, G., Fidan, B., Anderson, B.D.O.: Wireless Sensor NetworkLocalization Techniques. The International Journal of Computer andTelecommunications Networking (2007).

[12] Sichitiu, M.L., Ramadurai, V.: Localization of Wireless Sensor Net-works with a Mobile Beacon. In: Proceedings of the IEEE ICMASS(2004).

[13] Ladd, M., Bekris, K.E., Rudys, A., Kavraki, L.E., Wallach, D.S.:Robotics-Based Location Sensing UsingWireless Ethernet. In: Interna-tional Conference on Mobile Computing and Networking (2002).

[14] Simic, S.N., Sastry, S.: Distributed Localization in Wireless AdHoc Networks. Tech. report, UC Berkeley, 2002, Memorandum No.UCB/ERL M02/26 (2002).

[15] Evans, D., Hu, L.: Localization for mobile sensor networks. Proc. IEEEFGCN 2007 workshop chairs (2007).


8


Brute Force Attack Defensing With Online PasswordGuessing Resistant Protocol

1Jyothis K P, 2Padmadas M P, 3Krishnan N1Dept. of Computer Science & Engg.,

Sreepathy Institute of Management And Technology, [email protected]

2Research Scholar, Centre for Information Technology and Engineering, M.S University, Tirunelveli, [email protected]

3Professor,Centre for Information Technology and Engineering, M.S University, Tirunelveli, India

[email protected]

Abstract—Brute force and dictionary attacks on password- onlyremote login services are now widespread and ever increasing.Enabling convenient login for legitimate users while preventingsuch attacks is a difficult problem. Automated Turing Tests(ATTs) continue to be an effective, easy-to-deploy approach toidentify automated malicious login attempts with reasonablecost of inconvenience to users.In this paper, we discuss theinadequacy of existing and proposed login protocols designedto address largescale online dictionary attacks (e.g., from abotnet of hundreds of thousands of nodes). We propose a newPassword Guessing Resistant Protocol (PGRP), derived uponrevisiting prior proposals designed to restrict such attacks. WhilePGRP limits the total number of login attempts from unknownremote hosts to as low as a single attempt per username,legitimate users in most cases (e.g., when attempts are madefrom known, frequently-used machines) can make several failedlogin attempts before being challenged with an ATT. We analyzethe performance of PGRP with two real-world data sets and findit more promising than existing proposals.PGRP accommodatesboth graphical user interfaces (e.g., browser-based logins) andcharacter-based interfaces (e.g., SSH logins), while the previousprotocols deal exclusively with the former, requiring the use ofbrowser cookies. PGRP uses either cookies or IP addresses, orboth for tracking legitimate users

Keywords—Online password guessing attacks, brute force at-tacks, password dictionary, ATTs.

I. INTRODUCTION

THE Online guessing attacks on password-based systemsare inevitable and commonly observed against web appli-

cations and SSH logins. In a recent report, SANS identifiedpassword guessing attacks on websites as a top cyber securityrisk. As an example of SSH password guessing attacks, oneexperimental Linux honey pot setup has been reported tosuffer on average 2,805 SSH malicious login attempts percomputer per day Interestingly, SSH servers that disallow stan-dard password authentication may also suffer guessing attacks,e.g., through the exploitation of a lesser known/used SSHserver configuration called keyboard interactive authentication.However, online attacks have some inherent disadvantagescompared to offline attacks: attacking machines must engage inan interactive protocol, thus allowing easier detection; and in

most cases, attackers can try only limited number of guessesfrom a single machine before being locked out, delayed, orchallenged to answer Automated Turing Tests (ATTs, e.g.,CAPTCHAs). Consequently, attackers often must employ alarge number of machines to avoid detection or lock-out.

Fig. 1: Use Case Diagram for Password Attack

One effective defense against automated online passwordguessing attacks is to restrict the number of failed rials withoutATTs to a very small number (e.g., three), limiting automatedprograms (or bots) as used by attackers to three free passwordguesses for a targeted account, even if different machines froma botnet are used.

II. LITERATURE SURVEYEven the best current guidelines for designing password-

composition policies, for instance, are based on theoretical es-timates or small-scale laboratory studies (e.g., [12, 20]). Whatmakes designing an appropriate password- composition policyeven trickier is that such policies affect not only the passwordsusers create, but also users behavior. For ex-ample, certainpassword-composition policies that lead to more-difficult-to-predict passwords may also lead users to write down theirpasswords more readily, or to become more averse to changingpasswords because of the additional ef-fort of memorizing


9

Jyothis K P, et. al., Brute Force Attack Defensing With Online Password Guessing Resistant Protocol

the new ones. Such behavior may also affect an adversarysability to predict passwords and should therefore be taken intoaccount when selecting a policy. For instance, we comparedtwo password-composition policies: one required only thatpasswords be at least 16 characters long; the other required atleast eight characters but also an uppercase letter, a number, asymbol, and a dictionary check. According to the best availableguidelines , these two policies should result in passwords ofapproximately the same entropy. We find, however, that the16-character policy yields significantly less predictable pass-words, and that it is, by several metrics, less onerous forusers. believe this and other findings will be useful both tosecurity professionals seeking to establish or review password-composition policies, and to researchers interested in exam-ining how to make passwords more secure and usable.Namely,OpenID has little evidence of user adaptations 10, Object basedpasswords 10 has immerged recently and little is known yetand many other schemes 10.C. Herley and P.C. van Oorschotargue that no silver bullet will meet all requirements, and notonly will passwords be with us for some time, but in manyinstances they are the solution which best fits the scenarioof use 10. C.Herley, PC.van Oorschot, J. Bonneau, C. and F.Stajano, shows that selected scheme authors for their benefitanalysis in their survey paper, are not only optimistic but alsoincomplete, using the framework they have defined.CormacHerley argues that most security advice simply offers a poorcost-benefit trade-off to users and the users’ rejection ofthe security advice they receive is entirely rational from aneconomic perspective 10. So the password security communityhas looked back the research history of password security andusability and came up with new paradigms of solutions andyet shows that the text password practice wont end soon

III. ONLINE PGRP

Given In this section, we present the PGRP protocol, in-cluding the goals and design choices.

A. Goals, Operational Assumptions and Overview

1) Protocol Goals: Our objectives for PGRP include thefollowing:

• The login protocol should make brute force and dictio-nary attacks ineffective even for adversaries with accessto large botnets (i.e., capable of launching the attackfrom many remote hosts).

• The protocol should not have any significant impact onusability (user convenience). For example: for legitimateusers, any additional steps besides entering login cre-dentials should be minimal. Increasing the security ofthe protocol must have minimal effect in decreasing thelogin usability.

• The protocol should be easy to deploy and scalable,requiring minimum computational resources in terms ofmemory, processing time, and disk space.

2) Assumptions: We assume that adversaries can solvea small percentage of ATTs, e.g., through automated pro-grams, brute force mechanisms, and low paid workers (e.g.,Amazon Mechanical Turk ). Incidents of attackers using IPaddresses of known machines and cookie theft for targetedpassword guessing are also assumed to be minimal. Traditionalpassword-based authentication is not suitable for any un trustedenvironment (e.g., a key logger may record all keystrokes,including passwords in a system, and forward those to a remoteattacker). We do not prevent existing suchattacks in untrustedenvironments, and thus essentially assume any machines thatlegitimate users use for login are trustworthy. The data integrityof cookies must be protected (e.g., by a MAC using a keyknown only to the login server).

3) Overview: The general idea behind PGRP is that exceptfor the following two cases, all remote hosts must correctlyanswer an ATT challenge prior to being informed whetheraccess is granted or the login attempt is unsuccessful:

1) when the number of failed login attempts for a givenusername is very small.

2) when the remote host has successfully logged in usingthe same username in the past (however, such a hostmust pass an ATT challenge if it generates more failedlogin attempts than a prespecified threshold).

In contrast to previous protocols, PGRP uses either IP ad-dresses, cookies, or both to identify machines from whichusers have been successfully authenticated. The decision torequire an ATT challenge upon receiving incorrect credentialsis based on the received cookie (if any) and/or the remote hostsIP address. In addition, if the number of failed login attemptsfor a specific username is below a threshold, the user is notrequired to answer an ATT challenge even if the login attemptis from a new machine for the first time (whether the providedusername password pair is correct or incorrect).

B. Data Structure and Function Description

1) Data Structures: PGRP maintains three data structures:1) W. A list of source IP address, username pairs such

that for each pair, a successful login from the source IPaddress has been initiated for the username previously.

2) FT. Each entry in this table represents the numberof failed login attempts for a valid username, un. Amaximum of k2 failed login attempts are recorded.Accessing a nonexisting index returns 0.

3) FS. Each entry in this table represents the number offailed login attempts for each pair of (srcIP, un). Here,srcIP is the IP address for a host in W or a host witha valid cookie, and un is a valid username attemptedfrom srcIP

A maximum of k1 failed login attempts arerecorded;crossing this threshold may mandate passingan ATT (e.g.,depending on FT12un ). An entry is set to 0after a successful login attempt. Accessing a nonexisting indexreturns 0. Each entry in W, FT, and FS has a “write-expiry“interval such that the entry is deleted when the given period oftime (t1, t2, or t3) has lapsed since the last time the entry was


10


inserted or modified. There are different ways to implementwrite-expiry intervals (e.g., hashbelt ).

A simple approach is to store a timestamp of the insertiontime with each entry such that the timestamp is updated when-ever the entry is modified. At anytime the entry is accessed,if the delta between the access time and the entry timestampis greater than the data structure write-expiry interval (i.e., t1,t2, or t3), the entry is deleted.

2) Functions: PGRP uses the following functions (IN de-notes input and OUT denotes output):

1) ReadCredential(OUT: un,pw,cookie). Shows a loginprompt to the user and returns the entered usernameand password, and the cookie received from the usersbrowser (if any).

2) LoginCorrect(IN: un,pw; OUT: true/false). If the pro-vided username-password pair is valid, the functionreturns true; otherwise, it returns false.

3) GrantAccess(IN: un,cookie). The function sends thecookie to the users browser and then enables accessto the specified user account.

C. Cookies versus Source IP Addresses

Similar to the previous protocols, PGRP keeps track of usermachines from which successful logins have been initiated pre-viously. Browser cookies seem a good choice for this purposeif the login server offers a web-based interface. Typically, ifno cookie is sent by the user browser to the login server,the server sends a cookie to the browser after a successfullogin to identify the user on the next login attempt. However,if the user uses multiple browsers or more than one OS onthe same machine, the login server will be unable to identifythe user in all cases. Cookies may also be deleted by users,or automatically as enabled by the private browsing mode ofmost modern browsers. Moreover, cookie theft (e.g., throughsession hijacking) might enable an adversary to impersonatea user who has been successfully authenticated in the past. Inaddition, using cookies requires a browser interface (which,e.g., is not applicable to SSH). Alternatively, a user machinecan be identified by the source IP address. Relying on source IPaddresses to trace users may result in inaccurate identificationfor various reasons, including:• The same machine might be assigned different IP ad-

dresses over time (e.g., through the network DHCPserver and dial-up Internet)

• A group of machines might be represented by a smallernumber or even a single Internet-addressable IP addressif a NAT mechanism is in place. However, most NATsserve few hosts and DHCPs usually rotate IP addresseson the order of several days (also, techniques to identifymachines behind a NAT exist Drawbacks of identifyinga user by means of either a browser cookie or a sourceIP address) include:◦ Failing to identify a machine from which the user

has authenticated successfully in the past.◦ Wrongly identifying a machine the user has not

authenticated before.

Case 1) decreases usability since the user might be asked toanswer an ATT challenge for both correct and incorrect logincredentials. Case 2) affects security since some users/attackersmay not be asked to answer an ATT challenge even thoughthey have not logged in successfully from those machines inthe past. However, the probability of launching a dictionaryor brute force attack from these machines appears to below. First, for identification through cookies, a directed attackto steal users cookies is required by an adversary. Second,for identification through IP addresses, the adversary musthave access to a machine in the same subnet as the user.Consequently, we choose to use both browser cookies andsource IP address (or only one of them if the other is notapplicable) in PGRP to minimize user inconvenience duringthe login process. Also, by using IP addresses only, PGRP canbe used in character-based login interfaces such as SSH. AnSSH server can be adapted to use PGRP using text-based ATTs(e.g., textcaptcha.com). For example, a prototype of a text-based CAPTCHA for SSH is available as a source code patchfor OpenSSH. The security implications of mistakenly treatinga machine as one that a user has previously successfullylogged in from is limited by a threshold such that after aspecific number of failed login attempts (k1 in Fig. 1), anATT challenge is imposed. For identification through a sourceIP address, the condition FS12srcIP; un < k1 in line 4 (forcorrect credentials) and in line 16 (for incorrect credentials)limits the number of failed login attempts an identified usercan make without answering ATTs the function Valid (cookie,un, k1, true) in line 4 updates a counter in the received cookiein which the cookie is considered invalid once this counter hitsor exceeds k1. This function is also called in line 16 to checkthis counter in case of a failed login attempt.

D. Decision Function for Requesting ATTsBelow we discuss issues related to ATT challenges as pro-

vided by the login server in Fig. 1. The decision to challengethe user with an ATT depends on two factors:• whether the user has authenticated successfully from the

same machine previously.• the total number of failed login attempts for a specific

user account. For definitions of W, FT, and FS.1) Username-Password Pair Is Valid: As in the condition

in line 4, upon entering a correct username-password pair, theuser will not be asked to answer an ATT challenge in thefollowing cases:• A valid cookie is received from the user machine (i.e.,

the function V alid returns true) and the number of failedlogin attempts from the user machines IP address for thatusername, FS12srcIP; un , is less than k1 over a timeperiod determined by t3.

• The user machines IP address is in the whitelist W andthe number of failed login attempts from this IP addressfor that username, FS12srcIP; un , is less than k1 overa time period determined by t3;

• The number of failed login attempts from any machinefor that username, FT12un , is below a threshold k2 overa time period determined by t2. The last case enables a


11


Fig. 2: PGRP: Password Guessing Resistant Protocol

user who tries to login from a new machine/IP addressfor the first time before k2 is reached to proceed withoutan ATT. However, if the number of failed login attemptsfor the username exceeds the threshold k2 (default 3),this might indicate a guessing attack and hence the usermust pass an ATT challenge.

2) Username-Password Pair Is Invalid: Upon entering anincorrect username-password pair, the user will not be askedto answer an ATT challenge in the following cases:• A valid cookie is received from the user machine (i.e.,

the function V alid returns true) and the number of failedlogin attempts from the user machines IP address for thatusername, FS12srcIP; un , is less than k1 (line 16) overa time period determined by t3;

• The user machines IP address is in the whitelist W andthe number of failed login attempts from this IP addressfor that username, FS12srcIP; un , is less than k1 (line16) over a time period determined by t3;

• The username is valid and the number of failed lo-gin attempts (from any machine) for that username,FT12un , is below a threshold k2 (line 19) over a time

period determined by t2. A failed login attempt froma user with a valid cookie or in the whitelist W willnot increase the total number of failed login attemptsin the FT table since it is expected that legitimate usersmay potentially forget or mistype their password (line16-18). Nevertheless, if the user machine is identifiedby a cookie, a corresponding counter of the failed loginattempts in the cookie will be updated. In addition, theFS entry indexed by the source IP address, usernamepair will also be incremented (line 17). Once the cookiecounter or the corresponding FS entry hits or exceeds thethreshold k1 (default value 30), the user must correctlyanswer an ATT challenge.

3) Output Messages: PGRP shows different messages incase of incorrect {username, password} pair (lines 21 and 24)and incorrect answer to the given ATT challenge (lines 14and 26). While showing a human that the entered {username,password} pair is incorrect, an automated program unwillingto answer the ATT challenge cannot confirm whether it is thepair or the ATT that was incorrect. However, while this is moreconvenient for legitimate users, it gives more information tothe attacker about the answered ATTs. PGRP can be modifiedto display only one message in lines 14, 21, 24, and 26 (e.g.,”login fails“ as in the PS and VS protocols) to prevent suchinformation leakage.

4) Why Not to Black-List Offending IP Addresses: Wechoose not to create a blacklist for IP addresses making manyfailed login attempts for the following reasons:• This list may consume considerable memory;• legitimate users from blacklisted IP addresses could be

blocked (e.g., using compromised machines); 3) hostsusing dynamic IP addresses seem more attractive targets(compared to hosts with static IP addresses) for adver-saries to launch their attacks from (e.g., spammers).

If the cookie mechanism is not available for the login server,PGRP can operate by using only source IP addresses to keeptrack of user machines.

IV. METHODOLOGY

A. Existing MethodHowever, this inconveniences the legitimate user who then

must answer an ATT on the next login attempt. Several othertechniques are deployed in practice, including: allowing loginattempts without ATTs from a different machine, when acertain number of failed attempts occur from a given machine;allowing more attempts without ATTs after a time-out period;and time-limited account locking. Many existing techniquesand proposals involve ATTs, with the underlying assumptionthat these challenges are sufficiently difficult for bots and easyfor most people. However, users increasingly dislike ATTs asthese are perceived as an (unnecessary) extra step; see Yanand Ahmad for usability issues related to commonly usedCAPTCHAs.

B. DisadvantagesDue to successful attacks which break ATTs without human

solvers ATTs perceived to be more difficult for bots are being


12


deployed. As a consequence of this arms-race, present-dayATTs are becoming increasingly difficult for human users ,fueling a growing tension between security and usability ofATTs. Therefore, we focus on reducing user annoyance bychallenging users with fewer ATTs, while at the same timesubjecting bot logins to more ATTs, to drive up the economiccost to attackers. Two well-known proposals for limiting onlineguessing attacks using ATTs are Pinkas and Sander (hereindenoted PS), and van O orschot and Stubblebine (hereindenoted VS). For convenience, a review of these protocols

C. Proposed Method

The PS proposal reduces the number of ATTs sent tolegitimate users, but at some meaningful loss of security; forexample, in an example setup (with p 14 0:05, the fractionof incorrect login attempts requiring an ATT) PS allowsattackers to eliminate 95 percent of the password space withoutanswering any ATTs. The VS proposal reduces this but at asignificant cost to usability; for example, VS may require allusers to answer ATTs in certain circumstances The proposal inthe present paper, called Password Guessing Resistant Protocol(PGRP), significantly improves the security-usability trade-off,and can be more generally deployed beyond browser-basedauthentication. PGRP builds on these two previous proposals.In particular, to limit attackers in control of a large botnet (e.g.,comprising hundreds of thousands of bots), PGRP enforcesATTs after a few (e.g., three) failed login attempts are madefrom unknown machines. On the other hand, PGRP allows ahigh number (e.g., 30) of failed attempts from known machineswithout answering any ATTs. We define known machines asthose from which a successful login has occurred within afixed period of time. These are identified by their IP addressessaved on the login server as a white list, or cookies stored onclient machines. A white-listed IP address and/or client cookieexpire after a certain time.

D. Advantages

PGRP accommodates both graphical user interfaces(e.g.,browser-based logins) and character-based interfaces(e.g.,SSH logins), while the previous protocols deal exclusivelywith the former, requiring the use of browser cookies. PGRPuses either cookies or IP addresses, or both for trackinglegitimate users. Tracking users through their IP addresses alsoallows PGRP to increase the number of ATTs for passwordguessing attacks and meanwhile to decrease the number ofATTs for legitimate login attempts. Although NATs and webproxies may (slightly) reduce the utility of IP address informa-tion, in practice, the use of IP addresses for client identificationappears feasible [4]. In recent years, the trend of logging into online accounts through multiple personal devices (e.g.,PCs, laptops, smart phones) is growing. When used from ahome environment, these devices often share a single publicIP address (i.e., a simple NAT address) which makes IP-basedhistory tracking more user friendly than cookies. For example,cookies must be stored, albeit transparently to the user, in alldevices used for login.

V. EXPERIMENT RESULT

In this section, we provide the details of our test setup,empirical results, and analysis of PGRP on two different datasets. PGRP results are also compared to those obtained fromtesting the PS and VS protocols on the same data sets.

VI. DATA SETS

We used two data sets from an operational university net-work environment. Each data set logs events of a particularremote login service, over a one-year period each. SSH ServerLog. The first data set was a log file for an SSH serverserving about 44 user accounts. The SSH server recordeddetails of each authentication event, including: date, time,authentication status (success, failed, or invalid username),username, source IP address, and source port. Log files werefor the period of January 4, 2009 to January 22, 2010 (thus,slightly over one year). Table 4 shows that the majority of thelogin events (95 percent) are for invalid usernames suggestingthat most login attempts are due to SSH guessing attacks.Note that attack login attempts involving valid usernames arenot distinguishable from incorrect logins by legitimate userssince there is no indication whether the source is maliciousor benign. However, there were only few failed login attemptsfor valid usernames either over short bursts or over the wholelog capture period.

Fig. 3: Dataset

The number of invalid usernames that appear to be mistypedvalid usernames represents less than one percent. Email serverlog (web interface). The second data set consisted of logfiles of a Horde IMP email client2 for the period of January15, 2009 to January 25, 2010. The Horde email platform isconnected to an IMAP email server in a university environ-ment. For each authentication event, a log entry contained:date, time, authentication status (success, failed, or invalidusername), username, and source IP address. Although thenumber of registered user accounts in this server is 1,758, only147 accounts were accessed.

Compared to the SSH log, Table 4 shows that maliciouslogin attempts are far less prevalent, at only about one percent.Login attempts with valid usernames generated by guessingattacks are, as above, not distinguishable. We were unable todetermine the percentage of misspelled valid usernames sincethe log file data including the usernames was anonymized.


13


A. Simulation Method and AssumptionsWe performed a series of experiments with a Python-

based implementation of PGRP with different settings of theconfiguration variables (k1, k2, t1, t2, and t3). The login eventsin each data set are ordered according to date (older entriesfirst). Each event is processed by PGRP as if it runs in realtime, with protocol tables updated according to the events.Since entries in the tables W, FT, and FS have write-expiryintervals,3 they get updated at each login event according tothe date/time of the current event (i.e., the current time ofthe protocol is the time of the login event being processed).We assume that users always answer ATT challenges correctly.While some users will fail in answering some ATTs in practice(see, e.g., [3]), the percentage of failed ATTs depends on themechanism used to generate the ATTs, the chosen challengedegree of difficulty (if configurable), and the type of the serviceand its users. The number of generated ATTs by the servercan be updated accordingly; for example, if the probabilityof answering an ATT correctly is p, then the total numberof generated ATTs must be multiplied by a factor of 1=p.Since no browser cookie mechanism was implemented in ourtests, in either services of the data sets, the function Validcookie; un; k1; status always returns false. In the absence ofa browser cookie mechanism, a machine from which a userhas previously logged in successfully would not be identifiedby the login system if the machine uses a different IP addressthat is not in W (see Section 3.3 for further discussion). Suchlegitimate users will be challenged with ATTs in this case. For

Fig. 4: Result Table

a comparative analysis, we also implemented the PS and VSprotocols under the same assumptions. The cookie mechanismin these protocols is replaced by IP address tracking of usermachines since cookies are not used in either data sets. Theprobability p of the deterministic function is set to 0.05 0.30,and 0.60 in each experiment. For VS, b1 and b2 are both setto 5 suggested 10 as an upper bound for both b1 and b2.

B. Analysis of ResultsIn Fig. 4, we list the protocol parameter settings of eight

experiments. For both SSH and email data sets, the totalnumber of ATTs that would be served over the log period,and the maximum number of entries in the W, FT, and FStables are reported. In the first five experiments, we changethe parameter k2 from 0 to 4. k2 bounds the number of failedlogin attempts after which an ATT challenge will be triggered

for the following login attempt. Note that the total numberof ATTs served over the log period decreases slightly witha larger k2 for both data sets. Other parameters have minoreffects on the number of ATTs served. The number of entriesin W in the email data set is larger than the SSH data set sincethere are more email users. Note that although the number offailed login attempts is larger in the SSH data set, the numberof entries in FT is smaller than the email data set because thenumber of usernames is less in the SSH data set with veryfew common usernames (e.g., common first or last names thatcan be used in brute force attacks). Given that the protocolrequires an ATT for each failed login attempt from a sourcenot in W (and with no valid cookie) when k2 is set to 0, theFT table is empty in the first experiment for both data sets (asthe second condition in line 19 is always false).

VII. CONCLUSION AND FUTURE WORKS

Moreover, the adversary is expected to need to correctlyanswer about N=2 ATTs in order to guess a password correctlyas opposed to 1 2 pN in the PS protocol.The Online passwordguessing attacks on password-only systems have been observedfor decades (see, e.g., [21]).Presentday attackers targeting suchsystems are empowered by having control of thousand tomillion-node botnets. Inprevious ATT-based login protocols,there exists a security usability trade-off with respect to thenumber of free failed login attempts (i.e., with no ATTs)versus user login convenience (e.g., less ATTs and otherrequirements). In contrast, PGRP is more restrictive againstbrute force and dictionary attacks while safely allowing a largenumber offree failed attempts for legitimate users. Our empiri-calexperiments on two data sets (of one-year duration) gatheredfrom operational network environments show that while PGRPis apparently more effective in preventing password guessingattacks (without answering ATT challenges), it also offers moreconvenient login experience, e.g., fewer ATT challenges forlegitimate users even if no cookies are available. However, wereiterate that no user testing of PGRP has been conducted sofar.

However, PGRP appears suitable for organizations of bothsmall and large number of user accounts. The required systemresources (e.g., memory space) are linearly proportional to thenumber of users in a system. PGRP can also be used withremote login services where cookies are not applicable (e.g.,SSH and FTP).

REFERENCES

[1] Amazon Mechanical https://www.mturk.com/mturk/,June2010.[2] S.M. Bellovin, ”A Technique for Counting Natted Hosts,“ Proc. ACM

SIGCOMM Workshop Internet Measurement, pp. 267-272, 2002.[3] E. Bursztein, S. Bethard, J.C. Mitchell, D. Jurafsky, and C. Fabry, ”How

Good Are Humans at Solving CAPTCHAs? A Large Scale Evaluation,“Proc. IEEE Symp. Security and Privacy, May 2010.

[4] M. Casado and M.J. Freedman, ”Peering through the Shroud: TheEffect of Edge Opacity on Ip-Based Client Identification,“ Proc.Fourth USENIX Symp. Networked Systems Design and Implementation(NDSS 07), 2007.


14


[5] S. Chiasson, P.C. van Oorschot, and R. Biddle, ”A Usability Study andCritique of Two Password Managers,”Proc. USENIX Security Symp.,pp. 1-16, 2006.

[6] D. Florencio, C. Herley, and B. Coskun, “Do Strong Web PasswordsAccomplish Anything?,” Proc. USENIX Workshop Hot Topics in Se-curity (HotSec 07), pp. 1-6, 2007.

[7] K. Fu, E. Sit, K. Smith, and N. Feamster, “Dos and Donts of ClientAuthentication on the Web,” Proc. USENIX Security Symp., pp. 251-268, 2001.

[8] P. Hansteen, “Rickrolled? Get Ready for the Hail MaryCloud!”,http://bsdly.blogspot.com/2009/11/rickrolled-get-ready-forhail-mary.html,Feb. 2010.

[9] Y. He and Z. Han, “User Authentication with Provable Security againstOnline Dictionary Attacks,“ J. Networks, vol. 4, no. 3, pp. 200-207,May 2009.

[10] T. Kohno, A. Broido, and K.C. Claffy, ”Remote Physical DeviceFingerprinting,“ Proc. IEEE Symp. Security and Privacy, pp. 211-225,2005.

[11] M. Motoyama, K. Levchenko, C. Kanich, D. Mccoy,G.M. Voelker,and S. Savage, ”Re: CAPTCHAs Understanding CAPTCHASolvingServices in an Economic Context,” Proc. USENIX Security Symp., Aug.2010.

[12] C. Namprempre and M.N. Dailey, Mitigating Dictionary Attacks withText-Graphics Character Captchas, IEICE Trans. Fundamentals of Elec-tronics, Comm. and Computer Sciences, vol. E90-A, no. 1, pp. 179-186,2007.

[13] A. Narayanan and V. Shmatikov, “Fast Dictionary Attacks on Human-Memorable Passwords Using Time-Space Tradeoff,” Proc. ACM Com-puter and Comm. Security (CCS 05), pp. 364-372, Nov. 2005.

[14] Natl Inst. of Standards and Technology (NIST), Hash-belt.http://ww.itl.nist.gov/div897/sqg/dads/HTML/has hbelt.html,Sept. 2010.

[15] The Biggest Cloud on the Planet Is Owned by...theCrooks,NetworkWorld.com.,http://www.networkworld.com/community/node/58829, Mar. 2010.

[16] J. Nielsen,“Stop Password Masking,” http://www.useit.com/ alert-box/passwords.html, June 2009.

[17] B. Pinkas and T. Sander, “Securing Passwords against DictionaryAttacks,“ Proc. ACM Conf. Computer and Comm. Security (CCS 02),pp. 161-170, Nov. 2002.

[18] D. Ramsbrock, R. Berthier, and M. Cukier, Profiling Attacker Behaviorfollowing SSH Compromises, Proc. 37th Ann. IEEE/IFIP Intl Conf.Dependable Systems and Networks (DSN 07), pp. 119-124, June 2007

[19] SANS.org, Important Information: Distributed SSH BruteForce Attacks, SANS Internet Storm Center Handlers Diary,http://isc.sans.edu/diary.html?storyid=9034, June 2010.

[20] ”The Top Cyber Security Risks,“ SANS.org, http://www.sans. org/top-cyber-security-risks/, Sept. 2009.

[21] C. Stoll, The Cuckoos Egg: Tracking a Spy through the Maze ofComputer Espionage. Doubleday, 1989.

[22] ”Botnet Pierces Microsoft Live throughAudio Captchas,“ TheRegister.co.uk,http://www.theregister.co.uk/2010/03/22/microsoft live captcha bypass/, Mar. 2010

[23] P.C. van Oorschot and S. Stubblebine, ”On Countering Online Dictio-nary Attacks with Login Histories and Humans-in-the- Loop,“ ACMTrans. Information and System Security, vol. 9, no. 3, pp. 235-258,2006.

[24] L. von Ahn, M. Blum, N. Hopper, and J. Langford, ”CAPTCHA: UsingHard AI Problems for Security,“ Proc. Eurocrypt, pp. 294- 311, May2003.

[25] M. Weir, S. Aggarwal, M. Collins, and H. Stern, ”Testing Metricsfor Password Creation Policies by Attacking Large Sets of RevealedPasswords,“ Proc. 17th ACM Conf. Computer and Comm. Security,pp. 162-175, 2010.

[26] Y. Xie, F. Yu, K. Achan, E. Gillum, M. Goldszmidt, and T. Wobber,”How Dynamic Are IP Addresses?,” SIGCOMM Computer Comm.Rev., vol. 37, no. 4, pp. 301-312, 2007.

[27] J. Yan and A.S.E. Ahmad, “A Low-Cost Attack on a MicrosoftCAPTCHA,” Proc. ACM Computer and Comm. Security (CCS 08),pp. 543-554, Oct. 2008.

[28] J. Yan and A.S.E. Ahmad, “Usability of CAPTCHAs or UsabilityIssues in CAPTCHA Design,” Proc. Symp. Usable Privacy and Security(SOUPS 08), pp. 44-52, July 2008.


15


Intelligent Image InterpreterJayasree N V,

Dept. of Computer Science and Engg,SIMAT, Vavanoor, Palakkad

[email protected]

Abstract—The intrinsic information present in an image is veryhard to interpret by the computer. There is lot of approachesfocused on finding out the salient object in an image. Here wepropose a novel architectural approach to find out the relationbetween salient objects using local and global analysis. The localanalysis focuses salient object detection with efficient relationmining in the context of the processing image. For an effectiveglobal analysis we created ontology tree by considering a wide setof natural images. From these natural images we create an affinitybased ontology graph; with the help of this local and globalcontextual graph we construct an annotated parse tree. Aboveformed tree is helpful in large image search. So our proposal willgive new heights to the content based image retrieval and imagerelated interpretation problems.

Keywords—Ontology graph, Intelligent interpretation, Localanalysis, Global analysis, global knowledge base.

I. INTRODUCTION

INTERPRETATION is the act of analyzing and makingconclusions based on the given data. Nowadays we have

to work with a huge amount of data which include text,video, audio etc. So the fundamental issue is to take somedecisions based on the appearance, context or any other factorswhich present in the raw data. The symbolic interpretationof image is a challenging task. As there are lot of literatureswhich mention the application of symbolic computation suchas natural language processing on decision making systems. Sohere we include a state of art solution to the above mentioneddecision systems based on a step by step refinement of imageor video. The refinement process includes all the symbolicinterpretations of image such as object detection, grammaticalarrangement of objects, semantic annotation of objects, Globalanalysis of object based on global knowledge base. .

II. LITERATURE REVIEW

Here we propose a novel algorithm for salient region de-tection by integrating three important visual properties of animage like, uniqueness, focusness and objectness (UFO)[7].Uniqueness captures the appearance-derived visual contrast;focusness concentrates on the fact that salient regions areoften photographed in focus; and objectness helps to thinkabout the completeness of the salient region detected. Whileuniqueness has been used for saliency detection for long, it isnew to integrate focusness and objectness for this purpose. Infact,focusness and objectness both provide important saliencyinformation complementary of uniqueness.

Human can prioritize external visual stimuli and localizetheir most interest in a scene quickly. As such, how to simulate

such human capability with a computer, i.e., how to identifythe most salient pixels or regions in a digital image whichattract humans first visual attention, has become an importanttask in computer vision. Further, results of saliency detectioncan be used to facilitate other computer vision tasks suchas image resizing, thumb nailing, image segmentation andobject detection. Due to its importance, saliency detection hasreceived intensive research attention resulting in many recentlyproposed algorithms. The majority of those algorithms arebased on low-level features of the image such as appearanceuniqueness in pixel or super pixel level. One basic ideais to derive the saliency value from the local contrast ofvarious channels, such as in terms of uniqueness defined in.While uniqueness often helps generate good saliency detectionresults, it sometimes produces high values for non-salientregions, especially for regions with complex structures. As aresult, it is desired to integrate complementary cues to addressthe issue.

Detecting visually salient regions in images is one of thefundamental problems in computer vision[2]. We propose anovel method to decompose an image into large scale per-ceptually homogeneous elements for efficient salient regiondetection, using a soft image abstraction representation. Byconsidering both appearance similarity and spatial distributionof image pixels, the proposed representation abstracts outunnecessary image details, allowing the assignment of com-parable saliency values across similar regions, and producingperceptually accurate salient region detection. We evaluateour salient region detection approach on the largest publiclyavailable dataset with pixel accurate annotations.

This paper propose a novel soft image abstraction ap-proach that captures large scale perceptually homogeneouselements, thus enabling effective estimation of global saliencycues. Unlike previous techniques that rely on super-pixels forimage abstraction, we use histogram quantization to collectappearance samples for a global Gaussian Mixture Model(GMM) based decomposition. Components sharing the samespatial support are further grouped to provide a more compactand meaningful presentation. This soft abstraction avoids thehard decision boundaries of super pixels, allowing abstractioncomponents with very large spatial support. This allows thesubsequent global saliency cues to uniformly highlight entiresalient object regions. Finally, we integrate the two globalsaliency cues, Global Uniqueness (GU) and Color SpatialDistribution (CSD), by automatically identifying which one ismore likely to provide the correct identification of the salientregion.

Semantic-based image retrieval has attracted great interest


16

Jayasree N. Vettath, Intelligent Image Interpreter

in recent years [6]. This paper proposes a region-based imageretrieval system with high-level semantic learning. The keyfeatures of the system are:

1) it supports both query by keyword and query by regionof interest. The system segments an image into differentregions and extracts low-level features of each region.From these features, high-level concepts are obtainedusing a proposed decision tree-based learning algo-rithm named DT-ST. During retrieval, a set of imageswhose semantic concept matches the query is returned.Experiments on a standard real-world image databaseconfirm that the proposed system significantly improvesthe retrieval performance, compared with a conventionalcontent-based image retrieval system.

2) The proposed decision tree induction method DT-STfor image semantic learning is different from other de-cision tree induction algorithms in that it makes use ofthe semantic templates to discretize continuous-valuedregion features and avoids the difficult image featurediscretization problem. Furthermore, it introduces ahybrid tree simplification method to handle the noiseand tree fragmentation problems, thereby improving theclassification performance of the tree.

Fig. 1: Architecture of our Automatic Image Annotationprocess.

Here we propose a new approach for automatic image anno-tation (AIA) in order to automatically and efficiently assignlinguistic concepts to visual data such as digital images, basedon both numeric and semantic features[8]. The presentedmethod first computes multi-layered active contours. The first-layer active contour corresponds to the main object or fore-ground, while the next-layers active contours delineate theobjects subparts. Then, visual features are extracted within theregions segmented by these active contours and are mappedinto semantic notions. Next, decision trees are trained basedon these attributes, and the image is semantically annotatedusing the resulting decision rules. Experiments carried out onseveral standards datasets have demonstrated the reliability andthe computational effectiveness of our AIA system.

Here we propose a new fully automatic image annotationmethod based on efficiently implemented active contours anddecision trees. Hence, our approach consists of the automaticrecursive image segmentation in multiple layers using multi-feature active contours and the automatic semantic labeling of

the image based on decision trees. While being an unsuper-vised segmentation technique, the multi-layered multi-featureactive contour approach does not use any prior knowledgeabout the foreground unlike top- contours, in order

1) to precisely and automatically down segmentationmethods and reaches a semantically segment the imageinto background and semantically coherent segmenta-tion of the objects more accurately than the meaningfulforeground regions and

2) to extract bottom-up segmentation techniques and fasterthan the coherent and semantically meaningful sub-regions of combined ones or the extracted main object;On the other hand, our segmentation method also pro-vides the background region.

However, in this work, we only exploit the information aboutthe main object and its subparts, in order to process the trainingof the corresponding decision trees and the automatic labelingof the dataset images in a more computational efficiently waythan background-based systems like.

AIA system illustrated in Fig. 1 above, which performsboth the automatic visual segmentation of the image and itsautomatic semantic annotation. The main steps of the processare the multi-layered partition of the image in terms of back-ground, foreground and foreground s semantically meaningfulsub-regions, the extraction of the corresponding metric featuresfrom these delineated regions as well as the definition ofthe semantic attributes based on the visual features, and thelabeling of the image followed by the final online annotationof the image using offline-trained decision trees

A new framework for annotating images automatically usingontologies[11] is described here. An ontology is constructedholding characteristics from multiple information sources in-cluding text descriptions and low-level image features. Imageannotation is implemented as a retrieval process by comparingan input (query) image with representative images of allclasses. Handling uncertainty in class descriptions is a distinc-tive feature of SIA. Average Retrieval Rank (AVR) is appliedto compute the likelihood of the input image to belong toeach one of the ontology classes. SIA is a complete prototypesystem for image annotation. Given a query image as input,SIA computes its description consisting of a class name and thedescription of this class. This description may be augmentedby class (ontology) properties depicting its shape, size, color,texture (e.g., “has long hair”, “small size” etc.). The systemconsists of several modules.

The image ontology has two main components namely, theclass hierarchy of the image domain and the descriptions hi-erarchy [5]. Various associations between concepts or featuresbetween the two parts are also defined: Class Hierarchy: Theclass hierarchy of the image domain is generated based onthe respective nouns hierarchy of Wordnet4 . In this work,a class hierarchy for dog breeds is constructed (e.g., dog,working group, Alsatian). The leaf classes in the hierarchyrepresent the different semantic categories of the ontology (i.e.,the dog breeds). Also a leaf class (i.e., a dog breed) may berepresented by several image instance for handling variationsin scaling and posing. For example, in SIA leaf class Labradorhas 6 instances. Descriptions hierarchy: Descriptions are dis-


17


tinguished into high-level and low-level descriptions. High-level descriptions are further divided into concept descriptions(corresponding to the glosses of Wordnet categories) and visualtext descriptions (high-level narrative information). The later,are actually descriptions that humans would give to imagesand are further specialized based on animal shape and sizeproperties (i.e., small, medium and big) respectively. The low-level descriptions hierarchy represents features extracted by7image descriptors. Because an image class is represented bymore than one image instances (6 in this work), each classis represented by a setof 7 features for each image instance.An association between image instances and low-level featuresis also defined denoting the existence of such features (e.g.,“hasColorLayout”, “hasCEDD”). Fig. 2 illustrates part of theSIA ontology (not all classes and class properties are shown).The input image may contain several regions from which some

Fig. 2: Part of the SIA ontology.

may be more relevant to the application than others. In thiswork, dogs head is chosen as the most representative part ofa dog image for further analysis. This task is implementedby manual Region of Interest (ROI) placement (the userdrags a rectangle around a region) followed by backgroundsubstraction by applying GrabCut. and noise reduction.

III. PROPOSED METHOD

The different modules here are:• Image Acquisition• Salient Object Detection• Syntax Tree Generation• Ontology Tree Generation• Semantic Tree GenerationThe architecture modules of the Intelligent Image Interpreter

system can be viewed as much the same as of a compiler,which includes Lexical analysis, Syntax analysis, Semanticanalysis,intermediate code generation etc as shown in Fig. 3.

A. Image AcquisitionIn the phase of Image acquisition capture the image to be

annotated using a digital camera and convert it in to machinereadable form. This is the input collection stage, similar to aprogram that is given as the input of a compiler. The program

Fig. 3: System Architecture

will be first converted to the machine friendly form like ASCIIcharacters.

B. Salient Object DetectionHumans have the capability to quickly prioritize external

visual stimuli and localize their most interest in a scene. Assuch, how to simulate such human capability with a computer,i.e., how to identify the most salient pixels or regions in adigital image which attract humans first visual attention, hasbecome an important task in computer vision. In this phase,Analyzing the image captured and identifying the salientobjects from the image this module returns. This can becompared to the tokens that are returned from compiler aslexeme from the input program, here the salient objects areidentified and returned from the input image.

C. Syntax Tree GenerationThis phase of can be compared with the syntax analysis

phase of the compiler which generate a Parse tree from thelexeme. It checks only the syntax. No semantics will beanalyzed here. Here we Create a syntax tree based on theimage input by analyzing the image locally. The local analysisis done with the returned salient objects from phaseI. Thismodule returns a syntax tree which is much the same as parsetree in second phase of the compiler.

D. Ontology Tree GenerationIn this phase, Global analysis phase I is completed by

creating a knowledge base by analyzing a set of possiblenatural images. Detect the salient regions from them andconstruct an ontology tree based on a set of natural imageswhich are already stored in the system knowledge base. Itreturns an ontology tree which can be used as the input tophase5 . It shows the relationship between the salient objectsby performing data mining operations.

E. Semantic Tree GenerationThe final Phase of the project is to create an annotated

parse tree based on the syntax tree and ontology tree created


18


in previous phases. The syntax tree returned from phase 3 isanalyzed with the help of ontology tree returned from Phase IVand interpret the meaning from this and construct an annotatedparse tree

IV. PROPOSED ALGORITHM

• Capture the image(I) using high quality digital camera.• Detect the salient objects using edge detection algorithm

and deep level segmentation. Salient objects: P1,P2,P3• Create a grammatical rule based on the fuzzy inference.

IF X P1 && X MEM P2 THEN P1 RE• Local analysis performed on the image to get a meaning-

ful structure of objects and relation exist among them.• Create an ontology tree for a set of images using Global

analysis

V. CONCLUSION

In this work, the problem of image interpretation usingsemantics is investigated and it proposes a new method forinterpreting the semantics of a given image and construct asemantic tree structure. Which is based on the salient regionsin an image. The output of the system is an annotated semantictree from which we can find out relation ships between objectsand this can be extended to implement security systems. Theintelligent image interpreter can be implemented in systemsthat are used as security systems in ATM, Safety lockers etc.purposes.

REFERENCES

[1] Ying Liua, , Dengsheng Zhanga , Guojun Lua , Wei-Ying Mab, “A surveyof content-based image retrieval with high-level semantics”, 2012

[2] Ming-Ming Cheng, Jonathan Warrell, Wen-Yan Lin, Shuai Zheng, VibhavVineet , “Efficient Salient Region Detection with Soft Image Abstrac-tion”, Vision Group, Oxford Brookes University Nigel Crook, 2011.

[3] Radhakrishna Achanta , Sheila Hemami, Francisco Estrada , and SabineSusstrunk , “Frequency-tuned Salient Region Detection”,2010.

[4] Hao Fu ,Guoping Qiu , “Fast Semantic Image Retrieval Based onRandom Forest”, 2012.

[5] Ming-Ming Cheng, Guo-Xin Zhang, “Global contrast based salientregion detection”,2011.

[6] Ying Liu, Dengsheng Zhang , Guojun Lu, “Region-based image retrievalwith high-level semantics using decision tree learning”, 2007

[7] Peng Jiang ,Haibin Ling , Jingyi Yu , Jingliang Peng, “Salient RegionDetection by UFO: Uniqueness, Focusness and Objectness”, 2013.

[8] Joanna Isabelle OLSZEWSKA, “Semantic, Automatic Image AnnotationBased On Multi-Layered Active Contours and Decision Trees”, 2013.

[9] Dengsheng Zhang, Md Monirul Islam, Guojun Lu and Jin Hou ,“Seman-tic Image Retrieval Using Region Based Inverted File ”, 2009.

[10] Wei Wang, Yuqing Song and Aidong Zhang, “Semantics Retrieval byContent and Context of Image Regions”, 2012.


19


Self Interpretive Algorithm GeneratorJayanthan K S,


[email protected]

Abstract—Thinking is a complex procedure which is necessaryto deal with a complex world. Machines that help us handle thiscomplexity can be regarded as intelligent tools, tools support ourthinking capabilities. The need for such intelligent tools will growas new forms of complexity evolve, for our increasingly globallynetworked information society. Algorithms are considered as thekey component of any problem solving activity. In order formachines to be capable of intelligently predigesting information,they should perform in a way similar to the way human think.Humans have developed sophisticated methods for dealing withthe worlds complexity, and it is worthwhile to adapt to someof them. Sometimes the principles of human thinking combinedwith the functional principles of biological cell systems. Herethe consideration of human thinking is extended to interpretivecapacity of the machine. This interpretive capacity measuredthe capacity of the machine to generate the algorithm. Manydefinitions of algorithm are based on the single idea that input isconverted to output in a finite number of steps. An algorithm is astep by step procedure to complete a task. It is any set of detailedinstructions which results in a predictable end-state from a knownbeginning. This project is an attempt to generate algorithms bymachine without human intervention. Evolutionary computationis a major tool which is competitive to human in many differentareas of problem solving. So here a using Genetically EvolvedCellular Automata is used for giving the system self interpretivecapacity to generate algorithm.

Keywords—Genetically evolved cellular automata, Genetic algo-rithm, Symbolic knowledge base, Problem solving

I. INTRODUCTION

COmputer science is mainly concentrating on the devel-opment of algorithms to solve complex problems. An

algorithm is a step by step procedure to complete a task. It isany set of detailed instructions which results in a predictableend-state from a known beginning. Algorithms are only asgood as the instructions given, however the result will beincorrect if the algorithm is not defined properly. In the caseof algorithm development, a huge amount of human effortis needed. Automation is the use of information technologyto reduce human effort. Every computer program is simplya series of instructions, which may vary in complexity, andis listed, in a specific order, designed to perform a specifictask. Mathematics also uses algorithms to solve equations byhand, without the use of a calculator. One good exampleis the human brain: most conceptions of the human braindefine all behavior from the acquisition of food to fallingin love as the result of a complex algorithm. The capabilitiesof todays machine are still far from the capabilities of humans.As natural computing has not yet reached its limits. Existingapproaches combined with new concepts and ideas promise

significant progress within the next few years. The field ofartificial intelligence encompasses several different approachesto model natural thinking. They include semantic networks,Bayesian networks, Neural network and most prominentlyCellular Automata. Cellular Automata are tools to computecomplex situations. The idea behind cellular automata orcellular machines is quite natural: neighboring objects influ-ence each other. It uses large number of so - called cellsin a regular geometrical arrangement. The cells usually havediscrete states that are influenced by their relationships withtheir neighbors. Like many natural process that are spatiallyextended cellular automata configurations often organize overtime into spatial regions that are dynamically homogeneous.Sometimes in space time diagrams these regions are obvious tothe eye as domains regions in which the same pattern appears.These domain patterns are described using Deterministic finiteautomaton. In the computer world problems may be classifiedin Dimensionality of the input. So here we are considering a setof numbers are considered as 1 Dimensional input and Image,Graph as 2 Dimensional inputs. So the system will accept theinput and output as Cellular Automata entity, and find out theemergence of the input to output using Genetically EvolvedCellular Automata rule. From the previous experiments it isdemonstrated that Genetic Algorithm can perform better thanhuman rules.

II. LITERATURE SURVEY

A cellular automaton or CA is a mathematical machine ortool that lends itself to some very remarkable and beautifulideas[12]. A cellular automaton is a discrete, dynamical systemthat performs computations in a finely distributed fashion ona spatial grid. Cellular Automatas (CA) are often describedas a counterpart to partial differential equations, which havethe capability to describe continuous dynamical systems. Themeaning of discrete is that space, time and properties of theautomaton can have only a finite, countable number of states.The basic idea is not to try to describe a complex system from”above” - to describe it using difficult equations, but simulatingthis system by interaction of cells following easy rules. Thebasic element of a CA is the cell[28]. A cell is a kind of amemory element and stores - to say it with easy words - states.In the simplest case, each cell can have the binary states 1or 0. In a more complex simulation the cells can have moredifferent states. It is even thinkable, that each cell has morethan one property or attribute and each of these propertiesor attributes can have two or more states. The cells are thecommon elements which give the CA a wide application tool.These cells are arranged in a spatial web - a lattice. The


20

Jayanthan K S, Self Interpretive Algorithm Generator

simplest one is the one dimensional ”lattice”, meaning thatall cells are arranged in a line like a string of perls. The mostcommon CA’s are built in one or two dimensions, whereasthe one dimensional CA has the big advantage, that it is veryeasy to visualize. The states of one time step are plotted inone dimension, and the dynamic development can be shownin the second dimension. A flat plot of a one dimensionalCA hence shows the states from time step 0 to time step n.Consider a two dimensional CA: a two dimensional plot canevidently show only the state of one time step. So visualizingthe dynamic of a 2D CA is by that reason more difficult. Bythat reasons and because 1D CA’s are generally more easy tohandle. Most theoretical papers available deal with propertiesof 1D CA’s because the rules are comparably simple. The ideabehind GA’s is to extract optimization strategies nature usessuccessfully - known as Darwinian Evolution - and transformthem for application in mathematical optimization theory tofind the global optimum in a defined phase space. One couldimagine a population of individual explorers sent into theoptimization phase-space. Each explorer is defined by its genesby its position inside the phase-space which is coded in hisgenes. Every explorer has the duty to find a value of thequality of his position in the phase space. Natural LanguageProcessing is a subfield of artificial intelligence and linguistic,devoted to make computers understand statements written inhuman languages. A natural language is a language spoken,written by humans for general purpose communication [24].The models developed by Natural Language Processing areuseful to write computer programs to do useful task involvinglanguage processing, and there by a better understanding ofhuman communication. To some extent these goals are com-plementary. A better understanding of human communicationis the major goal of all Natural Language Processing systems.The process of building computer programs that understandnatural language involves three major problems: the first onerelates to the thought process, the second one to the repre-sentation and meaning of the linguistic input, and the thirdone to the world knowledge. Thus, an NLP system may beginat the word level to determine the morphological structure,nature of the word and then may move on to the sentencelevel to determine the word order, grammar, meaning of theentire sentence, etc. and then to the context and the overallenvironment or domain.

III. METHODOLOGY

A GA was used to evolve CA for two computational tasks:density classification and synchronization. In both cases, theGA discovered rules that gave rise to sophisticated emergentcomputational strategies [3]. The GA worked by evolving theCA rule table and the number of iterations that the modelwas to run. After the final chromosomes were obtained forall shapes, the CA model was allowed to run starting witha single cell in the middle of the lattice until the allowednumber of iterations was reached and a shape was formed. Inall cases, mean fitness values of evolved chromosomes wereabove 80%. The Density Classification Task (DCT) is one ofthe most studied examples of collective computation in cellular

automata [10]. The goal is to find a binary CA rule that canbest classify the majority states in the randomized IC. If themajority of cells in the IC are in the quiescent (active) state,after a number of time steps M, the lattice should converge to ahomogeneous state where every cell is in the quiescent (active)state. Since the outcome could be undecidable in lattices witheven number of cells (N), this task is only applicable to latticeswith an odd number of cells. Devising CA rules that performthis task is not trivial, because cells in a CA lattice updatetheir states based only on local neighborhood information.However, in this particular task, it is required that informationbe transferred across time and space in order to achieve acorrect global classification. The definition of the DCT usedin our studies is the same as the one by Mitchell et al. Thedensity classification task has been studied for many years. Inthis task, a one-dimensional binary CA is initialized with arandom Initial Configuration(IC) and iterated for a maximumnumber of steps I or until a fixed point is reached [27]. If theIC contains more ones than zeros, the CA is deemed to havesolved the task if a fixed point of all ones is reached and viceversa. The situation where the IC contains an equal numberof ones and zeros. This is a difficult task for a CA becausea solution requires coordinating the global state of the systemwhile using only local communication between cells providedby the neighborhood. For this reason, the density classificationtask is widely used as a standard test function to exploreCA behavior [12]. The ability of a particular EvolutionaryCellular Automata (EvCA) to solve the density classificationtask depends on the IC. Intuitively, ICs containing many onesor zeros are closer in Hamming distance to one of the solutionfixed points, making it easier for a CA to iterate to the correctfixed point compared to an IC containing a more or less equalmix of ones and zeros. For this reason, performance of a CA onthe density classification task is estimated by sampling manyICs generated from a known distribution. Performance is thenthe fractional number of times the CA achieves the correctfixed point. It has been proven that no binary CA exists thatsolves the density classification task for all possible ICs [19][20]. Thus, a binary CA can only solve the problem for specificICs or to a particular degree over multiple ICs [21]. GeneratingICs using an equal probability of each cell being in the oneor zero state creates a binomial distribution.Bitmap problem and Checkboard problem are widely discussedby Breukelaar etal[23].The bitmap problem is defined as Givenan initial state and a specific desired end state:find the rulethat iterates from the initial state to the desired end state inless than I iterations.

IV. ALGORITHM

Algorithm cellular automata based algorithmInput: input image I(x, y) and output image O(x, y)Output: the transformation rule or algorithm1. Convert the input and output to bit pattern by

comparing I(x, y) and O(x, y) with thresholdand convert input image into binary image.I(x, y), O(x, y) ≡ B(I(x, y)), B(O(x, y)),whereB(I(x, y)) and B(O(x, y))are the binary image value.


21


2. Generate the genetically evolved population of cellularautomata rules Rand[(0, 1), Rulesize] Rulesize = 2n

Total number of population = 2Rulesize,where n isthe number of neighbors.

Fig. 1: Initial Stage.

3. Find out the transformation for each rule using mooreneighborhood(Refer Fig. 7 .The selection based on thework of Olivera etal[11]. I(x, y) depending on I(x+1, y),I(x, y + 1), I(x, y − 1), I(x − 1, y), I(x + 1, y + 1),I(x− 1, y − 1), I(x+ 1, y − 1), I(x− 1, y + 1)

4. Apply cellular automata rule on the input image untilit converges to output image in minimum number ofiterations.while B(I(x, y))! = B(O(x, y))apply new transformationsend

5. Using a symbolic knowledge base the transformationsare interpreted.∀rule(rulebit)⇒ interpretation of transformations

6. Using the intelligently interpreted transformationsthe above formed rules are transformed to naturallanguage based algorithm aply mathemtical tools∀ transformation to generate algorithm.

Fig. 2: Moore Neighborhood.

V. IMPLEMENTATION

The proposed method is implemented by using MATLABand Prolog. For implementing cellular automata uses RBNtoolbox. The symbolic knowledgebase is implemented withthe help of prolog. Mathematica tool is used for generatingthe algorithm from transformations.

Fig. 3: Algorithm Generation.

Fig. 4: GUI for input and output image.

Fig. 5: Plot for Cellular automata rule

Fig. 6: Plot for rule transformation

VI. CONCLUSION

This paper proposes a method which gives an insightinto symbolic and non-symbolic knowledge. In this papergenetically evolved cellular automata outperforms other softcomputing paradigms. Better rule transformation can givemore accurate results. Here prolog knowledge base actingthe role of a compiler .This paper focuses on the intelli-gent interpretation of systems, which traditionally uses human


22


Fig. 7: Rule Generator.

knowledge for its working. The proposed method can be usedin robot navigation, stock exchange signal valuation and otherreal world problems with specified input and output. Futurework majorly focusing on the wide area of application specificmethod generation.

REFERENCES

7. [1] Adam Callahan. Genetic Algorithm for Evolving CA to Solve theMajority Problem. August 6,2009.

[2] Adriana Popovici and Dan Popovici.Cellular Automata in Image Pro-cessing.

[3] Ajith Abraham, Nadia Nedjah ,Luzia de Macedo Mourelle,EvolutionaryComputation:from Genetic Algorithms to Genetic Programming SpringerVerlag Berlin Heidelberg -2006.

[4] A.M.Turing :Computing Machinery And Intelligence Mind ,New Series,Vol 59,No.236.(oct.,1950),pp.433-460.

[5] Carlos A .Coello CoelloAn introduction to Evolutionary Algorithms andTheir Applications CINVESTAV-IPN Evolutinary Computation Group.

[6] Christopher R.Houck,Jeffery A. Joines Micheal G.Kay. A genetic Algo-rithm for Function Optimization: A Matlab Implementation.

[7] David Andre ,Forrest H Bennett , John R.Koza, .Discovery by GeneticProgramming of a Cellular Automata Rule that is better than any KnownRule for the Majority Classification Problem. Stansford University.

[8] Eric Cantu Paz A summery of Research on Parallel Genetic AlgorithmsUniversity of Illinos Genetic Algorithm Laboratory.

[9] Franciszek Seredynski, Pascal Bouvry. Multiprocessor Scheduling Algo-rithms Based On Cellular Automata Training

[10] G.Binnig ,M.Baatz, J.Clenk, G.Schmidt.Will machines to start to thinklike humans?Artificial versus Natural intelligence.

[11] Gina M B.Olivera, Luiz G A.Martins, Laura B.de Carvalho, Enrich Fynn,Some Investigations About Synchronization and Density ClassificationTasks in One Dimensional and Two Dimensional Cellular Automata RuleSpaceElsevier Electronic Notes in Theoretical Computer Science 252(2009) 121-142.

[12] Herald Niesche Introduction to Cellular Automata Organic Computingss2006.

[13] James P.Crutchfield Melanie Mitchell The Evoltion of Emergent Com-putationNational Academy of Science.

[14] Jan Paredis Coevolving Cellular Automata:Be Aware of the Red QueenUniversity of Maastrichit.

[15] John r. Koza ,Forrest H Bennett David Andre Martin A.Keane.FourProblems for which a Computer Evoloved by Genetic Programming isCompetitive with Human Perfomance. 1996 IEEE.

[16] Lu yuming ,LiMing ,LiLing .Cellular Genetic Algorithms with Evolu-tional Rule. 2009 IEEE.

[17] Melanie Mitchell ,James P.Crutchfield ,Rajarshi Das.Evolving CellularAutomata with Genetic Algorithms:A review of Recent Work .EvcA96.

[18] Micheal O Neill ,Leonardo Vanneschi,Steven Gustafson,WolfgangBanzhaf, Open issues in Genetic Programming Springer 14 May 2010.

[19] Niloy Ganguly ,Pradipta Maji, Sandip Dhar,Biplab k.Sikdar ,P.PalChaudhuri Evolving Cellular Automata as Pattern Classifier SpringerVerlag Berlin Heidelberg 2002.

[20] Niloy Ganguly,Biplab k Sikdar,Andreas Deutsch,Geofferey Canright ,PPal Chaudhuri. A survey on Cellular automata 2004 Centre for Highperformance Computing.

[21] Rajarshi Das ,Melanie Mitchell, James P .Crutchfield,A Genetic Al-gorithm Discovers Particle Based Computation in Cellular AutomataParallel Problem solving from Nature PPSN .Berlin:Spinger Verlag.

[22] R.Breukelaar ,Th.Back .Using a Genetic Agorithm to Evolve Behaviorin Multi Dimensional Cellular Automata GECCO 05 ,June 25-29,ACM2005.

[23] Ron Breukelaar ,Thomas Back .Evolving Transition Rules for Multi Di-mensional Cellular Automata ACRI 2004,LNCS 3305,pp.182-191,2004Spinger-Verlag Berlin Heidelberg 2004.

[24] Sang Ho Shin, Kee-Young Yoo. Analysis of 2 state,3 NeighbourhoodCellular Automata Rules for Cryptographic Pseudorandom NumberGen-eration IEEE Computer Society Press 2009.

[25] Stuart Bain,John Thornton,and Abdul Sattar . Methods of AutomaticAlgorithm Generation. .Institute for Integrated and Intelligent Systems.

[26] Sinisa Petric Painterly Rendering Using Cellular Automata SIGMAPIDESIGN.

[27] S.Wolfram .A New Kind of Science Wolfram media,Inc,2002.[28] Arun P.V. , S.K. Katiyar., 2012,Automatic Object Extraction from

Satellite Images using Cellular Automata Based Algorithm ., IEEE-TGRS ., 50( 3)-2,pp: 92-102


23


Natural Language Generation from OntologiesManu Madhavan,


[email protected]

Abstract—Natural Language Generation is the task of generat-ing natural language text suitable for human consumption frommachine representation of facts which can be pre-structured insome linguistically amenable fashion, or completely unstructured.An ontology is a formal explicit description of concepts in adomain of discourse. An ontology is considered as a formalknowledge repository which can be used as a resource for NLGtasks. A domain ontology will provides the input for contentdetermination and micro-planing of NLG task. A linguisticontology can be used for lexical realization.

The logically structured manner of knowledge organizationwithin an ontology enables to perform reasoning tasks like Con-sistency checking, Concept Satisfiability, Concept Subsumptionand Instance Checking. These types of logical inferencing actionswill be applied to derive descriptive texts as answers to userqueries from ontologies. Thus a simple natural language basedQuestion Answering system can be implement, guided by robustNLG techniques that act upon ontologies.

Some of the tools for constructing ontologies ( Protege, NaturalOWL, etc.) and their combination with NLG process will also bediscussed...

Keywords—Natural Language Generation (NLG), Ontology,Protege, Question Answering.

I. INTRODUCTION

NATURAL Language Generation (NLG) is a young andfascinating area of Computational Linguistics. It is the

intelligent process of generating informations in some naturallanguage. The most important capability, that makes humnanintelligent is common sense. It has been accepted that withoutsubstantial bodies of background information concerning com-monsense, everyday knowledge about the world or detailedinformation concerning particular domains of application, itwill not be possible to construct systems that can supportthe use of natural language. Thence the development of NLGsystems has reached the stage where concentrated efforts arenecessary in the area of representing more ’abstract’, more’knowledge’-related bodies of information. Systems need torepresent concrete details of the ’worlds’ that their textsdescribe: for example, the resolution of anaphors, the inductionof text coherence by recognizing regularities present in theworld and not in the text, the recognition of plans by knowingwhat kinds of plans make sense for speakers and hearers in realsituations, etc. all require world modeling to various depths.

This need creates two interrelated problem areas[3]. The firstproblem is how knowledge of the world is to be represented.The second problem is how such organizations of knowledgeare to be related to linguistic system levels of organization suchas grammar and lexicons. For both problem areas the conceptof ontologies for NLG has been suggested to be of potential

solution. With the advent of the “Semantic Web Vision”[6],ontologies have become the formalism of choice for knowledgerepresentation and reasoning.

Usually, ontologies are authored to represent real worldknowledge in terms of concepts, individuals and relations insome variety of Description Logic. With suitable interpretation,one can consider an ontology to be an organized knowledgesource repository which could serve as an input to NLGsystems[9].

The remaining part of this section give a general introductionof NLG, Ontology and QA system. The section 2 give thedetails of literature survey and some of the related works. Thedetailed concept of ontology and the tools and approaches fordeveloping an ontology is described in section 3. The task ofgenerating natural language from ontology and modifying itinto a QA system are described in next two chapters. Finally,summary of the discussion and some of the future works arementioned.

A. Natural Language GenerationNatural language generation is the process of converting an

input knowledge representation into an expression is naturallanguage(either text or speech) according to the application.The input to the system is a four tuple[7]: (K,C,U,D) where Kis the knowledge source, a database of world knowledge. C isthe communication goal, specified as independent of languagewhich is using. U is the user model based on which the systemis working. Probabilistic models are most commonly used ingeneration process. Finally, D is the discourse history, whichdeals with the ordering of information in the output text. Theoutput will be natural language text which can be followed bya speech synthesizer according to the application.

In most of the systems, the process of NLG is achieved bya pipeline of tasks such as document plan, micro plan andsurface realization. This architecture is represented in Figure1. In the first stage, the system identifies the message froma non-linguistic representation of concept and representingit as text plan. The micro plan module select the lexicalitem, that can be used to represent the message in naturallanguage. It also perform necessary sentence aggregation andpronominalization, to improve the readability. The final stepis applying grammar rules in the micro plan to producesyntactically and semantically valid output sentence.

B. OntologyThe concept of ontology is adapted from philosophy. Con-

sidering it as the first philosophy, Aristotle defined ontology as


24

Manu Madhavan, NLG from Ontologies

Fig. 1: NLG Architecture

an explicit formal specification of how to represent the objects,concepts and other entities that are assumed to exist in somearea of interest and the relationships that hold among them.According to Websters Revised Unabridged Dictionary[18] theword ontology means: “That department of the science ofmetaphysics which investigates and explains the nature andessential properties and relations of all beings, as such, or theprinciples and causes of being”.

An ontology is an explicit specification of a conceptual-ization. A conceptualization is an abstract, simplified view ofthe world that we wish to represent for some purpose. Thecommon application of ontology in NLG is domain modeling.The ontology will act as the knowledge base of generatingmessages. It represents different objects in a world by set ofhierarchies and relations. In CL applications, this provides acommon vocabulary for the domain of interest.

C. Question Answering Systems

Question Answering is the task of automatically derivingan answer to a question posed in natural language. A gooddefinition of a QA system is as follows: “A Question An-swering system is an information retrieval application whoseaim is to provide inexperienced users with flexible access toinformation, allowing them to write a query in natural languageand obtaining not a set of documents that contain the answer,but the concise answer itself” [6].

In this work, the queries put by the user in natural languagewill be used as the content determiner for NLG. The outputwill be the answer to the user query, in natural language. So,

it involves all the process of natural language understanding,ontology and NLG to perform form the task of QA.

II. RELATED WORKThe use of ontology for natural language processing is an

interesting area in Knowledge Representation. The researchersstarted to construct domain ontology from the beginningof 1980. The success of PENNMAN[13], a text generationsystem is a monument in this development. The PENNMANsystem consists of a knowledge acquisition system, a text planand a language generation system which uses large systemicgrammar of English.

ONTOGENERATION[5] uses a concept of reusing domainand linguistic ontologies for text generation. This article pro-poses a general approach to reuse domain and linguistic ontolo-gies with natural language generation technology, describinga practical system for the generation of Spanish texts in thedomain of chemical substances. It uses a Generalized UpperModel(GUM) for NLG.

The OntoSum project aims at developing new methodsand techniques for - Ontology Learning - Ontology Coor-dination, Mapping and Ontology merging - Ontology-basedSummarization[5]. The OntoSum methodology consists of thefollowing steps - Specification of the theoretical framework- Development of methods and of a prototype system -Specification of the evaluation methodology - Exploitation andevaluation of the developed methods and system in two casestudies: (a) in the context of the e-Centric EXODUS platformfor document management, and (b) in biomedical applicationsthat are being developed in DEVLAB of Dartmouth College.- Dissemination of results.

The approach of ontology based system towards questionanswering is discussed by Gyawali[9]. This also describe ageneralized architecture for the same. The thesis identifies a setof factoid questions, that can be asked to a domain ontology.The ideas for developing a sample ontology is taken from [14].

An Ontology based multilingual sentence generator [15] forEnglish, Spanish, Japanese and Chinese was a combinationof example based, rule based and statistical components. Thisalso provided an application-driven generation of sentences byfeature based grammars.

SimpleNLG [1] is a realization engine for English whichaims to provide simple and robust interfaces to generate syntac-tic structures and linearize them. John A Batesman suggesteda systemic functional Grammar (SFG) for representing thesemantic features of a sentence. Based on SFG, a systemnetwork is developed and used as an internal representationfor NLG[15]. In automatic story generation using Ontology[10] a rule based model generates the language and Ontologybased internal representation check the semantic and pragmaticexistence of the generated sentence.

A. Applications of NLG• Canned Text Generation: Sometimes the general form

of the sentences or their constructions in a text is suf-ficiently invariant that can be predetermined and storedas text string. For example, in compilers, shows the line


25


in which an error occurred. This approach to generationis called canned text. But, this is the simple case ofgeneration, where text produced not by the system, butthe author of the program[9].

• Weather Forecasting Systems: Generate textualweather forecasts from representations of graphicalweather maps.

• Machine Translation: NLG an be consider as a trans-lation process, which convert an input non-linguisticrepresentation to language specific output. So, if thesystem can generate this input representation froma source language, multilingual translation can beachieved efficiently[13].

• Authoring Tools: NLG technology can also be used tobuild authoring aids, systems which help people createroutine documents.

• Text Summarization: Applications of NLG can beextended in automatic summary generation in Medicalfield, News analysis etc..

• Question Answering: QA is the task of automaticallyanswering a question posed in natural language. The QAsystem generates an answer to a question, a QA com-puter program may use either a pre-structured databaseor a collection of natural language documents.

III. ONTOLOGY ENGINEERING AND TOOLS

This section discusses the formal definition of ontology, theapproaches for constructing ontologies and ontology engineer-ing tools.

A. Definition

Formally an ontology is defined by a seven tuple asfollows:[9]

O = (C,HC , RC , HR, I, RI , A)An ontology O consists of the following. The concepts C ofthe schema are arranged in a subsumption hierarchy HC .Relations RC exist between concepts. Relations (Properties)can also be arranged in a hierarchy HR . Instances I of aspecific concept are interconnected by property instances RI .Additionally, one can define axioms A which can be used toinfer knowledge from already existing one.

With a formal Description Logic based knowledge rep-resentation scheme and support of complex reasoners thatcan perform inference on the knowledge specified within anontology, the uses of ontologies have broadened from thepurely theoretical inquiry initially carried out within the areaof Artificial Intelligence to encompass practical applicationsby domain experts across heterogeneous fields.

Classes are the focus of most ontologies. Classes describeconcepts in the domain. For example, a class of wines repre-sents all wines. Specific wines are instances of this class. TheBordeaux wine in the glass in front of you while you readthis document is an instance of the class of Bordeaux wines.A class can have subclasses that represent concepts that aremore specific than the superclass.

B. Ontology and NLPThe most systems that deal currently with nlp already

adopt some kind of ontology for their more abstract levels ofinformation. However, theoretical principles for the design anddevelopment of ontologies meeting the goals of generality anddetail remain weak. This is due not only to a lack of theoreticalaccounts at these more cleanse abstract levels of information,but also to the co-existence of a range of, sometimes poorly dif-ferentiated, functions such bodies of information are expectedto fulfill.

The following list gives an idea of the range of functionsadopted in nlp. Ontologies are often expected to fulfill at leastone (and often more) of:[4]• organizing ’world knowledge’• organizing the world itself• organizing ’meaning’ or ’semantics’ of natural language

expressions• providing an interface between system external compo-

nents, domain models etc. and nlp linguistic components• ensuring expressibility of input expressions• offering an interlingua for machine translation• supporting the construction of ’conceptual dictionaries’

C. Types of OntologiesThe nlp applications use ontology for the representation of

world knowledge in a formal way, without including the lin-guistic details. Then another kind of ontology which explicitlycontains the linguistic details for the system. Each of thesevariants has been adopted in some system where a concreteontology has been attempted. This gives rise to three distinctkinds of ontology that can be found in nlp work. They are[4]:• Conceptual Ontology: an abstract organization of real-

world knowledge (commonsense or otherwise) that isessentially non-linguistic.

• Mixed Ontology: an abstract semantico-conceptual rep-resentation of real-world knowledge that also functionsas a semantics for use of grammar and lexis.

• Interface Ontology: an abstract organization underlyingour use of grammar and lexis that is separate from theconceptual, world knowlege ontology, but which actsas an interface between grammar and lexis and thatontology.

D. Ontology Representation1) OWL: An ontology language is a formal language used

to encode the ontology[14]. They are usually declarative lan-guages, are almost always generalizations of frame languages,and are commonly based on either first-order logic or ondescription logic. The Web Ontology Language (OWL) is afamily of knowledge representation languages for authoringontologies. The languages are characterised by formal seman-tics and RDF/XML-based serializations for the Semantic Web.Current ontologies are authored in one of several varieties ofOWL based representation, namely OWL FULL, OWL DL andOWL Lite. The choice among these varieties for representinga knowledge base will eventually affect the expressiveness of


26


the ontology and whether or not reasoning algorithms will beable to guarantee completeness and/or decidability.

2) RDF: RDF (Resource Description Framework)[17] canbe used to describe ontology metadata. The RDF data modelis similar to classic conceptual modeling approaches such asentity-relationship or class diagrams, as it is based upon theidea of making statements about resources (in particular Webresources) in the form of subject-predicate-object expressions.These expressions are known as triples in RDF terminology.The subject denotes the resource, and the predicate denotestraits or aspects of the resource and expresses a relationshipbetween the subject and the object.

3) SPARQL: The predominant query language for RDFgraphs is SPARQL[17]. SPARQL is an SQL-like language,and a recommendation of the W3C as of January 15, 2008.SPARQL allows users to write unambiguous queries. Forexample, the following query returns names and emails ofevery person in the dataset:

PREFIX foaf : 〈http ://xmlns.com/foaf/0.1/〉

SELECT ?name ?emailWHERE{

?person a foaf : Person?person foaf : name ?name?person foaf : mbox ?email

}This query can be distributed to multiple SPARQL endpoints(services that accept SPARQL queries and return results),computed, and results gathered, a procedure known asfederated query.

E. Ontology ToolsProtege, Oiled, Apollo, RDFedt, OntoLingua, OntoEdit,

WebODE, KAON, ICOM, DOE and WebOnto are some ofthe ontology development tools available for research works.Medius Visual Ontology, Modeler LinKFactory Workbenchand K-Infinity are some commercially avilable ontology tools.

1) Protege: Protege[17] is an open-source tool developedat Stanford Medical Informatics. Like most other modelingtools, the architecture of Protege is cleanly separatedinto a “model” part and a “view” part. Protege modelis the internal representation mechanism for ontologiesand knowledge bases. Protege’s view components provide auser interface to display and manipulate the underlying model.

Proteges model is based on a simple yet flexible meta-model, which is comparable to object-oriented and frame-based systems. It basically can represent ontologies consistingof classes, properties (slots), property characteristics (facetsand constraints), and instances. Prot g provides an open JavaAPI to query and manipulate models. An important strengthof Protege is that the Protege metamodel itself is a Protegeontology, with classes that represent classes, properties, and soon. For example, the default class in the Protege base systemis called :STANDARD-CLASS, and has properties such as:NAME and :DIRECT-SUPERCLASSES. This structure ofthe metamodel enables easy extension and adaption to otherrepresentations.

Protege 3.4 alpha is used used in this work for demonstra-tion.

IV. NLG FROM ONTOLOGYA less conventional approach of utilizing ontologies in

the computational linguistics field is in the area of NaturalLanguage Generation (NLG). In this approach, an ontologyis considered as a formal knowledge repository which can beutilized as a resource for NLG tasks. The objective, then, isto generate a linguistically interesting and relevant descrip-tive text summarizing parts or all of the concisely encodedknowledge within the given ontology. It has been argued thatontologies contain linguistically interesting patterns in choiceof the words they use for representing knowledge and this initself makes the task of mapping from ontologies to naturallanguage easier. It is along this line of thought that this thesisbuilds upon. The present work aims at utilizing ontologies forthe sake of NLG and seek to justify the motive and rationalefor doing so. Further,it will identify a set of generic questionsthat are suitable to be asked concerning an input ontology. Thelogically structured manner of knowledge organization withinan ontology enables to perform reasoning tasks like Consis-tency checking, Concept Satisfiability, Concept Subsumptionand Instance Checking[9]. These types of logical inferencingactions will motivate us in proposing a set of Natural Languagebased questions which can be asked. NLG, in turn, will beapplied to derive descriptive texts as answers to such queries.This will eventually help us in implementing a simple naturallanguage based Question Answering system guided by robustNLG techniques that act upon ontologies.

A. ArchitectureThe architecture of present work combine the pipeline of

NLG, discussed in chapter 1 and some other steps. The NLG,here is achieved mainly through the following four steps[9]:• Extracting the knowledge from Ontology• Document Planning• Micro Planning• Reasoning

The deatiled architecture is sown in Figure 2.In this work, the ontology is constructed using Protege tool.

The sample ontology used for the explanation is pizza.owl.This represents the details of different pizzas available. Thestructure of pizza ontology is given Figure 3. From the Protege,abstract syntax of OWL can be obtained, which is is RDFformat. The abstract syntax representation is semi-linguisticsin nature. This will helps to extract the content for NLG task.

B. Extracting Data from OntologyThe input to the NLG system is comes from the knowl-

edge base. In this case, the input is obtained from ontology(pizza.owl). From the Protege -OWL interface, abstract syntaxof ontology is obtained, which is represented in RDF. Asmentioned earlier, this representation is semi-linguistics. Thepurpose in this stage is to extract axioms from this RDF


27


Fig. 2: NLG Architecture

Fig. 3: Pizza Ontology

representation.A sample RDF representation is given in Figure4. The corresponding axiom will be: SubClassOf(< pizza >, ,< Food >), which is more intuitive, cleaner and easier forfurther processing.

Then, the available axioms in the ontology are classifiedinto the following categories and retrieve all of the axiomscorresponding to each category[9].

Fig. 4: RDF Representation

1) Subsumersa) Stated Subsumers with Named Conceptb) Stated Subsumers with Property Restrictionc) Implied Subsumers with Named Concept

2) Equivalentsa) Stated Equivalents with Property Restrictionb) Stated Equivalents with Enumerationc) Stated Equivalents with Cardinality Restrictiond) Stated Equivalents with Set Operatore) Implied Equivalents with Named Concept

3) Disjoints4) SiblingsSubsumer axioms are the axioms which state that a concept

(child concept) inherits properties from some other concept(parent concept) in the ontology. Typically, such axioms beginwith the word “SubClassOf”. For a given concept, its subsumercan either be a named concept in the ontology or an anonymousconcept (an anonymous concept is an unnamed concept whichrepresents a new set of individuals that can be obtainedafter the specified property restrictions are exercised over thespecified named concept in the ontology). Further, in additionto the explicitly stated subsumers (both named and anonymoustype) for a given concept in the ontology, we use the reasonerto infer its additional named subsumers.

Equivalent axioms are the axioms which state that theconcepts involved describe exactly the same set of individuals.Typically, such axioms begin with the word “Equivalent-Classes”. For a given concept, its equivalent concept can eitherbe a named concept in the ontology or an anonymous concept.Here, the anonymous concepts can be defined in terms ofproperty restrictions, cardinality restrictions, enumeration orset operations over some named concepts in the ontology.

Disjoint axioms are the axioms which state that the conceptsinvolved have no individuals in common. Typically, suchaxioms begin with the word “Disjoint-Classes”. The conceptsare considered as sibling of each other when they are classifiedas subconcepts under the same parent concept, i.e., if a conceptX has children Y and Z, then Y is a sibling of Z and vice versa(also Y and Z are siblings of themselves).

C. Document Planning

The task of document planner is to determine ’what to say’and ’how to say’. These are achieved through two subtasks-content determination and document structuring. The input tothis stage is set of axioms from the ontology, and output willbe a text (document) plan.


28


1) Content Determination: The content determination is theprocess of selecting axioms from the ontology, according tothe goal. Identifying the content for the generation is not aneasy task. In an ontology based model, the axioms act as themessages for NLG system. There are two approaches - top-down and bottom-up - for content selection. The “top-downproblems” need to identify specific contents that can addressa specific goal while the “bottom-up problems“ have a morediffuse goal of identifying contents that can produce a generalexpository or descriptive text.

2) Document Structuring: The process involved in this stageis organization of different messages (axioms) identified in theprevious subsection. The output thus obtained is the structureof the output to be generated, which is called text plan. Thetext plan should consider the rhetoric relations between themessage units. The pragmatic and discourse structure are alsoconsidered in text plan. The text plan is represented as a tree,where the leaves are axioms and intermediates nodes are thecategory names of the axioms. A sample text plan is given inFigure 5.

Fig. 5: Text Plan

D. Micro PlanningThe micro planning comprises of two stages lexicalization

and aggregation.1) Pre-Lexicalization: In NLG, lexicalization is the task

of identifying lexical items (words of natural language) thatwill serve to build up natural language sentences. The lexicalitems are vocabulary for the sentence. During the lexicalizationphase, identify the lexical items that will help us in mappingthe factual knowledge within each category of OWL axioms tonatural language text. In this approach lexicalization is carriedout in two phases- pre-lexicalization and lexicalization proper.

In the Pre-Lexicalization stage,an initial plan of sentencestructure is prepared. It also made a preliminary choice oflexical items for each category of the statements presented insection 4.2 are selected. It is easy to notice that the statementsretrieved in content determiner are semi linguistic in nature.Each example have statements which contain concepts andrelations that are either legitimate words of English (eg: Pizza,

Food, Country etc.) or are concatenations of legitimate wordsof English (eg: hasBase, MeatyPizza, PizzaTopping etc.).Based on similar observations, it is feasible to derive lexicalitems from the semi linguistic concept and relation names inthe ontology itself. Further, the semantics of the predicatesbeing used in the statements guide us on identifying whatlinguistic roles such lexical items play in the output sentenceto be generated. Let us consider the following statement, forexample:SubClassOf(〈Margherita〉ObjectSomeV aluesFrom(〈hasTopping〉〈TomatoTopping〉))

The concept name ”Margherita“ is a legitimate word ofEnglish and the concept name ”TomatoTopping“ is formedby concatenation of two English words ”Tomato“ & ”Top-ping“. Likewise, the relation ”hasTopping“ is also formedby concatenation of two legitimate words has & Topping.Additionally, the semantics of the predicate SubClassOf guidesus in mapping the concept ”Margherita“ to the subject and theconcept ”TomatoTopping“ to the object of the sentence thatthe developer intend to verbalize from the axiom. Similarly,the relation ”hasTopping“ is a good candidate for the verb inthe sentence to be generated; since relationships in OWL actas binders between concepts, they can serve to identify thelinguistic roles of subject and object in the output sentenceto be generated. The pre-lexicalization stage exploits suchsemantic information to plan the output sentence structure foreach category of the statements presented in section 4.2.

With regards to the task of identifying lexical items torepresent concepts, the concept names that are legitimate wordsof English are directly approved as lexical items for ourtask; we shall refer to such concept names as ”SimplifiedConcept Name“. The other concept names which are formedby concatenation of two or more English words, are breakdowninto possible Simplified Concept Names, which will then serveas lexical items.

The strategy used to identify the simplified concept namefrom a complex concept is as follows: Since there can bemultiple super concepts for a concept in the ontology (eitherstated or inferred via reasoning), it is possible to check forsuch a possibility among a number of super concepts. Thesuperconcept that satisfies such requirements is then chosenand designated as ”Best Parent”. This allows us to modelthe features describing a concept (in our feature structurerepresentation) in terms of base form and its modifier. Thebase form is set to the value of the “Best Parent” name and themodifier is set to the value of the Simplified Concept Name.For example, for the concept “CheeseyPizza” , we have itsbase form set to “Pizza” and its modifier set to “Cheesey”.This strategy of generating the base form and modifierfor a given concept is referred as “Concept LexicalisationAlgorithm“ [9]. This idea can be represented as below:Extracted axiom: SubClassOf(<Concept_X><Concept_Y>)Feature Set: (S )ubject : &Concept XObject : &Bset Parent of Concept YObjectModifier : &SimplifiedConcept Y The followingexample shows how verbs can be identified from the axiomstrucutre. Extracted axiom: SubClassOf(<Concept_X>


29


ObjectAllValuesFrom(<Relation_A><Concept_Y>))Feature Set: (S )ubject : &Concept XV erb : &Relation AObject : &Concept Y

2) Aggregation: Aggregation is the task of grouping twoor more simple structures to generate a single sentence, afrequent phenomenon in natural languages. There has beenlot of works in aggragation, which concentrate on syntacticlevel. This work concentrate on aggragation of various featurestructures at semantic level.

Figure 6 shows an example of aggragation. For the concept”Napoletana“ in the pizza ontology, the concept ”NamedPizza“is stated to be it’s subsumer and the concepts ”Interesting-Pizza“, ”CheeseyPizza“, ”RealItalianPizza“ and ”NonVegetari-anPizza“ are inferred to be its subsumers. Subsequently, duringour PreLexicalisation phase, the following feature structuresrepresent them, respectively.

Fig. 6: Feature Structures

Now, during the aggregation phase, the following new fea-ture structure is generated and preserved for further processing;eliminating the above five feature structures, as in Figure 7.The aggregation should also consider the categories of axiomsto be grouped together. This result with better readability inthe output text.

Fig. 7: Aggregation of Feature structures

3) Lexicalization Proper: The completion of Aggregationphase opens up an opportunity to generate further lexical items;which was otherwise infeasible/unsuitable to obtain during thePreLexicalisation phase. In particular, the process attempt togenerate lexical items for the Verb features (whenever present)in the feature structures and identifying the Verb lexical itemwill facilitate in augmenting the information pertaining tothe Object feature in those feature structures. The need topostpone such activities until the aggregation phase has beencompleted stems from the fact that our Aggregation task is

highly dependent on the values assigned to Verb features inthe feature structures; they served as a criterion for judgingwhether an aggregation task should be carried out on theavailable set of feature structures or not. Thus it was desir-able that those values come directly from the name (string)representing the relation in the axiom and remain ”intact“throughout the aggregation phase. Let us consider an example.(S )ubject : &MargheritaV erb : &hasToppingObject : &TomatoTopping

is represented as(S )ubject : &Margherita

V erb : &hasObjectDescriptor : &toppingObject : &Tomato

E. Reasoning

Realisation is the task of generating actual natural languagesentences from the intermediary (syntactic) representationsobtained during the Micro Planning phase. For a NLG de-veloper, a number of general purpose realisation modules areavailable to achieve such functionality. In particular, suchmodules facilitate in transforming syntactic information intonatural language text by taking care of various syntactic (forexample, the arrangement of Subject, Verb and Object ina sentence), morphological (for example, the generation ofinflected forms of words, when required, such as the pluralof child being children and not child) and orthographical (forexample, placement of appropriate punctuation marks in thesentence, such as placing a comma to describe an aggregationof things) transformations that the contents from the MicroPlanning phase need to adhere to for generating grammaticallyvalid sentences.

SimpleNLG[1] has java classes which allow a programmerto specify the content (such as Subject, Verb, Object, Modi-fiers, Tense, Preposition phrase etc) of a sentence by settingvalues to the attributes in the classes. Once such attributes areset, methods can be executed to generate output sentences;the package takes care of the linguistic transformations andensures grammatically well formed sentences. An example isshown in Figure 8[9].

Fig. 8: Reasoning


30


V. TOWARDS QUESTION ANSWERING

This chapter discusses how the technique of NLG fromontology can be utilized to develop a QA system based onan Ontology. The general architecture, implementation toolsand examples are also explained.

A. Question AnsweringQuestion Answering (QA) is a computer science discipline

within the fields of information retrieval and natural languageprocessing (NLP) which is concerned with building systemsthat automatically answer questions posed by humans in anatural language. A QA implementation, usually a computerprogram, may construct its answers by querying a structureddatabase of knowledge or information, usually a knowledgebase.

The basic idea of this implementation is generate a naturallanguage answer to the query given by the user using theontology as knowledge base. The system will determine a cluefor the content determination from the user query. Then theNLG technique described in Chapter 4 is used to generate theoutput answer. A common architecture is explained in nextsection.

B. ArchitectureThe architecture[9] shown in Figure 9 is an augmentation

of architecture discussed in previous chapter. In QA system,the ontology served as the knowledge base. The questionasked by the user is analyzed for identifying axioms forcontent determination. These axioms are given as input to theNLG system. The NLG pipeline, after grammatical reasoning,generate an output, which is the answer to the user query.

Fig. 9: Architecture of QA system with NLG from Ontology

The output of the system should be structured according tothe query asked. On an analysis of the questions asked, themode of answer expected can be identified. Some idea in thiscontext is given below in Table 1[8]. These tags of expected an-swers can be used for identifying the axiom category, rhetoricalstructuring of the features and final grammatical realization.

TABLE I: Classification of Questions and Answers

Question & Expected An-swerWhat did you buy? [THING]Where is my coat? [PLACE]Where did they go? [PLACE]What happened next? [EVENT]How did you cook the eggs? [MANNER]

Since an ontology is domain specific, the possible set ofquestion (type of questions) asked by the user can be predicted.This observation helps to limit the unnecessary searches in theontology and improve the results. For example, the question“What is RealItalianPizza?” towards Pizza ontology will givethe following result:RealItalianPizza is a pizza that falls under the class of thin andcrispy pizza. RealItalianPizza can have base of thinandcrispyonly. However, it might be the case that some instances ofRealItalianPizza don‘t have any base at all. RealItalianPizzahas Italy as its countryoforigin.

C. Implementation ToolsThe tools used for implementation of the system

include python, package rdflib, SPARQL, and NLG reasoningsystems. The Figure 10 shows the implementation stages.

Fig. 10: Implementation tools

The ontology can be developed using Protege[17]. Theis a Java interface for ontology development. Protege haveoptions to get RDF file of the ontology. Then, NLP relatedworks are done by python programs. Python has got a librarypackage, rdflib, which can be used to interact with rdf file. Thepackage rdflib also equipped with SPARQL query processingcapacity. The query result is then processed by reasoners likeSimpleNLG to produce the output.

D. ExampleThe concept of seminar is presented with a practical example

of M. Tech ontology. The ontology contains the details of


31


M.Tech courses conducted in Kerala state. The hierarchy ofontology classes is shown in Figure 11. The whole ontologyis considered as the subclass of claa Things(default class). Theclasses in the hierarchy are Institutions, Branches, Speciliza-tions and Universities. All these classes has subclasses. Somerelations (properties) in ontology are IsOfferedBy (courseIsOfferedBy college), ApprovedBy (college ApprovedBy Uni-versity) and HasSpecilizaion(Branch HasSpecilizaion course).A RDF code snippet is shown in Figure 12. The processing

Fig. 11: Example: M.Tech Ontology

of the user query ’ which college offer M.Tech CL?’ canbe explained as follows. The sentence is split into wordsand identify the keyword which selects an axiom. Here,it isoffered, which select the axiom OfferedBy. Figure 13 shows

Fig. 12: Code segment from RDF File

the SPARQL query for this question. The final output will

Fig. 13: SPARQL Query

be:GECSKP offer M.Tech in CL.The system is capable of answering around five types

of questions like which are the colleges offering course –, which university approve —(course or college) etc. Thealgorithms can be extended by adding more semantic featureslike synonyms and hyponyms to handle more complex queries.

VI. CONCLUSIONS & FUTURE SCOPE

The NLG have good real time applications, currently whichare limited by the lack of world knowledge. The constructionof a domain specific ontologies will solve the problem of worldknowledge. This work discusses the approaches for collaborat-ing ontologies and NLG systems. The method discussed hereclassifies the axioms in the ontology into different categoriesand performs NLG pipeline. With necessary modification,an ontology supported NLG system can be extended as QAsystem. There are hands of tools for efficient implementationof the approach. The development of such a QA system isexplained with a sample ontology of M.Tech courses. The sys-tem performance can be improved by applying more linguistictechniques to analysis the NL efficiently. Such systems haveimportant role in the web-age of semantic search.

REFERENCES

[1] Albert G and Ehud R, “ SimpleNLG: A realization engine for practicalapplications, ” in Proceedings of the 12th European Workshop onNatural Language Generation, 2009.

[2] Allen J, Natural Language Understanding. Benjamin/Cummings Pub-lication company, California, 1988.

[3] Batesman J A,“ Sentence generation and systemic grammar: an intro-duction,” in Iwanami Lecture Series: Language Sciences, Volume 8.Tokyo: Iwanami Shoten Publishers, 1997.

[4] Batesman J A,“The Theoretical Status of Ontologies in Natural Lan-guage Processing,” in Proceedings of the workshop on ’Text Represen-tation and Domain Modelling–Ideas from Linguistics and AI, TechnicalUniversity Berlin, 1991.

[5] Batesman J A, “Ontology Construction and Natural Language”, inProceedings of Workshop on Formal Ontology in Conceptual Analysisand Knowledge Representation, Padova,,1993.

[6] Davis J, Studer R, and Warren P,Semantic Web Technologies. John Wiley& Sons Ltd, England, 2006.

[7] Ehud R, “ Building Applied Natural Language Generation Systems,”in Proceedings of Applied Natural Language Processing Conference inWashington dc, 1997.

[8] Ghorka W, Bownik L, Piasecki A, “ Information System Based onNatural Language Generation from Ontology ”, in Proceedings of theInternational Multiconference on Computer Science and InformationTechnology, pp. 357 364, 2007.

[9] Gyawali B, “Answering Factoid Questions via Ontologies : A NaturalLanguage Generation Approach”, M.Sc. Dissertation, Dept of Intelli-gent Computer Systems, University of Malta, 2011.


32


[10] Jaya A and Uma G. V, “ A Novel Approach for Construction ofSentences for Automatic Story Generation Using Ontology,” in Pro-ceedings of the International Conference on Computing,Communicationand Networking (ICCCN), 2008.

[11] Jurafsky D and Martin H, Speech and Language Processing. PrenticeHall Inc., 2008.

[12] Kalina Bontcheva. “Generating tailored textual summaries from ontolo-gies”. In ESWC, volume 3532 of Lecture Notes in Computer Science,pages 531–545. Springer, 2005.

[13] Mann W. C, “ An Overview of the Pennman Text Generation System”, in AAAI-83 Proceedings, pp 261–265, 1983.

[14] Noy N. F and McGuinness L. D, “Ontology Development 101: A Guideto Creating Your First Ontology” , Stanford University, Stanford,2005.

[15] Takako Aikawa , Maite Melero , Lee Schwartz, and Andi Wu, “Multilin-gual Sentence Generation,” in Proceedings of 8th European Workshopon Natural Language Generation, 2001.

[16] http://protege.stanford.edu, refered on April, 2012.[17] http://code.google.com/p/rdflib, refered on April, 2012.[18] Webster’s Revised Unabridged Dictionary:

http://machaut.uchicago.edu/websters,referred on April, 2014.


33


Speech synthesis using Artificial Neural NetworkAnjali Krishna C R, Arya M B, Neeraja P N, Sreevidya K M, Jayanthan K S


[email protected]

Abstract—The text to speech conversion is a large area whichshows a very fast development in the last few decades.Ourgoal is to study and implement the specific tasks concentratedduring text to speech conversion namely text normalizationgrapheme to phoneme conversion, phoneme concatenation andspeech engine processing.Usage of neural network for graphemeto phoneme conversion provides more accuracy than normalcorpus or dictionary based approach.Usually in text to speechgrapheme to phoneme conversion is performed using a dictionarybased method.The main limitation of this technique is that itcan’t able to give the phoneme of a word which is not in thedictionary and to have more efficiency in phoneme generationwe require a large collection of word- pronunciation pair.Forusing large dictionary we require large storage space also.Thislimitation can be overcome using a neural network.The mainadvantage of this approach is that it can able to adapt unknownsituvations.ie it can able to predict the phoneme of a graphemewhich is not defined so far.The neural network system requiresless memory than a dictionary based system and performed wellin tests.The system will be very much useful for an illiterateand vision impaired people to hear and understand the content,where they face many problem in their day to day life due to thedifferences in their script system.

I. INTRODUCTION

TEXT to speech synthesizer is a computer based systemthat can read text aloud automatically, from a source

text.It has been made very fast improvement in this fieldover a couple of decades and a lot of TTS systems arenow available for commercial use. A text to speech systemconverts a written text into speech.Speech is often based onconcatenation of speech units,that are taken from naturalspeech put together to form a word or a sentence.Concatinativespeech synthesize has become very popular in recent yearsdue to its improved sensitivity compared with other.ManyTTS systems are developed based on the principle, corpus-based speech synthesis.Since there are lot of speech systemsdeveloped none of them deals with quality [1].Speech is themost used and natural way for people to communicate.Fromthe beginning of the man-machine interface research, speechhas been one of the most desired mediums to interact withcomputers.Therefore, speech recognition and speech synthesishave been studied to make the communication with machinesmore human likely. In order to increase the naturalness of oralcommunications between humans and machines, all speechaspects must be involved.Speech does not only transmit ideasand concepts, but also carries information about the attitude,emotion and individuality of the speaker.Different applications of TTS in our day-to-day life are the

following [9]:-

• Telephony:- Automation of telephone transactions(e.g.,banking operations)automatic call centers for informa-tion services(e.g., access to weather reports),etc.

• Automotive:- Information released by in-car equip-ments such as the radio,the air conditioning sys-tem,the navigation system,the mobile phone(e.g.,voicedialing)embedded telemetric systems,etc.

• Multimedia : Reading of electronic documents(webpages, emails, bills) or scanned pages(output of anOptical Character Recognition system).

• Medical:- Disabled people assistance: personal computerhandling, demotic, mail reading.

• Industrial:- Voice-based management of control tools, bydrawing operators attention on important events dividedamong several screens.

Evolution Of TTS :-In 1779 the Danish scientist ChristianKratzenstein,working at the Rusian academy of sciences,buildmodels of the human vocal tract that could produce thefive long vowel sounds they are a,e,i,o and u.In 1791 anAustrian scientist developed a system based on the previousone included tongue,lips and ”mouth” made of rubber anda ”nose” with two nostrils which was able to pronounceconsonants.In 1837,Joseph Faber developoed a system whichimplemented Pharyngeal Cavity,used for singing.It was con-trolled by keyboard.Bell Labs Developed VOCODER, a clearlyintelligible.keyboard-operated electronic speech analyzer andsynthesizer.In 1939, Homer Dudely developed VODER whichwas an improvement over VOCODER.The Pattern Playbackwas built by Dr. Franklin S. Cooper and his colleagues atHaskins Laboratories.First Electronic based TTS system wasdesigned in 1968.Concatenation Technique was developed by1970s.Many computer operating systems have included speechsynthesizers since the early 1980s. From 1990s there was aprogress in Unit Selection and Diphone Synthesis.Still a lot ofdevolopment is taking in this area.

A. Problem definition

Most of the existing system uses dictionary based methodfor converting the grapheme to phoneme.Phoneme generationusing this techniques is not an efficient method.In dictionarybased method the entries of a dictionary will be a tupple ofword-pronounciation pair.This technique cant able to give thephoneme of a grapheme which is not in the dictionary.Andfor higher efficiency we require a large dictionary with alarge no of word pronounciation pair.For that it requires large


34

Anjali Krishna, et. al, Speech synthesis using Artificial Neural Network

storage space also.We can avoid these limitations using aneural network method.

II. REQUIREMENT ANALYSIS

Requirements of TTS includes nltk,numpy,python andspeech generating tool mbrola.Python is a free and opensource software and is widely used in general-purpose, high-level programming language.NLTK is a leading platform forbuilding Python programs to work with human language data.Itprovides easy-to-use interfaces.NumPy is an extension to thePython programming language and NumPy is also an opensource tool .So all the requirements for speech synthesizersare cost effective.

A. Existing System

Most Text To Speech engines can be categorized by themethod that they use to translate phonemes into audiblesound.Some TTS Systems are listed below:-

• Prerecorded:- In this kind of TTS Systems we maintaina database of prerecorded words.The main advantageof this method is good quality of voice.But limitedvocabulary and need of large storage space makes it lessefficient.

• Formant:-Here voice is generated by the simulation ofthe behavior of human vocal cord.Unlimited vocabulary,need of low storage space and ability to produce multiplefeatured voices makes it highly efficient, but roboticvoice, which is sometimes not appreciated by the users.

• Concatenated:-In this kind of TTS systems,text is pho-netically represented by the combination of its sylla-bles.These syllables are concatenated at run time andthey produce phonetic representation of text.Key featuresof this technique are unlimited vocabulary and goodvoice.But it cant produce multiple featured voices,needslarge storage space.The Implementation of this TTS isdone using the concatenation method.

Some existing speech softwares are :-• ESPEAK[9]:-eSpeak uses a ”formant synthesis”

method.This allows many languages to be providedin a small size.The speech is clear, and can be usedat high speeds, but is not as natural or smooth aslarger synthesizers which are based on human speechrecordings. eSpeak is available as:-◦ A command line program(linux or windows) to

speak text from a file or from stdin.◦ A shared library version for use by other programs.

(On Windows this is a DLL).◦ A SAPI5 version for Windows, so it can be used

with screen-readers and other programs that sup-port the Windows SAPI5 interface.

◦ e Speak has been ported to other platforms, includ-ing Android, Mac OSX and Solaris. Features.

◦ Includes different Voices, whose characteristics canbe altered.

◦ Can produce speech output as a WAV file.

◦ SSML (Speech Synthesis Markup Language) issupported (not complete), and also HTML.

◦ Compact size. The program and its data, includingmany languages, totals about 2 Mbytes.

◦ Can be used as a front-end to MBROLA diphonevoices, see mbrola.html. eSpeak converts text tophonemes with pitch and length information.

◦ Can translate text into phoneme codes, so it couldbe adapted as a front end for another speechsynthesis engine.

◦ Potential for other languages. Several are includedin varying stages of progress.Help from nativespeakers for these or other languages is welcome.

◦ Development tools are available for producing andtuning phoneme data Written in C.

• FESTIVAL[9]:-The Festival Speech Synthesis System isa general multi-lingual speech synthesis system origi-nally developed by Alan W. Black at Centre for SpeechTechnology Research (CSTR) at the University of Edin-burgh.It offers a full text to speech system with variousAPIs, as well as an environment for development andresearch of speech synthesis techniques.It is writtenin C++ with a Scheme-like command interpreter forgeneral customization and extension. Festival Usage:-When you pass a text file to festival, it converts thecontents of a text file into voice.For example, if youwant to read a letter (mail) which is residing in a textfile (say letter.txt) you can let festival read it out loudfor you as follows:$ festival (hyphen)tts text file (name) Advantages:-◦ Available for Free under an open source license.◦ The quality of the voice and the pronunciations are

very good.◦ Supports 3 languages - English, Spanish, and

Welsh.All the above mentioned softwares have a lot of limitationsand someof them are:• No emotions in speaking styles.• Needs improvisation in collaboration between linguistics

and technologists.• Text to speech should be made audibly communicate

information to the user.• Sound produced is not natural.

B. Proposed SystemCurrently existing systems have lots of limitations.Here

we are trying to improve the performance of the existingsystem by doing the phonetic analysis using Neural Networkapproach.Artificial intelligence and NN have been used moreand more in recent decades. Potentials in this area are huge.NNare used in cases where rules or criteria for searching an answeris not clear (that is why NN are often called black box, theycan solve the problem but at times it is hard to explain howproblem was solved). Some applications of neural network[6]:-

• Character Recognition:- The idea of character recogni-tion has become very important as handheld devices like


35


the Palm Pilot are becoming increasingly popular.Neuralnetworks can be used to recognize handwritten charac-ters.

• Image Compression:- Neural networks can receive andprocess vast amounts of information at once, makingthem useful in image compression.With the Internetexplosion and more sites using more images on theirsites, using neural networks for image compression isworth a look.

• Stock Market Prediction:- The day-to-day business ofthe stock market is extremely complicated.Many factorsweigh in whether a given stock will go up or down onany given day.Since neural networks can examine a lotof information quickly and sort it all out, they can beused to predict stock prices.

• Traveling Saleman’s Problem:- Interestingly enough,neural networks can solve the traveling salesman prob-lem, but only to a certain degree of approximation.

• Medicine, Electronic Nose, Security, and LoanApplications:- These are some applicationsthat are intheir proof-of-concept stage, with the acception of aneural network that will decide whether or not to granta loan, something that has already been used moresuccessfully than many humans.

• Miscellaneous Applications:- These are some very in-teresting (albeit at times a little absurd) applications ofneural networks.NN advantages are that they can adaptto new scenarios, they are fault tolerant and can dealwith noisy data.

The speech synthesizer has mainly four modules, and aregiven below:-

1.The first phase is Text Normalization.

2.Grapheme to Phoneme conversion, using Neural Networks.

3.Phoneme concatenation

4.Speech engine processing.The main advantage of the system is its efficiency in findingthe phoneme corresponding to the input.By using the NeuralNetworking (along with back propagation algorithm) forfinding the phoneme we dont want use the corpus of adirectory like CMU .Artificial intelligence and NN havebeen used more and more in recent decades.NN are usedin cases where rules or criteria for searching an answer arenot clear.Currently existing speech systems uses are Festival,Espeak etc.Limitations of these kinds of tools are it cantproduce emotions, feelings etc.Other limitations includeit produce voices of words which are predefined in thecorpus which are more artificial cant give a smooth voiceas the output.We are trying to overcome the last mentionedproblem using neural network.Thus improving the efficiencyof existing system.Fig 1 indicate different phases of a text tospeech system

Fig. 1: Text to Speech Conversion

III. MODULE DESCRIPTION

The important challenge that needs to be addressed in anyTTS systems is how to produce natural sounding speech fora plain text given as its input.It is not feasible to solve thisby recording and storing all the words of a language andthen concatenating the words in the given text to producethe corresponding speech.The TTS systems first convert theinput text into its corresponding linguistic or phonetic rep-resentations and then produce the sounds corresponding tothose representations.With the input being a plain text, thegenerated phonetic representations also need to be augmentedwith information about the intonation and rhythm that thesynthesized speech should have.This task is done by a textanalysis module in most speech synthesizers.The transcriptionfrom the text analysis module is then given to a signalprocessing module that produces synthetic speech .In speech synthesizers commonly two actions are takingplace.The front end receives the text as input and outputsa symbolic linguistic representation.The back-end - receivesthe symbolic linguistic representation as input and outputs thespeech.These two tasks are divided into four modules.They are

1) Text Normalization2) Grapheme to Phoneme Conversion3) Phoneme Concatenation4) Speech Engine Processing

Text normalization is the first phase of text to speech con-version.It is the process of transforming text into a singlecanonical form. Normalizing text before storing or processingit allows for separation of concerns, since input is guar-anteed to be consistent before operations are performed onit.Text normalization requires being aware of what type oftext is to be normalized and how it is to be processedafterwards.Grapheme to phoneme conversion is done by usingneural network.MBROLA is used for speech generation.

IV. TEXT NORMALIZATION

The text normalization is the first fundementel component ofthe text to speech system.In this phase we do the text analysisalso .The text analysis phase, which analyse the input text andorganize into manaegable list of words.The text normaliza-tion is the trnasformation of text to pronounceable form.Textnormalization is performed before the text is processed insome way.The main objective of this process is to identifythe punctuation mark and pauses between the words.Usuallythe text normalization process is done for converting allletters of lowercsae or uppercase.And also to remove accentmark,stopwords or etc.


36


We know that the input is taken from the user ,and theuser can input any text they would like to hear as voice.Thenormal user may input the strings with some punctuations andacronyms.Let consider an example string input hello how areyou? when we process this input we dont want to read the pro-nuncation ?.when we produce the voice correspond to the inputwe dont want to read the questionmark.So we have to removethe tokens which have no pronouncation in the corpus.In theabove case we have to remove the punctuations from the inputthen only we can move to the further phases.And similarlywe have to create a corpus which contain the acronyms andtheir expansion.when we see an acronym in the input stringwe have to replace it with the corresponding expansion to getaccurate result from the system.An example is given when theuser input a string as HIV we have to convert it as aitch eyeve and for this we produce the voice.In our system we incoperate a python code, which can removethe puctuations from the input string.For that we created alist called pun which contains the puncuations to be removedfrom the text.Each time the word is checked with the membersof pun list if word is not in list it is written to a fileotherwise not written.Here we are giving a text which containspunctuations.This text is given into normalization unit.Then weget a normalized text and each word in that text is also knownas graphemes.For eg:-

Input: how r ? u?Output:how r u

Fig 2 explians the input - output relation of a text nor-malization phase.The input is a normal text which containspunctuations.This text is given into normalization unit.Thenwe get a normalized text and each word in that text is alsoknown as graphemes.

Fig. 2: Text to Graphemes Conversion

V. GRAPHEME TO PHONEME CONVERSION

This task is performed with the help of neural net-work.Neural Networks (NN) are important data mining toolused for classication and clustering.It is an attempt to build

machine that will mimic brain activities and be able tolearn.NN usually learns by examples.If NN is supplied withenough examples, it should be able to perform classicationand even discover new trends or patterns in data.Basic NN iscomposed of three layers, input, output and hidden layer.Eachlayer can have number of nodes and nodes from input layer areconnected to the nodes from hidden layer.Nodes from hiddenlayer are connected to the nodes from output layer.Thoseconnections represent weights between nodes.Figure 3 represents the architecture of an simple NN. It ismade up from an input, output and one or more hidden layers.Each node from input layer is connected to a node fromhidden layer and every node from hidden layer is connectedto a node in output layer. There is usually some weightassociated with every connection.Input layer represents theraw information that is fed into the network.This part ofnetwork is never changing its values.Every single input to thenetwork is duplicated and send down to the nodes in hiddenlayer.Hidden Layer accepts data from the input layer.It usesinput values and modies them using some weight value, thisnew value is then send to the output layer but it will also bemodied by some weight from connection between hidden andoutput layer.Output layer process information received fromthe hidden layer and produces an output.This output is thenprocessed by activation function.

Fig. 3: Sipmle Neural Network

A. Number of Nodes and LayersChoosing number of nodes for each layer will depend on

problem NN is trying to solve, types of data network is dealingwith, quality of data and some other parameters.Number ofinput and output nodes depends on training set in hand.Laroseargued that choosing number of nodes in hidden layer couldbechallenging task.If there are too many nodes in hidden layer,number of possible computations that algorithm has to dealwith increases.Picking just few nodes in hidden layer canprevent the algorithm of it learning ability.Right balance needsto be picked.It is very important to monitor the progress ofNN during its training, if results are not improving, somemodication to the model might be needed.The way to controlNN is by setting and adjusting weights between nodes.Initialweights are usually set at some random numbers and then theyare adjusted during NN training.


37


Phonemer can be locally split into two parts: preprocessing /feature selection and classification. The feature set we usedis quite simple.Our features for each grapheme of a wordare: a) the character before, b) the character afterward, c) thecurrent character, d) the character 2 steps afterward, and e)what the class of the character is according to the Soundexalgorithm.For the last feature, several characters such as h andy do not belong to any of the classes, so they got their ownclass.However, all vowels were grouped into one class.Phonemer can be expanded and more features could be addedwith ease for later trial.Some examples for future review couldbe, the number of vowels before and after the current character,or even whether this character is a duplicate of the one previoussuch as with the character t in the word bottle.Pronunciationclasses apart from the Soundex Class can be used here aswell, but we ignored these and instead focused on simple andwidely used features to put emphasis on the abilities of neuralnetworks.The feature set generated for each character is thenturned into a binary input vector for the neural network.Eachentry represents a possibility for each feature class.

B. DatasetThe data set we used was provided by Sproat with his work

on the pmtools toolkit for word pronunciation modeling.Thedata set has almost 50,000 entries which we split up into adev set and a final set at an 80 and 20 percent randomizedsplit.With neural nets, the amount of training data severelyaffects its accuracy.So to avoid this testing error, we couldnttest the validity of our application by just training on apercentage of the final set and then testing with the rest.Instead,for final review, we trained on all of dev and then testedusing final.For incremental tests however, we trained using80% of the dataset was already aligned by character, so wecould ignore the problem of character alignment, however thephoneme classes were fairly complex.They included stresses,as well as silent and added pronunciation. As mentioned above,silent pauses were indicated using special characters.

C. Back Propagation (BP) AlgorithmOne of the most popular NN algorithms is back propagation

algorithm. Rojas claimed that BP algorithm could be brokendown to four main steps.After choosing the weights of thenetwork randomly, the back propagation algorithm is usedto compute the necessary corrections.The algorithm can bedecomposed in the following four steps:-

1) Feed-forward computation.2) Back propagation to the output layer.3) Back propagation to the hidden layer.4) Weight updates.

The algorithm will stop when the value of the error functionhas become approximately small.This is very rough and basicformula for BP algorithm.There are some variations proposedby other scientists but Roja’s denition seems to be quiteaccurate and easy to follow.This algorithm will repeat untillthe value of the error function becomes sufficiently small.BPalgorithm will be explained using the below shown Fig 4

Fig. 4: Example

D. Worked exampleNN on gure 5.2 has two nodes (N0,0 and N0,1) in input

layer, two nodes in hidden layer (N1,0 and N1,1) and onenode in output layer (N2,0).Input layer nodes are connected tohidden layer nodes with weights (W0,1-W0,4).Hidden layernodes are connected with output layer node with weights(W1,0 and W1,1).The values that were given to weights aretaken randomly and will be changed during BP iterations.Tablewith input node values and desired output with learningrate and momentum are also given in gure 5. There is alsosigmoid function formula f(x) = 1.0/(1.0 + exp(x)).Shownare calculations for this simple network (only calculation forexample set 1 is going to be shown (input values of 1 and1 with output value 1)).In NN training, all example sets arecalculated but logic behind calculation is the same.

E. Advantages And DisadvantagesArticial intelligence and NN have been used more and more

in recent decades.Potentials in this area are huge.Here are someNN advantages, disadvantages and industries where they arebeing used.NN are used in cases where rules or criteria forsearching an answer is not clear (that is why NN are oftencalled black box, they can solve the problem but at times it ishard to explain how problem was solved).They found its wayinto broad spectrum of industries,from medicine to marketingand military just to name few. Financial sector has beenknown for using NN in classifying credit rating and marketforecasts.Marketing is another eld where NN has been usedfor customer classication (groups that will buy some product,


38


identifying new markets for certain products, relationshipsbetween customer and company).Many companies use directmarketing(sending its oer by mails)to attract customers.If NNcould be employed up the percentage of the response to directmarketing, it could save companies lots of their revenue. Atthe end of the day, its all about the money.Post oces are knownto use NN for sorting the post (based on postal code recog-nition).Those were just few examples where NN are beingused.NN advantages are that they can adapt to new scenarios,they are fault tolerant and can deal with noisy data.Time totrain NN is probably identied as biggest disadvantage.Theyalso require very large sample sets to train model eciently.It ishard to explain results and what is going on inside NN.

VI. PHONEME CONCATENATION

The third phase of the system is phoneme concatenation.Weknow that the second phase of the system was the Grapheme toPhoneme at the end of this phase we get output ,the phonemeof the given grapheme(it simply means aword).The previousphase is done with the help of neural network.The input of thisphase is obtain from the previous phase and the input will be asthe word, which is seperated with space and the correspondingphoneme which also seperated with space.Currently the wordis seperated into their constituent phonetic.In this phase theseperated phoneme syllebles are concatenated to reconstructthe desired words. To impliment this phase we create a pythoncode which results the concatenated phonemes ie phonemewhich are corresponding to the letters in the word.

VII. SPEECH ENGINE PROCESSING

A speech engine is a generic entity that either processesspeech input or produces speech output.Each type of speechengine has a well-defined set of states of operation, and well-defined behavior for transitions between states.MBROLA is an algorithm for speech synthesis, and softwarewhich is distributed at no financial cost but in binary formonly.The MBROLA provides diphone databases for a largenumber of spoken languages.The MBROLA software is not acomplete text-to-speech system for all those languages; thetext must first be transformed into phoneme and prosodicinformation in MBROLA’s format, and separate software todo this is available for some but not all of MBROLA’slanguages and can require extra setup.Although diphone-based,the quality of MBROLA’s synthesis is considered to be higherthan that of most diphone synthesisers as it preprocesses thediphones imposing constant pitch and harmonic phases thatenhances their concatenation while only slightly degradingtheir segmental quality.MBROLA is a time-domain algorithm,as PSOLA, which implies very low computational load atsynthesis time.Unlike PSOLA, however, MBROLA does notrequire a preliminary marking of pitch periods.This featurehas made it possible to develop the MBROLA project aroundthe MBROLA algorithm, through which many speech researchlabs, companies, or individuals around the world have provideddiphone databases for many languages and voices (the numberof which is by far a world record for speech synthesis, but thereare some notable omissions such as Chinese).

MBROLA is a speech synthesizer based on the concatenationof diphones.It takes a list of phonemes as input, together withprosodic information (duration of phonemes and a piecewiselinear description of pitch), and produces speech samples on 16bits (linear), at the sampling frequency of the diphone database.It is therefore not a Text-To-Speech (TTS) synthesizer, since itdoes not accept raw text as input.In order to obtain a full TTSsystem, you need to use this synthesizer in combination witha text processing system that produces phonetic and prosodiccommands.

A. Related Terms• Diphone:-In phonetics, a diphone is an adjacent pair of

phones.It is usually used to refer to a recording of thetransition between two phones.

• Prosody:-Prosody reflect various features of the speakeror the utterance: the emotional state of the speaker; theform of the utterance (statement, question, or command),the presence of irony or sarcasm, emphasis, contrast, andfocus, or other elements of language that may not be en-coded by grammar or choice of vocabulary.The Mbrolavoices are cost-free but are not open source.eSpeak canbe used as a front-end to Mbrola.It provides the spelling-to-phoneme translation and intonation, which Mbrolathen uses to generate speech sound.To use a Mbrolavoice, eSpeak needs information to translate from itsown phonemes to the equivalent Mbrola phonemes.Thishas been set up for only some voices so far.MBROLAis a speech synthesizer based on the concatenation ofdiphones.

VIII. SYSTEM ARCHITECTURE

Fig 5 shows the complete phases of a TTS system.Alsoindicate input and output at each phases.

Fig. 5: steps in text to speech conversion


39


IX. SUMMARY OF RESULTS

Individually each module is tested and after integrating,the framework is tested for the expected result. In text nor-malisation expectd output is obtained.All punctuations associ-ated with the input text are removed.Also Obtained expectedphonemes for the given input using neural network. Phonemeconcatenation seems to be succesfull. Since speech engineworks using MBROLA almost 90 percent accuracy is ob-tained.Found less accuracy in pronouncing words beginningwith S.

X. CONCLUSION AND FUTURE WORK

As per the goal of this project an attempt is made toshow how the computer speaks out the English text.Herethe provision is provided to the user to input the Englishtext and he or she can listen to his text.Present system justpronounces the English character; however the naturalness ofthe synthetic speech needs to be improved for implementingthe expressions of the human beings.By developing suchsystems, relationship between human and computer becomesmuch closer.Thus it helps in overcoming the problem ofDIGITAL DIVIDE.

This work can be modified into an interactive communica-tion system that communicates with human beings emotion-ally.Since the face and its expressions are the most importantrole for natural communication, we are thinking to develop aface robot that can express facial expressions similar to humanbeings and produce life like behaviour and gesters.Programshould use the ability of neck and head movement, correspond-ing to the synthesized text, to create a more life - like limmi-tation.Emotion detection and emotion mimic:In the future asystem that uses language proccessing to detect emotions ina given text and react with the appropriate face expressionand may be tone, could be intergrated.An additional speechanalyzing system may be developed in order to detect feelings,by analyzing the tone of the speaker and other parameters.

REFERENCES

[1] A.Black, H.Zen and K.Tokuda Statistical parametric speech synthesis,in proc.ICASSP, Honolulu, HI 2007, vol IV, PP 1229-1232.

[2] Deana L. Pennell,NORMALIZATION OF INFORMAL TEXT FORTEXT-TO-SPEECH ,Approved by Supervisory Committe Dr. YangLiu, Co-Chair,Dr. Vincent Ng, Co-Chair,Dr. John H.L. Hansen,Dr.Haim Schweitzer.

[3] Frances Alias, Xavier Servillano, Joan Claudi socoro and XavierGonzalvo Towards High-Quality Next Generation Text-to-SpeechSynthesis:A multi domain Approach by Automatic DomainClassification,IEEE Transactions on AUDIO,SPEECH ANDLANGUAG PROCESSING, VOL16,NO,7 september 2008.

[4] G.Bailly, N.Campbell and b.Mobius, ISCA special session: Hot topics inspeech synthesis , inproc.Eurospeech, Genea,switzerland 2003,pp37-40.

[5] Gopalakrishna anumanchipalli,Rahul Chitturi, Sachin Joshi, RohitKumar, Satinder Pal Singh,R.n.v Sitaram,D.P.Kishore, Developmentof Indian Language Speech Databases for Large Vocabulary SpeechRecognition System.

[6] Mirza Cilimkovic, Neural Networks and Back PropagationAlgorithm,Institute of Technology Blanchardstown, BlanchardstownRoad North,Dublin 15,Ireland.

[7] M.Ostendorf and I.Bulyko, The impact of speech recognition onspeech synthesis, in proc, IEEE Workshop Speech Synthesis, SantaMonica,2002,pp. 99-106.

[8] Richard Sproat, ”Pmtools: A Pronunciation Modeling Toolkit”,Proceedings of the Fourth ISCA Tutorial and Research Workshop onSpeech Synthesis, Blair Atholl, Scotland, 2001.

[9] Text To Speech Synthesis - a knol by Jaibatrik Dutta .

[10] Qing Guo, Jie Zhang, Nobuyuki Katae, Hao Yu , High Qulity ProsodyGeneration in Mandrain Text-to-Speech system ,fuji tsu Sci.Tech,J.,vol.46, No.1,pp.40-46 ,2010.


40


Cross Domain Sentiment ClassificationS.Abilasha,C.H.Chithira,Fazlu Rahman,P.K.Megha,P.Safva Mol, Manu Madhavan


[email protected]

Abstract—Sentiment analysis refers to the use of natural lan-guage processing and machine learning techniques to identify andextract subjective information in a source material like productreviews. Due to revolutionary development in web technologyand social media reviews can span so many different domainsthat it is difficult to gather annotated training data for all ofthem. A cross domain sentiment analysis invokes adaptation oflearned information of some (labeled) source domain to unlabelledtarget domain. The method proposed in this project uses anautomatically created sentiment sensitive thesaurus for domainadaptation. Based on the survey conducted on related literature,we identified L1 regularized logistic regression is a good binaryclassifier for our area of interest. In addition to the previouswork we propose the use of sentiwordnet and adjective-adverbcombinations for those effective feature learning.

Keywords—Sentiment Classification, Opinion Mining, LogisticRegression, Cross Domain.

I. INTRODUCTION

CROSS domain sentiment classification is a method ofclassifying the sentiments as positive or negative. Sen-

timent analysis refers to the use of text analysis for extract-ing subjective information. Sentiment classification has beenapplied in numerous tasks such as opinion mining, opinionsummarization, contextual advertising and market analysis .Sentiments in this system refers to reviews in various domains.Users express opinions about products or services they con-sume in blog posts, shopping sites, or review sites. It is usefulfor both consumers as well as for producers to know whatgeneral public think about a particular product or service.Automatic document level sentiment classification is the taskof classifying a given review with respect to the sentimentexpressed by the author of the review. For example, a sentimentclassifier might classify a user review about a movie aspositive or negative depending on the sentiment expressed inthe review. We define cross domain sentiment classificationas the problem of learning a binary classifier (i.e. positiveor negative sentiment) given a small set of labeled data forthe source domain, and unlabeled data for both source andtarget domains. In particular, no labeled data is provided forthe target domain. In this proposed system, we describe across-domain sentiment classification method. In this work,the lexical elements (unigram or bigram) in a review aretaken and score of each lexical elements is calculated usingsentiwordnet. Using this, a trained dataset is created and thetest data will be classified according to this trained data [3].In this work, logistic regression based algorithm is used forsentiment classification.Remaining part of this paper is organized as follows. Section 2

describes analysis on the related literature. Section 3 containsthe details of design and implementation. The test results andexperiments is presented in section 4 and section 5 givesconclusion and future work of this system.

II. RELATED WORKS

Our requirement is to develop a cross domain sentimentclassifier for classifying the reviews from different domainsas either positive or negative. Thus this system helps toanalyze on a review. In previous work [4],various methodshave been used for classification in single domain.Some of theclassification method involves Bayesian classification,Entropybased method,Support Vector machine,Structural correspon-dence learning etc.They are described as follows:

1) Domain Adaptation with Structural CorrespondenceLearning: Structural Correspondence Learning [6] is one ofthe first algorithm for domain adaptation. Many NLP tasks suf-fers lack of training data in the domain. To face this challengethe possible solution is adapt a source domain (known domain)to a target domain (new domain). This is called DomainAdaptation. Structural correspondence learning (SCL) is ageneral technique (a Domain adaptation algorithm) which canbe applied to feature based classifiers, proposed by Blitzer[6].The key idea of SCL is to identify correspondences amongfeatures from different domains by modeling their correlationswith pivot features. Pivot features are features which behavein the same way for discriminative learning in both domains.Structural correspondence learning involves a source domainand a target domain. Both domains have ample unlabeleddata, but only the source has labeled training data. The SCLalgorithm involves selection of pivot features, training a binaryclassifier for every pivot features. The simplest criterion forselecting pivot feature is that it should occur frequently in theunlabeled data of both domains. The binary classifier here actsas prediction function.

These binary classification problems can be trained from theunlabeled data, since they merely represent properties of theinput. If the features are represented as a binary vector x, thesecan be solved by using m linear predictors.fl(x) = sgn(wl.X)l = 1..mSince each instance contains features which are totally pre-dictive of the pivot feature , we never use these featureswhen making the binary prediction. That is, we do not useany feature derived from the right word when solving righttoken pivot predictor. Then arrange the pivot predictor weightvectors in matrix W. Apply Singular Value Decomposition toW, and select the h top left singular vectors . Train a newmodel on the source data augmented with x. Singular Value


41

S. Abilasha, et al., Cross Domain Sentiment Classification

Decomposition (SVD) decompose a matrix A of order m X n,into product three matrices: A= L S Transpose(V), where L isan orthonormalized matrix of order m X m , S is a diagonalmatrix of order m X n and V is the orthogonal matrix of ordern X n.

2) Sentiment Classification Using Machine Learning Tech-niques: This work [2] mainly examine the effectiveness ofapplying machine learning techniques to the sentiment clas-sification problem. A challenging aspect of this problem thatseems to distinguish it from traditional topic-based classifi-cation is that what topics are often identifiable by keywordsalone, sentiment can be expressed in a more subtle manner.Sentiment classification would be helpful in business intelli-gence applications and recommender systems, where user inputand feedback could be quickly summarized.The main aim of this work was to examine whether itsuffices to treat sentiment classification simply as a specialcase of topic based categorization with the two topics beingpositive sentiment and negative sentiment, or whether spe-cial sentiment-categorization methods need to be developed.Three basic standard algorithms , Naive Bayes Classification,Maximum Entropy Classification, Support Vector Machine areexperimented in this work. experimented in this work.

Naive Bayes Classification is an approach to textclassification is to assign to a given document d theclass c∗ − argmaxcp(c/d). The Nave Bayes classifier can bederived by observing the Bayes rule:

P (c|d) = p(c)p(d|c)p(d)

where p(d) plays no role in selecting c* to estimate theterm p(d|c), NaiveBayes decomposes it by assuming the fisare conditionally independent given d class:

PNB(c|d) =P (c)(

∏m

i=1P (fi|c)ni(d))

P (d)

where fi is the predefined feature that can appear in adocument. The training method consist of relative-frequencyestimation of p(c) and p(fi|c),using add-one smoothing. NaveBayes is optimal for certain problem classes with highlydependent features.

Maximum Entropy Classification is an alternative techniquewhich has proven effective in a number of natural languageprocessing applications. Its estimate of p(c/d) takes thefollowing exponential form:

PME(c|d) = 1Z(d)exp(

∑i λi,cFi,c(d, c))

where z(d) is a normalization function. Fi,c is a featureclass function for feature fi class c, defined as follows:

Fi,c(d, c) ={

1 ifni(d) > 0 and c− c0 Otherwise

Maximum Entropy makes no assumptions about the rela-tionship between features ,and so might potentially performbetter when conditional independence assumptions are not met.

Support vector machines (SVMs) have been shown to behighly effective at traditional text categorization, generally

outperforming Naive Bayes . In the two-category case, thebasic idea behind the training procedure is to find hyperplane, represented by vector w, that not only separates thedocument vectors in one class from those in the other, but forwhich the separation, or margin, is as large as possible. Thissearch corresponds to a constrained optimization problem;letting cj ∈ [1,−1] (corresponding to positive and negative)be the correct class of document dj , the solution can bewritten as:

w : −∑j αjcj dj , αj ≥ 0

where the αj ’s are obtained by solving a dual optimizationproblem. Those -dj such that j is greater than zero are calledsupport vectors, since they are the only document vectorscontributing to w. Classification of test instances consistssimply of determining which side of w’s hyper plane theyfall on.

From these observations, it can be concluded that the resultsproduced via machine learning techniques are quite good incomparison to the human-generated baselines. In terms ofrelative performance, Naive Bayes tends to do the worst andSVMs tend to do the best, although the differences aren’tvery large. On the other hand, all these methods were notable to achieve accuracies on the sentiment classificationproblem comparable to those reported for standard topic-basedcategorization, despite the several different types of features wetried.

A. Problem IdentificationWith the rapid growth of the Web, more and more people

write reviews for all types of products and services and placethem online. It is becoming a common practice for a consumerto learn how others like or dislike a product before buying,or for a manufacturer to keep track of customer opinions onits products to improve the user satisfaction. However, as thenumber of reviews available for any given product grows,it becomes harder and harder for people to understand andevaluate what the prevailing opinion about the product is. Thiscan be illustrated with the help of an example: in case ofcamera Nikon D70, which gathers user reviews from severalsites, get over about 759,000 reviews by searching NikonD70 user review in Google. This demonstrates the need foralgorithmic sentiment classification in order to digest this hugerepository of hidden reviews( Example taken from [4]).

B. Technical BackgroundThe cross domain sentiment classification for classifying the

reviews is a data mining approach.Data mining (the analysisstep of the ”Knowledge Discovery and Data Mining” process,or KDD), an interdisciplinary subfield of computer science,is the computational process of discovering patterns in largedata sets involving methods at the intersection of artificialintelligence, machine learning, statistics, and database systems.The overall goal of the data mining process is to extractinformation from a data set and transform it into an under-standable structure for further use. Aside from the raw analysis


42


step, it involves database and data management aspects, datapre-processing, model and inference considerations, interest-ingness metrics, complexity considerations, post-processing ofdiscovered structures, visualization, and online updating.

Classification is one of the techniques used in datamining.The proposed system is a classification tech-nique.Classification is a supervised method of learning.Inclassification a trained dataset is provided.This system uses thetrained data as lexical elements which is labeled as positive ornegative.When the unlabeled test data is provided,classificationalgorithm will classify it.

C. Language Tools1) Language Used: Python is the language used for im-

plementing this system. Python is actually an object orientedlanguage. Python is found to be very effective in providingmachine learning and artificial intelligence. Programming inpython is simple and easy to implement. We mainly concen-trate on string manipulation since sentiment classification ismainly concerned with text reviews. Python uses data typeslike Boolean, string, set, list, dictionary etc. As other program-ming languages, boolean uses two variables-true and false.Strings in python can be created using single quotes, doublequotes and triple quotes. When we use triple quotes, stringscan span several lines without using the escape character.

2) Tools Used: The various tools in python used for thismethod involves NLTK(natural language tool kit), SciPy,NumPy.These tools are described here in detail. The NaturalLanguage Toolkit, or more commonly NLTK, is a suite oflibraries and programs for symbolic and statistical naturallanguage processing (NLP) for the Python programming lan-guage. NLTK includes graphical demonstrations and sampledata. NLTK is intended to support research and teaching inNLP or closely related areas, including empirical linguistics,cognitive science, artificial intelligence, information retrieval,and machine learning.NLTK has been used successfully as ateaching tool, as an individual study tool, and as a platformfor prototyping and building research systems. corpus readerfunctions in NLTK can be used to read documents from thatcorpus. Corpus reader functions are named based on the typeof information they return. Some common examples, and theirreturn types, are:• words(): list of str• paras(): list of (list of (list of str))• taggedwords(): list of (str,str) tuple• taggedsents(): list of (list of (str,str))NumPy is an extension to the Python programming lan-

guage, adding support for large, multi-dimensional arrays andmatrices, along with a large library of high-level mathematicalfunctions to operate on these arrays. Because Python is cur-rently implemented as an interpreter, mathematical algorithmswritten in it often run slower than compiled equivalents.NumPy seeks to address this problem for numerical algorithmsby providing multidimensional arrays and functions and oper-ators that operate efficiently on arrays. Thus any algorithmthat can be expressed primarily as operations on arrays andmatrices can run almost as quickly as the equivalent C code.

The core functionality of NumPy is its ”ndarray”, for n-dimensional array, data structure.In contrast to Python’s built-in list data structure these arrays are homogeneously typed: allelements of a single array must be of the same type.The basic data structure in SciPy is a multidimensional ar-ray provided by the NumPy module. NumPy provides somefunctions for linear algebra, Fourier transforms and randomnumber generation, but not with the generality of the equivalentfunctions in SciPy. NumPy can also be used as an efficientmulti-dimensional container of data with arbitrary data-types.This allows NumPy to seamlessly and speedily integrate witha wide variety of databases.Older versions of SciPy usedNumeric as an array type, which is now deprecated in favorof the newer NumPy array code.

III. DESIGN AND IMPLEMENTATION

Cross Domain Sentiment Classification is a method ofclassification applied when we do not have any labeled datafor a target domain but have some labeled data for multipleother domains, designated as the source domain. It focuseson the challenge of training a classifier from one or moredomains (source domains) and applying the trained classifier ina different domain (target domain).A cross-domain sentimentclassification system must overcome two main challenges.First, it must identify which source domain features are re-lated to which target domain features. Second, it requires alearning framework to incorporate the information regardingthe relatedness of source and target domain features.

A. Input DesignWe use labeled data from multiple source domains and

unlabeled data from source and target domains to representthe distribution of features.Our first Step is,Given a labeled oran unlabeled review, we first split the review into individualsentences.This is done in the Preprocessing Stage of ourprocess. The review given will be the input to the first stage. Onmoving to the next stage or the attained pos tagged words arefetched to this stage,hence to the input given to sentiWordNetwill be the pos tagged sentences .The score obtained fromthis stage will be the input for the next stage,that is the inputgiven to the logistic regression will be the score calculatedusing sentiwordnet.

B. Output DesignAfter preprocessing stage of our execution we got the output

in the form of POS tagged sentences. This output is given tothe next stage as the input,and the next is the sentiWordNet,atthis stage the score is calculated and this score will be theoutput.this score is given to the next logistic regression andthe output of this stage will be the prediction that the givensentences or review is positive or not.

C. Module DescriptionWe describe a sentiment classification method that is ap-

plicable when we do not have any labeled data for a target


43


domain but have some labeled data for multiple other domains,designated as the source domains.The proposed system have mainly four modules:• Preprocessing: In this module,First, we select the lexical

elements that co-occur with in a review sentence asfeatures. Second, from each source domain labeledreview sentence in which the sentence occurs, wecreate sentiment features by appending the labelof the review to each lexical element we generatefrom that review.we use the notation *P to indicatepositive sentiment features and *N to indicate negativesentiment features.In addition to word-level sentimentfeatures,we replace words with their POS tags to createPOS-level sentiment features. POS tags generalize theword-level sentiment features, thereby reducing featuresparseness.We then apply a simple word filter basedon POS tags to select content words (nouns, verbs,adjectives, and adverbs).

The preprocessing stage of a sentence is described infigure 3.1

TABLE I: Generating lexical elements and sentiment features.

sentence Excellent and broad survey of the devel-opment of civilization.

POS tags Excellent/JJ and/CC broad/JJsurvey/NN1 of/IO the/ATdevelopment/NN1 of/IO civilization/NN1

lexical elements (unigrams) excellent, broad, survey, development,civilization

lexical elements (bigrams) excellent+broad, broad+survey,survey+development, development+civilization

sentiment features (lemma) excellent*P, broad*P, survey* P, excel-lent+broad*P, broad+survey*P

sentiment features (POS) JJ*P, NN1*P, JJ+NN1*P

• SentiWordNet:SentiWordNet is a lexical resource foropinion mining [1]. SentiWordNet assigns to each synsetof WordNet two sentiment scores: positivity, negativity.SentiWordnet is an online dictionary and it providespositive and negative score for each lexical elements.

• Logistic Regression: In this module,we give the inputobtained from the previous module that means thescore obtained using the SentiWordNet and the labeledelements are given as the inputs.And in this module thisinput will become the trained set.And when we givethe test data it could predict whether it is positive or not.

• Cross Domain: Till now, the review from a singledomain is classified.Now the classification is extendedfor multiple domains.

D. Implementation1) System Architecture: We use labeled data from mul-tiple source domains and unlabeled data from sourceand target domains to represent the distribution of fea-tures.Our first Step is,Given a labeled or an unlabeled

review, we first split the review into individual sen-tences.This is done in the Preprocessing Stage of ourprocess. The review given will be the input to the firststage.,First, we select other lexical elements that co-occur with in a review sentence as features. Second, fromeach source domain labeled review sentence in whichthe sentence occurs, we create sentiment features byappending the label of the review to each lexical elementwe generate from that review.we use the notation *P toindicate positive sentiment features and *N to indicatenegative sentiment features.In addition to word-level sentiment features,we replacewords with their POS tags to create POS-level sentimentfeatures. POS tags generalize the word-level sentimentfeatures, thereby reducing feature sparseness.We thenapply a simple word filter based on POS tags to se-lect content words (nouns, verbs, adjectives, and ad-verbs).After preprocessing stage of our execution wegot the output in the form of POS tagged sentences.This output is given to the next stage as the input.Onmoving to the next stage or the attained pos taggedwords are fetched to this stage,hence to the input givento sentiWordNet will be the pos tagged sentences .Thescore obtained from this stage will be the input forthe next stage. SentiWordNet is a lexical resource foropinion mining. SentiWordNet assigns to each synsetof WordNet two sentiment scores: positivity, negativ-ity.SentiWordnet is an online dictionary and it providespositive and negative score for each lexical elements.at this stage the score is calculated and this score willbe the output.The score obtained from this stage willbe the input for the next stage,that is the input givento the logistic regression will be the score calculatedusing sentiwordnet. we give the input obtained from theprevious module that means the score obtained usingthe SentiWordNet and the labelled elements are givenas the inputs.And in this module this input will becomethe trained set.And when we give the test data it couldpredict whether it is positive or not.the implementationis only done in a single domain,now this is to implementin cross domain.

IV. EXPERIMENTS AND RESULTS

To evaluate our method we use the cross-domain sen-timent classification dataset.This dataset consists ofAmazon product reviews for three different producttypes: books, electronics and movie.And for testing theclassifier in cross domain,we provide another domainphones.This benchmark dataset has been used in muchprevious work on cross-domain sentiment classificationand by evaluating on it we can directly compare ourmethod against existing approaches. The accuracy ofsentiment classification in a single movie domain wascalculated by taking a maximum of 1000 documents fortraining and an average of 100 documents for testing.Each time the number of documents in the training setis varied in the range of 100 while keeping number of


44


Fig. 1: Accuracy in movie domain

test documents as fixed and a graph is plotted. Figure 1shows the graph of accuracy in movie domain. It wasnoted from the graph that as the number of training dataincreases,accuracy shows an increasing trend.The aver-age accuracy was found to be 80 percentage. The similarexperiment is repeated in the other two domains.Thesedomains also shows an increasing trend of accuracywith more number of training data. The accuracy iscomputed as the percentage of correctly classified targetdomain reviews out of the total number of reviews inthe target domain.For testing the classifier in cross domain,we provide

Fig. 2: Accuracy in cross domain

a set of test reviews from the domain phones and theaverage accuracy in classifying these test data is foundto be 75 percentage.The experiment is conducted with1000 documents taken from all the three domains.Thegraph in figure 2 shows the result. Sometimes there is

a chance of misclassification of reviews due to variousreasons like adaptation failure of the previously trainedlogistic regression classifier for a given new test dataeither from the same domain or cross domain.Sometimesdue to inexistence of the synsets in the sentiwordnet.

V. CONCLUSION AND FUTURE WORK

Sentiment analysis is found to be the method of classi-fying huge reviewsby analysing its opinion strength.Wehave implemented a system that classifies the reviewsfrom a single domain.L1 logistic regression method isused for this classification.

REFERENCES

[1] Andrea Esuli , Fabrizio Sebastiani, ” SENTIWORDNET: APublicly Available Lexical Resource for Opinion Mining”,Proc. of the 5th Conf. on Language Resources and Evaluation(LREC06), 2006

[2] Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan, ” Thumbsup? Sentiment Classification using Machine Learning Tech-niques”, In EMNLP, pp 79-86, 2002.

[3] Bo Pang and Lillian Lee, ” Opinion mining and sentimentanalysis”, in Foundations and Trends in Information Retrieval,2(1-2):1-135, 2008.

[4] Danushka Bolegalla ,David Weir, John Caroll, ”Using MultipleSources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification ”, in Proc. of the 49th AnnualMeeting of the ACL: Human Language Technologies - Vol-1pp132-14, 2011.

[5] Farah Benamara, Sabatier Irit, Carmine Cesarano, Napoli Fed-erico, Diego Reforgiato, ” Sentiment Analysis: Adjectives andAdverbs are better than Adjectives Alone ”, In Proc of Int Confon Weblogs and Social Media , 2007.

[6] John Blitzer, Ryan McDonald, and Fernando Pereira, ” Do-main Adaptation with Structural Correspondence Learning”, InEMNLP, 2006.

[7] Sinno Jialin Pan, Xiaochuan Ni, Jian-Tao Sun, Qiang Yang,and Zheng Chen, ” Cross-Domain Sentiment Classification viaSpectral Feature Alignment ”, In WWW 2010.

[8] Steve Bird, et al., Natural Language Processing with Python,OReilly Media Inc., 2009.


45

sjcse

Documents

new number system

new secret

secret information

total number of login

binarypob number system

failed login attempts

n pob system

new password