muhammad awais hassan 2009-phd-cs-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/muhammad...

130
A Framework to Improve Classification of Positive and Negative Opinions in Roman Urdu-English Code Switching Environment Ph.D Thesis Submitted by Muhammad Awais Hassan 2009-Phd-CS-01 SUPERVISOR Prof. Dr. Muhammad Shoaib DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIVERSITY OF ENGINEERING AND TECHNOLOGY LAHORE

Upload: others

Post on 19-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

A Framework to Improve Classification of Positive and

Negative Opinions in Roman Urdu-English Code

Switching Environment

Ph.D Thesis Submitted by

Muhammad Awais Hassan

2009-Phd-CS-01

SUPERVISOR

Prof. Dr. Muhammad Shoaib

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

UNIVERSITY OF ENGINEERING AND TECHNOLOGY LAHORE

Page 2: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

A Framework to Improve Classification of Positive and

Negative Opinions in Roman Urdu-English Code Switching

Environment

Dissertation is submitted to the faculty of the Electrical Engineering, Department of

Computer Science and Engineering, University of Engineering and Technology, Lahore

in partial fulfillment of the requirements for the Degree of

Doctor of Philosophy

in

Computer Science

by

Muhammad Awais Hassan (2009-PhD-CS-01)

__________________________________

Supervisor (Internal Examiner)

Prof. Dr Muhammad Shoaib

Department of Computer Science and

Engineering UET, Lahore

__________________________________

Chairman

Prof. Dr Muhammd Shahbaz

__________________________________

External Examinar

Dr Shoaib Farooq

Associate Professor, University of

Management and Technology, Lahore

__________________________________

Dean

Faculty of Electric Engineering

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

UNIVERSITY OF ENGINEERING AND TECHNOLOGY LAHORE

Page 3: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

Declaration

I declare that the work contained in this thesis is my own, except where explicitly stated

otherwise. In addition, this work has not been submitted to obtain another degree or professional

qualification.

Name: Muhammad Awais Hassan

Signed: ______________________

Dated: _______________________

Page 4: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

List of Publications

Opinion within Opinion: Segmentation Approach for Urdu Sentiment Analysis, Muhammad

Awais Hassan and Muhammad Shoaib, International Arab Journal of Information Technology,

IF: 0.60 (Accepted)

Role of Discourse Information in Urdu Sentiment Classification: A Rule-Based Method and A

Machine Learning Technique, Muhammad Awais Hassan and Muhammad Shoaib, IPM,

Elsevier, IF: 1.40 (Final Review)

Lafafa Journalism: Biased and Un-Biased Media Identification with Sentiment Analysis of News

Muhammad Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

Technology, IF: 0.60 (submitted)

Are you happy: Identification of National Happiness Index of Urdu Speaking Countries.

Muhammad Awais Hassan and Muhammad Shoaib,

Auto identification of Emotional State from Urdu Drama Literature

Muhammad Awais Hassan and Muhammad Shoaib

Page 5: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

Acknowledgement

I want to thank my ALLAH for bestowing his blessing upon me to achieve this prestigious

academic level. After that the gratitude goes to the supervisor Dr Muhammad Shoaib who

always kept me on right track and strengthens me whenever a difficult situation confronted

during the research work. He always treated me with affection and adoration. Simply, He is my

hero in this journey. I do not have words to explain his role in the achievement.

However, no one can acknowledge his parent especially mother who always prays for you.

Whatever I am in my life, the credit goes to my mother she is always there when I needed her.

Also, my wife facilitates me in my studies and kept the worries of household away from me. At

last but not least, I am thankful to Dr Muhammad Ali Maud, Mr Umer Younas and Mr Nouman

Ali for facilitating in my work and supporting me morally during this voyage.

Page 6: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

Dedication

Dedicated to my Nana, Nani, my children Maryam Awais and Arsal Awais.

Their memories and love is asset of my life.

Page 7: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

i

ABSTRACT ................................................................................................................................................................. 1

CHAPTER 1. INTRODUCTION ......................................................................................................................... 3

OVERVIEW .................................................................................................................................................... 3

MOTIVATION, RESEARCH GAP AND RESEARCH OBJECTIVES ........................................................................ 4

1.2.1 Complex Sentiments ................................................................................................................................. 5

1.2.2 Research Gap ........................................................................................................................................... 6

1.2.3 Problem Statement and Research Questions ........................................................................................... 7

CONTRIBUTION ............................................................................................................................................. 8

TERMINOLOGY ............................................................................................................................................ 10

ORGANIZATION ........................................................................................................................................... 11

CHAPTER 2. LITERATURE SURVEY ........................................................................................................... 13

SUBJECTIVITY INSIDE WRITTEN TEXT .......................................................................................................... 13

SENTIMENT ANALYSIS OF ENGLISH LANGUAGE ......................................................................................... 15

LITERATURE REVIEW OF ARABIC, HINDI AND PERSIAN. ............................................................................. 20

LITERATURE REVIEW OF URDU LANGUAGE. ............................................................................................... 21

NOVELTY OF WORK. ................................................................................................................................... 26

LITERATURE REVIEW SUMMARY. ............................................................................................................... 27

CHAPTER 3. CONJUNCTIONS IN URDU ..................................................................................................... 29

TYPES OF CONJUNCTIONS IN URDU ............................................................................................................. 29

3.1.1 Haroof Jars ............................................................................................................................................ 29

3.1.2 Haroof Ataf ............................................................................................................................................ 29

3.1.3 Haroof Tahsees ...................................................................................................................................... 30

3.1.4 Haroof Fajy ........................................................................................................................................... 30

TYPES OF HAROOF ATAF. ........................................................................................................................... 30

3.2.1 Haroof Wasl وصل حروف .......................................................................................................................... 30

3.2.2 Haroof Tardeed ..................................................................................................................................... 31

Page 8: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

ii

3.2.3 Haroof Astadark .................................................................................................................................... 31

3.2.4 Haroof Astashna .................................................................................................................................... 32

3.2.5 Haroof Shaart u Jaza ............................................................................................................................. 32

3.2.6 Haroof Elaat .......................................................................................................................................... 33

MULTIPLE USAGES OF URDU HAROOF ATAFS ............................................................................................. 34

3.3.1 When these words used as proposition (Case1) ..................................................................................... 34

3.3.2 Multiple Haroof-Ataf in a Sentencce (Case2) ........................................................................................ 35

CHAPTER 4. DISCOURSE URDU SENTIMENT CLASSIFICATION (DUSC) ........................................ 36

EXTRACTION OF DISCOURSE INFORMATION................................................................................................ 37

4.1.1 Sub-Opinions within the Sentiment ........................................................................................................ 37

4.1.2 Polarity Assignment to Each Sub-Opinion. ........................................................................................... 40

4.1.3 Polarity Relation between Sub-Opinions ............................................................................................... 42

4.1.4 Identification of Discourse Relations ..................................................................................................... 45

RULE BASED CLASSIFIER. ........................................................................................................................... 52

4.2.1 Rules Identification ................................................................................................................................ 52

4.2.2 Rule Based Algorithm. ........................................................................................................................... 60

PROPOSED SUPERVISED LEARNING TECHNIQUE ......................................................................................... 63

4.3.1 Feature Identification ............................................................................................................................ 64

4.3.2 Feature Selection ................................................................................................................................... 69

4.3.3 Model Training ...................................................................................................................................... 74

CHAPTER 5. IMPLEMENTATION ................................................................................................................ 75

DATASETS ................................................................................................................................................... 75

EXPERIMENTAL SETUP ................................................................................................................................ 76

EXPERIMENTS ............................................................................................................................................. 84

CASE STUDY ............................................................................................................................................... 84

CANDIDATE EXAMPLE ................................................................................................................................ 86

Page 9: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

iii

CHAPTER 6. RESULTS AND DISCUSSION ................................................................................................. 88

PERFORMANCE METRICS ............................................................................................................................ 88

PERFORMANCE OF BAG-OF-WORD (BOW) ................................................................................................. 89

PROPOSED RULE-BASED CLASSIFIER .......................................................................................................... 90

SUPERVISED LEARNING FOR URDU SENTIMENT CLASSIFICATION ............................................................... 95

PERFORMANCE COMPARISON ..................................................................................................................... 97

SIGNIFICANT TEST ...................................................................................................................................... 99

6.6.1 Performance difference between BoW and Rule Based Algorithm ........................................................ 99

6.6.2 Performance Difference between Supervised Learning Techniques .................................................... 101

6.6.3 Second opinion significantly leads the polarity of sentiment. .............................................................. 102

PERFORMANCE OF ENGLISH LANGUAGE ALGORITHM ON URDU ............................................................... 103

CHAPTER 7. CONCLUSIONS AND FUTURE WORK .............................................................................. 105

REFERENCES ........................................................................................................................................................ 108

APPENDIXES .......................................................................................................................................................... 115

LIST OF FIGURES .................................................................................................................................................... 115

LIST OF TABLES ..................................................................................................................................................... 116

LIST OF ALGORITHMS ............................................................................................................................................ 117

LIST OF FLOWCHARTS ........................................................................................................................................... 117

LIST OF EXAMPLES ................................................................................................................................................ 117

LIST OF HAROOF ATAF .......................................................................................................................................... 120

INDEX ...................................................................................................................................................................... 121

Page 10: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

Abstract

In computational linguistics, sentiment analysis facilitates classification of opinion as a positive

or a negative class. In last decade, the area of sentiment analysis of English language is explored

largely with different techniques those have improved the overall performance.Urdu is language

of sixty-six million people and largely spoken in south-asian subcontinent. Also, it is national

language of Pakistan which is world sixth most populous country according to United Nations

Population Division. Sentiment analysis of Urdu language is important tool to understand the

behavioural aspects, cultural values and social habits of the people living in this part of world.

Opinion mining is also crucial for governments, policy makers, business owners and brand

ambassadors to make their decisions in accordance to sentiment of the public. However,

sentiment analysis of Urdu language is not well explored as that of English language. The Urdu

sentiment analysis is performed with simple Bag-of-Word (BoW) method and machine learning

(ML) techniques with limited set of features. The BoW method is not sufficient to handle

complex opinions. Also, the accuracy of ML techniques, with legacy features, is not comparable

to the sentiment classification task of other languages. For English language, the discourse

information (sub-sentence level information) boosted the performance of both BoW method and

ML techniques. A theory for Urdu sentiment analysis that extract and use the discourse

information at sub sentence level and also suggest a computational model to achieve more

accurate and better results than the simple bag of word approach. The proposed solution

segmented the sentiment into two sub-opinions, extracted discourse information (discourse

relation and polarity relation), proposed an extended BoW method (rule based method) and

suggested a new small subset of features for ML techniques. The results significantly enhance (p

Page 11: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

2

< 0.001) the performance of recall, precision and accuracy by 37.25%, 8.46%, and 24.75%

respectively. The current research targeted sentiment with two sub-opinions that remain excellent

until the opinions are short messages like those on Twitter, in forum comments or as Facebook

status posts. The proposed technique can be extended for sentiments with more than two sub-

opinions such as blogs, reviews, and TV talk shows.

Page 12: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

3

Chapter 1. Introduction

Overview

Sentiment [1] or opinion represents the emotional state of a person regarding any entity or any

event. This document refers a sentiment as a text sentence, which conveys positive, negative, or

neutral emotions. These positive, negative and neutral emotions are also called as sentiment

polarity [2]. A sentiment could consist of multiple sub-sentences these have their own polarity.

The scope of this research is limited to classification of sentiment which has one or two opinion

about single or multiple entities. The sentiment classification [3] usually refers to classification

of a sentiment into a positive or negative class.

Due to proliferation of 3G/4G network [4], the rate of sharing and expressing sentiments has

increased exponentially on social media websites. In many cases, these discussions and opinions

set the trends in both print media and digital media. These opinions are also crucial for

governments, policy makers, companies and brand ambassadors to make their decisions in

accordance to sentiment of the public. Since last decade [5], the efforts have been made to

analyse sentiment automatically and prepare statistics for the stack holders.

Urdu is language of sixty-six million people, largely spoken in south-asian subcontinent, and it is

national language of Pakistan which is world sixth largest country [6]. Also, Hindustani is social

linguistic variant of Urdu language that is spoken largely in India [7]. Sentiment analysis of Urdu

language is important tool to understand the behavioural aspects, cultural values and social habits

of the people living in this part of world. Urdu borrows a considerable number of words from

other languages like Sanskrit, Persian, Arabic, and English [8]; this unique behavior of the

language results in a rich morphology and complex grammar. Thus, the computational tools

Page 13: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

4

available for other languages are not sufficient for Urdu [9], [10] and it demands a more

specialized set of tools, especially tools for sentiment analysis.

Motivation, Research Gap and Research Objectives

The sentiment analysis of English language includes different methods such as the Bag-of-Word

(BoW) approach, the supervised learning approach, the rule-based approach, and the discourse-

based approach [5]. The literature contains very restricted research for sentiment analysis of the

Urdu language: mainly, BoW based models [11–13] and partially supervised learning (SL)

techniques [14], [15] with very limited features are dominating the field. The BoW based

methods are not sufficient for handling complex opinions and SL techniques required different

and intuitive feature set to show better results.

BoW model [12] calculates total negative words and positive words of a sentiment with the help

of a Dictionary. The Dictionary contains an entry for each word of sentiment, its part-of-speech

(POS) tag and orientation (word is positive or negative). Let Ptotal are total positive words and

Ntotal are total negative words in a sentiment; the BoW classifies the sentiment in positive,

negative or neutral class using following function.

Equation 1:

if

( , ) if

if

total total

total total total total

total total

positive P N

class P N negative P N

neutral P N

Example 1: (وہ ایک اچھا اور بہدارلڑکا ہنے)

wo aik acha (+) aur bahdar (+) larka ha.

(He is good and brave boy)

Page 14: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

5

In example 1, both acha (+) and bahdar (+) are positive words; so Ptotal = 2 and Ntotal = 0 , BoW

estimates positive polarity for sentiment.

BoW based models worked perfectly for a sentiment which has polarity according to the

frequency of positive or negative words. However, BoW model failed to classify complex

sentiment; the next section defines this type of sentiments.

1.2.1 Complex Sentiments

The research categorized a sentiment as complex sentiment if it is belonged to one of the

following three types.

Type 1: Total posititve and total negative words are equal, but the sentiment class is not neutral.

Type 2: Total positive words are greater than total negative words, but the sentiment class is

negative.

Type 3: Total negative words are greater than total positive words, but the sentiment class is

positive.

Both the example 2 and example 3 are complex sentiments and BoW model fails to classify

them.

Example 2: (آفریدی ایک اچھا اور فٹ الرونڈر ہے مگر اس کا کیا فایدہ اگر وہ پاکستان کو میچ نہیں جتوا سکے)

Afradi aik acha (+) aur fit (+) allrounder ha magar ess ka kaya fida (+) agar wo Pakistan

ko match nahi jatwa (-) sakta.

(Afridi is good and fit all-rounder, but what of his use, if he cannot win match for

Pakistan)

Example 3: )اگرچہ پاکستان کرکٹ ٹیم کو میڈیا تنقید کا نشانہ بناتا رہتا ہےلکین اس مرتبہ دبئی ٹور میں اس نے کمال کر دیا)

Page 15: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

6

Agarcha pakistan cricket team ko media Tanqeed (-) ka nashna banata rahta haa.

Magar ess martaba dubi tour maan ess naa Kamal (+) ker daya.

(Although the media criticized Pakistan cricket team all the time, but they gave an

excellent performance during Dubai tour.)

The Example 2 consists of three positive and one negative word; the BoW model classifies this

sentiment in a positive class, but the actual polarity of sentiment is negative. The Example 3

contains one negative and one positive word; the BoW model assigns neutral class, but the actual

polarity of sentiment is positive.

1.2.2 Research Gap

Natural language processing of Urdu language comes with many challenges. First there is no

formal word bank for Urdu language that indicates the polarity and POS tag of a word. Secondly,

the parsing of Urdu language is itself very challenging task and current state of art work is not

mature. Thirdly, there is no sentiment corpus available for Urdu language that can be used for

training of supervised learning algorithms. Other than these general language processing

challenges; the more specific challenges for sentiment analysis is that user tend to write reviews

and opinions using roman Urdu that make the task more complicated and complex as the

resource for roman Urdu are rare. Often, these opinions tend to contains words of English

language very frequently, and the code switching is very common. All these issues make the

sentiment analysis very challenging for Urdu language especially when opinions are given in

roman Urdu and in code switching environment.

Almost all the techniques related to sentiment analysis of Urdu language follows the simple

BoW approach, a dictionary of adjectives and valence shifters is used that contains words, their

Page 16: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

7

related POS tags, polarity and intensities. The existing supervised learning algorithms uses

legacy feature set for model training which is not effective for model training [14], [16].

The BoW model fails to classify complex opinion and sarcastic sentiments, when BoW model

fails to classify sentiment correctly. Thus, new techniques are required to extend the BoW

capabilities to classify the complex opinions.

These techniques are not sufficient for complex and ambiguous opinions especially in case of

Urdu language that do not have availability of rich resource [14]. In literature, discourse

information [5-24] certainly improved the sentiment analysis task of English language but the

discourse information has not been explored for Urdu sentiment analysis. It is important to

understand whether the discourse information could extend the capabilities of simple BoW

model and SL techniques.

A set of predefine syntactic patterns that identify the discourse relationship can increase the

efficiency of sentiment analysis task while a small data set is available for training. However, the

discourse level sentiment analysis is itself very challenging task, the main task could be divided

into following sub tasks i) discourse segmentation ii) identification of the relationship between

two opinions iii) computational model that use the relationship information to classify opinion.

This research focus on all three sub tasks but scope of research is limited to the opinions that

contain at most two sub opinions.

1.2.3 Problem Statement and Research Questions

The importance of sentiment analysis and limitation of current available techniques for Urdu are

the main driving force behind this research. This dissertion proposed “A theory for Urdu

sentiment analysis that extracts and uses the discourse information at sub sentence level and also

Page 17: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

8

suggests a computational model to achieve more accurate and better results than the simple Bag-

of-Word approach”. More specifically, the following three questions are used:

1 How the discourse information can be extracted from Urdu sentiments?

2 How the capabilities of BoW model can be extended with the discourse information?

3 How the performance of SL techniques can be improved with discourse information?

Contribution

1 A corpus of 800 sentiments is written in roman Urdu and a dictionary of 5000 tokens

with POS tag and polarity information is developed.

2 An opinion segmentation method is proposed which segments large complex sentiment

into smaller sentiments.

3 A stack based computation model is proposed to calculate polarity of sentiment. The

algorithm handled both forward and backward negations.

4 Five discourse relations and four polarity relations are identified.

5 A method was suggested to extract the discourse information in terms of discourse

relation and polarity relation.

6 The four manual rules, based on discourse information, are proposed and an algorithm is

suggested which selects and applies these rule on sentiment to classify it.

7 A small subset of features are identified which showed an improved accuracy than the

large number of unigram features.

8 A computational model is developed that would assist in future research on sentiment

analysis.

Page 18: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

9

9 The performance has been improved significantly for both Rule-Based methods and SL

techniques. The precision improved from 78.76 to 86.45, recall improved from 47.83 to

73.47 and accuracy improved from 68.90 to 83.79

10 The statistical tests are performed on the proposed methods to check their significance.

Page 19: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

10

Terminology

This section defines the terms used in the thesis.

Sentiment, S is the input sentiment with n words in linear order

1 2 3, , ... |n iS W W W W W D

Dictionary, D is a file that defines the orientation and part-of-speech (POS) tag of words.

Orientation of a word tells whether the word contains positive (+1), negative (-1) or neutral (0)

thought.

Orientation (w) is a function that finds the orientation of w from D and returns +1 or -1.

Orientation Score is integer value which is calculated with simple BoW method: the total score

is obtained by summing up the orientation of each adjective [12], [13].

Let A is sub set of S .

1 2{ , ,... | ( ) }n i

A S

A A A A pos A adjective

The orientation score calculated with help of following equation.

1

( ) where n

i i

i

score Orientation A A A

Polarity (score) For given score, this function returns positive, negative or neutral polarity.

if 0

( ) = if 0

if 0

positive score

polarity score negative score

neutral score

POS (W) is a function that takes a word as input and returns its POS tag as defined in D.

Page 20: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

11

However, a set of words reverse the orientation of an adjective and make above equation

ineffective.The next section explains these words.

Orientation shifters are the words that reverse the orientation of another word. In Urdu there are

two types of orientation shifters, Forward Negation and Backward Negation.

Forward Negations reverse the orientation of the adjective that comes after it. The words

maat (مت) and na (نا) are forward negations.

Backward Negations Reverse the orientation of the adjective that comes before it.The

Nahi (نہیں) is example of backward negation.

Example 4: وہ ایک اچھا لڑکا نہیں ہے)

Wo acha(+) larka nahi ha

He is not good boy

The decision that the word belongs to forward negation or backward negation depends upon its

usage within the text. However, for this research the negation type is fixed irrespective of its use

in a sentence. The word nahi (نہیں) is shortlisted as backward negation; maat (مت) and na (نا) as

forward negation.

Organization

The thesis has been organized into five chapters. The Chapter 2 discusses the relevant literature

which includes subjectivity within the senitment, sentiment analysis in English langauge,

research in sister lanaguage such as hindi, and finally list down the work that has been found for

Urdu. The Chapter 3 provides the basic background and details of sentence joing in Urdu

langauge. The Chapter 4 presents the proposed model. First, it expains the algorithms those

extract the discourse information. Then, it provides the details of extracted rules. Finally, the

Page 21: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

12

chapter givses the rule-based algorithm and introduces the selected feature set for supervised

learning technique. The Chapter 5 provides the details of datasets, experimental setup and

implementation. The Chapter 6 records the results and discusses these results in detail. The

Chapter 7 concludes the work with future recommendations. Finally, the references, appendixes

and index are given.

Page 22: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

13

Chapter 2. Literature Survey

The literature survey has been divided into four sections. The first section explains sentiment and

subjectivity. The second section consists of most relevant work in English language and third

section lists down the sentiment analysis research in sister languages like Hindi, Persian and

Arabic. Finally, last section discusses the literature review and identified the knowledge gap for

Urdu language.

Subjectivity inside written text

Wiebe [17] distinguished sentences that express factual information (objectivity) from sentences

that express subjective views and opinions (subjectivity). However, sometimes objective

sentences contain the opinion, “We bought the laptop last month and the battery is not working.”

Wilson [18] have also analyzed clauses, but analyzing the clause level is not sufficient.

According to Liu [5], opinions can be divided into different types.

Liu [19] has defined two types of regular opinion, direct and indirect. If an opinion refers

directly on an entity or its aspect then this opinion is called direct opinion for example “The

resolution of camera is excellent.” Where an indirect opinion is expressed indirectly on an entity

or its aspect of an entity for example, the sentence “After running a 3D game my mobile

performance get slowed down”. This text indirectly gives an opinion (negative) or sentiment

about the performance of the mobile. In the case, the entity is the mobile and the aspect is the

effect on performance.

Jindal [20], [21] discussed the comparative opinions: the opinions that compares entities (based

on their common aspects) or when one entity is preferred over another entity.

Page 23: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

14

Liu [5] defined the two types of opinions, explicit and implicit. If an opinion (regular or

comparative) expressed as subjective sentence then this opinion is called explicit opinion. For

example, “Resolution of Android2 tablets is the best” or “Apple is better than Android2”.

Similarly if an opinion (regular or comparative) expressed as objective sentence then this opinion

is called implicit opinion. These objective statements usually express an undesirable or desirable

fact. e.g., “My mobile battery drained in just 5 hours” and “The battery life of HP laptops is

longer than DELL Laptops.”

According to Wiebe [1], [22] an objective sentence gives some factual information about the

entity, while a subjective sentence expresses some personal views, beliefs or feelings. An

example of objective sentence is “Afridi is player of t20 team.” And an example of subjective

sentence is “I like Dell”. Opinions, speculations, allegations, desires, suspicions, and beliefs are

all form of subjective expressions. It is not necessary that subjective sentences always express

some opinion for example, “I think he is playing cricket” is a subjective sentence but it does not

contain any opinion.

Zhang [23] argued that objective sentences can have opinions or sentiments due to desirable or

undesirable facts. For example, the sentences, “Keypad of this brand is broken in a week” and “I

bring a cycle of for aunty and its break has failed” states some facts and imply negative

sentiments because these facts are undesirable.

Chaudhuri [24] has discussed two types of evaluations, rational evaluations and emotional

evaluations. Relational evaluations are from rational reasoning, concrete beliefs, and functional

attitudes. For example, the sentences “The camera zoom is large” and “This laptop is worth the

price” express rational evaluations. Emotional evaluations are non-tangible and emotional

responses to entities which are deep inside head. For example, the sentences, “I love Nokia” and

Page 24: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

15

“I am so disappointed with its performance” express emotional evaluations. Emotions are

subjective feelings and thoughts.

Parrott [25] identified six types of emotions i.e., joy, love, anger, sadness, fear and surprise

which can be sub-divided into many secondary and tertiary emotions. Different emotions can

have different intensities and they are closely related to sentiments.

Sentiment Analysis of English Language

For English language, different approaches have been explored for sentiment analysis. Here, the

most relevant work is mentioned: discourse based methods and SL techniques. Discourse

structures and context level information improved the sentiment analysis task for the English

language. Poria [26] segmented text into clauses, normalized the clauses using a Lancaster

stemming algorithm [27] and identified the concepts within the sentiment. Next, the hour glass of

emotions [28] was applied to identify the pleasantness, attention sensitivity and aptitude within

the sentiment. Finally, the artificial neural network (ANN) method was used to cluster the

Affective Space. Ortigosa [29] used context information to classify sentiments in noisy and e-

learning environment. Caro [30] introduced the context-based sentiment propagation method,

which measured the distance between the target and textual appraisal. Chenlo [31] studied the

ability of RST [32] to reveal the positive or negative orientation of sentences within the news

articles. Mukherjee[33] incorporated the discourse relations, connectives and conditionals to

extend the capabilities of the BoW model: they used discourse information as a feature set and

applied support vector machines (SVM) for classification. Thabit[34] proposed a new vector

similarity method for scheme matching. The method used thesaurus to perform the mapping

based on textual analysis [35], [36] has defined an approach for sentiment analysis using

Page 25: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

16

discourse and rhetorical relations. The paper defined four top level categories, reporting,

judgment, advise and sentiment and 20 sub categories, each sub categories contains different

verbs. On the basis of these categories sentence is first divided into opinion segments, and then

rhetoric relationships between these segments are identified. There could be five types of rhetoric

relationships, contrast, correction, explanation and result. Next, each opinion word that belongs

to a category is represented with a shallow semantic feature structure (FS). After that discursive

representation of a text was constructed and the different feature structures (FS) was combined.

The combination of two low-level FS is combined into high level FS using a set of dedicated

rules.

The RST theory remains useful to identify the relation between the sentence structures [37], [38]

as well for multimedia objects [39]. Zhou [40] proposed RST based scheme for sentiment

analysis. The paper identified 13 most occurring relations of RST from corpus of sentiment

analysis and categories them into following five categories contrast, condition, continuation,

cause, purpose. In first module, a small set of cure phrase based patterns were used to collect

discourse instances. Then, these instances were converted into Semantic Sequential

Representation (SSR). The conversion of SSR was performed using some predefine rules. The

common SSRs were generated from individual discourse instances. Finally, an unsupervised

SSR learner was adopted to generate weights and then, filter high quality new SSRs without cue

phrases.

Somasundaran [41] introduced the opinion frame that consists of two opinions. Each opinion can

be about same or different target. An opinion frame consists of opinion span, polarity, valence,

target span, and link. With two opinion types (S: sentiment and A: arguing) and two polarities (P:

positive and N: negative), there can be four type of opinion pairs (SP, SN, AP and AN), and for

Page 26: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

17

each of two opinions and two possible relation types (same, alternative) there can be 4*4*2= 32

opinion frames. Somasundaran [42] also used opinion frames [41] to suggest a graph base

approach and combined the opinion frames with a collective classification framework [43] to

build a computational model. According to this scheme, the two opinions were related in the

discourse when they have related targets. A classifier was used to perform polarity classification

locally at each node and then used discourse level links between nodes. Thus, the opinion

classification of a node depended on local features and class labels of related opinions and the

nature of these links. The opinion graph then provides a means to factor in the related opinion

information into the link classifier. This approach used information in the nodes to achieve a

joint inference.

Zirn [44] has presented a fully automatic framework which perform sentiment analysis on the

sub sentence level combining multiple sentiment lexicons and neighborhood. Also, they involved

the discourse relations to overcome this challenge. The paper used contrast and no-contrast

relation. The authors applied Markov logic to integrate polarity scores from different sentiment

lexicons. The method utilized the information about relations between neighboring segments and

evaluates the approach on product reviews.

Taboada [45] presented an approach to extracting sentiment from texts that makes use of

contextual information. Using two different approaches, algorithm extracted the most relevant

sentences of a text, and calculates semantic orientation weighing those more heavily. First

approach uses the Rhetorical Structure Theory (RST) and extracts nuclei as the relevant parts;

the second approach uses a topic classification that built using support vector machines (SVM).

Barbosa [46] proposed an algorithm based on traditional features of tweets for subjectivity

classification of twitter data. This includes features such as retweets, emoticons, hashtags, upper

Page 27: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

18

case words, links, question and exclamation marks. For sentiment classification of subjective

tweets, the same set of features was also used. Huang [47] introduced a method for sentiment

classification of subjective sentences, this method was similar to the research [48]. This work

used a large set of seed adjectives instead of using one seed word for positive and one for

negative as in [48] also this work used a modified log-likelihood. Two thresholds were chosen

using the training data and applied to determine whether the sentence has a negative, positive, or

neutral orientation.

Dragut [49] proposed a bootstrapping method using WorldNet. This algorithm does not follow

the dictionary blindly. The algorithm takes seeds words with labeled sentiment orientations as

input and produces sets of synonyms (synsets) with orientations.

Peng [50] used constrained symmetric nonnegative matrix factorization (CSNMF) method for

sentiment lexicon generation. The method first finds the candidate sentiment words from

dictionary and then assigns polarities, using a large corpus. This method thus uses both

dictionary and corpus. Xu [51] presented several integrated methods based on label propagation

in a similarity graph [52], this method uses dictionaries and corpora to find emotion words.

Ding [53] and Liu [54] extended the lexicon-based approach to assign aspects of the entities in

preferred and not preferred set, the method assigned negative opinions to aspects of the entities

in not preferred set and positive opinion to aspects of the entities in preferred set. The lexicons

used for the purpose was divide into two set, Type1 contains general purpose comparative

sentiment words like worse and better. These words have domain independent positive and

negative sentiments. Type2 contains comparatives formed by adding less, more, most or least

adverbs/adjectives.

Page 28: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

19

The research community used various supervised-learning methods including probabilistic

classifiers (Naïve Bays, Maximum Entropy, and Bayesian Network), linear classifiers (SVM,

Neural Network), and decision trees for sentiment classification. The literature contains abundant

material on these techniques; however, only the most relevant work is listed. Kang [55] applied

naïve bayes (NB) on restaurant reviews; Ortigosa [56] introduced multi-dimensional classifiers

based on the bayesian network, and Bau [57] studied the maximum entropy classifier for

predicting the consumer reviews. Chen [58] introduced a SVM-based framework to drive market

intelligence from product reviews. Van [59] used a neural network to identify the positive and

negative relationship between two persons based on the autobiographies. Hu [60] used decision

trees to classify the documents based on sentiments. Wilson [61] and Raaijmakers [62] applied

experiments with words n-grams, character n-gram and phoneme n-gram. Schapire [63] used

BoosTexter as the learning algorithm. These experiments showed that character n-grams

performed the best, and performance of phoneme n-grams was similarly to word n-grams.

The literature is also filled with different hybrid methods [64–66] that included both supervised

and unsupervised techniques; these classifiers are called meta classifiers.

Liu [5] listed down some common features used by research community. These features are,

Terms (individual word) and their frequencies, terms as unigram and their n-grams, part-of-

speech (POS) of each word e.g adjectives are most important sentiment words. Sentiment

shifters are the words those change the orientation of any sentiment indicator, e.g., not, don’t,

never. Some researchers also used dependency-based features those were generated through

parsing or dependency trees.

Rloff [67] used rules based for entity extraction instead of using statistical techniques. Lee [68],

[69] highlighted the issues of multiple names used for single entity during the Named Entity

Page 29: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

20

Recognition (NER). For example, “Imran Khan” may be written as “IM” or “IK.” This method

compare the similarity of the surrounding words of each candidate entity with the seed entities

and after that rank the entities based on the similarity values.

Li [70] has showed that the approach mentioned in [68], [69] was inaccurate and learning from

unlabeled and positive examples (PU learning) using the S-EM algorithm [71] was considerably

better. This method uses given seeds to automatically extract sentences those contain one or

more of the seeds. The other surrounding words of each seed served as the context of the seed.

The rest of the sentences were considered as unlabeled examples. S-EM outperformed other

supervised learning techniques.

Literature Review of Arabic, Hindi and Persian.

Bikliwal [72] presented a simple graph based traversal algorithm that use the synonym and

antonym relations to classify the sentiments given in Hindi language. Joshi [73] introduced a

fall-back strategy that followed three approaches, supervised translation, in-language sentiment

analysis and resource bases sentiment analysis. Namita [74] proposed a classifier which applied

rules to handle negation and discourse relations. She used HindiSentiWordNet to calculate

polarity of words. Pooja [75] applied semi-supervised approach to train a deep belief network

with help of HindiSentiWordNet.

Saree [76] developed a stemming based method on selected feature and applied naïve-bayes

algorithm to classify sentiments written in Persian language. The model was tested on mobile

phone products and improved results were reported. Shams [77] applied automatic translation

tools to convert English sentiment clues to Persian sentiment clues. The technique was used to

remove the noise in the Persian clues and then applied LDA to classify document.

Page 30: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

21

The supervised learning and hybrid technique improved the language processing for Arabic

language. For example, Selamat [78] applied hybrid-knn to identify the Arabic script webpage.

The study showed that k-nearest neighbours (KNN) with back propagation produced the precise

results when data is clean. However, KNN with SVM showed more promising results in case of

noisy dataset. In another study, selamat [79] proposed hybrid decision tree neural networks

which outperformed simple decision tree method. These results indicate the supervised learning

technique was improved with hybrid methods for language processing tasks.

Literature Review of Urdu Language.

The research work [13] and in [80] perform sentiment analysis task for Urdu language using

adjectives. This approach combined adjectival phrases with polarity shifters and conjunctions to

make sentiment expressions in the opinionated sentences. The main task was divided into two

sub tasks: creation of sentiment-annotated lexicon [11] and building of classification model.

Afraz [11] discussed the method and structure for manually annotated lexical construction. The

lexicon contains information about the subjectivity of an entry in addition to its phonological,

orthographic, syntactic and morphological aspects. This approach recognizes the subjective

entries in the lexicon through their two attributes; i.e. orientation (either positive or negative) and

intensity (the force of the orientation).

Afraz [13] divided lexical construction task into following sub tasks: define sentiment-oriented

phrases/words in Urdu language, identify grammatical rules (e.g. use of modifiers), list inflection

or derivation) recognize and annotate polarities to the entries, detect semantics , and differentiate

between multiple POS tags for same entries. The sentiment classification was performed in three

phases 1) preprocessing. 2) shallow parsing 3) classification. The preprocessing phase prepares

Page 31: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

22

text for sentiment analysis it is based on method [81] for removal of punctuations and striping of

HTML tags, with two additional tasks of normalization and segmentation. These sub tasks were

required due to Urdu’s orthographic characteristics such as ambiguity in word boundaries and

optional use of diacritics [82]. In normalization diacritics were removed and in segmentation

words were selected. At next step, shallow parsing was applied to identify SentiUnits after

preprocessing. At the same time, negation was considered for classification extracted SentiUnits.

The polarities are calculated for classification as positive or negative finally classifier is applied

for classification.

Afraz [12] discussed effect of negation phrases on sentiment classification. The paper

investigates phrase-level negation on the sentiment analysis. The negation flips the polarity of a

word. The presented approach (SentiUnits) focuses on the subjective phrases which are made on

the subjectivity [11], [13]. For a given Urdu language based review, the SentiUnit extraction and

polarity computation takes place in three phases. In first phase, the normalized text was passed to

the parts-of-speech (POS) tagger, which assigns POS tags to all the terms. Along with this

tagging, the word polarities were also annotated to the subjective words. This polarity annotation

was performed with the help of the sentiment annotated lexicon of the Urdu text. In second

phase, these annotated subjective terms (adjectives) are considered as the headwords for the next

phase in which shallow parsing was applied for phrase chunking and the adjectival phrases were

chunked out. Now, these chunks were converted into SentiUnits by attaching the negation,

modifiers, conjunctions, etc. In the last phase, the identified SentiUnit were analyzed for polarity

computation. The polarity of the subjective terms was treated with the combined effect of the

negation (if it exists in the SentiUnit).

Page 32: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

23

Ahmed [83] provides translation framework for roman Urdu. The algorithm followed some

spelling rules to translate roman word to Urdu script using word list. The paper mentioned issues

that can come during translation, 1) no standard spelling for Urdu words 2) multiple Urdu letters

for roman consonants 3) multiple roman letters for Urdu vowels 4) Y ( ی) behaves as consonants

as well as vowel 5) double roman letters for germination 6) vowel change around gol hay (ہ) 7)

vowel at the start of the word/syllable 8) short vowel at the end of the word 9) bari-ye (ے) at the

end of the word, syllable boundaries and roman vowel in sequence. The method encodes Urdu

word and then the roman word. The encoded word(s) corresponding to the roman word which

matched with the highest frequency.

Malik [84] suggested Urdu to roman transliteration scheme using LFG grammar. The presented

translitrator converts unicode Urdu script to unicode roman letters using a cascaded sequence of

modules. This paper deals with language specific problems like multiple characters for one

sound and diacritization. Transliteration pipeline architecture consisted of normalization. In

unicode standard, some characters of Arabic script are written in composed and decomposed

form. To avoid the duplication, the process normalized the input text to the composed character

form. Urdu is normally written without any aerabs (vowel diacritics) unicode to UZT [85].

Finally, transliteration component applies convert the number-based UZT notation to the Roman

letter-based scheme. The rules are compiled into a finite-state suprvised using the XFST [86]

toolset.

Usman [87] introduced a roman translation method which consists of three steps. At first step,

pre-processing removed html tag for the next layer. The second step performed mapping based

on the analysis of vowel and consonant which returns all possible equivalent Urdu words as a trie

for any given Roman-Urdu word. The last step used trie-pruning to remove false branches.

Page 33: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

24

Iqbal [88] presented task of conversion of Urdu nastaliq to Roman-Urdu. This work proposed a

technique to handle many complexities in conversion of Urdu names to Roman-Urdu which

includes vowel sounds produced by a single character due to diacritics and different voices of aa,

ya and other such characters. When diacritics (zee,zabar,pesh,shad) are involved in a name, the

conversion becomes more difficult: many words produce different sounds at their initial, middle

and end positions. For example, 'Ain' (ع) produces vowel sounds at its initial position and

different sounds at middle and end positions. Segmentation module returns a strings consisting of

roman name of each segmented Urdu character. This is done by matching starting and ending

character first, and then the remaining characters are matched. The name with maximum

difference of one character is the output name translated from Urdu name image.

Ahmed [89] introduced a method to develop Urdu WorldNet from the already existed Hindi

WordNet using carefully designed transliterators. The research converts Urdu input to the

equivalent Roman word [84]. At next step, the Roman to Hindi transliterates converts Roman

words into Devanagari script. Due to same sound for different characters, the conversion results

in the loss of some information. Finally, each Urdu character is mapped onto a single Hindi

character.

Mukand [90] presented a method for polarity classification of code-mixed data. The method was

built on a theory called structural correspondence learning (SCL) for domain adaptation. The two

oracles were defined. The issue with spelling variations is handled by the first oracle. The oracle

converted each token to roman Urdu using double meta-phone algorithm. The words which have

same meta-phone code in both target language and source language were added into pivot pairs.

The second oracle performed translation between Urdu and English.

Page 34: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

25

Javed [91] suggested a method of sentiment analysis for code switching environment of Urdu

and English. The data set was taken from twitter and these tweets were related to general election

2013. Twitter search API was used for tweets and retrieval was based on the keywords. Two

iterations of classification were performed over dataset: first iteration discriminates between the

tweets belonging to political/nonpolitical contents and second iteration of classification was

performed to discriminate between English and Roman-Urdu. A bilingual lexicon is constructed

that is capable of providing sentiment strength for English as well as roman Urdu words used in

tweets. In order to increase the coverage of this bilingual lexicon, WordNet was used to improve

the performance of English tweets. English words from SentiStrength [92] was searched for their

Roman-Urdu translations [93]. A bi-lingual sentiment repository (BLSR) was created which

contains English and corresponding Roman-Urdu translations. Similarly, for Roman Urdu

tweets, a bigram based cosine similarity was used to reduce number of typographical errors as

well as performing string approximation to increase the coverage. The paper has addressed the

dominance of political parties in Pakistan before elections 2013.After tokenization, strength was

assigned to each token with help of SentiStrength and BLSR. The strength of every single tweet

was then computed on the basis of token frequency and its polarity. The difference in the results

of English and Urdu Tweets shows the two separate clusters of population and their political

affiliations.

Bilal [14] applied different SL techniques including naive bayesian, KNN and decision tree to

perform the classification task. The authors trained the model with following features: term

frequency, inverse document frequency, lower case tokens and minimum term frequency as

features for training algorithms.

Page 35: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

26

The SEGMODEL [94] divided the opinions in two sub-opinions and decided the polarity of

sentiment based on second opinion. However, the study ignored the relationship between two

sub-opinions and the cases when first sub-opinion determines the polarity of the sentiment.

Irvine [16] used a technique to process text given in roman Urdu. The research converted

informal Romanized-Urdu messages into the native Arabic script and normalized the non-

standard SMS language. Next, they implemented the hidden Markov model (HMM) to estimate

the bigram probability of a word. For this research, a highly accurate data dictionary is required

which map roman Urdu to corresponding English terms.

Novelty of Work.

The use of discourse information for sentiment analysis of Urdu language is very rare to found in

the literature. However, discourse information has been used for English language. The most

related work is performed by Mukherjee [33]: the method extended the Bow model by increasing

weight of polarity terms based on their use in context. In order to verify, whether Mukherjee

work replicate same result for Urdu language, we performed experiment 6.7. In this experiment,

the suggested algorithm by Mukherjee was applied on the sentiments of Urdu language. The

result of the algorithm was not very encouraging Table 19. Overall accuracy of the system was

even less than the BoW model that clearly suggesting the more specialized discourse based

theory is required for Urdu Language.

Also, our work is difference from the Mukherjee because this work extracts the most influential

sub-opinion instead of increasing the weights of each individual term. The research introduces a

novel concept: a sentiment with more than one sub-opinion has a most dominant sub-opinion

which determine the polarity of the whole sentiment and this work extracts that sub-opinion. One

Page 36: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

27

more major difference is that Urdu has forward and backward negation which is handled by the

proposed method given in this research paper.

The other more related work that used discourse relation for sentiment analysis is given by Mittal

[30]. However, the steps given in that paper are not reproducible as the author does not mention

the detail of conj_infer and conj_follow. Due to lack of reproducibility, the work could not be

quantitatively compared with the current work. The qualitatively the Mittal work only identify

two relations. On other hand, our proposed work identifies the five sub-clause level relations and

also extracts the polarity relations between clauses.

Literature Review Summary.

Sentiment analysis or opinion mining is the area of NLP that classify sentences into positive or

negative classes. Opinions can be of different types such as subjective opinions, objective

opinions, comparative opinions, direct and indirect opinions, explicit and implicit opinion,

rational and emotional opinions. Different techniques have been used for opinion mining: these

techniques are based on supervised learning and non-supervised learning methods. Supervised

learning techniques exploits the language specific features such as unigrams, bigrams, POS, and

adjectives. Non Supervised learning techniques includes minicut approach, linguistic knowledge,

and lexicon based approach, dependency tree base classification, hierarchical multi classifier,

rule base methods and semi supervise learning algorithms that used expectation maximization

and Naive Bayes. Discourse structures such as opinion frames and graph based relations were

also used for sentiment classification. For aspect base sentiment classification, aspect extraction

has been performed with four techniques: 1) using frequent noun and noun phrases 2) using

opinion and target relations 3) using supervised learning 4) using topic modeling.

Page 37: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

28

Roman Urdu translation was done by researchers using Hindi WordNet and some predefined

phoneme based rules. Different techniques were used for sentiment analysis of Urdu language

those include lexicon base methods, shallow parsing, and structural correspondence learning and

approaches to handle negation phrases.

Urdu is diverse and different language which required specialized computation tools [9].The

current research for Urdu sentiment analysis mainly explores the BoW type algorithms [11], [13]

which heavily rely on the number of adjectives and negation inside the sentiment. The studies

[90][14] applied supervised learning techniques, are limited to legacy set of features. It is very

rare to found any work that a work that utilized the discourse information in SL techniques or in

BoW model for Urdu sentiment analysis.

Page 38: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

29

Chapter 3. Conjunctions in Urdu

The set of words which joins two sentences, phrases or clauses are called conjunctions. In

English, there are three types of conjunctions coordinate conjunction, correlative conjunctions

and subordinating conjunctions. The proposed solution which is given in section 4 heavily relay

on concept of Urdu conjunctions. This chapter introduces types of conjunctions and provides a

short review on the grammar concepts that are necessary to understand the coming chapters.

Types of Conjunctions in Urdu

Every language has different type of conjunctions that combine sentences or words together; In

Urdu, these conjunctions are called haroof حرو ف There are four types of haroofs in Urdu i)

Haroof jar حروف جا ر ii) Haroof ataaf حر و ف عطف iii) Haroof Tahsees حروف تحصیص iv) Haroof Fajya حروف

.فجا یہ

3.1.1 Haroof Jars

These haroofs connect noun with verb or noun with noun. Such as

Example 5: رفیق کا بھائی,

(Rafeeq ka bahi)

Brother of Rafeeq

In this sentence, ka (کا) is haroof jar. Some haroof jars are per, sa , taak, laya.

3.1.2 Haroof Ataf

Haroofs ataf connect two sentences or two clauses together for example.

Example 6: آصف اور علی دونوں اسکول گئے تھے

Asif aur ali dono school gaya thay

Page 39: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

30

Asif and ali both went to school.

Example 7: اظہر آیا تھا پر روکانہیں

Azhar aya thay per roka nahi .

Azhar came but did not stop

There are different types of Haroof Ataf those will be discussed in next section.

3.1.3 Haroof Tahsees

These haroofs make any verb or noun more special.

Example: اکرم کی عادت ہی اسی ہے

Akram ki adat he asee ha

Akram has such attitude,

Some haroof tahsees are he (ہی), tu (تو), bahi (بھی), har (ہر).

3.1.4 Haroof Fajy

Haroof fajya used to express sudden happiness, sadness and emotions. For example

Example: او ہو تم پھرواپس اگئے ھو

oo ho ! tum pher wapas a gaya

oo ho ! you have come back again.

Example of these haroofs are afsoos (افسوس), oo ho (اوہو) , aa ha (آہا) etc.

Types of Haroof Ataf.

There are different types of haroof ataf , these types will be used in proposed solution. Following

are the types of haroof ataf.

3.2.1 Haroof Wasl حروف وصل

Page 40: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

31

It is used to connect two words or two sentences together at one place. These haroofs are aur اور,

wao و, pher پھر , neez نیز , ker کر, ka کے , yaan یاں , kaya کیا

Examples:

Example 8: ابھی تو اور کام پڑا ہے

Abhi tu aur kaam parha ha.

More work is left to do

Example 9:تم جلدی آو اور کام کرو

Tum jaldi ayo aur kaam karo

Come soon and do work

Example 10: اگ لگی پھر بھج گئی

Aag lagi pher bohj gae.

Fire caught but than extinguished

3.2.2 Haroof Tardeed

These haroofs are used to select one out of two choices. The haroofs in this category are yaan یاں ,

hawa ہوا , chaho چاہو , ka کا , yaan tu یاں تو, chahy چا ے

Example 11: کیاتم مانو یاں نا مانو ہم کو اس سے

tm mano yaan na mano hum ko ess sa kaya

Example 12: ھمارے لیاسب برابر ہییں ےآ وہتم آو یاں

tm ayoo yaan wo ayaa hamray laya sub brabar haan

3.2.3 Haroof Astadark

Page 41: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

32

Astadark means doubt. When first sentence mentioned the doubt and second sentence clarify the

doubt, these words are used to connect the sentence. These haroofs are magar مگر, magar haan مگر

.البتہ albata ,لکین laken , پہ pa ,پر per ,ہاں

Example 13: میں نے اس کو بہت سمجھایا مگر اس نے میری ایک نا مانی

Maan na ess ko boht samjhya magar us na mari aik na mani

Example 14: بات تو ٹھک ہ پر وہ مانتا نہیں

Baat tu tahk ha per wo manta nahi

محنت تو بہت کی لکین پاس ہونے کی امید کم ہے

Mehnat tu both ki laken pass honay ki umeed kaam ha

3.2.4 Haroof Astashna

Haroof astashna are haroofs those isolated one word or sentence from other words or sentences.

The examples of these words are magar مگر , sawa سوا, ellawa عالوہ , laken لکین and bagar بغیر .

Example 15:سب لوگ ماجود تھے مگر تم نہیں تھے

Sub loog majood thay magar tm na thay.

Example 16: تمہارے عالوہ سب لوگ اس خبر سے خوش ہیں

Tmhary sawa sub loog ess khbar sa khush haan

3.2.5 Haroof Shaart u Jaza

A set of two words that joins conditional and resultant clauses are called Haruf Shaart u Jaza;

these words are sub type of Haruf-Ataf. Haruf-Shart (حرف شرط) indicates the conditional part and

Haruf-Jaza جزا( )حرف refers to the resultant clause. Examples of Haruf-Shart are agar (اگر), jo (جو),

agarcha (اگرچہ) and examples of Haruf-Jaza are tu (تو), tub (تب), and ess laya (اسلیے).

Haroof jaza are tu تو , so سو, tub تب , aur اور and ess laya اسلیے.

Examples:

Page 42: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

33

Agar mehnat karoo ga tu kamyab ho gaya.

اگر محنت کرو گے تو کامیاب ھو گے

Here agar is haroof sharaat and tu is haroofy jaza.

3.2.6 Haroof Elaat

The Haroofs those explain the cause of some effect or action are called haroof Elaat. These are

kyn ka کیونکہ , ess laya ka اسلیے کے , ess wasty ka اس واسطے کے , taa ka تاکہ , choonka چونکے .

Example 17: میں نہیں آ سکتی کیونکہ میں مصروف ہوں

Maan nahi a sakti kyn ka maan masroof hun.

Example 18: محنت کرواس لیے کے محنت میں برکت ہے

Mehnaat karoo ess laya ka mehnat maan barkat ha

Table 1: List of haroof Atafs.

Haroof Ataf Types Haroof

Haroof Wasl aur اور, wao و, pher پھر , neez نیز , ker کر, ka کے , yaan یاں , kaya کیا

Haroof tardeed yaan یاں , hawa ہوا , chaho چاہو , ka کا , yaan tu یاں تو, chahy چا ے

Haroof Astadark:

magar مگر, magar haan مگر ہاں, per پر, pa پہ , laken لکین, albata البتہ.

Haroof Astashna magar مگر , sawa سوا, ellawa عالوہ , laken لکینand bagar بغیر

Haroof Shaart u Jaza:

Haroof sharat are: agar (اگر), jo (جو), agarcha (اگرچہ) jub taak جب تک and

choon ka چوں کہ

Haruf-Jaza: tu (تو), tub (تب), and ess laya (اسلیے).

Haroof Elaat.

kyn ka کیونکہ , ess laya ka اسلیے کے , ess wasty ka اس واسطے کے , taa ka

. چونکے choonka , تاکہ

Page 43: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

34

Multiple usages of Urdu Haroof Atafs

As it was discussed, the haroof ataf used to connect two sentences but their use is not limited. A

set of words, belongs to Haruf-Ataf, also plays different roles other than connecting two

sentences. The following sections list down some of those cases when these haroofs are used

other than connecting two clauses.

3.3.1 When these words used as proposition (Case1)

Example 19: وہ نیک اور بہادر لڑکا ہے

wo naak aur bahdar larka ha

(He is noble and brave boy)

Example 20: یہ کتاب میز پر رکھ دو

yea kitab maaz per rakh doo

(put this book on the table)

Example 21:اس کا کیمرہ ٹھیک کام نہیں کرتا اور بٹری بھی ٹھیک نہیں ہے

ess ka camera tahk kaam nahi kerta aur bettary b tahk nahi ha.

(Its camera and battery is not working)

Table 2: List of Stop Words

Root Stop Word Example forms

Aya (آیا) Ata (اتا), ati(اتی), ateen (آتیں), ayeen (آییں)

Chukay (چکے) Chaya (چایا), chukey (چکے),

Daya (دیا) Diya (دیا),dete (دتے)

Gaya (گیا) Gaya (گیا),gae (گے)

Ha (ہے) Ha (ہے),haan (ہیں),hoti (ہوتی),hogay (ہوگے) ,howay (ہووے),

Saktay (سکتے) Saktay (سکتے),sakti (سکتی)

Ja (جا) Ja (جا),jayan (جایں),jati (جاتی),

Page 44: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

35

Kr (کر) Ka (کا),ker (کر),keran (کریں),kertay (کرتے), koi (کوئی)

Lga (لگا) lagty (لگتے), lagi (لگی)

Mila (مال) Mila (مال),mliay (ملے),mili (ملی)

Rahay (رہے) rahi (رہی),rahtay (رہے)

Tha (تھا) Thay (تھے),thi (تھی)

Wala (واال) Wali(والی), walay(والے)

In Example 19, aur (اور) is connecting two adjectives, in Example 20 per (پر) is acting as

proposition and in Example 21 aur (اور) is connecting two sentences.

However, in Urdu, stop words are used to check the role of haroof-ataf within the sentence.

These words came at the end of the sentence and made it a complete thought. The Table 2

provides the selected list of stop-words, many of these stop-words are auxiliary verbs.

3.3.2 Multiple Haroof-Ataf in a Sentencce (Case2)

There are cases in which a sentiment contains more than one Haruf-Ataf; consider following

examples.

Example 22:اگر محنت کرو گے تو کامیاب ھو گے

Agar mehnat karoo ga tu kamyab ho gaya.

(If you will work hard, then you will succeed)

Example 23: یہ موبائل بہت اچھا ہوتا اگر اس کی بیٹری ٹائمنگ حراب نا ہوتی

yea mobile boht acha hota agar ess ki battery timing khrab na hoti

(This mobile would be excellent if its battery is not damaged)

In Example 22, agar ( اگر) is Haruf-Shaart and tu (تو) is Haruf-Jaza; clearly agar (اگر) is not

connecting two sentences. However, in Example 23: agar (اگر) connects two sub opinions so it is

a segmentation word.

Page 45: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

36

Chapter 4. Discourse Urdu Sentiment Classification (DUSC)

The proposed solution, Discourse Urdu Sentiment Classification (DUSC) takes the sentiment as

input, extracts discourse information and assign positive or negative polarity to sentiment based

on the extracted discourse information. The proposed solution consists of three modules.

A. Extraction of Discourse Information.

B. Rule-Based Method.

C. Supervise Learning Technique.

The first module extracts the discourse information which includes segmentation of sentiment,

identification of polarity relation and assignment of discourse relation.

Figure 1: Architecture of the Proposed Solution

Page 46: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

37

This discourse information is utilized by both rule-based and supervised learning classifier. The

rule based classifier assign polarity to sentiment with a set of manual rules. These rules are

drafted by the authors after detail observation of sentiments in corpus. The supervised learning

method selects the relevant feature set and train the model. Both methods run independently of

each other as well in hybrid mode. The rule-based method starts classifying the sentiment

without any prior information, which makes it domain independent. On other hand, the

supervised learning method requires prior training which makes it dependent on the type of

training data. The architecture of the proposed solution is given in Figure 1 and details of its

components has provided in upcoming sections.

Extraction of Discourse Information

This module extracts following discourse information from given sentiment.

1. Sub-opinions within the sentiment.

2. Polarity assignment to each sub-opinion.

3. Polarity relation between sub-opinions.

4. Discourse Relation.

4.1.1 Sub-Opinions within the Sentiment

The concept of segmentation word was introduced to segment the sentiment.

4.1.1.1 Segmentation Word (SWi):

The segmentation word (SWi) is a word at position i within the sentiment S which segments the

sentiment into two fragments. Let S is the sentiment with n words, the SWi segment the

sentiment into S1 and S2.

Page 47: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

38

Equation 2:

1 1 2 3 1

2 1 2 3

1 2

, , ...

, , ...

i

i i i n

i

S W W W W

S W W W W

S S SW S

Example 24: آفریدی ایک اچھا پلیئر ہنے مگر پاکستان کو میچ نہیں جتواسکتا

Afridi aik acha player ha magar Pakistan ko match nehi jatawa sakta

S1: “Afridi aik acha player ha”.(آفریدی ایک اچھا پلیئر ہنے )

S2: “Pakistan ko match nehi jatawa sakta”.(پاکستان کو میچ نہیں جتواسکتا )

In Example, the word magar (مگر) is segmentation word (SWi ) which segments the sentiment into

following two fragments.

Example: میں اس نے کمال کر دیا ہے( ) اگرچہ پاکستانی ہاکی ٹیم گورنمنٹ لیول پر محرومیت کا شکار ہے لکین اس مرتبہ چیمپئن ٹرافی

Agarcha Pakistan hockey team government level per mahromyat (-) ka shahkar ha laken

ess martaba champions trophy maan ess naa Kamal (+) ker daya.

S1: Agarcha Pakistan hockey team government level per mahromyat (-) ka shahkar ha.

) اگرچہ پاکستانی ہاکی ٹیم گورنمنٹ لیول پر محرومیت کا شکار ہے(

(Although the government does not support Pakistan's hockey team)

S2: ess martaba champions trophy maan ess naa Kamal (+) ker daya.

اس مرتبہ چیمپئن ٹرافی میں اس نے کمال کر دیا ہے

the team performed well in champion trophy event.

In above example, the word laken is segmentation word and it segments the word into following

two fragments: S1 and S2

4.1.1.2 Sub-opinions:

The segmentation word SWi segments the sentiment into two fragments; each fragment is called

as sub-opinion. These sub-opinions contain positive, negative or neutral polarity.

As in Urdu, the Haruf-Ataf connects two sentences (Chapter 2). Based on this property of Haruf-

Ataf are used to segments an opinion into two sub-opinions [94].

Page 48: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

39

4.1.1.3 Segmentation Algorithm

The following three hypotheses are postulates to select the segmentation word.

H1: Let iW S and ( )iPOS W is Haruf-Ataf (حرف عطف) then Wi is selected as segmentation word

provided both H2 and H3 are satisfied.

The H1 hypothesis selects Haruf-Ataf as segmentation word. In chapter 3, the two cases have

been listed when these haroof do not use to connect the sentences. The H2 and H3 handle such

cases.

The hypothesis (H2) partially offers a solution to the problem; however, for this study only aur

.is considered in this category (پر) and per (اور)

H2: LetiW S ,

1iW is previous word of

iW in sentiment S . Wi-1=’aur’ or Wi-1=’per’ then Wi is

segmentation word (SWi) if POS(Wi-1) =stop-word.

When sentiment has multiple Haruf-Ataf, a third hypothesis (H3) was suggested. However, the

hypothesis only handles cases when part-of-speech tag of the first word is Haruf-Shart and

second word is Haruf-Jaza.

H3: Let Wa , Wb S, a <b and POS(Wa)=Haruf-Shart and PoS(Wb)=Haruf-Jaza then Wb is a

segmentation word (SWi).

Algorithm 1 explains the process of segmentation, based on these three hypotheses stated above.

This process will tokenize the given sentiment. For each token, the algorithm evaluates the token

using three hypothesis; these rules check the possibility of this token as SW. In H1, if token is

haroof sharat then algorithm will search related haroof jaza in sentiment and if it is found then

haroof jaza will be selected as SW. In H2, if token is “aur” or “per” then the context of the token

will be checked and if context of these haroof is other than the connecting of two sentences then

Page 49: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

40

it will not be selected as SW. The H3 checks the possibility of multiple haroof in a single

sentiment.

Algorithm 1: Sentiment Segmentation

4.1.2 Polarity Assignment to Each Sub-Opinion.

This module is based on the work done by [12], [13] with some modification. Polarity

assignment module will take the sentiment or sub opinion and determine its polarity by summing

Page 50: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

41

up the orientations of each positive and negative adjective. If total score is less than zero, the

polarity will be negative. If the total score is greater than zero, the polarity will be positive and if

score is zero then polarity will be neutral.

FlowChart 1: Sentiment Segmentation

This module will also handle the polarity shifters. Polarity shifters are the tokens that reverse the

orientation of an adjective. There are two type of polarity shifters Forward Negation and

Backward Negation. If a Forward Negation is found then the polarity of first adjective whose

Page 51: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

42

position comes after the Forward Negation is reversed. In case of Backward Negation, the

polarity of first adjective whose position comes before the Backward Negation is reversed. A

stack based algorithm was used to handle polarity shifters and calculating the score of sentiment.

In Example 4: وہ ایک اچھا لڑکا نہیں ہے) , acha (+) (good) is a positive word but nahi (نہیں) reverse its

orientation.

4.1.2.1 Orientation Score Algorithm

In the light of the above discussion, a stack-based algorithm (Algorithm 2) calculates the

orientation score of a sentiment and also takes care of forward and backward negations.

Algorithm 2: Sentiment Orientation Score

4.1.3 Polarity Relation between Sub-Opinions

In this study, polarity relation comes as important discourse artefact. Polarity relation is the

relationship between polarities of two sub opinions.

Page 52: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

43

FlowChart 2: Sentiment Orientation Score

Let opinion1 has positive polarity and opinion2 has negative polarity then polarity relationship

between these two polarities is Opposite Polarity (OP). For two sub opinions with three possible

(positive, negative and neutral) polarity values, there could be nine polarity relations (Table 3).

4.1.3.1 Reduction of Polarity Relations

With increase in number of conditions, the design complexity [95] of an algorithm increases; It is

always desired to lower the complexity of an algorithm. In the case of nine polarity relations, the

design complexity of algorithm for single discourse relation is ten and for five discourse relations

Page 53: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

44

it is greater than fifty. To lower the design complexity, the nine combinations were reduced into

four polarity relations: same polarity (SP), same polarity with neutral opinion (SPN), opposite

polarity (OP), and opposite polarity with neutral opinion (OPN). This reduced the complexity of

algorithm by fifty per cent. Less complexity is proposed because greater complexity is less

manageable and testable [96].

Table 3: Possible Polarity Relations between Sub-Opinions

Opinion 1 Opinion 2 Relation

Negative Positive OP

Negative Negative SP

Negative Neutral OPN

Positive Positive SP

Positive Negative OP

Positive Neutral OPN

Neutral Positive OPN

Neutral Negative OPN

Neutral Neutral SPN

Page 54: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

45

Algorithm 3: Identification of Polarity Relation

The algorithm takes sentiment S as input. At first step, it segments the sentiment into two sub-

opinions with algorithm1. Then, it assigns polarity to both sub-opinions with algorithm 2 and

finally based on this information it determines the PR.

4.1.4 Identification of Discourse Relations

Discourse relation is the fourth important discourse information within the sentiment. The

discourse relation is the correlation between the two sub-opinions.

In Urdu literature, Haruf-Ataf (segmentation word) are sub-divided into different categories [7],

[97].

Page 55: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

46

FlowChart 3: Polarity Relation Algorithm

Out of all these categories, the five categories were selected to identify the discourse relation

between sub-opinions. These five categories are wasal, elaat, shart-u-jaza, astadark, and

astashna. The Table 4 shows the most frequent segmentation words with the sub-type of Haruf-

Page 56: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

47

Ataf; the frequency represents their occurrence in the corpus. It is possible that an Urdu sentence

contains type of Haruf-Ataf other than the selected type. For example, Haruf-Tardeed (Table 2

Row 9) is subtype of Haruf-Ataf but that has not been selected as discourse relation indicator.

However, this study is only limited to the selected types which are explained in the next section.

4.1.4.1 Wasal Relation

In Urdu, wasal means ‘to connect’; Haruf-Wasal وصل( )حرف is sub-type of Haruf-Ataf that connects

two sentences of equal weight or importance. When two opinions of equal weight about the same

entity join through the Haruf-Wasal, the Wasal relation exists between these sub-opinions. In

Example 25, “aur ”)اور( is Haruf Wasal, and it joins the two sub opinions S1 and S2.

Example 25: ( کرتا اور بیٹری بھی خراب ہے اس کا کمیرہ ٹھیک کام نہیں )

Ess ka camera [tahk kaam nahi] (-) kerta aur battery b Khrab (-) haa.

(Its camera do not work correctly, and battery is also out of order)

S1: Ess ka camera tahk [kaam nahi kerta] (-) (negative)

(اس کا کمیرہ ٹھک کام نہیں کرتا)

Its camera ]do not work] (-) correctly

S2: battery b khrab ha (-) (negative)

(بیٹری بھی خراب ہے)

Battery is also [out of order] (-)

Page 57: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

48

Table 4: List of Segmentation Words: The Huroof Ataf which divide the sentiment into two sub-opinions and also identify the

discourse relation

Word Type Frequency

Laken لکین HaroofAstadark 128

Tu تو HaroofJaza 79

Per پر HaroofAstadark 73

Magarمگر HaroofAstashna 61

Aur اور HaroofWasl 54

ess laya اسلنے HaroofElaat 14

kyn kaکیونکہ HaroofElaat 12

Albataالبتہ HaroofAstadark 7

Yaan یاں *HaroofTardeed 7

Sawa سوا HaroofAstashna 5

Agarcha اگرچہ HaroofAstashna 5

HaroofTardeed is sub-type of Haroof-Ataf but in this research that is not using as segmentation word or identification for discourse relation.

4.1.4.2 Astadark Relation

Astadark means doubt; Haruf-Astadark رک(ا)حرف استد is a sub-type of Haruf-Ataf. The Haruf-

Astadark joins two clauses when a clause contains an ambiguity and a second clause clarify that

ambiguity. When these words connect two sub-opinions, the resultant relation is Astadark. In

Example 26, “magar )مگر( ” is the segmentation point, and it segments the sentiment into the two

sub-opinions S1 and S2.

Example 26: ( گاڑیوں میں بہت پسند ہےھونڈاسٹی ہےمہنگی مگر مجھے ساری )

honda city ha mahngi (-) magar mja sare garion me bohat pasand (+) hai.

(Honda city is quite expensive but I like it most in all cars)

S1: Honda city ha mahngi (-) (negative)

ہےمہنگیونڈاسٹی )ھ (

Honda city is quite expensive (-)

S2: mja sare garion me bohat pasand (+) hai. (positive)

)مجھے ساری گاڑیوں میں بہت پسند ہے(

Page 58: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

49

4.1.4.3 Astashna Relation

Astashna means “exception”; Haruf-Astashna )حرف استشنا( is sub-type of Haruf-Ataf. This type of

word connects a clause that contains general information about a group or set of entities with the

clause that excludes any object or any entity from the set. In a sentiment, these words exclude an

object from the positive or negative opinion (given earlier in the sentiment). In Example 27,

“sawa )سوا(” is the segmentation point, and it segments the sentiment in S1 and S2.

Example 27: (یہ سیٹ بہتر ہے اس کے سوا باقی سیٹ ٹھک کام نہیں کرتے)

Yea set bahtar (+) ha ess ka sawa baqi set [tahk kaam nahi] (-) kertay

(Except this set, which is better, all other sets do not work correctly.)

S1: Yea set bahtar (+) ha (positive)

( یہ سیٹ بہتر ہے )

This is better (+) set

S2: baqi set [tahk kaam nahi] (-) kertay (negative)

(باقی سیٹ ٹھک کام نہیں کرتے )

All other sets [do not work correctly] (-).

4.1.4.4 Shart-u-Jaza Relation

Shart means condition, and Jaza means reward; shart-u-jaza refers to a reward based on a

condition. These two words occur together to connect two clauses when the first clause points to

the condition and the second clause refers to the result. Haruf-Shart represents the conditional

clause, and Haruf-Jaza points to the resultant clause. Haruf-Jaza segments the sentiment when

both types of words coexist within the sentiment. In Example 28, “agar )اگر(” is Haruf-Shart and

“tu )تو( ” is Haruf- Jaza, so the resultant sub-opinions are S1 and S2.

Page 59: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

50

Example 28: )اگر یہ فرنیچرمظبوط ہےتو مہنگا اور پیارا بھی ہے (

yea furniture mazbot (+) hae tu mehnga (-) aur peyara (+) bhi ha

(if this furniture is durable, it is also expensive and lovely)

S1: agar yea furniture mazbot (+) hae (positive)

اگر یہ فرنیچرمظبوط ہے()

(If this furniture is durable)

S2: mehnga (-) aur peyara (+) bhi ha (neutral)

مہنگا اور پیارا بھی ہے()

(it is expensive and lovely)

4.1.4.5 Elaat Relation

Elaat means cause or reason for something; Haruf-Elaat (حرف علت( is a sub-type of Haruf-Ataf.

Haruf-Elaat combines two clauses when the first clause describes the detail of an action and the

second clause explains the reason to do that action. When Haruf-Elaat joins two sub-opinions,

the discourse relation is Elaat. In Example 29, “kynka ”)کیونکہ( is Haruf-Elaat.

Example 29: (بابر اچھا انسان نہیں ہے کیونکہ وہ جھوٹا ہے )

Babar [acha insan nahi] (-) ha kynka wo johata (-) ha

(Babar is not good person because he is a liar.)

S1: Babar acha insan nahi (-) ha. (negative)

(بابر اچھا انسان نہیں ہے (

Babar is [not good] (-) person

S2: kynka wo johata ha (negative)

(وہ جھوٹا ہے)

He is a liar (-)

Page 60: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

51

4.1.4.6 Discourse Relation Algorithm

FlowChart 4: Discourse Relation Finder

This algorithm takes the sentiment as input. Then, it identifies the type of segmentation word and

based on the type it returns the discourse relation. The algorithm selected one discourse relation

out of five.

Page 61: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

52

Algorithm 4: Discourse Relation Finder

Rule Based Classifier.

The rule based model (RModel) defines a set of rules to classify sentiments. RModel selects

most influential sub-opinion based on the discourse relation and polarity relation. The most

important sub-opinion was selected with help of rules. In next sections, first the rules are

explained and then the algorithm was given.

4.2.1 Rules Identification

A set of rules was extracted to apply on sentiment for the polarity calculation. These rules

utilized the discourse relation, polarity relation and polarity of both sub-opinions. In first sub

section, the rules were explained and in second sub-section the detail of the procedure was given

that utilized these rules to classify the sentiment.

Page 62: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

53

4.2.1.1 Rules Extraction

These rules are extracted manually with fundamental observation that the polarity of sentiment

(having multiple sub-opinions) depends on polarity of one of its sub-opinion. So, the task is to

establish rules those identify leading sub opinion within the sentiment while discourse

information is available. In order to extract rules, more than 400 sentiments were categorized by

the discourse relation and then by opinion relations. This information was used to extract

syntactic rules manually. Let P (O1) be the polarity of the first opinion, P (O2) is the polarity of a

second sub-opinion, P(Oi) is the polarity of any of two sub-opinions and PNN (S1, S2) represents

the polarity of the non-neutral sub-opinion. These rules are summarized in a 5x4 matrix (Table

5): the rows contain the name of the discourse relation, the columns represent the polarity

relation, and the cells show the leading sub-opinion of the sentiment.

Table 5: Leading Sub-Opinion of Sentiment for the Discourse Relation and the Corresponding Polarity Relation

Same Polarity

Same Polarity

Neutral

Opposite Polarity Opposite Polarity Neutral

Wasal P(Oi) PNN(S1, S2) P(O2) PNN(S1, S2)

Astadark P(O2) P(O2) P(O2) P(O2)

Shart-u-Jaza P(O2) P(O2) P(O2) P(O2)

Astashna P(O2) PNN(S1, S2) P(O2) PNN(S1, S2)

Elaat P(O1) P(O1) P(O1) P(O1)

A set of following syntactic rules extracted from the matrix (Table 5).

Let “^” represent the logical AND operator, and “v” represent the logical OR operator.

Rule1a: Polarity(S) = P (Oi) when (DR=Wasal) ^ (PR=SP)

Rule1b: Polarity(S) = PNN (S1, S2) when (DR=Wasal) ^ (PR=OPN v PR=SPN)

Rule1c: Polarity(S) = P(O2) when (DR=Wasal) ^ (PR=OP)

Page 63: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

54

Rule2: Polarity(S) = P(O2) when (DR=Astadark v DR= Shart-u-Jaza Relation)

Rule3a: Polarity(S) = PNN(S1,S2) when (DR=Astashna) ^ (PR=OPN)

Rule3b: Polarity(S) = P(O2) when (DR=Astashna) ^ (PR=SP v PR=SPN v PR=OP)

Rule4a: Polarity(S) = PNN(S1,S2) when (DR=Action & Reason) ^ (PR=OPN)

Rule4b: Polarity(S) = P(O1) when (DR=Action & Reason) ^ (PR=SP v PR=SPN v PR=OP)

In these rules, the discourse relation provides the primary clue for sentiment classification;

however, for certain cases, the polarity relation is also important. In following section, each rule

is explained with examples.

4.2.1.2 Rule 1a, Rule1b and Rule1c:

Wasal Relation: The polarity relation (PR) decides the overall polarity of the sentiment when

two sub-opinions of a sentiment connect through a Wasal relation. The decision, based on PR,

leads to three subcases.

Case 1: When both sub-opinions have the same polarity (PR is SP), the sentiment polarity is the

same as the polarity of any sub-opinion.

Example 30: ( موبائل کی سمارٹنسس بہت اچھی ہے اور اس کا کمیرہ بھی بہت اچھا ہنے)

mobile ki smartness b achi ha aur is ka camera bhi bohat acha hai

Mobile is slim (+) and also has an excellent (+) camera.

Segmentation Word: aur

Sub opinion 1: mobile ki smartness(+) b achi (+) ha (positive).

Sub opinion 2: is ka camera bhi bohat acha(+) hai (positive).

Discourse Relation: Wasal

Opinion Relation: SP

Page 64: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

55

Case 2: When one of the sub-opinions contains neutral polarity (PR = OPN or PR = SPN), the

polarity of sentiment is the polarity of the non-neutral sub opinion.

Example 31: کر رہی ہے( میں نےچارماہ پہلے یہ فون خریدا تھا اور اس کی بیٹری بہت اچھا کام)

main ne four months pehlay yeh phone khareeda tha aur is ki battery bohat acha

kaam ker rahi hai

I bought the phone four months ago and its battery is excellently (+) working.

Segmentation word: aur

Sub Opinion 1: main ne four months pehlay yeh phone khareeda tha (neutral)

Sub Opinion 2: is ki battery bohat acha (+) kaam kar rahi hai (positive)

Discourse Relation: Wasal

Polarity Relation: OPN

Case 3: When both sub-opinions hold opposite and non-neutral polarity, the sentiment polarity is

same as the polarity of the second sub-opinion.

Example 32: (ھونڈا دیکھنے میں بہت شاندار ہے لکین اندر سے اتنی ہی فضول ہے)

Honda amaze ka exterior shandaar hai aur interior utna ha fazool

(Honda amaze has great exterior but interior is that much useless.)

Segmentation Word: aur

Sub opinion 1: honda amaze ka exterior shandaar (+) hai (positive)

Sub opinion 2: interior utna he fazool ha (-) (negative)

Discourse Relation: Wasal

Polarity Relation: OP

Page 65: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

56

4.2.1.3 RULE 2:

The Rule is related to two discourse relations: Astadark Relation and Shart-u-Jaza Relation.

When the discourse relation is Astadark, the sub-opinions contradict each other. In this case, the

second sub-opinion leads the polarity of the sentiment. In Example 33, the first sub-opinion

carries negative polarity and the second sub-opinion holds positive polarity, thus the second sub-

opinion makes the polarity of sentiment positive.

Example 33: ( لکین مال زمین کیلے ایک سہولت فراہم کرتی ہے میٹرو بس ایک مہنگا پروجیکٹ ہےپنجاب گورنمنٹ کی )

Punjab government ki metro bus aik mahnga project ha laken employees kay laye

aik sahuulat fraaham karti hae

Metro bus is an expensive (-) project of Punjab government but it [provides the

facility](+) to working class.

Segmentation Word: laken

Sub opinion 1: Punjab governments ki metro bus aik mahnga (-) project ha

(negative)

Sub opinion 2: Employees kay laye aik sahuulat (+) fraaham karti hae (positive)

Discourse Relation: Astadark

Polarity Relation: OP

In, Example 34 first sub-opinion contains positive polarity and second sub-opinion has negative

polarity; however, the polarity of the sentiment is negative (the polarity of the second sub-

opinion).

Example 34: ( samsung کا پچھال ماڈل بہترین تھا لکین نیا ماڈلز اچھے نہیں ہیں )

Previous model best tha laken Samsung k naye models achay nahi hain.

Page 66: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

57

Previous model of Samsung was best (+) but its new models are [not useful] (-).

Segmentation Word: laken

Sub Opinion 1: previous model best (+) tha (positive)

Sub Opinion2: Samsung k naye models [achay nahi] (-) hain. (negative)

Discourse Relation: Astadark.

Polarity Relation: OP

Shart-u-Jaza Relation: In the shart-u-jaza relation, the first sub-sentence explains the

precondition and the second sub-sentence represents the actual opinion of the sentiment. The

relation works identically to the Astadark relation: when the relation occurs, the second sub-

opinion leads the polarity of the sentiment. In Example 35, the second sub-opinion carries

negative polarity and determines the overall polarity (negative) of the sentiment.

Example 35: ( lemo ایک لگژری کارہے مگر یہ بہت زیادہ آئل کھا تی ہے )

lemo luxury car ha magar es ki oil consumption bohat ziada hai.

(Lemo is a luxury car but its oil consumption is high.)

Segmentation Word: magar

Sub Opinion 1: lemo luxury (+) car ha (positive)

Sub Opinion 2: es ki oil consumption (-) bohat ziada hai. (negative)

Discourse Relation: Shart-u-Jaza

Polarity Relation: OP

In Example 36, the first sub-opinion holds negative polarity and second sub-opinion carries

positive polarity; the polarity of sentiment is also positive.

Example 36: ( جھے پسند تو نہیں تھی مگر پہن کر بہت کمال کی لگییہ ٹی شرٹ م )

ye t-shirt mujhe pasand tu nahiin thi magar pahn kar boht kamal ki lgi

Page 67: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

58

I did [not like] (-) this t-shirt but when I wear it, it looks great (+).

Segmentation Word: magar

Sub opinion 1: ye t-shirt mujhe [pasand tu nahiin] (-) thi (negative)

Sub opinion 2: pahn kar bri kamal (+) ki lgi (positive)

Discourse Relation: Shart-u-Jaza

Polarity Relation: OP

4.2.1.4 Rule 3a and Rule 3b

These rules deal with the sentiments those have Astashna Relation.

Astashna Relation: When the discourse relation between two sub-opinions is Astashna, the

polarity relation (PR) determines the polarity of the sentiment. The decision based on the polarity

relation leads to three sub-cases.

Case 1: When the polarity of one sub-opinion is neutral (polarity relation is either OPN or SPN),

the sentiment polarity is the same as the polarity of the non-neutral sub-opinion. In Example 37,

the first segment holds a neutral polarity and the second segment contains negative polarity; the

polarity of the non-neutral sub-opinion sets the tone (negative) of the sentiment.

Example 37: ( sunsilk وہ باقی شمپپو بیک ار ہیںکے عالا )

sunslik k sawa sarey shampo bey kaar haen

The shampoos other than Sunsilk are useless(-).

Segmentation Word: sawa

Sub Opinion 1: Sunsilk k (neutral)

Sub Opinion 2: sarey shampoos bey kaar (-) haen (negative)

Discourse Relation: Astashna

Page 68: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

59

Polarity Relation: OPN

Case 2: When both sub-opinions have non-neutral polarity, the polarity of the second sub-

opinion determines the polarity of the sentiment.

In Example 38, the first sub-opinion holds positive polarity and second sub-opinion contains

negative polarity; the polarity of sentiment is negative, identical to the polarity of the second sub-

opinion.

Example 38: (وہ کوی اور کوالٹی نہیں ہے )اس کار میں اچھا نظر آنے کے عالا

iss car meyn achi look k alawa aur koi achi quality nahiin hae

This car does [not have any other quality](-) except good looks(+)

Segmentation Word: alawa

Sub Opinion 1: iss car meyn achi (+) look k (positive)

Sub Opinion 2: koi aur achi quality nahiin (-) hae (negative)

Discourse Relation: Astashna

Polarity Relation: OP

4.2.1.5 Rules 4a and Rule 4b

These rules are related to the Elaat Relation.In this relation, the first sub-sentence holds the main

concept and the second sub-sentence explains that concept. In E18, the first segment contains

positive polarity and the second segment has negative polarity. However, the polarity of the

sentiment is positive (as is the polarity of the first sub-opinion) because the second segment

explains the reason and does not affect the polarity of the sentiment. When the Elaat relation

exists between two sub-opinions, the polarity of the sentiment depends on the polarity of the first

sub-opinion.

Page 69: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

60

Example 39: ( مہنگی ہے فیراری کی سپیڈ اچھی ہے اسلیے تو )

Ferrari ki speed achi hae iss laye tu maehngi hae

Ferrari is a high-speed (+) car that’s why its [price is high](-).

Segmentation Word: iss laye

Sub Opinion 1: Ferrari ki speed achi (+) hae (positive)

Sub Opinion 2: tu maehngi (-) hae (negative)

Discourse Relation: Elaat

Polarity Relation: OP

4.2.2 Rule Based Algorithm.

The proposed classifier involved three main modules to classify the given sentiment. The

architecture of proposed classifier is given in Figure 2. Now, the detail working of each

component is provided in next section. The first module (Algorithm 1) searches for a word with

a type of Haruf-Ataf in the sentiment and divides the sentiment into two sub-opinions around the

word. The second module calculates the polarity relation. To calculate the polarity relation, the

polarity of each sub-opinion is calculated with previously proposed BoW algorithm [94].

However, for completeness, the code is listed in Algorithm 3 which uses a stack based approach

and handles forward and backward negations.

Page 70: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

61

Figure 2: Architecture of Rule Base Classifier

Polarity Indicators and Polarity shifters

Sentiment Segmentation

If sw=null

Polarity Relation and Discourse Relation Extractor

Sentiment Polarity Assignment

PR=Polarity Relation identification

Polarity Assignment BoW

List of Haroof Ataf and their types

DR=Discourse Relation Identification

SW=Segmentation Word

SubOpinions, SW

PR,DR

SubOpnions

Sentiment S

Sentiment

SW

Page 71: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

62

Algorithm 5: Polarity Assignment to Sentiment

Then, these polarities are used to calculate the polarity relation with Algorithm 3. This module

extracts the discourse relation with help of Algorithm 4 . Finally, the set of rules (Table 3) are

applied on the discourse information to calculate the sentiment polarity. When the first module

failed to find the segmentation point, the classifier calculates the polarity of the sentiment with a

simple BoW approach and skips the other modules.

Page 72: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

63

FlowChart 5: Polarity Assignment to Sentiment

Proposed Supervised Learning Technique

This section identifies the possible feature set for supervise learning technique, discusses the

important features and provides the details on training the model.

Page 73: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

64

4.3.1 Feature Identification

In this section, different set of features are identified and discussed.

4.3.1.1 Unigram

A sentence consists of words and the feature based on these words is called unigram. Unigram is

a legacy feature that used largely in literature as a baseline binary feature in the classification of

text documents. A sparse matrix (Figure 3) was developed in R language with help of

RTextTools [98]. The columns of the matrix represent all unique words in the corpus, and each

row represents a sentiment. When a sentiment contains a word, the cross section of the row and

the column of the word have a value equal to 1; otherwise, its value is 0.

Figure 3: Partial View of Sparse Matrix of Unigram Features

A classifier was trained with svmRadial kernel in R language [39]. All unigram features were

used to train the model. The varImp function of R package calculates the importance of each

variable. This function uses ROC curve analysis on each predictor. For each cut-offs, the

sensitivity and specificity are computed. Then, the series of cut-offs are applied on predicators to

Page 74: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

65

identify the class. The trapezoidal rule is used to compute the area under the ROC curve. This

area is used as the measure of variable importance [99]. In Figure 3, the vertical axis shows the

unigram and horizontal axis shows the value of importance.

Figure 4: Most Important Unigram Features

The most important unigram is “hain”. The prefix “u_” denotes that the feature is unigram. The

detail of these top unigram features is listed in Table 6. The results reveal, there is only one

unigram feature in top list (Table 6 Row 6) that carries polarity information and all other top

unigram features are neutral.

Table 6: List of most important unigram features with their POS and Polarity information

Unigram Part-of-Speech Polarity

Hain Auxiliary Verb Neutral

Say Postposition Neutral

Tou Haruf Ataf Neutral

Word Noun Neutral

Market Noun Neutral

Behtreen adjective Positive

Phone Noun Neutral

Magr Haroof Ataf Neutral

Kya Prq Neutral

Mai Personal Pronoun Neutral

Page 75: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

66

4.3.1.2 TF-IDF Unigram

Instead of using binary feature of unigram, the Term Frequency (TF-Unigram) feature calculates

the total occurrence of each word within the document and use this number as a feature. In other

words, this feature gives more weight to the terms that occurs more frequently. However, some

terms may use more frequently than others for example the term "an" is common word. To avoid

these command terms, an inverse document frequency factor is added to eliminate the weight of

terms that occur very frequent. This feature is called Inverse Frequency (IF). We have used TF-

IDF Unigram feature to classify sentiments into positive and negative class.

4.3.1.3 Bigram

The bigram is binary feature which used two words instead of one word. For following

sentiment, the corresponding bigram are given.

S1: He is good boy

He_is is_good good_boy

Similarly, for each sentiment the bigram were calculated and a common set of bigrams

were identified. Then a sparse matrix was constructed to represent this binary feature.

4.3.1.4 Polarity Score

The BoW algorithm calculates the polarity score of a sentiment. For SL training, a binary feature

was used to represent this information. For each sentiment, the feature value is set to 1 when

BoW calculates a positive score. The feature value set to 0 when BoW calculates a negative

score.

4.3.1.5 Score Greater Than Zero

It is the binary feature to represent polarity score. Its value is 1 if the value of polarity score is

greater than zero and the value is 0 if the value of polarity score is less than zero.

Page 76: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

67

4.3.1.6 Discourse Features

The Algorithm 1, Algorithm 2, Algorithm 3 and Algorithm 4 calculate the discourse information:

first opinion polarity (FOPP), second opinion polarity (SOPP), segmentation word (SW),

segmentation word part-of-speech (SWPOS), discourse relation (DR) and polarity relation (PR).

This information is included in discourse feature set.

4.3.1.7 Correlation between discourse features

The correlation coefficient is a normalized measurement, which indicates how two variables are

linearly related with each other. The correlation was found between two variables x and y with

following formula:

*

xy

xy

x y

cor

Where σxy is covariance between two variable x and y, σx is standard deviation of x and σy is

standard deviation of y. Two variables are positively linearly related if the covariance is 1 and

these variables are negatively correlated when the covariance is -1. If covariance is near zero, it

shows week linear relationship among the variables. The correlation was calculated between all

the discourse features with COR function in R (Figure 5). The cross-section of row and column

filled with scaled colour, which represents how two discourse features are correlated to each

other. For example, SWPS and DR are weekly positively correlated in comparison to relation

between PR and SOPP. However, some discourse features are highly correlated with other

discourse features. For example, the discourse relation (DR) strongly correlates with both

segmentation word (SW) and first opinion polarity (FOPP). This information is useful to reduce

the feature set and select the best optimal features.

Page 77: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

68

Figure 5: : Discourse Features Correlation Matric

4.3.1.8 Part of Speech (POS)

The research community extensively used POS tag features for classifying text documents. A

dictionary was prepared which contains all vocabulary terms and related POS tag. First,

sentiments were converted into related tags and then a sparse matrix was developed. The

columns of the matrix represent all unique POS tag in the corpus while the row represents a

sentiment. When a sentiment contains a POS tag, the cross section of the sentiment row and the

column of POS tag has a value equal to 1; otherwise, its value is 0. The following example

shows the conversion of a sentiment:

Yea larka acha ha

This is good boy

POS: pp nn jj stopword

Page 78: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

69

4.3.1.9 POS with Polarity

This feature merges both the POS tag and the polarity of a word together. The following example

shows the conversion of a sentiment into the POS with polarity feature:

Yea larka acha ha

He is good boy

POS: pp_neutral nn_neutral jj_positive stopword_neutral

Both polarity and POS tag are extracted from the dictionary. This feature was also converted into

sparse matrix as it was prepared in case of POS and unigram.

4.3.2 Feature Selection

This section listed the important and the optimal number of features for model training.

4.3.2.1 Important Features

The recursive feature elimination method (RFE), a wrapper method [100] around SVM, was

used to identify the main key features (Figure 5). There is no unigram feature in the list but the

discourse and POS with polarity features are present.

Page 79: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

70

Figure 6: Main Key Features

The recursive feature elimination (RFE) is a backward feature selection algorithm [101] with

cross-validation. The method adopts greedy optimization approach for selecting the optimal

subset of features. The RFE repeatedly tunes/trains the model, calculates model performance,

ranks the variables, chooses either best or worst performing feature and repeats the process

without the selected feature. The process recursively eliminates feature one by one and calculates

the performance of model. Finally, the features are ranked based on the performance of model

when they are not used.

4.3.2.2 Optimal Number of Features:

For each sub set of features, the model was train with minimum subset and gradually increasing

size of the subset. It was observed that different subsets of features behaved differently with

increasing number of features (Figure 7).

Page 80: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

71

Table 7: Feature Set Performance with Increasing Number of Features

Feature Set Name Variables Accuracy Kappa AccuracySD KappaSD

Unigram

4 0.576187 0.168577 0.045521 0.087168

8 0.628074 0.266118 0.042491 0.081299

16 0.620327 0.248045 0.051196 0.099605

2221 0.678343 0.361915 0.045444 0.089783

Pos

4 0.670339 0.341674 0.057344 0.112161

8 0.642644 0.287522 0.071317 0.141087

16 0.637708 0.275857 0.086918 0.172437

24 0.645464 0.291242 0.101336 0.201558

Discourse

4 0.818371 0.63576 0.06912 0.137786

6 0.808371 0.616054 0.068249 0.136127

All

4 0.786124 0.571886 0.054351 0.107731

8 0.796188 0.59162 0.056046 0.112717

16 0.821124 0.641789 0.058867 0.117622

2299 0.813634 0.625174 0.05716 0.116246

Poswithpolarity

4 0.727974 0.451412 0.047292 0.097422

8 0.75773 0.51102 0.06874 0.139507

16 0.737907 0.472169 0.074693 0.150227

48 0.725278 0.447239 0.066741 0.133821

Discourse_Polarity_PosWithPolarity

4 0.780624 0.560611 0.057253 0.113221

8 0.808127 0.616135 0.054389 0.108511

16 0.817944 0.635779 0.042313 0.084093

78 0.805752 0.611131 0.043639 0.087492

We also calculated the Kappa [102] statistic, which is a metric that compares an observed

accuracy with an expected accuracy (random chance). Kappa Statistic compares the accuracy of

the system to the accuracy of a random system. kappa statistic is normalized statistic its value

never exceeds one, so the same statistic is usable as the number of observations grows .Observed

Page 81: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

72

Accuracy is simply the number of instances that were classified correctly throughout the entire

confusion matrix.

In case of unigram features, all features, discourse features and all except unigram features, the

classifier accuracy increases with an increase in the number of features; however, after a certain

level, less improvement was observed with the addition of more features.

Although in few cases (POS, POS with polarity), performance decreased when more features

were added. The Kappa [102] statistic was also calculated, which is a metric that equates

an expected accuracy with an observed accuracy .

Page 82: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

73

Figure 7: Optimal Number of Features

For each subset of features, The Table 7 list down the number of features, their corresponding

accuracy, kappa measure, accuracy standard deviation and kappa standard deviation. It was

found that classifier’s accuracy improved with increasing number of Unigram features. However

at first, more rapid change was observed but after the certain level (16 features) the increase in

accuracy was very slow (R1-R4).

Table 8: Most important features of each set

Feature Set Most Important Features

Discourse isScoreGreaterThanZero,secondOpinionPolarity,firstOpinionPolarity,segmentationWordPOS,NA

Pos negb,negf,ccs,cca,NA

pos_with_polarity negbneutral,jjnegative,ccaneutral,jjpositive,negfneutral

discourse_pos_poswithPolarity secondOpinionPolarity,isScoreGreaterThanZero,firstOpinionPolarity,jjnegative,negb

All secondOpinionPolarity, isScoreGreaterThanZero ,jjnegative, firstOpinionPolarity, negb

Similar behaviour was shown by all features; however, in this case a small subset of features

obtained a large accuracy but after increase in more feature the accuracy only improved by 2

points (R11-R14). In both POS (Row 5 and Row 8) and POS with polarity (R15-R18) the

accuracy decreased with increasing number of features. This implies these subsets are negatively

relevant [103] to each other. The discourse feature shows continues increase in accuracy with

increase in number of features (R9-R10). The small number of discourse features obtained the

comparable accuracy that full set of discourse features has achieved. Finally, the set of all except

unigram features showed better performance than the set of unigram features. The Table 8 lists

down the most important features within each subset.

Page 83: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

74

4.3.3 Model Training

Fernandez [104] showed that the Random Forest and SVM methods are the best classifiers for

classification problems. Different feature sets were examined with these two classifiers. The

code in R was implemented with the help of the Caret [105] package.

Page 84: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

75

Chapter 5. Implementation

This section is divided into three subsections: first section contains information about dataset,

second section provides the particulars on experimental setup and third section gives the details

on the experiments.

Datasets

The literature lacks publicly available datasets of sentiments written in roman Urdu. For

evaluation of both classifiers, two separate datasets are prepared. The first dataset (D1) consisted

of 443 product reviews of cars and cosmetic products. The second dataset (D2) comprised of 401

product reviews of electronic devices (Table 3).

The D1 and D2 included reviews from different social media forums; these forums were of type

mobile phones, cars and beauty products. After sentiment collection, three reviewers

independently marked the polarity (positive or negative) to each sentiment. These reviewers

were the student of computer science and active user of social media. Maximum voting

algorithm, out of three votes, selected the final polarity of sentiment.

Table 9: Corpus Detail

Dataset

Total

Reviews

Average

Length

Positive

Opinions

Negative

Opinions

D1 443 93 194 249

D2 401 81 197 204

Page 85: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

76

In Roman Urdu, some words contain spaces for example; “kay lay” (therefore) is complete word

but separated with space. In pre-processing stage, the boundaries of these words are marked with

XML syntax. In dataset, these words were enclosed in XML tag as <word> kay lay </word>.

After dataset creation, a data dictionary was developed which consists on the vocabulary terms

of dataset. A program (Figure 8) was written which tokenized each sentiment and allow human

annotator to set the polarity and part-of-speech (POS) tag. The program indicates whether a

token (word) is already added into the dictionary. To make the results reproducible, both data

dictionary and data sets are publically available[106].

Figure 8: Annotator Interface for Data Dictionary

Experimental Setup

BoW is a legacy type of classifier that calculates polarity of sentiment by summing up the

orientation of each adjective within the sentiment. The current version of BoW (Algorithm 2)

also handles both forward negation and backward negation. The architecture of rule based

Page 86: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

77

algorithm is given in Figure 1 and detail algorithm is explained in Algorithm 5. The BoW and

Rule-Based classifiers are implemented in C#. The Visual studio 2010 express edition was used

to develop the software.

Figure 9: Architecture of Code

The Figure 9 shows the architecture of implemented code. Abstract BaseModelTwoClass allow

any type of model to inject inside the framework. Four different models has been implemented to

conclude the research. The Figure 10 lists down the classes which hold the process and

unprocessed sentiment.

Page 87: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

78

Figure 10: Classes Related to Sentiments

The sentiment is main class which hold and process the sentiment. The methods of the class

allow system to identify the adjectives, negations, segmentation words and other artefacts.

SentiWord is representation of single word of sentiment. The processed sentiment is the class

which hold post processing information of the sentiment. The Polarity and Sentiment Type are

enumeration classes: the polarity is enumeration of Positive, Negative and Neutral and Sentiment

Type is enumeration of different available sentiment such as sport, politics and talk shows.

Page 88: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

79

Figure 11: Classes Holding Dictionary

The Figure 11 shows the set of classes which handles the language related functionality of

sentiments. The MyDictionaryItem class represent a single entry in a dictionary. For each item it

stores Part of Speech (POS) which is type of PartofSpechItem, polarity and word (the item). The

PartOfSpechItem class allow system to identify the POS of a word. It exposed a property for

every possible POS. This class is useful for identification of adjectives, negation and

segmentation words.

Page 89: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

80

Figure 12: Helper Classes

The Figure 12 shows the list of helper classes. FleLoader is singleton class that manages the

loading of data sets. Logitem and Logoptions used to record the messages and history of

processing. Util is general purpose class that provides different utility functions. The Figure 13

shows the class diagram of client side interface of the system.

Page 90: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

81

Figure 13: Sentiment Client

The sentiment client is UI application that interacts with user to load process and annotate the

sentiments. It allows user to process sentiment in batch or process these one by one. It also

informs about undefined dictionary items.

This software required two files, dataset file and the dictionary file, to start processing. Each line

of the dataset file consists of sentiment text and user-defined polarity separated by a # symbol.

Dictionary file contains all unique words exists in the data file. Each line of the dictionary file

defines the word, POS tag, and polarity, all separated by the # symbol. The startup program reads

the sentiments from the data file; loads the word information from a dictionary file; and finally,

passes the sentiment to the classifier. The classifier returns the estimated polarity of the

sentiment. This process was repeated for each sentiment of data file and calculated the overall

performance of the classifier.

Page 91: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

82

For supervised learning experiments, code in R language was implemented with help of CARET

[101] and RTextTools [98]. The R-Studio version 0.99 was used to write and test the code. The

Figure 14 lists down the code that plots the most important features using svmRadial and

bootstrapping method.

Figure 14: Most Important Features

The Figure 15 shows the R code which identifies the optimal number of features. In the code

recursive feature elimination method is used with 5 cross folds. The Figure 16 lists down the

code that trains the model; tests the model and returns the performance of the model.

Page 92: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

83

Figure 15: Optimal Number of Features

Figure 16: Model Training with R

Page 93: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

84

Experiments

The four experiments were performed to measure the performance.

1. The first experiment tests the performance of BoW.

2. The seocnd set of experiments evaluates the rule-based classifier.

3. The third set of experiments assesses the SVM with different feature set.

4. The fourth set of experiments collected the data which was required to verify whether the

results are statistically significant.

Case Study

This section will show the working of proposed solution with an example.

Input Sentiment:

لکین مالزمین کیلۓ ایک سہولت فراہم کرتی ہے میٹرو بس ایک مہنگا پروجیکٹ ہےپنجاب گورنمنٹ کی

Step1: Identification of Sub-opinion

Input: Sentiment

Process: The Algorithm searched for a word with a type of Haruf-Ataf within the sentiment and

divided the sentiment into two sub-opinions around the word.

Output:

– Segmentation word: لکین

– Sub-opinion1 (O1) : میٹرو بس ایک مہنگا پروجیکٹ ہے پنجاب گورنمنٹ کی

– Sub-opinion2 (O2): کیلے ایک سہولت فراہم کرتی ہےمال زمین

Step 2: Calculation of Orientation Score of each sub-opinion and polarity

Input: Both sub-opinion O1 and O2

Output:

– Score(O1): -1

– Score (O2): +1

Page 94: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

85

– P(O1)= negative

– P(O2) = positive

Step 3: Calculate the Polarity Relation

Input: polarity of both sub-opinions

Process: This algorithm calculates the polarity relation based on polarity of each sub-opinion.

Output:

– Polarity Relation: OP

Step 4: Identification of Discourse Relation

Input: Segmentation Word

Process: This step extracted discourse relation based on the segmentation word. There are five

possible discourse relations between sub-opinions

Output:

– Discourse Relation: Astadark

Step 5: Sentiment Polarity Assignment

Input: Polarity Relation and Discourse Relation

Process: The final step applied a set of rules to the discourse relation and polarity relation to

calculate the sentiment polarity.

Output:

– Selected Rule: Polarity(S) = P(O2) when PR=OP and DR=Astadark

– Sentiment Polarity: Positive

Page 95: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

86

Candidate Example

The section contains the list of candidate examples and output of rule based classifier

Sentiment: wo sub acha hain laken pakistan ka laya behtar nahi haan

Processing Detail:

Sentiment: suzuki aik bahtreen company ha <word> es laya </word> ess suzuki ko junk cars

banana band ker dena chahye.

Processing Detail:

User Polarity Positive

BOW Polarity Neutral

User Polarity negative

BOW Polarity neutral

SentiUnits { acha,1,False }, { behtar,1,True,nahi } ,

Discourse Relation Disagreement

Segmented Word { laken , HaroofAstadark , 4}

Opinion1 wo sub acha hain

Opinion2 pakistan ka laya behtar nahi haan

Opinion1 Polarity { positive , 1 }

Opinion 2 Polarity { negative , -1 }

Polarity Relation OP

Model Polarity Negative

Actual Polarity Negative

Page 96: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

87

SentiUnits { naya,1,False }, { fazool,-1,False }, { bakwas,-1,False }, {

zabardast,1,False },

Discourse

Relation Disagreement

Segmented Word { laken , HaroofAstadark , 8}

Opinion1 whatsapp ka naya feature fazool aur bakwas hai

Opinion2 previous version zabardast ha

Opinion 1 Polarity { negative , -1 }

Opinion2 Polarity { positive , 1 }

Polarity Relation OP

Model Polarity Positive

Actual Polarity Positive

Sentiment: honda amaze ka exterior shandaar hai aur interior utna ha fazool

BOW Polarity Neutral

SentiUnits { shandaar,1,False }, { fazool,-1,False },

Discourse Relation Continuing

Segmented Word { aur , HaroofWasl , 5}

Opinion1 honda amaze ka exterior shandaar hai

Opinion2 interior utna ha fazool

Opinion1 Polarity { positive , 1 }

Opinion2 Polarity { negative , -1 }

Polarity Relation OP

Model Polarity negative

Actual Polarity negative

Page 97: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

88

Chapter 6. Results and Discussion

Performance Metrics

The three performance metrics accuracy, precision and recall has been selected to evaluate the

performance of proposed and legacy methods.

For each classifier, the true positives (TP), true negatives (TN), false positives (FP) and false

negatives (FN) are calculated. Then, all three metrics were measured for positive and negative

classes of each dataset using following formulas.

Let

Equation 3:

, and { 1, 2}c positive negative i DataSet DataSet

Equation 4:

 

Precision TP / TP FP

Recall TP / TP FN

Accuracy TP TN / TP FP FN TN

i i i i

c c c c

i i i i

c c c c

i i i i i i i

c c c c c c c

Next, average performance of classifiers was separately measured for both D1 and D2.

Equation 5:

Pr (Pr Pr ) / 2

Re (Re Re ) / 2

( ) / 2

i i i

positive negative

i i i

positive negative

i i i

positive negative

ecision ecision ecision

call call call

Accuracy Accuracy Accuracy

Finally, overall performance was calculated by taking the average of both D1 and D2 using

following formulas

Page 98: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

89

Equation 6:

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =(𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛1 + 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛2)

2

𝑅𝑒𝑐𝑎𝑙𝑙 =(𝑅𝑒𝑐𝑎𝑙𝑙1 + 𝑅𝑒𝑐𝑎𝑙𝑙2)

2

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =(𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦1 + 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦2)

2-

Performance of Bag-of-Word (BoW)

The BoW algorithm (Algorithm 3) was implemented in C#, which required a dictionary to

identify positive and negative terms within a sentiment. The result of the experiment was noted

in terms of confusion matrix (Table 10), which was used to calculate three (accuracy, precision,

and recall) performance metrics (Table 11 ).

Table 10: Confusion matrix of simple BoW (without discourse information)

Class True Positives True Negatives False Positives False Negatives

Positive

250 372 73 149

Negative 133 408 33 270

Table 11: Performance of simple BoW

Class Precision Recall Accuracy

Positive

77.40 62.66 73.70

Negative 80.12 33.00 64.10

Macro Average 78.76 47.83 68.90

The first experiment calculated the performance of BoW algorithm (Table 11) that gave the

baseline criteria for other two experiments. The BoW model showed very low performance in all

Page 99: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

90

three metrics (precision, recall and accuracy). When polarity of sentiment is not in accordance to

number of adjectives, the BoW fails.

Proposed Rule-Based Classifier

This experiment determined the performance of the rule-based approach. The results of the trial

are presented in terms of the confusion matrix (Table 12); these findings are involved in the

calculation of accuracy, precision and recall (Table 13)

Table 12: Confusion matrix of Rule-Based Classifier

Class True Positives True Negatives False Positives False Negatives

Positive 314 391 54 85

Negative 275 402 39 128

Table 13: Performance of the proposed Rule-Based classifier

Class Precision Recall Accuracy

Positive 85.33 78.70 83.53

Negative 87.58 68.24 80.21

Macro Average 86.45 73.47 81.87

Table 14: Output of Rule Based Classifier that shows Discourse Relation and Polarity Relation

Sentiment Number Discourse Relation Polarity Relation Actual Polarity Assigned Polarity

1 Astadark OP negative negative

2 Astadark OP positive positive

3 Jaza SPN positive neutral

4 Astadark OP negative negative

5 Wasl OPN positive positive

6 Astashna OPN positive positive

7 Jaza SPN positive neutral

Page 100: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

91

8 Elaat SP positive positive

9 Astadark SPN negative neutral

*The table only list down output for first nine sentiment

The distribution of polarity relation was also calculated with respect to discourse relation and

projected in Figure 17

.

Figure 17: Distribution of Polarity Relation and Discourse Relation within the Dataset

After applying the rule base classifier at each sentiment, the detail of the processing information

has logged. The partial snapshot (for nine sentiments) is shown in Table 14. From this data, the

performance of each rule was calculated and the output was projected in the form of a Precision-

Recall (PR) curve (Figure 18).

0 20 40 60 80 100 120 140 160 180

OP

OPN

SP

SPN

Astashna Astadark Wasal Shart-u-Jaza Elaat

Page 101: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

92

Figure 18: Precision-Recall curve of each rule (upper right position shows better performance than lower left position)

Rule 1c and Rule 1b performed better than the other rules as they were in the upper right corner

of the curve. Rule 4b appeared toward the lower left corner of the curve, which suggested

comparatively lower performance. The rules Rule 1c, Rule 2, and Rule 3b gave high priority to

the second sub-opinion, and showed better performance. However, there were cases when the

second sub-opinion remains unable to determine the polarity of a sentiment; Rule 3a and Rule 4

Page 102: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

93

handled such opinions. The subset of rules Rule 1a, Rule 1b, Rule 1c were all related to the

Wasal discourse relation, but they showed different performance due to polarity relation; the

same is the case with Rule 4a and Rule 4b. The observation leads to the conclusion that both the

discourse relation and the polarity relation are necessary input for the rule-based algorithm.

The third set of experiments evaluated the proposed rule based algorithm. These experiments

showed significantly better results (Table 13) than simple BoW algorithm. The model applied a

set of rules on the discourse information to identify the most important sub-opinion of the

sentiment.

To understand performance of discourse relation, a Fourfold graph [107] was constructed (Figure

19). For each discourse relation, the graph tests the hypothesis that whether the association

between assigned polarity and actual polarity is significant. The graph revealed three important

observations. First, each relation showed balanced performance for both positive and negative

classes because the area of the upper left quadrant (the correct classification of the positive

classes) and the lower right quadrant (the correct classification of the negative classes) was

relatively equal. Second, the diagonal area of each four-fold graph was greater than the off-

diagonal area that showed high odd ratios of correct classification. Third, the quadrant circles

were not overlapping (odd ratio > 1): when four quadrant circles do not overlap each other to

make a complete circle, the relation indicates significant association between assigned and actual

polarity.

Page 103: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

94

Figure 19: Fourfold display for each discourse relation. The area of each shaded quadrant shows the frequencies, standardized to

equate the margins for the assigned class and the actual class. The integer value in each quadrant represents the raw frequency;

for example, the upper left quadrant shows the number of correctly classified positive sentiments. The diagonal (upper left and

lower right) shows the correct classification cases, and the off diagonal (upper right, and lower left) shows the incorrect

classification. When four quadrant circles do not overlap each other to make complete circle, the relation indicate significant

association between assigned and actual polarity.

Actual: Positive

Assig

ned: P

ositiv

e

Actual: Negative

Assig

ned: N

egative

Relation: Action and Reason

11

2

4

11

Actual: Positive

Assig

ned: P

ositiv

e

Actual: Negative

Assig

ned: N

egative

Relation: Condition

23

1

10

33

Actual: Positive

Assig

ned: P

ositiv

e

Actual: Negative

Assig

ned: N

egative

Relation: Continuing

33

1

3

16

Actual: Positive

Assig

ned: P

ositiv

e

Actual: Negative

Assig

ned: N

egative

Relation: Disagreement

66

2

4

108

Actual: Positive

Assig

ned: P

ositiv

e

Actual: Negative

Assig

ned: N

egative

Relation: Exception

29

4

3

41

Page 104: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

95

The rule-based algorithm exhibited equally better performance (diagonal of the fourfold graph)

for both positive and negative instances. The results showed that the classifier improved the

sentiment analysis task for each type of relation. However, misclassification (off-diagonal area of

fourfold graph) varied from relation to the relation. For example, the classifier misclassified a

comparatively small proportion of sentiments with Astadark relation and a high proportion of

sentiments with Elaat relation. This observation indicated that certain relations are more robust

and less variant in their opinion patterns than others. The behaviour leads to the conclusion that

the discourse relation establishes a useful clue for the classification of both positive and negative

reviews.

Supervised Learning for Urdu Sentiment Classification

We performed a set of experiments to check whether the discourse information improved the

machine learning task. Two types of machine learning algorithms (SVM and Random Forest)

were used on the different feature set to determine the best optimal solution. We used 5-fold

cross validation and obtained different accuracies for each fold. The reported accuracy was used

to select the optimal model using the largest value. The standard deviation reports the difference

from mean.The results are shown in Table 15.

Page 105: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

96

Table 15: Performance of the Supervised Learning Algorithms

Random Forest Support Vector Machine

Feature Set Features Accuracy Precision Recall F1 Accuracy Precision Recall F1

1 Unigram 2221 0.6883 0.7531 0.5894 0.6612 0.6534 0.6753 0.6329 0.6534

Unigram –TF-IDF 0.68 0.6890 0.74 0.7135 0.66 0.65 0.76 0.7007

Bigram 0.7136 0.6890 0.86 0.7650 0.70 0.695 0.85 0.7647

2 POS 24 0.6559 0.6825 0.6232 0.6515 0.6708 0.6847 0.6715 0.6780

3 POS + Unigram 2245 0.7182 0.7423 0.6957 0.7182 0.6933 0.7 0.7101 0.7050

4 POS With Polarity 48 0.7431 0.7476 0.7585 0.7530 0.7307 0.7037 0.8261 0.7600

5 FS1 1 *NA NA NA NA NA 0.7042 0.8164 0.7561

6 Unigram + FS1 2222 0.7357 0.7205 0.7971 0.7568 0.7382 0.7277 0.7874 0.7563

7 Discourse 6 0.7431 0.8768 0.5845 0.7014 0.7382 0.8643 0.5845 0.6973

8 Discourse + FS1 7 0.798 0.8539 0.7343 0.7895 0.8204 0.814 0.8454 0.8294

9 Discourse + Unigram 2228 0.8279 0.867 0.7874 0.8252 0.8204 0.8392 0.8068 0.8226

10 FS2 2 0.7232 0.8582 0.5556 0.6745 0.7232 0.8582 0.5556 0.6745

11

POS + Unigram +

Discourse

2252 0.8279 0.852 0.8068 0.8287 0.813 0.8204 0.8164 0.8183

12

FS1 + FS2 + Polarity

Relation

4 0.7905 0.8555 0.7150 0.7789 0.7905 0.8555 0.715 0.7789

13 ALL 2300 0.8329 0.8535 0.8164 0.8345 0.8105 0.8104 0.8261 0.8181

14 FS1 + FS2 + FS3 5 0.818 0.849 0.7874 0.8170 0.8379 0.8447 0.8406 0.8426

FS1 = ScoreGreaterThanZero, FS2 = FirstOpinionPolarity + SecondOpinionPolarity, FS3 = NegativeAdjectives + Negations

*NA = Random Forest required more than one variable.

Page 106: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

97

The fourth set of experiment was performed to answer the question whether discourse

information improved the classification task of supervised learning algorithm. The traditional

unigram and POS features were not sufficient to classify complex sentiments written in Urdu

(Table 15, Row 1, and Row 2). However, relatively high performance was obtained when these

two features were combined (Table 15, Row 3), but the performance was not comparative to

other combinations. The classifier performance was improved when the polarity information was

added with a POS tag (Table 15, Row 4). The score feature is certainly important and

outperforms the traditional unigram and POS features (Table 15, Row 5). This feature also

improved the classification task when combined with traditional features (Table 15, Row 6, and

Row 8). Although, discourse features were smaller in number, they contained crucial information

and showed better performance than traditional features. These features considerably improved

performance when combined with the Unigram (Table 15, Row 9). When both discourse features

and score feature were used together, the better result was obtained than their individual and

independent use (Table 15, Row 8). Instead of using all of the discourse features, the subset of

discourse features (Table 15, Row 10) showed the comparatively same performance. This lead to

the conclusion that subset of discourse features was highly correlated and redundant (Figure 4).

Finally, an optimal subset of five features (Table 15, Row 14) was found, which showed the best

performance when compared with the other combinations.

Performance Comparison

For this study, the discourse information came in three ways: sub-opinions within the sentiment,

discourse relation and the polarity relation between sub-opinions. The most frequent

segmentation words were identified (Table 4) to segment a sentiment into two sub-opinions. The

Page 107: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

98

discourse relation depended on the word that joined two sub-opinions within the sentiment, and

the opinion relation (Table 2) depended on the polarity of each sub-opinion.

The discourse information significantly improved the capabilities of both the BoW algorithm and

the supervised learning method to classify complex opinions written in the Urdu language

(Figure 20). Both of the proposed methods showed better performance than the legacy BoW

method, and they are also comparative to each other. The purpose of this study was three-fold:

first, to identify the discourse information within the sentiments written in Urdu; next, to propose

a rule-based algorithm to classify the complex opinions; and finally, to determine the best

optimal features sets for Supervised learning algorithms.

The Figure 11 shows the comparison between all classifiers; the best supervised learning results

(Table 15

Row 14) were selected to compare with other two methods.

Page 108: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

99

Figure 20: Comparison between the simple BoW and the purpose method

Significant Test

Three experiments were performed to test the significant level of following observations.

a. The proposed Rule based classifier performance is significantly greater than the

simple BoW model.

b. The performance of SL technique with discourse features is significantly greater than

the model trained without discourse features.

c. In significant number of opinions, people tend to conclude their opinion at the end of

sentiment

6.6.1 Performance difference between BoW and Rule Based Algorithm

84.47 84.06 83.79 83.9286.45

73.47

81.8777.4478.76

47.83

68.9

56.46

Precision Recall Accuracy F1

Machine Learning RuleBased BoW

Page 109: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

100

This experiment was performed to check whether the performance difference between rule base

algorithm and BoW algorithm is statistically significant and does not occur by chance. A subset

of sentiments (401 sentiments) was randomly extracted from the corpus. Then, each sentiment

was classified by both BoW and rule-base classifier. The results were logged in to Table 16. The

MeNeamar’s test [108], [109] was applied to check both null hypothesis and alternative

hypothesis.

H0: the both algorithms have same error rate.

H1: the error rate of two algorithms is statistically significant.

MeNemar’s test uses following chi square statistic with 1 degree of freedom.

Equation 7:

𝜒 =(𝑛01 − 𝑛10)2

(𝑛01 + 𝑛10)

Where n01 is number of sentiments misclassified by proposed classifier but correctly classified by

BoW and n10 is number of sentiments misclassified by BoW but correctly classified by the

proposed classifier. In this test it was checked whether performance difference between BoW

and proposed rule-based algorithm is significant. The results (Table 16) shows there were 119

discordant sentiment (Rule-Based and BoW had different exposure to the misclassification).

There are 68.067% sentiments those are misclassified by BoW but correctly classified by Rule-

Based and the 31.933% sentiments those are misclassified by the Rule-Based method but

correctly classified by BoW.

Table 16: Significant Test to measure the performance difference between BoW and Proposed Rule-Based Classifier

Misclassified by BoW Correctly Classified by BoW

Misclassified By Proposed Classifier 190 38

Correctly Classified bY Proposed Classifier 81 92

Page 110: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

101

McNemar ‘s Test: The two-tailed P value is 0.0001 and Chi squared is 14.824 with 1 degrees of freedom reject the null

hypothesis and accept alternative hypothesis

The p-value was calculated with McNemar's test with the continuity correction. The two-tailed

p-value is 0.0001 and Chi squared is 14.824 with 1 degree of freedom. The small p value < 0.5

rejects the null hypotheses and approves the alternate hypothesis: the performance difference

between two methods is statistically significant.

6.6.2 Performance Difference between Supervised Learning

Techniques

Let M01 is model trained with unigram features and M13 is model trained with selected features.

A subset of sentiments was randomly selected and these sentiments were classified with M1 and

M13. From the predicted output, McNamara’s statistic was calculated to check whether

performance difference between M01 and M13 is significant. The results are shown in Table 17.

Table 17: significant test to measure the performance difference between models trained with unigram features and

model trained with selected features

Misclassified By M13 Correctly Classified By M13

Misclassified By M1 43 86

Correctly Classified BY M1 45 227

McNemar ‘s Test: The two-tailed P value is 0.0003 and Chi squared is 12.83 with 1 degrees of freedom reject the

null hypothesis and accept alternative hypothesis .

In second test, it was verified whether performance difference between model trained with

unigram (Table 15 m1) and the model trained with selected features (Table 15 m14) is

statistically significant. The process was started with null hypothesis that performance difference

between m1 and m14 is not significant. The results (Table 17) with small p-value reject the null

hypothesis.

Page 111: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

102

6.6.3 Second opinion significantly leads the polarity of sentiment.

A statistical test was performed to check whether second opinion significantly leads the polarity

of sentiment. From the corpus, sentiments with two opinions were selected. Three steps were

applied before performing significant test: 1) sentiment was segmented 2) polarity of first and

second opinion was determined and 3) two sets S0 and S1 were prepared.

S0: set of all sentiments when polarity of first opinion is same as the actual sentiment polarity.

S1: set of all sentiments when polarity of second opinion is same as the actual sentiment polarity.

The common sentiments (those exist in both sets) were removed from both S0 and S1. Let p0 and

p1 are the number of elements in set S0 and S1 respectively. A null hypothesis was defined as

H0:p1 > p0 and alternative hypothesis was defined as H1:p0 > p1. The following statistic was used

to calculate Z and P-value.

Equation 8:

1 1 2 2 1 2

1 2

1 2

( * * ) / ( )

1 1*(1 )

/

p p n p n n n

SE p pn n

Z p p SE

The results and calculation are logged in Table 7. The very low p-value rejects the null

hypothesis that is p0>=p1

Table 7: Significant Test

Variable Detail Value

P0 = number of sentiments, in

which first opinion determines the 80

Page 112: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

103

polarity of sentiment

P1 = number of sentiments when

second opinion determine the

polarity of sentiment.

392

Significance Level 0.01

z-value 17.1

P 0

The third significant test was performed to find whether in significant number of sentiments the

second sub-opinion lead the orientation of the sentiment. In 80 sentiments, the first sub-opinion

determines the polarity of sentiment and in 392 sentiments second sub-opinion was the lead

opinion. The Z-test (Table 7) with significance level 0.01, z-value =17.1 and p=0 leads the

conclusion that in significant number of opinions people tend to conclude the sentiment at end of

sentence.

Performance of English Language Algorithm on Urdu

The experiment execute the algorithm of Mukherjee [33] on Urdu corpus. At first step, we

translated all the relation’s attributes into corresponding Urdu terms. Next, we annotated our

dictionary items of these terms with corresponding relations. For example, the word “but” is

relational attribute whose corresponding Urdu translation is “laken”. For all variants of laken, we

annotated it with the corresponding relation name. We did it for all attribute terms. Finally, we

implemented same algorithm within our framework to classify the Urdu sentiments. These

results are logged in to Table 18. However, the results are not very encouraging.

Page 113: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

104

Table 18: Confusion matrix of of English Language Algorithm

TP TN FP FN

positive

294 233 213 104

negative 151 373 67 253

Table 19: Performance of English Language Algorithm

Precision Recall Accuracy Fmeasure

63.63 55.62 62.26 59.36

Page 114: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

105

Chapter 7. Conclusions and Future Work

The proposed solution successfully extracted the discourse information which consists of

discourse relation and polarity relation. The first experiment finds out the performance (Table 9)

of BoW algorithm (Algorithm 3) that sets the baseline criteria for other two experiments. In all

three metrics (precision, recall and accuracy) the algorithm showed very low performance. The

second set of experiments evaluates the proposed rule based algorithm. The proposed rule-based

algorithm with extracted discourse information showed significantly better results (Table 11)

than simple BoW algorithm.

The third experiment was performed to answer the question whether discourse information

improved the classification task of machine learning algorithm. The traditional unigram and POS

features were not sufficient to classify complex sentiments written in Urdu (Table 13, Row 1,

and Row 2).The traditional unigram (binary), unigram with TF-IDF, Biagram and POS features

were not sufficient to classify complex sentiments written in Urdu. Instead of large number of

features, an optimal subset of five features (Table 13, Row 14) was found, which showed the

best performance as compared to the other combinations.

Two significant tests have been performed to verify results. The first test verifies whether

performance difference between BoW and proposed rule-based algorithm is significant. The

second test confirmed whether performance difference between model trained with unigram

(Table 13 m1) and the model trained with selected features (Table 13 m14) is statistically

significant.

Page 115: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

106

The discourse information significantly improved the capabilities of both the rule-based

algorithm and the machine learning method to classify complex opinions written in the Urdu

language (Figure 11). The both proposed methods showed better performance than the legacy

BoW method and they are also comparative to each other. The purpose of this study was three-

fold: first, to identify the discourse information within the sentiments written in Urdu; next, to

propose a rule-based algorithm to classify the complex opinions; and finally, to determine the

best optimal features sets for machine learning algorithms.

It is found that the segmentation of a complex sentiment into smaller discourse units reveals

useful information about the structure and rhetoric of the sentiment. The results indicates that the

discourse relations which give high priority to second sub-opinion shows better performance

than the discourse relation those give high priority to first sub-opinion. The observation leads to

the conclusion that a large number of sentiments depends on the second sub-opinion of the

sentiment. The conclusion also verified with the significant test.

The manually extracted rules with discourse information significantly improved the performance

of Bag-of-Word based model. The results also reveal that discourse relations are not sufficient to

classify the sentiment and polarity relations play a significant role along with discourse relation.

A small subset of discourse features is found which showed significantly better performance than

large set of legacy unigram features. It is also concluded that the different legacy feature sets

exhibit better performance with the proposed discourse features. It is found that increasing

number of features for model training does not increase the performance (in all cases) at same

rate; in some cases, the performance reduces with increasing number of features. Finally, a set of

five features consisting of both discourse and legacy features are found which shows best

performance.

Page 116: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

107

The rule-based model does not require any prior training, but its performance is relatively

weaker than that of supervised learning methods. Conversely, the supervised learning method

requires prior training and performs better than the proposed rule-based method. The supervised-

learning technique is the best choice when applications have prior training data. However, for

applications that lack the labelled data, the proposed rule-based technique is the better choice

because it outperforms the BoW method.

The statistically test are performed to check whether the performance difference between two

methods is significant. These results concluded the discourse information significantly improved

the performance of both BoW based method and supervised learning technique for Urdu

sentiment classification.

The research contributed a corpus of Urdu sentiment analysis and dictionary consisted of

vocabulary terms, POS tag and word orientation which has been made public. The current

research targeted sentiment with two sub-opinions that remain excellent until the opinions are

short messages like those on Twitter, in forum comments or as Facebook status posts. However,

resources such as blogs, reviews, and TV talk shows have multiple sub-opinions; the proposed

technique is required to extend and test for such scenarios.

Brand monitoring, political situation assessment, and terrorist behaviour detection are common

applications of social media mining. All of these applications are mature enough for western

languages, especially applications in the English language. The Urdu is morphologically

complex with diverse grammar, and the language requires more specialized applications to

classify complex and ambiguous opinions. To further research in sentiment analysis of the Urdu

language, the use of discourse information at the sub-opinion level is necessary and results of the

current work strongly support the concept.

Page 117: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

108

References

[1] J. Wiebe, “Learning subjective adjectives from corpora,” in Association for Advancement

of Artifical Intelligence / Innovative Applications of Artificial Intelligence, 2000, pp. 735–

740.

[2] V. Hatzivassiloglou and K. R. McKeown, “Towards the automatic identification of

adjectival scales: Clustering adjectives according to meaning,” in Proceedings of the 31st

annual meeting on Association for Computational Linguistics, 1993, pp. 172–182.

[3] T. Nasukawa and J. Yi, “Sentiment analysis: Capturing favorability using natural

language processing,” in Proceedings of the 2nd international conference on Knowledge

capture, 2003, pp. 70–77.

[4] S. Y. Hui and K. H. Yeung, “Challenges in the migration to 4G mobile systems,”

Communications Magazine, Institute of Electrical and Electronics Engineers, vol. 41, no.

12, pp. 54–59, 2003.

[5] B. Liu, “Sentiment analysis and opinion mining,” Synthesis Lectures on Human Language

Technologies, vol. 5, no. 1, pp. 1–167, 2012.

[6] R. G. Gordon and B. F. Grimes, Ethnologue: Languages of the world, vol. 15. Summer

Institute of Linguistics (SIL) international Dallas, TX, 2005.

[7] J. T. Platts, A grammar of the Hindustani or Urdu language. Sang-e-Meel Publications,

2002.

[8] A. Hardie, “Developing a tagset for automated part-of-speech tagging in Urdu.,” in

Corpus Linguistics 2003, 2003.

[9] Q.-A. Akram, A. Naseer, and S. Hussain, “Assas-Band, an affix-exception-list based

Urdu stemmer,” in BCS Information Retrieval Specialist Group(IRSG) Symposium:

Future Directions in Information Access, 2009, p. 23.

[10] K. Riaz, “Challenges in Urdu Stemming (A Progress Report),” in BCS Information

Retrieval Specialist Group (IRSG) Symposium: Future Directions in Information Access,

2007, p. 23.

[11] Z. Afraz, A. Muhammad, and A. Martinez-Enriquez, “Sentiment-Annotated Lexicon

Construction For An Urdu Text Based Sentiment Analyzer.,” Pakistan Journal of

Science, vol. 63, no. 4, 2011.

[12] A. Z. Syed, M. Aslam, and A. M. Martinez-Enriquez, “Sentiment analysis of urdu

language: handling phrase-level negation,” in Advances in Artificial Intelligence,

Springer, 2011, pp. 382–393.

[13] A. Z. Syed, M. Aslam, and A. M. Martinez-Enriquez, “Lexicon based sentiment analysis

of Urdu text using SentiUnits,” in Advances in Artificial Intelligence, Springer, 2010, pp.

32–43.

[14] M. Bilal, H. Israr, M. Shahid, and A. Khan, “Sentiment classification of Roman-Urdu

opinions using Na𝚤ve Bayesian, Decision Tree and KNN classification techniques,”

Journal of King Saud University-Computer and Information Sciences, 2015.

[15] S. Mukund and R. K. Srihari, “A vector space model for subjectivity classification in

Urdu aided by co-training,” in Proceedings of the 23rd International Conference on

Computational Linguistics: Posters, 2010, pp. 860–868.

Page 118: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

109

[16] A. Irvine, J. Weese, and C. Callison-Burch, “Processing informal, romanized Pakistani

text messages,” in Proceedings of the Second Workshop on Language in Social Media,

2012, pp. 75–78.

[17] J. M. Wiebe, R. F. Bruce, and T. P. O’Hara, “Development and use of a gold-standard

data set for subjectivity classifications,” in Proceedings of the 37th annual meeting of the

Association for Computational Linguistics on Computational Linguistics, 1999, pp. 246–

253.

[18] T. Wilson, J. Wiebe, and R. Hwa, “Just how mad are you? Finding strong and weak

opinion clauses,” in Association for Advancement of Artifical Intelligence / Innovative

Applications of Artificial Intelligence, 2004, vol. 4, pp. 761–769.

[19] B. Liu, “Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data,” Citado

nas, p. 8, 2006.

[20] N. Jindal and B. Liu, “Identifying comparative sentences in text documents,” in

Proceedings of the 29th annual international ACM Special Interest Group on Information

Retrieval (SIGIR) conference on Research and development in information retrieval,

2006, pp. 244–251.

[21] N. Jindal and B. Liu, “Mining comparative sentences and relations,” in Association for

Advancement of Artifical Intelligence, 2006, vol. 22, pp. 1331–1336.

[22] E. Riloff, S. Patwardhan, and J. Wiebe, “Feature subsumption for opinion analysis,” in

Proceedings of the 2006 Conference on Empirical Methods in Natural Language

Processing, 2006, pp. 440–448.

[23] L. Zhang and B. Liu, “Identifying noun product features that imply opinions,” in

Proceedings of the 49th Annual Meeting of the Association for Computational

Linguistics: Human Language Technologies: short papers-Volume 2, 2011, pp. 575–580.

[24] A. Chaudhuri, Emotion and reason in consumer behavior. Routledge, 2012.

[25] W. Parrott, Emotions in social psychology: Essential readings. Psychology Press, 2001.

[26] S. Poria, E. Cambria, G. Winterstein, and G.-B. Huang, “Sentic patterns: Dependency-

based rules for concept-level sentiment analysis,” Knowledge-Based Systems, vol. 69, pp.

45–63, 2014.

[27] C. D. Paice, “An evaluation method for stemming algorithms,” in Proceedings of the 17th

annual international ACM Special Interest Group on Information Retrieval (SIGIR)

conference on Research and development in information retrieval, 1994, pp. 42–50.

[28] E. Cambria and A. Hussain, Sentic computing: Techniques, tools, and applications, vol. 2.

Springer Science & Business Media, 2012.

[29] A. Ortigosa, J. M. Mart𝚤n, and R. M. Carro, “Sentiment analysis in Facebook and its

application to e-learning,” Computers in Human Behavior, vol. 31, pp. 527–541, 2014.

[30] L. Di Caro and M. Grella, “Sentiment analysis via dependency parsing,” Computer

Standards & Interfaces, vol. 35, no. 5, pp. 442–453, 2013.

[31] J. M. Chenlo, A. Hogenboom, and D. E. Losada, “Rhetorical Structure Theory for

polarity estimation: An experimental study,” Data & Knowledge Engineering, vol. 94, pp.

135–147, 2014.

[32] M. William and S. Thompson, “Rhetorical structure theory: Towards a functional theory

of text organization,” Text, vol. 8, no. 3, pp. 243–281, 1988.

Page 119: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

110

[33] S. Mukherjee and P. Bhattacharyya, “Sentiment Analysis in Twitter with Lightweight

Discourse Analysis.,” in International Conference on Computational Linguistics

(COLING), 2012, pp. 1847–1864.

[34] T. Sabbah, A. Selamat, M. Ashraf, and T. Herawan, “Effect of thesaurus size on schema

matching quality,” Knowledge-Based Systems, vol. 71, pp. 211–226, 2014.

[35] N. Asher, F. Benamara, and Y. Y. Mathieu, “Appraisal of opinion expressions in

discourse,” Lingvisticæ Investigationes, vol. 32, no. 2, pp. 279–292, 2009.

[36] N. Asher, F. Benamara, Y. Y. Mathieu, and others, “Distilling Opinion in Discourse: A

Preliminary Study.,” in International Conference on Computational Linguistics

(COLING), 2008, pp. 7–10.

[37] S. Inam, M. Shoaib, F. Majeed, and M. I. Sharjeel, “Ontology based query reformulation

using rhetorical relations,” International Journal of Computer Science Issues(IJCSI), vol.

9, no. 4, 2012.

[38] M. Shoaib, A. Shah, and A. Vashishta, “A Dynamic Weight Assignment Approach for IR

Systems,” in Information and Communication Technologies, 2005. ICICT 2005. First

International Conference on, 2005, pp. 272–275.

[39] S. K. Khurshid and M. Shoaib, “Rio: rhetorical structure theory based indexing technique

for image objects.,” Internation Arab Jouranl of Information Technology, vol. 10, no. 5,

pp. 511–518, 2013.

[40] L. Zhou, B. Li, W. Gao, Z. Wei, and K.-F. Wong, “Unsupervised discovery of discourse

relations for eliminating intra-sentence polarity ambiguities,” in Proceedings of the

Conference on Empirical Methods in Natural Language Processing, 2011, pp. 162–171.

[41] S. Somasundaran, J. Ruppenhofer, and J. Wiebe, “Discourse level opinion relations: An

annotation study,” in Proceedings of the 9th SIGdial Workshop on Discourse and

Dialogue, 2008, pp. 129–137.

[42] S. Somasundaran, G. Namata, L. Getoor, and J. Wiebe, “Opinion graphs for polarity and

discourse classification,” in Proceedings of the 2009 Workshop on Graph-based Methods

for Natural Language Processing, 2009, pp. 66–74.

[43] M. Bilgic, G. M. Namata, and L. Getoor, “Combining collective classification and link

prediction,” in Data Mining Workshops, 2007. ICDM Workshops 2007. Seventh Institute

of Electrical and Electronics Engineers, 2007, pp. 381–386.

[44] C. Zirn, M. Niepert, H. Stuckenschmidt, and M. Strube, “Fine-Grained Sentiment

Analysis with Structural Features.,” in International Joint Conference on Natural

Language Processing, 2011, pp. 336–344.

[45] M. Taboada, K. Voll, and J. Brooke, “Extracting sentiment as a function of discourse

structure and topicality,” Simon Fraser Univeristy School of Computing Science

Technical Report, 2008.

[46] L. Barbosa and J. Feng, “Robust sentiment detection on twitter from biased and noisy

data,” in Proceedings of the 23rd International Conference on Computational Linguistics:

Posters, 2010, pp. 36–44.

[47] A. Go, R. Bhayani, and L. Huang, “Twitter sentiment classification using distant

supervision,” in Proceedings of the 2003 conference on Empirical methods in natural

language processing, 2009, pp. 129–136.

Page 120: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

111

[48] P. D. Turney, “Thumbs up or thumbs down?: semantic orientation applied to

unsupervised classification of reviews,” in Proceedings of the 40th annual meeting on

association for computational linguistics, 2002, pp. 417–424.

[49] E. C. Dragut, C. Yu, P. Sistla, and W. Meng, “Construction of a sentimental word

dictionary,” in Proceedings of the 19th ACM international conference on Information and

knowledge management, 2010, pp. 1761–1764.

[50] W. Peng and D. H. Park, “Generate adjective sentiment dictionary for social media

sentiment analysis using constrained nonnegative matrix factorization,” Urbana, vol. 51,

p. 61801, 2004.

[51] G. Xu, X. Meng, and H. Wang, “Build Chinese emotion lexicons using a graph-based

algorithm and multiple resources,” in Proceedings of the 23rd International Conference

on Computational Linguistics, 2010, pp. 1209–1217.

[52] X. Zhu and Z. Ghahramani, “Learning from labeled and unlabeled data with label

propagation,” Technical Report CMU-CALD-02-107, Carnegie Mellon University, 2002.

[53] X. Ding, B. Liu, and L. Zhang, “Entity discovery and assignment for opinion mining

applications,” in Proceedings of the 15th ACM SIGKDD international conference on

Knowledge discovery and data mining, 2009, pp. 1125–1134.

[54] M. Ganapathibhotla and B. Liu, “Mining opinions in comparative sentences,” in

Proceedings of the 22nd International Conference on Computational Linguistics-Volume

1, 2008, pp. 241–248.

[55] H. Kang, S. J. Yoo, and D. Han, “Senti-lexicon and improved Na𝚤ve Bayes algorithms for

sentiment analysis of restaurant reviews,” Expert Systems with Applications, vol. 39, no.

5, pp. 6000–6010, 2012.

[56] J. Ortigosa-Hernández, J. D. Rodr𝚤guez, L. Alzate, M. Lucania, I. Inza, and J. A. Lozano,

“Approaching Sentiment Analysis by using semi-supervised learning of multi-

dimensional classifiers,” Neurocomputing, vol. 92, pp. 98–115, 2012.

[57] X. Bai, “Predicting consumer sentiments from online text,” Decision Support Systems,

vol. 50, no. 4, pp. 732–742, 2011.

[58] C. C. Chen and Y.-D. Tseng, “Quality evaluation of product reviews using an information

quality framework,” Decision Support Systems, vol. 50, no. 4, pp. 755–768, 2011.

[59] M. Van De Camp and A. Van Den Bosch, “The socialist network,” Decision Support

Systems, vol. 53, no. 4, pp. 761–769, 2012.

[60] Y. Hu and W. Li, “Document sentiment classification by exploring description model of

topical terms,” Computer Speech & Language, vol. 25, no. 2, pp. 386–403, 2011.

[61] S. Raaijmakers, K. Truong, and T. Wilson, “Multimodal subjectivity analysis of

multiparty conversation,” in Proceedings of the Conference on Empirical Methods in

Natural Language Processing, 2008, pp. 466–474.

[62] T. Wilson and S. Raaijmakers, “Comparing word, character, and phoneme n-grams for

subjective utterance recognition.,” in International Speech Communication Association,

2008, pp. 1614–1617.

[63] R. E. Schapire and Y. Singer, “BoosTexter: A boosting-based system for text

categorization,” Machine learning, vol. 39, no. 2–3, pp. 135–168, 2000.

[64] D. Clarke, P. Lane, and P. Hender, “Developing robust models for favourability analysis,”

in Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and

Sentiment Analysis, 2011, pp. 44–52.

Page 121: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

112

[65] M.-T. Mart𝚤n-Valdivia, E. Mart𝚤nez-Cámara, J.-M. Perea-Ortega, and L. A. Ureña-López,

“Sentiment polarity detection in Spanish reviews combining supervised and unsupervised

approaches,” Expert Systems with Applications, vol. 40, no. 10, pp. 3934–3942, 2013.

[66] H. Rui, Y. Liu, and A. Whinston, “Whose and what chatter matters? The effect of tweets

on movie sales,” Decision Support Systems, vol. 55, no. 4, pp. 863–870, 2013.

[67] E. Riloff, “Automatically constructing a dictionary for information extraction tasks,” in

Association for Advancement of Artifical Intelligence, 1993, pp. 811–816.

[68] L. Lee, “Measures of distributional similarity,” in Proceedings of the 37th annual meeting

of the Association for Computational Linguistics on Computational Linguistics, 1999, pp.

25–32.

[69] P. Pantel, E. Crestan, A. Borkovsky, A.-M. Popescu, and V. Vyas, “Web-scale

distributional similarity and entity set expansion,” in Proceedings of the 2009 Conference

on Empirical Methods in Natural Language Processing: Volume 2-Volume 2, 2009, pp.

938–947.

[70] S. Li, C.-Y. Lin, Y.-I. Song, and Z. Li, “Comparable entity mining from comparative

questions,” Knowledge and Data Engineering, Institute of Electrical and Electronics

Engineers (IEEE) Transactions on, vol. 25, no. 7, pp. 1498–1509, 2013.

[71] B. Liu, W. S. Lee, P. S. Yu, and X. Li, “Partially supervised classification of text

documents,” in International Conference on Machine Learning, 2002, vol. 2, pp. 387–

394.

[72] P. Arora, A. Bakliwal, and V. Varma, “Hindi subjective lexicon generation using

WordNet graph traversal,” International Journal of Computational Linguistics and

Applications, vol. 3, no. 1, pp. 25–39, 2012.

[73] A. Joshi, A. Balamurali, and P. Bhattacharyya, “A fall-back strategy for sentiment

analysis in hindi: a case study,” Proceedings of the 8th International Conference on

Natural Language Processing (ICON), 2010.

[74] N. Mittal, B. Agarwal, G. Chouhan, P. Pareek, and N. Bania, “Discourse Based Sentiment

Analysis for Hindi Reviews,” in Pattern Recognition and Machine Intelligence, Springer,

2013, pp. 720–725.

[75] P. Pandey and S. Govilkar, “A Framework for Sentiment Analysis in Hindi using

HSWN,” International Journal of Computer Applications, vol. 119, no. 19, pp. 23–26,

2015.

[76] M. Saraee and A. Bagheri, “Feature selection methods in Persian sentiment analysis,” in

Natural Language Processing and Information Systems, Springer, 2013, pp. 303–308.

[77] M. Shams, A. Shakery, and H. Faili, “A non-parametric LDA-based induction method for

sentiment analysis,” in Artificial Intelligence and Signal Processing (AISP), 2012 16th

CSI International Symposium on, 2012, pp. 216–221.

[78] A. Selamat, I. M. I. Subroto, and C.-C. Ng, “Arabic script web page language

identification using hybrid-KNN method,” International Journal of Computational

Intelligence and Applications, vol. 8, no. 03, pp. 315–343, 2009.

[79] A. Selamat and C.-C. Ng, “Arabic script web page language identifications using decision

tree neural networks,” Pattern Recognition, vol. 44, no. 1, pp. 133–144, 2011.

[80] Z. Syed, M. Aslam, and A. Martinez-Enriquez, “Adjectival Phrases as the Sentiment

Carriers in the Urdu Text,” Journal of American Science, vol. 7, no. 3, pp. 644–652, 2011.

Page 122: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

113

[81] M. Annett and G. Kondrak, “A comparison of sentiment analysis techniques: Polarizing

movie blogs,” in Advances in artificial intelligence, Springer, 2008, pp. 25–35.

[82] N. Durrani and S. Hussain, “Urdu word segmentation,” in Human Language

Technologies: The 2010 Annual Conference of the North American Chapter of the

Association for Computational Linguistics, 2010, pp. 528–536.

[83] T. Ahmed, “Roman to Urdu transliteration using wordlist,” in Proceedings of the

Conference on Language and Technology, 2009.

[84] M. K. Malik, T. Ahmed, S. Sulger, T. Bögel, A. Gulzar, G. Raza, S. Hussain, and M.

Butt, “Transliterating Urdu for a Broad-Coverage Urdu/Hindi LFG Grammar.,” in

Language Resources and Evaluation Conference (LREC), 2010.

[85] S. Hussain and M. Afzal, “Urdu computing standards: Urdu zabta takhti (uzt) 1.01,” in

Multi Topic Conference, 2001. Institute of Electrical and Electronics Engineers (IEEE)

INMIC 2001. Technology for the 21st Century. Proceedings. IEEE International, 2001,

pp. 223–228.

[86] K. R. Beesley and L. Karttunen, “Finite-state morphology: Xerox tools and techniques,”

Center for the Study of Language and Information, Stanford, 2003.

[87] D. Usman Afzal, N. I. Rao, and A. M. Sheri, “Adaptive Transliteration Based on Cross-

Script Trie Generation: A Case of Roman-UrduBBB,” 2009.

[88] F. Iqbal, A. Latif, N. Kanwal, and T. Altaf, “Conversion of urdu nastaliq to roman urdu

using OCR,” in Interaction Sciences (ICIS), 2011 4th International Conference on, 2011,

pp. 19–22.

[89] T. Ahmed and A. Hautli, “Developing a Basic Lexical Resource for Urdu Using Hindi

WordNet,” Proceedings of Conference on Language and Technology, Islamabad,

Pakistan, 2010.

[90] S. Mukund and R. K. Srihari, “Analyzing Urdu social media for sentiments using transfer

learning with controlled translations,” in Proceedings of the Second Workshop on

Language in Social Media, 2012, pp. 1–8.

[91] I. Javed and H. Afzal, “Opinion Analysis of Bi-Lingual Event Data from Social

Networks,” 2013.

[92] M. Thelwall, “Heart and soul: Sentiment strength detection in the social web with

sentistrength,” Proceedings of the CyberEmotions, pp. 1–14, 2013.

[93] R. S. McGregor, The Oxford Hindi-English Dictionary. Oxford University Press, USA,

1993.

[94] S. Awais, “Opinion within Opinion: Segmentation Approach for Urdu Sentiment, in-

press,” The International Arab Journal of Information Technology, 2016.

[95] D. N. Card and W. W. Agresti, “Measuring software design complexity,” Journal of

Systems and Software, vol. 8, no. 3, pp. 185–197, 1988.

[96] T. J. McCabe, “A complexity measure,” Software Engineering, Institute of Electrical and

Electronics Engineers (IEEE) Transactions on, no. 4, pp. 308–320, 1976.

[97] M. RAFIQUE, Urdu qawaid-o-insha Pardazi. Vol. 2. ferozsons.

[98] D. Meyer, K. Hornik, and I. Feinerer, “Text mining infrastructure in R,” Journal of

Statistical Software, vol. 25, no. 5, pp. 1–54, 2008.

[99] S. Janitza, C. Strobl, and A.-L. Boulesteix, “An AUC-based permutation variable

importance measure for random forests,” BioMed Central (BMC) bioinformatics, vol. 14,

no. 1, p. 119, 2013.

Page 123: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

114

[100] G. H. John, R. Kohavi, K. Pfleger, and others, “Irrelevant features and the subset selection

problem,” in Machine Learning: Proceedings of the Eleventh International Conference,

1994, pp. 121–129.

[101] M. Kuhn, “Variable selection using the caret package,” URL< http://cran. cermin. lipi.

go. id/web/packages/caret/vignettes/caretSelection. pdf, 2012.

[102] J. Cohen and others, “A coefficient of agreement for nominal scales,” Educational and

psychological measurement, vol. 20, no. 1, pp. 37–46, 1960.

[103] G. Trunk, “A problem of dimensionality: A simple example,” Pattern Analysis and

Machine Intelligence, Institute of Electrical and Electronics Engineers (IEEE)

Transactions on, no. 3, pp. 306–307, 1979.

[104] M. Fernández-Delgado, E. Cernadas, S. Barro, and D. Amorim, “Do we need hundreds of

classifiers to solve real world classification problems?,” The Journal of Machine Learning

Research, vol. 15, no. 1, pp. 3133–3181, 2014.

[105] M. Kuhn, “Building predictive models in R using the caret package,” Journal of

Statistical Software, vol. 28, no. 5, pp. 1–26, 2008.

[106] A. Hassan, “Urdu Sentiment Corpus,GitHub, 2016. [Online]. Available:

https://github.com/resonotech/sentimentanalysis.git. [Accessed: 18- Jan- 2016].” 2014.

[107] M. Friendly, “A fourfold display for 2 by 2 by k tables,” Technical Report 217,

Psychology Department, York University, 1994.

[108] B. S. Everitt, The analysis of contingency tables. Chemical Rubber Company (CRC)

Press, 1992.

[109] S. Siegel, “Nonparametric statistics for the behavioral sciences.,” 1956.

Page 124: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

115

Appendixes

List of Figures

Figure 1: Architecture of the Proposed Solution .......................................................................... 36

Figure 2: Partial View of Sparse Matrix of Unigram Features ..................................................... 64

Figure 3: Most Important Unigram Features ................................................................................ 65

Figure 4: : Discourse Features Correlation Matri ......................................................................... 68

Figure 5: Main Key Features ........................................................................................................ 70

Figure 6: Optimal Number of Features ......................................................................................... 73

Figure 7: Annotator Interface for Data Dictionary ....................................................................... 76

Figure 8: Distribution of Polarity Relation and Discourse Relation within the Dataset .............. 91

Figure 9: Precision-Recall curve of each rule (upper right position shows better performance than

lower left position) ........................................................................................................................ 92

Figure 10: Fourfold display for each discourse relation. The area of each shaded quadrant shows

the frequencies, standardized to equate the margins for the assigned class and the actual class.

The integer value in each quadrant represents the raw frequency; for example, the upper left

quadrant shows the number of correctly classified positive sentiments. The diagonal (upper left

and lower right) shows the correct classification cases, and the off diagonal (upper right, and

lower left) shows the incorrect classification. When four quadrant circles do not overlap each

other to make complete circle, the relation indicate significant association between assigned and

actual polarity................................................................................................................................ 94

Figure 11: Comparison between the simple BoW and the purpose method ................................. 99

Page 125: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

116

List of Tables

Table 1: List of haroof Atafs. ........................................................................................................ 33

Table 2: List of Stop Words .......................................................................................................... 34

Table 3: Possible Polarity Relations between Sub-Opinions ........................................................ 44

Table 4: List of Segmentation Words: The Huroof Ataf which divide the sentiment into two sub-

opinions and also identify the discourse relation .......................................................................... 48

Table 5: Leading Sub-Opinion of Sentiment for the Discourse Relation and the Corresponding

Polarity Relation ........................................................................................................................... 53

Table 6: List of most important unigram features with their POS and Polarity information ....... 65

Table 7: Feature Set Performance with Increasing Number of Features ...................................... 71

Table 8: Most important features of each set ................................................................................ 73

Table 9: Corpus Detail .................................................................................................................. 75

Table 10: Confusion matrix of simple BoW (without discourse information) ............................. 89

Table 11: Performance of simple BoW ........................................................................................ 89

Table 12: Confusion matrix of Rule-Based Classifier .................................................................. 90

Table 13: Performance of the proposed Rule-Based classifier ..................................................... 90

Table 14: Output of Rule Based Classifier that shows Discourse Relation and Polarity Relation90

Table 15: Performance of the Supervised Learning Algorithms .................................................. 96

Table 16: Significant Test to measure the performance difference between BoW and Proposed

Rule-Based Classifier.................................................................................................................. 100

Table 17: significant test to measure the performance difference between models trained with

unigram features and model trained with selected features ........................................................ 101

Page 126: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

117

List of Algorithms

Algorithm 1: Sentiment Segmentation.......................................................................................... 40

Algorithm 2: Sentiment Orientation Score ................................................................................... 42

Algorithm 3: Identification of Polarity Relation ...................................................................... 45

Algorithm 4: Discourse Relation Finder ....................................................................................... 52

Algorithm 5: Polarity Assignment to Sentiment ........................................................................... 62

List of FlowCharts

FlowChart 1: Sentiment Segmentation ......................................................................................... 41

FlowChart 2: Sentiment Orientation Score ................................................................................... 43

FlowChart 3: Polarity Relation Algorithm ................................................................................... 46

FlowChart 4: Discourse Relation Finder ...................................................................................... 51

FlowChart 5: Polarity Assignment to Sentiment .......................................................................... 63

List of Examples

Example 1: ( ہنے بہدارلڑکا اور اچھا کیا وہ ) ................................................................................................... 4

Example 2: ( سکے جتوا ںینہ چیم کو پاکستان وہ اگر دہیفا ایک کا اس مگر ہے الرونڈر فٹ اور اچھا کیا یدیآفر ) ................................... 5

Example 3: اید کر کمال نے اس ںیم ٹور یدبئ مرتبہ اس نیہےلک رہتا بناتا نشانہ کا دیتنق ایڈیم کو میٹ کرکٹ پاکستان اگرچہ) )........................ 5

Example 4: کا قیرف یبھائ , .................................................................................................................... 29

Example 5: تھے گئے اسکول دونوں یعل اور آصف ............................................................................................ 29

Example 6: ںیروکانہ پر تھا ایآ اظہر ........................................................................................................... 30

Example 7: ہے پڑا کام اور تو یابھ ............................................................................................................. 31

Example 8: کرو کام اور آو یجلد تم ............................................................................................................. 31

Example 9: یگئ بھج پھر یلگ اگ .............................................................................................................. 31

Page 127: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

118

Example 10: یک سے اس کو ہم مانو نا اںی مانو تم ............................................................................................. 31

Example 11: ںییہ برابر اسبیل ھمارے آو نا اںی آو تم .......................................................................................... 31

Example 12: یمان نا کیا یریم نے اس مگر ایسمجھا بہت کو اس نے ںیم ..................................................................... 32

Example 13: ںینہ مانتا وہ پر ہ ٹھک تو بات ................................................................................................... 32

Example 14: تھے ںینہ تم مگر تھے ماجود لوگ سب ........................................................................................... 32

Example 15: ںیہ خوش سے خبر اس لوگ سب عالوہ تمہارے .................................................................................. 32

Example 16: ہوں مصروف ںیم ونکہیک یسکت آ ںینہ ںیم ....................................................................................... 33

Example 17: ہے برکت ںیم محنت کے ےیل کرواس محنت ...................................................................................... 33

Example 18: اور کین وہ ہے لڑکا بہادر ...................................................................................................... 34

Example 19: دو رکھ پر زیم کتاب ہی ........................................................................................................... 34

Example 20: ہے ںینہ کیٹھ یبھ یبٹر اور کرتا ںینہ کام کیٹھ مرہیک کا اس .................................................................... 34

Example 21: گے ھو ابیکام تو گے کرو محنت اگر ............................................................................................. 35

Example 22: یہوت نا حراب ٹائمنگ یٹریب یک اس اگر ہوتا اچھا بہت موبائل ہی ................................................................... 35

Example 23: جتواسکتا ںینہ چیم کو پاکستان مگر ہنے ئریپل اچھا کیا یدیآفر ................................................................... 38

Example 24: ہے ںینہ لڑکا اچھا کیا وہ ) ...................................................................................................... 11

Example 25: ( ہے خراب یبھ یٹریب اور کرتا ںینہ کام کیٹھ رہیکم کا اس ) ...................................................................... 47

Example 26: ( ہے پسند بہت ںیم وںیگاڑ یسار مجھے مگر یہےمہنگ یھونڈاسٹ ) .............................................................. 48

Example 27: ( کرتے ںینہ کام ٹھک ٹیس یباق سوا کے اس ہے بہتر ٹیس ہی ) ................................................................... 49

Example 28: (ہے یبھ ارایپ اور مہنگا ہےتو چرمظبوطیفرن ہی اگر) ............................................................................. 50

Example 29: ( ہے جھوٹا وہ ونکہیک ہے ںینہ انسان اچھا بابر ) ................................................................................ 50

Example 30: ( 54 ...................................................... ( ہنے اچھا بہت یبھ رہیکم کا اس اور ہے یاچھ بہت سمارٹنسس یک موبائل

Example 31: (ہے یرہ کر کام اچھا بہت یٹریب یک اس اور تھا دایخر فون ہی پہلے نےچارماہ ںیم) ................................................ 55

Example 32: ( ہے فضول یہ یاتن سے اندر نیلک ہے شاندار بہت ںیم کھنےید 55 ...........................................................(ھونڈا

Page 128: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

119

Example 33: ( 56 ....................... ( ہے یکرت فراہم سہولت کیا لےیک نیزم مال نیلک ہے کٹیپروج مہنگا کیا بس ٹرویم یک گورنمنٹ پنجاب

Example 34: (samsung ںیہ ںینہ اچھے ماڈلز این نیلک تھا نیبہتر ماڈل پچھال کا ) .......................................................... 56

Example 35: (lemo ہے یت کھا آئل ادہیز بہت ہی مگر کارہے یلگژر کیا )................................................................... 57

Example 36: ( یلگ یک کمال بہت کر پہن مگر یتھ ںینہ تو پسند مجھے شرٹ یٹ ہی ) ............................................................ 57

Example 37: (sunsilk وہ کے ںیہ کاریب شمپپو یباق عالا ) ................................................................................... 58

Example 38: ( وہ کے آنے نظر اچھا ںیم کار اس) ہے ںینہ یکوالٹ اور یکو عالا ................................................................. 59

Example 39: ( ہے یمہنگ تو ےیاسل ہے یاچھ ڈیسپ یک یراریف ) ................................................................................ 60

Page 129: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

120

List of Haroof Ataf

words Type words Type words Type words Type

Aur Wasal Laken Astadrak tub Jaza Yaan Tu Tardeed

Phr Wasal Per Astadrak aur Jaza na Tardeed

Naaz Wasal Albata Astadrak ess laya Jaza Magar Astadrak

Yaan Wasal Alla Astashna Kyn ka Elaat chonkay Shart

KER Wasal Sawa Astashna

Ess laya

ka

Elaat tu Jaza

KA Wasal Alawa Astashna

ess

wastay ka

Elaat so Jaza

YEA Wasal Bagar Astashna taa ka Elaat KA Tardeed

Yaan Tardeed Agar Shart choon ka Elaat Jubtak Shart

Khawa Tardeed Jo Shart bana Elaat ki waja Elaat

Chaho Tardeed Agarcha Shart bareen Elaat KA Wasal

Page 130: Muhammad Awais Hassan 2009-Phd-CS-01prr.hec.gov.pk/jspui/bitstream/123456789/11361/1/Muhammad Awai… · Awais Hassan and Muhammad Shoaib, International Arab Journal of Information

121

INDEX

Confusion matrix, 81, 82

a Precision-Recall (PR) curve, 83

Astadark Relation, 53

Astashna Relation, 54

Backward Negations, 46

Complex Sentiments:, 8

Correlation between discourse features, 70

Dictionary, 7, 45

Discourse Features, 70

Feature Identification, 67

Feature Selection, 72

Forward Negations, 46

Fourfold, 86

Haroof Astadark:, 34

Haroof Astashna, 35

Haroof ataf, 32

Haroof Elaat, 35

Haroof fajya, 33

Haroof jars, 32

Haroof Shaart u Jaza:, 35

Haroof tahsees, 32

Haroof tardeed, 34

Haroof Wasl, 33

Most important Features, 72

Optimal Number of Features, 73

Orientation, 45

Part of Speech (POS), 71

Polarity function, 45

Polarity Score, 70

POS with Polarity, 72

Rule 1a, Rule1b and Rule1c, 59

RULE 2, 61

Rule 3a, 63

Rule 3b, 63

Rule1a, 58

Rule1b, 58

Rule1c, 59

Rule2, 59

Rule3a, 59

Rule3b:, 59

Rule4a:, 59

Rule4b:, 59

Rules 4a, 64

Score, 45

Segmentation Word, 40

Sentiment, 6, 45

sentiment polarity, 6

Shart-u-Jaza Relation, 54

Sub-opinions, 41

Supervised Learning Classifier, 87

Unigram, 68

Wasal Relation, 52