informatics tools, ai models and methods used for

10
1. Introduction Quality assurance is currently realized by means of a process approach based on the model of a quality management system [1]. It describes the interaction of the company and the customer during the process of product production and consumption. To correct the parameters of product quality in order to improve it for the customer, the models include feedback. For companies, one aspect of feedback during the process of quality management is information about the level of customer satisfaction, expressed in the form of customer reviews of the product quality. That is why customer satisfaction is the key information in quality management that influences decision-making. To collect data and to evaluate customer satisfaction, the International Quality Standard ISO 10004 recommends using the following methods: personal and phone interviews, discussion groups, mail surveys (postal questionnaires), online research and survey (questionnaire survey) [2]. However, these methods of collecting and analyzing customer opinions show a number of drawbacks. A general drawback of the recommended methods is the need for a large amount of manual work: preparing questions, creating a respondent database, mailing questionnaires and collecting results, conducting personal interviews, preparing a report based on the results. All this increases the research costs. Due to their discreteness these methods do not allow for the continuous monitoring of customer satisfaction. For this reason, the data analysis is limited to one time period and does not give an insight into the trends and dynamics of customer satisfaction. This also has a negative influence on the speed of managerial decision making, which depends on the arrival rate of up-to-date information about customer opinions. Existing scales of customer satisfaction and their subjectivity perception raise questions. Values of customer satisfaction expressed in the form of abstract satisfaction indices make it difficult to understand, compare and interpret the results. Methods of analysis of data collected through the recommended ISO 10004 procedures permit only the detection of linear dependencies. To increase the effectiveness of product quality management, we suggest approaching the research of customer satisfaction through the use of Informatics, as AI technologies. Applying Text Mining tools for analyzing customers’ reviews posted on the Internet is not novel. There are many studies concerning models and methods for data collection, sentiment analysis and information extraction. Recent studies show acceptable accuracy of methods for sentiment classification. Gräbner Studies in Informatics and Control, Vol. 24, No. 3, September 2015 http://www.sic.ici.ro 261 Informatics Tools, AI Models and Methods Used for Automatic Analysis of Customer Satisfaction George KOVÁCS 1 , Diana BOGDANOVA 2 , Nafissa YUSSUPOVA 2 , Maxim BOYKO 2 1 Computer and Automation Research Institute, Kende u. 13-17, Budapest, 1111, Hungary, [email protected] 2 Ufa State Aviation Technical University, K. Marx 12, Ufa, 450000, Russia, [email protected], [email protected], [email protected] Abstract: Customer satisfaction is getting more and more importance world-wide. Informatics tools and methods are used to research customer satisfaction based on a detailed analysis of consumer reviews. The examined reviews are written in natural languages and some Artificial Intelligence (AI) techniques such as Text Mining, Aspect Sentiment Analysis, Data Mining and Machine Learning are used for the study. As input for running the investigations, we use different internet resources in which the accumulated customer reviews are available. These are for example yelp.com, tripadviser.com and tophotels.ru, etc. To see and show the efficacy of the proposed approach, we have carried out experiments on hotel client satisfaction. The results have proven the effectiveness of the proposed approach to decision support in product quality management and support applying them instead of traditional methods of qualitative and quantitative research of customer satisfaction. Keywords: quality management; customer satisfaction research; decision support system; sentiment analysis.

Upload: others

Post on 21-May-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Informatics Tools, AI Models and Methods Used for

1. Introduction

Quality assurance is currently realized bymeans of a process approach based on themodel of a quality management system [1]. Itdescribes the interaction of the company andthe customer during the process of productproduction and consumption. To correct theparameters of product quality in order toimprove it for the customer, the models includefeedback. For companies, one aspect offeedback during the process of qualitymanagement is information about the level ofcustomer satisfaction, expressed in the form ofcustomer reviews of the product quality. That iswhy customer satisfaction is the keyinformation in quality management thatinfluences decision-making.

To collect data and to evaluate customersatisfaction, the International Quality StandardISO 10004 recommends using the followingmethods: personal and phone interviews,discussion groups, mail surveys (postalquestionnaires), online research and survey(questionnaire survey) [2]. However, thesemethods of collecting and analyzing customeropinions show a number of drawbacks.

A general drawback of the recommendedmethods is the need for a large amount ofmanual work: preparing questions, creating arespondent database, mailing questionnaires and

collecting results, conducting personalinterviews, preparing a report based on theresults. All this increases the research costs. Dueto their discreteness these methods do not allowfor the continuous monitoring of customersatisfaction. For this reason, the data analysis islimited to one time period and does not give aninsight into the trends and dynamics of customersatisfaction. This also has a negative influenceon the speed of managerial decision making,which depends on the arrival rate of up-to-dateinformation about customer opinions.

Existing scales of customer satisfaction and theirsubjectivity perception raise questions. Values ofcustomer satisfaction expressed in the form ofabstract satisfaction indices make it difficult tounderstand, compare and interpret the results.Methods of analysis of data collected throughthe recommended ISO 10004 procedures permitonly the detection of linear dependencies.

To increase the effectiveness of product qualitymanagement, we suggest approaching theresearch of customer satisfaction through theuse of Informatics, as AI technologies.Applying Text Mining tools for analyzingcustomers’ reviews posted on the Internet is notnovel. There are many studies concerningmodels and methods for data collection,sentiment analysis and information extraction.Recent studies show acceptable accuracy ofmethods for sentiment classification. Gräbner

Studies in Informatics and Control, Vol. 24, No. 3, September 2015 http://www.sic.ici.ro 261

Informatics Tools, AI Models and Methods Used forAutomatic Analysis of Customer Satisfaction

George KOVÁCS1, Diana BOGDANOVA2, Nafissa YUSSUPOVA2, Maxim BOYKO2

1 Computer and Automation Research Institute, Kende u. 13-17, Budapest, 1111, Hungary, [email protected]

2 Ufa State Aviation Technical University, K. Marx 12, Ufa, 450000, Russia, [email protected], [email protected], [email protected]

Abstract: Customer satisfaction is getting more and more importance world-wide. Informatics tools and methods areused to research customer satisfaction based on a detailed analysis of consumer reviews. The examined reviews arewritten in natural languages and some Artificial Intelligence (AI) techniques such as Text Mining, Aspect SentimentAnalysis, Data Mining and Machine Learning are used for the study. As input for running the investigations, we usedifferent internet resources in which the accumulated customer reviews are available. These are for example yelp.com,tripadviser.com and tophotels.ru, etc. To see and show the efficacy of the proposed approach, we have carried outexperiments on hotel client satisfaction. The results have proven the effectiveness of the proposed approach to decisionsupport in product quality management and support applying them instead of traditional methods of qualitative andquantitative research of customer satisfaction.

Keywords: quality management; customer satisfaction research; decision support system; sentiment analysis.

Page 2: Informatics Tools, AI Models and Methods Used for

et al. [3] proposed a system that performs thesentiment classification of customer reviews onhotels. The precision values are 84% forpositive and 92% for negative reviews.Lexicon-based method [4] allowed the correctclassification of reviews with a probability ofabout 90%. These achievements makesentiment analysis applicable for an applicationon quality management and customersatisfaction research.

Jo and Oh [5] and Lu et al. [6] considered theproblems of automatically discoveringproducts’ aspects and sentiments estimation forthese aspects, which are evaluated in reviews.For solving these problems, they suggestedmethods based on Latent Dirichlet Allocation[7] and its modifications.

The main drawback of most social monitoringsystems and frameworks for automatic analysisof reviews is that they can provide entirely onlya quantitative survey of customer reviews, i.e.,they can provide measurement of the degree ofcustomer satisfaction with a product and itsaspects, sometimes with a model [9].Qualitative survey were usually onlyconducting the extraction of products’ aspects.However, estimation of the significance of eachproducts’ aspects for the customer is missed.The information about products’ aspects thatinfluence customers’ satisfaction and relativeimportance of products’ aspects for thecustomers is missing, as well as an insight intocustomer expectations and perceptions.

The most related work to this problem is [8]. Itis dedicated to the topic of aspect ranking,which aims to automatically identify importantaspects of product from online consumerreviews. Most proposals used a probabilisticmodel with a large number of parameters thatlead to low robustness of the model. Totalweighting values of aspects are calculated asthe average of the weighting values by eachreview. Finally, significance values of aspectsare estimated independently of sentiments ofopinions. In real life we can speak about bad“signal connection”, in a review, but we usuallyomit comments int he case of good “signalconnection”, as it should be caused by the

phone. In our paper, we estimate significancevalues of aspects in accordance with theirpositive and negative sentiments.

In this paper, for qualitative survey is used anovel approach based on transformation resultsof sentiment analysis and aspect-based

sentiment analysis, such as sentiment labels ofreviews and mentions about product’s aspectsin reviews, into boolean data. After that,boolean data is processed with a data miningtool – decision tree. Qualitative survey aims toidentify how the sentiment of reviews dependson the sentiment of different products’ aspects.In other words, how overall customersatisfaction with product depends on thecustomer satisfaction with a product’s aspects.Decision tree performs this aim and identifieslatent relations between the sentiment ofreviews and sentiment of a product’s aspects.Also using the decision tree allows to estimatethe significance of product’s aspects for thecustomers. The output of the qualitative surveycontains significant values of the product’saspects for customers, and identifies latentrelations between satisfaction with the productand satisfaction with each product’s aspect.These were produced as rules extracted by thedecision tree. The availability of bothquantitative and qualitative surveys allowsrealizing Intelligent Decision Support Systemfor Quality Management in accordance withquality standard ISO 10004.

2. Quality Management –Customer Satisfaction

Figure 1 represents the algorithm of thesuggested approach to quality managementbased on research into customer satisfactionusing Artificial Intelligence applications. Itconsists of four main stages. The first stageincludes the collection of reviews from Internetresources, data cleansing and loading data intothe database. The second stage comprises theprocessing and analysis of the collectedreviews. It includes marking reviews by theiremotional response, i.e., sentiment (forexample, negative and positive), identifyingproduct aspects, and evaluating the sentimentof the comments on the separate aspects.Following the stage of data processing utilizingvisualization tools, quantitative research iscarried out. A qualitative research of customersatisfaction is undertaken by means of buildingmodels based on decision trees, where thereview's sentiment serves as a dependentvariable, and sentiment comments on separateproduct aspects as independent variables.Managerial decision development and makingis carried out on the basis of these four stages.

http://www.sic.ici.ro Studies in Informatics and Control, Vol. 24, No. 3, September 2015262

Page 3: Informatics Tools, AI Models and Methods Used for

3. Applicable AI Means

3.1 Collecting important data

Nowadays there are a large number of Internetresources where users can leave their opinionsabout goods and services. The most popularexamples are tophotels.ru (635,000 reviews),yelp.com (53 million reviews), tripadvisor.com(travels, 130 million reviews). Similarresources continue to gain popularity. Theiradvantage as a source of information forsatisfaction evaluation lies in their purpose –the accumulation of customer reviews. Asopposed to social network services, the web

pages of review sources use XML thatdetermines the structure typical for a review.Such a structure includes separate blocks withthe name of a product or company and areview, and other blocks with additionalinformation. In the case of several inputsources, information/data should be managedby up-to-date tools, as e.g. cloud computing.Therefore, all reviews are clearly identified inrelation to the review object. It significantlysimplifies the process of data collection andexcludes the problem of key word ambiguity.One further advantage is that many of suchresources monitor the reviews and check theobjectivity of the authors.

There are two main types of collecting Internetdata on customer reviews: 1) by using API(application programming interface) and 2) byweb parsing. API is a set of ready-to-use tools –classes, procedures, functions – provided by theapplication (Internet resource) for use in anexternal software product. Unfortunately, onlyfew resources that accumulate reviews havetheir own API. In this case, to collect reviewswe can use the second method for datacollection – web parsing. Web parsing is aprocess of automated analysis and contentcollection from xml-pages of any Internetresource using special programs or script. Inthis paper is used the second method for datacollection – web data extraction. It is a processof automated content collection from HTML-pages of any Internet resource using specialprograms or script. Related work is presentedin [11].

3.2 Analysis of sentiments

After the data has been collected and cleaned,we can start their processing with the help ofText Mining tools. Sentiment Analysis is used toevaluate the author's product satisfaction.Sentiment stands for the emotional evaluation ofan author's opinion in respect to the object that isreferred to in the text. We can distinguish threemain approaches to Sentiment Analysis: 1)linguistic, 2) statistical, and 3) combined. Thelinguistic approach is based on using rules andsentiment vocabulary. It is quite time-consumingdue to the need of compiling sentimentvocabularies, patterns and making rules foridentifying sentiments. But the main drawbackof the approach is the impossibility of obtaininga quantitative evaluation of the sentiment. Thestatistical approach is based on the methods ofsupervised and non-supervised machine

Studies in Informatics and Control, Vol. 24, No. 3, September 2015 http://www.sic.ici.ro 263

Figure 1. The algorithm of AI based qualitymanagement of customer satisfaction.

Page 4: Informatics Tools, AI Models and Methods Used for

learning. The combined approach refers to acombined use of the first two approaches.

We use the methods of supervised machinelearning: Bayesian classification and SupportVector Machines. Software implementation issimple, and does not require generatinglinguistic analyzers or sentiment vocabularies.Text sentiment evaluation can be expressedquantitatively. To apply these methods, atraining sample was created. To describe anattribute space, vector representation of reviewtexts was used with the help of the bag-of-words model. Bit vectors - presence or absenceof the word in the review text, and frequencyvectors – the number of times that a given wordappears in the text of the review, served asattributes. Lemmatization, a procedure ofreducing all the words of the review to theirbasic forms, was also used. More details can befound in [12].

3.3 Aspects of sentiment analysisSentiment Analysis of reviews allows toevaluate general customer product or companysatisfaction. However, it does not make clearwhat exactly the author of the review likes andwhat not. To answer this question, it isnecessary to perform an Aspect-basedSentiment Analysis. An aspect meanscharacteristics, attributes, qualities, propertiesthat characterize the product, for example, aphone battery or delivery period, etc. However,one sentiment object can have a great numberof aspects. Furthermore, aspects in the text canbe expressed by synonyms (battery andaccumulator). In such cases it is useful tocombine aspects into aspect groups.

An Aspect-based Sentiment Analysis of areview is a more difficult task and consists oftwo stages – identifying aspects anddetermining the sentiment of the comment onthem. To complete the task of the Aspect-basedSentiment Analysis, a simple and effectivealgorithm has been developed (see Figure 2).

A frequency vocabulary (based on the corpus)that helps to compare the obtained frequencieswith word frequencies is used to identifyaspects. The nouns with maximum frequencydeviations are candidates for inclusion intoaspect groups. Division of the noun set intoaspect groups was carried out manually. Weshould note that if a sentence includes nounsfrom several aspect groups, then it will appearin each of them. The results of Sentiment

Analysis and Aspect Sentiment Analysis can berepresented in the form of text variables Obj=(Revi , Sent i , Neg i

1 , ... , Neg ij , Posi

1 , ... , Posij) ,

where Obj is a sentiment object or a product,Revi the text of the i review, Senti the sentimentof i review, Negi

j the negative sentences aboutthe j aspect in the i review, Posi

j the positivesentences about the j aspect in the i review, ithe review number, j the aspect group number.

3.4 A well established way:decision tree

The following section focuses on an algorithmof the processing of data obtained with help ofSentiment Analysis and Aspect-basedSentiment Analysis. The task of the developedalgorithm is the mining of data that can be usedfor decision support in product qualitymanagement. To realize this algorithm, we usethe Data Mining method, i.e. the decision tree,since this tool can be easily understood andresults can be clearly interpreted; it also canexplain situations by means of Boolean logic.

http://www.sic.ici.ro Studies in Informatics and Control, Vol. 24, No. 3, September 2015264

Figure 2. Aspect-based Sentiment Analysis.

Page 5: Informatics Tools, AI Models and Methods Used for

The algorithm of processing of data obtainedby means of Sentiment Analysis is presented onFigure 3.

Figure 3. Algorithm of data mining.

The algorithm we have described allows us tounderstand which sentiment comments onproduct aspects influence the whole textsentiment or, in other words, what productaspects influence customer satisfaction and inwhat way. Our decision tree model allows us toconsider the influence not only of separatesentiment comments on aspects but also of theirmutual presence (or absence) in the text oncustomer satisfaction. The decision tree modelalso enables us to detect the most significantproduct aspects that are essential for thecustomer. The logical constructions (calledrules) that we have obtained can be expressedboth in the form of Boolean functions in adisjunctive normal form and in natural language.

A decision tree model can help to predictsentiment in dependence on various inputtingaspect comments of different sentiments. Infact, it makes it possible to evaluateexperimentally customer satisfaction independence on satisfaction with differentproduct attributes. As the final result, predictionand analysis of the influence of differentinputting variants on customer satisfactionallows us to distribute the company’s budgeteffectively to maintain a high product quality.

As a measure of the customer satisfaction withproduct is used a ratio of positive reviews to thesum of positive and negative reviews. The score

of customer satisfaction CS by product iscalculated by (1):

(1)

where Zpos – number of positive reviews,Zneg – number of negative reviews.

As a measure of the customer satisfaction withproduct’s aspect groups is used a ratio ofpositive sentences with mentions of a product’saspect to the sum of positive and negativesentences with mentions of a product’s aspect.The score of customer satisfaction csj with jproduct’s aspect group is calculated by (2):

(2)

where – number of positive sentencescontaining mention about the j product’s aspectgroup, – number of negative commentscontaining mention about the j product’s aspectgroup. Unlike indices in [2] (which oftenrepresent the average subjective values obtainedwith using rating scales) proposed measuresshow the ratio of positive / negative reviews tototal number of reviews. It gives more clearerfor understanding of monitoring results.

Significance of aspects group shows how muchthe sentiment of a review depends on the aspectgroup in positive and negative sentences, i.e.,significance of product’s aspects for customers.Let the number of aspect groups is g/2, then thenumber of independent sentimental variables g.According to the methodology described in [13]the equation (3) for calculating the significanceof variable m is:

(3)

where kl – number of nodes that were split byattribute l, El,j – entropy of the parent node, splitby attribute l, El,j,i – subsite node for j, whichwas split by attribute l, Ql,j, Ql,j,i – number ofexamples in the corresponding nodes, ql,j –number of child nodes for j parent node.

4. Real Data Experiments

Effectiveness evaluation of the developedapproach was performed on the data obtainedfrom 635,824 reviews of hotels and resorts in

Studies in Informatics and Control, Vol. 24, No. 3, September 2015 http://www.sic.ici.ro 265

Page 6: Informatics Tools, AI Models and Methods Used for

Russian. The reviews were collected from apopular Internet resource for the period of2003-2013. The initial structure of the collecteddata consisted of the following fields: hotelname; country name; resort name; date of visit;opinion of the hotel; author evaluation of food;author evaluation of service; review number.The data was preliminarily processed andloaded into the database SQL Server 2012.

To classify segments, we used a binary scale(negative and positive) on the hypothesis thatthe absence of negative is positive. A trainingsample of positive and negative opinions wascreated using the collected data on the author’sevaluation of accommodation, food andservice. The Internet resource tophotels.ru usesa five-point grading scale. A review can have amaximum total of 15 points, a minimum of 3points. The training sample included 15,790negative reviews that had awarded 3 and 4points, and 15,790 positive reviews that hadawarded 15 points. We did not use authorevaluation for further data processing. Themarking of the remaining 604,244 reviews wascarried out using a trained classifier.

For the purpose of effectively creating asentiment classifier, we evaluated the accuracyof the classification of machine learningalgorithms and some peculiarities of theirstructure (Table 1). The criterion Accuracy(ratio of the number of correctly classifiedexamples to their total number) was used toassess classification accuracy. Accuracyevaluation was performed on two sets of data.The first set (Test No. 1) represented a trainingsample consisting of strong positive and strongnegative opinions. It was tested by crossvalidation by dividing the data into 10 parts.The second set (Test No. 2) included reviewscovering different points and was markedmanually (497 positive and 126 negative

reviews). It was used only for the accuracycontrol of the classifier that had been trained onthe first data set.

To assess the influence of the negative particles“not” and “no”, we used tagging; for example,the phrase “not good” was marked as“not_good”, and was regarded by the classifieras one word. This technique allowed theincrease of sentiment classification accuracy.

Table 1. Results of methods for sentimentclassification

Machine learningmethods

VectorTestNo. 1

Test No.2

SVM (linear kernel) Frequency 94.2% 83.1%

SVM (linear kernel) Binary 95.7% 84.1%

NB Binary 96.1% 83.7%

NB Frequency 97.6% 92.6%

NB (exceptional words) Frequency 97.7% 92.7%

Bagging NB Frequency 97.6% 92.8%

NB (tagging “not” and“no”)

Frequency98.1% 93.6%

For the marking of reviews and the SentimentAnalysis, we created a classifier on the basis ofthe NB method, with frequency vectors asattribute space, and with the use oflemmatization and tagging of the negativeparticles “not” and ‘no”.

Using the algorithm we had developed weextracted from all reviews the key words thatwere divided into seven basic aspect groups(see Figure 4): beach/swimming pool, food,entertainment, place, room, service, transport.The following step was extracting and markingsentences with words from aspect groups bysentiment. However, not all sentences withaspects have a clearly expressed sentiment;therefore, the sentences which do not show aclearly expressed sentiment were filtered out.

http://www.sic.ici.ro Studies in Informatics and Control, Vol. 24, No. 3, September 2015266

Figure 4. Aspect groups of the object “hotel”

Page 7: Informatics Tools, AI Models and Methods Used for

We will give an example of our qualitative andquantitative research for two 5 star hotels “A”(1,692 reviews) and “B” (1,300 reviews) locatedin the resort Sharm el-Sheikh (63,472 reviews)in Egypt. First, we will describe our quantitativeresearch of consumer satisfaction dynamics,then we will compare this with the averagesatisfaction in the whole resort, detect negativetrends in the different hotel aspects and identifyproblems in the quality of hotel services.

The dynamics of customer satisfaction isrepresented in Figure 5. Concerning Hotel “A”,there is a positive upward satisfaction trendbeginning in 2009; it reaches the average resortlevel in 2013. Concerning Hotel “B”, in 2012there was a sharp satisfaction decline and asimilarly sharp increase in 2013. We can alsonotice this trend in a monthly schedule (Figure6). Satisfaction decrease for Hotel “B” started inJune 2012 and stopped in October 2012. Then,customer satisfaction with Hotel “B” grew to alevel that was higher than the average resortlevel, being ahead of its competitor – Hotel “A”.

Figure 5. Yearly dynamics of customer satisfaction.

To find reasons for the Hotel “B” satisfactiondecrease, we will examine the diagrams inFigure 7. We can see that in 2012, Hotel “B” onaverage was second to Hotel “A” in suchaspects as “Room” (∆12%), “Place” (∆8%),“Service” (∆5%), “Beach/swimming pool”(∆3%) and “Entertainment” (∆3%). Besides, in2012, Hotel “B” had more registered cases offood poisoning as well as cases of theft inAugust 2012. We should also note that one ofthe reasons of client dissatisfaction with Hotel“B” as a place was the beginning of therenovation of the hotel building and the rooms.These measures, however, were rewarded in2013, when customer satisfaction with Hotel“A” aspects equaled the average resort level.

In 2013, customer satisfaction with Hotel “B”exceeded the average level in all aspects(Figure 8). Customer satisfaction with Hotel“A” dropped to lower than average values insuch aspects as “Service” (∆3%), “Food”

(∆3%), “Beach/swimming pool” (∆3%) and“Transport” (∆4%). The manager of Hotel “A”could be advised to direct efforts to increase thequality of all aspects, but would this be themost effective solution? Which aspects are themost significant for the customer and shouldconsequently be improved in the first place? Isit possible to offset the dissatisfaction with theservice, for example, by healthier food or ananimated evening performance and achieveclient satisfaction? A qualitative research of theSentiment Analysis results can give answers tothese questions.

Figure 6. Monthly dynamics of customersatisfaction.

Figure 7. Comparison of customer satisfaction byaspects in 2012.

Figure 8. Comparison of customer satisfaction byaspects in 2013

Decision trees were created using algorithmC4.5 and Data Mining tool – Deductor [13].

Studies in Informatics and Control, Vol. 24, No. 3, September 2015 http://www.sic.ici.ro 267

Page 8: Informatics Tools, AI Models and Methods Used for

The first step was creating a tree for the totalsample of the reviews on the given resort todetect general principles. Constructed decisiontree is presented on Figure 9. Extracted rulesthat have a confidence >80% are represented inTable 3. The second step is developing decisiontrees for the sample of Hotel “A” and Hotel“B” reviews to identify principles on the hotellevel. Aspect significance is represented inTable 2.

Table 2. Significance of product aspect groups

Aspect groupSentiment of

mention

Significance values

ResortHotel“A”

Hotel“B”

ServiceNegative 34.8% 60.2% -Positive 0.7% - -

FoodNegative 30.3% 27.2% 30.3%Positive 16% - -

EntertainmentNegative - - -Positive 8.5% 12.7% 12.4%

RoomNegative 4% - 57.3%Positive 2.1% - -

Beach/swimmingpool

Negative 0.2% - -Positive 2.5% - -

PlaceNegative - - -Positive 1% - -

TransportNegative - - -Positive - - -

Analyzing values of aspect significance (Table2), we can say that the main factors ofconsumer dissatisfaction are a low servicelevel, problems with food, and complaintsabout the hotel rooms. The most critical aspectfor Hotel “B” is “Room”. Without negativeopinions on the aspect “Room”, the reviewswould be positive with a probability of 95.5%(Rule No. 10, Table 3). That is why the

performed repair work facilitated a significantincrease of consumer satisfaction. The mostcritical aspect for Hotel “A” is “Service”,which corresponds with the findings for theresort as a whole.

The aspects which are significant both for theresort and for the two hotels and contributing tocustomer satisfaction are good food andamusing entertainment activities. Thecombination of these aspects cancounterbalance negative emotions from theservice or complaints concerning hotel roomsand leave the client with a favorable impressionof the time spent in the hotel (Rules No. 5, 7,11, Table 3). We should note that positiveopinions about service, beach/swimming poolor place do not have a powerful influence onsentiment. That means the consumer a prioriawaits a high-level service, well-kept place andbeach/swimming pool as a matter of course.

Our qualitative research enabled us to detectthe main ways for Hotel “A” to increasecustomer satisfaction (see Table 3). Theproblematic aspects identified in the course ofour quantitative research correspond to themost significant aspects detected during thequalitative research stage. A search foralternative aspects that can lead to customersatisfaction in the presence of negativeopinions about the significant aspects “Service”and “Food” was carried out. To accomplishthis, the rules (see Table 3) containing negativesentiment in problem aspects, but whicheventually lead to a positive review, werefiltered out by the decision tree.

http://www.sic.ici.ro Studies in Informatics and Control, Vol. 24, No. 3, September 2015268

Figure 9. Decision trees for hotels

Neg. – NegativePos. – Positive

Page 9: Informatics Tools, AI Models and Methods Used for

In order of preference, the manager of Hotel“A” should first of all make decisions onincreasing the service quality, and then onincreasing the quality of food andbeach/swimming pool maintenance. Transportproblems – concerning flights, early check-in,and baggage storage – are not significant andcan be solved within the frames of serviceimprovement. The process of service qualityincrease can take much time; organizingentertainment and animated programs togetherwith solving problems in connection withrestaurant service and beach/swimming poolmaintenance can serve as immediate measuresto increase client satisfaction. Specification ofmanagerial decisions can be performed on thebasis of the information on existing problemscontained in negative reviews. The extractedopinions on aspects can be used by hotelmanagers for improve specific service areas.

5. Conclusion

Poor quality of products and servicescontributes to a decrease of customersatisfaction. On the other hand, under theconditions of stiff competition, there are nobarriers for the consumer to change the supplierof goods and services. All these things cancause loss of clients and a decrease of acompany’s efficiency indexes.

Therefore, maintaining high-qualitystandards should be provided by effectivemanagerial decisions and based on opinionmining as a feedback.

The suggested conception of decision supportbased on the developed approach of text dataprocessing and analysis allows performingquantitative and qualitative surveys ofcustomer satisfaction using informationtechnology in the form of computer-aidedprocedures, and making effective managerialdecisions on product quality management.The present conception allows effectivereduction of labor intensity of customersatisfaction research that makes it availablefor use by a wide range of companies.

A prototype of IDSS was developed on thebasis of the suggested conception. Theperformed experiment has proved its efficacyfor solving real problems of qualitymanagement and consistency of the resultsobtained. IDSS enables companies to makedecisions on quality control based on analyticalprocessing of text data containing implicitinformation on client satisfaction.

Future research on the given topic can bedevoted to automatic annotating of text data,representing text amount of review in theform of a summary, and extracting useful andunique information.

REFERENCES1. ISO 9000:2008. The quality management

system. Fundamentals and vocabulary.

2. ISO10004:2010. Quality management.Customer satisfaction. Guidelines formonitoring and measuring.

Studies in Informatics and Control, Vol. 24, No. 3, September 2015 http://www.sic.ici.ro 269

Table 3. Rules extracted by using decision trees.

# Rules Support ConfidenceExtracted rules on resort reviews

1 37.2% 97.4%2 11% 86.2%3 10.6% 83.9%4 6.9% 92.3%

5 5.8% 88.4%

Extracted rules on Hotel “A” reviews6 62.9% 88.3%7 20.5% 74.1%

8 9.4% 86.2%

9 7.2% 65.6%

Extracted rules on Hotel “B” reviews

10 51.2% 95.5%

11 27.9% 81%

12 11.1% 84%

13 9.9% 55.8%

Page 10: Informatics Tools, AI Models and Methods Used for

3. GRÄBNER, D., M. ZANKER, G.FLIEDL, M. FUCHS, Classification ofCustomer Reviews based on SentimentAnalysis, Proceedings of the InternationalConference in Helsingborg, SpringerVienna, 2012, pp. 460-470.

4. TABOADA, M., J. BROOKE, M.TOFILOSKI, K. D. VOLL, M. STEDE,Lexicon-Based Methods for SentimentAnalysis, Computational Linguistics, June2011, vol. 37(2), pp. 267-307.

5. JO, Y., A. OH, Aspect and SentimentUnification Model for Online ReviewAnalysis, Proceedings of the fourth ACMinternational conference on Web search anddata mining (WSDM ‘11), ACM NewYork, Feb. 2011, pp. 815-824.

6. LU, B., M. OTT, C. CARDIE, B. TSOU,Multi-aspect Analysis with Topic Models,Proceedings of the 2011 IEEE 11thInternational Conference on Data MiningWorkshops (ICDMW ‘11), Dec. 2011,pp. 81-88.

7. BLEI, D. M., A. Y. NG, M. I. JORDAN,Latent Dirichlet allocation, Journal ofMachine Learning Research, Jan. 2003,vol. 3 (4–5), pp. 993-1022.

8. YU, J., Z.-J. ZHA, M. WANG, T.-S.CHUA, Aspect Ranking: IdentifyingImportant Product’s aspects fromOnline Consumer Reviews, Proceedingsof the 49th Annual Meeting of theAssociation for Computational Linguistics:Human Language Technologies (HLT ‘11),June 2011, pp. 1496-1505.

9. HORVÁTH, L., I. RUDAS, New Methodfor Intellectual Content Driven GenericProduct Model Generation 2014 IEEEInternational Conference on Systems, Manand Cybernetics (SMC). IEEE, 2014.pp. 1660-1665.

10. RUDAS, I., Cloud Technology-BasedEducation with Special Emphasis onUsing Virtual Environment Proceedingsof the 14th WSEAS Conf. 2014.01.29-2014.01.31. Cambridge. USA, p. 23.

11. THOMSEN, J., E. ERNST, C.BRABRAND, M. SCHWARTZBACH,WebSelF: A Web Scraping Framework,Proceedings of the 12th internationalconference on Web Engineering (ICWE2012), July 2012, pp. 347-361.

12. YUSSUPOVA, N., D. BOGDANOVA, M.BOYKO, Applying of Sentiment Analysisfor Texts in Russian Based on MachineLearning Approach, IMMM 2012,Venice, Italy, pp. 8-14.

13. EBERT, S., N. T. VU, H. SCHÜTZE,CIS-positive: Combining ConvolutionalNeural Networks and SVMs forSentiment Analysis in Twitter In:Proceedings of the 9th InternationalWorkshop on Semantic Evaluation.SemEval 2015.

14. TURNEY, P. D., Y. NEUMAN, D. ASSAF,Y. COHEN, Literal and MetaphoricalSense Identification through Concreteand Abstract Context. In: EMNLP., 2011,pp. 680-690.

15. DURRANI, N., H. SCHMID, A. FRASER,P. KOEHN, H. SCHÜTZE, The OperationSequence Model – Combining N-Gram-based and Phrase-based StatisticalMachine Translation. ComputationalLinguistics, vol. 41(2), 2015, pp. 157-186.

16. YUSSUPOVA, N., M. BOYKO, D.BOGDANOVA, A. HILBERT, A DecisionSupport Approach based on SentimentAnalysis Combined with Data Miningfor Customer Satisfaction Research, TheInternational Journal on Advances inIntelligent Systems is Published by IARIAvol 8, no 1&2, 2015.

http://www.sic.ici.ro Studies in Informatics and Control, Vol. 24, No. 3, September 2015270