bi journal bi sentimental analytics[1]

Upload: hibs2782

Post on 07-Feb-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/21/2019 BI Journal BI Sentimental Analytics[1]

    1/10

    41BUSINESS INTELLIGENCEJOURNAL VOL. 15, NO. 2

    SENTIMENT ANALYSIS

    BI and SentimentAnalysis

    Mukund Deshpande and Avik Sarkar

    Overview

    Over the past two decades, there has been explosive growth

    in the volume of information and articles published on the

    Internet. With this enormous increase in online content

    came the challenge of quickly finding specific information.

    Google, AltaVista, MSN, Yahoo, and other search sites

    stepped in and developed novel technologies to efficientlysearch and harness the massive amount of Internet

    information. Some search engines indexed keywords; others

    used information hierarchies, arranging Web pages in a

    structured way for easy browsing and for quickly locating

    requested information. Text classification, also known as

    text categorization, and text-clustering-based techniques

    advanced, allowing Web pages to be automatically

    organized into relevant hierarchies.

    Web sites frequently discuss consumer products or

    servicesfrom movies and restaurants to hotels andpolitics. ese shared opinions, termed the voice of the

    customer, have become highly valuable to businesses and

    organizations large and small. In fact, a recent study by

    Deloitte found that 82 percent of purchase decisions have

    been direct ly influenced by reviews. e rapid spread of

    information over the Internet and the heightened impact

    of the media have broken down physical and geographical

    boundaries and caused organizations to become increas-

    ingly cautious about their reputations.

    Businesses and market research firms have carried outtraditional sentiment analysis (also referred to as opinion

    analysis or reputation analysis) for some time, but it

    requires significant resources (travel to a given location;

    staffing the survey process; offering survey respondents

    incentives; and collecting, aggregating, and analyzing

    results). Such analysis is cumbersome, time-consuming,

    and costly.

    Dr. Mukund Deshpande is senior architect atthe business intelligence competency center of

    Persistent Systems. He has helped enterprises,

    e-commerce companies, and ISVs make better

    business decisions for the past 10 years by using

    machine learning and data mining techniques.

    [email protected]

    Dr. Avik Sarkar is technical lead at the analytics

    competency center at Persistent Systems and

    has over nine years of experience using analyt ics,

    data mining, and statistical modeling techniques

    across different industry vertical markets.

    [email protected]

  • 7/21/2019 BI Journal BI Sentimental Analytics[1]

    2/10

    42 BUSINESS INTELLIGENCEJOURNAL VOL. 15, NO. 2

    Automated sentiment analysis based on text mining

    techniques offers a simpler, more cost-effective solution

    by providing timely and focused analysis of huge, ever-increasing volumes of content. e concept of automated

    sentiment analysis is gaining prominence as companies

    seek to provide better products and services to capture

    market share and increase revenues, especially in a chal-

    lenging global economy. Understanding market trends and

    buzz enables enterprises to better target their campaigns

    and determine the degree to which sentiment is positive,

    negative, or neutral for a given market segment.

    Text Mining

    Research and business communities are using textmining to harness large amounts of unstructured textual

    information and transform it into structured information.

    Text mining refers to a collection of techniques and

    algorithms from multiple domains, such as data mining,

    artificial intelligence, natural language processing (NLP),

    machine learning, statistics, linguistics, and computational

    linguistics. e objective of text mining is to put the

    already accumulated data to better use and enhance an

    organizations profitability. With a variety of customer

    trends and behavior and increasing competition in each

    market segment, the better the quality of the intelligence,the better the chances of increasing profitability.

    e major text mining techniques include:

    Text clustering:e automated grouping of textual

    documents based on their similarityfor example,

    clustering documents in an enterprise to understand its

    broad areas of focus

    Text classification or categorization:e automated

    assignment of documents into some specific topicsor categoriesfor example, assigning topics such as

    politics, sports, or business to an incoming stream of

    news articles

    Entity extraction:e automated tagging or extraction

    of entities from textfor example, extracting names of

    people, organizations, or locations

    Document summarization:An automated technique

    for deriving a short summary of a longer text document

    Sentiment analysis applies these techniques to assign

    sentiment or opinion information to certain entities within

    text. Sentiment evaluation is another step in the process of

    converting unstructured content into structured content

    so that data can be tracked and analyzed to identify trends

    and patterns.

    Sentiment Analysis

    Sentiment analysis broadly refers to the identification and

    assessment of opinions, emotions, and evaluations, which,

    for the purposes of computation, might be defined aswritten expressions of subjective mental states.

    For example, consider this unstructured English sentence

    in the context of a digital camera review:

    Canon PowerShot A540 had good aperture

    combined with excel lent resolution.

    Consider how sentiment analysis breaks down the informa-

    tion. First, the entities of interest are extracted from the

    sentence:

    Digital camera model: Canon PowerShot A540

    Camera dimensions or features: aperture, resolution

    Sentiments are further extracted and associated for each

    entity, as follows:

    Digital camera model = Canon PowerShot A540;

    Dimension = aperture, Sentiment = good (positive)

    Digital camera model = Canon PowerShot A540;

    Dimension = resolution, Sentiment = excel lent (positive)

    Based on the individual sentence-level sentiments,

    aggregated and summarized sentiment about the digital

    camera is obtained and stored in the database for

    reporting purposes.

    SENTIMENT ANALYSIS

  • 7/21/2019 BI Journal BI Sentimental Analytics[1]

    3/10

    43BUSINESS INTELLIGENCEJOURNAL VOL. 15, NO. 2

    e following sections delve into the technical details andalgorithms used for this type of sentiment analysis.

    Sentiment Analysis Steps

    Suppose we are interested in deriving the sentiment or

    opinion of various digital cameras across dimensions such

    as price, usability, and features. Figure 1 illustrates the steps

    we will follow in this analysis.

    Step 1: Fetch, Crawl, and Cleanse

    Comments about digital cameras might be available on

    gadget review sites or in discussion forums about digitalcameras, as well as in specialized blogs. Data from all of

    these sources needs to be collected to give a holistic view

    of all the ongoing discussions about digital cameras. Web

    crawlerssimple applications that grab the content of a

    Web page and store it on a local diskfetch data from the

    targeted sites. e downloaded Web pages are in HTML

    format, so they need to be cleansed to retain only the

    textual content and the remaining HTML tags used for

    rendering the page on the Web site.

    Step 2: Text Classificatione sites from which data is fetched might contain extra

    information and discussions about other electronic gadgets,

    but our current interest is limited to digital cameras. A text

    classifier determines whether the page or discussions on it

    are related to digital cameras; based on the decision of the

    classifier, the page is either retained for further analysis or

    discarded from the system.

    e text cla ssifier is provided by a list of relevant (positive)and irrelevant (negative) words. is list consists of a base

    list of words supplied by the software provider, which is

    typically enhanced by the user (the enterprise) to make

    it relevant to the particular domain. A simple rule-based

    classifier determines the polarity of the page based on the

    proportion of positive or negative words it contains. You

    can train complex and robust classifiers by feeding them

    samples of positive and negative pages. ese samples

    allow you to build probabilistic models based on machine-

    learning principles. en, these models are applied on

    unknown pages to determine the pages relevance.

    Commercial forums, blog aggregation services, and search

    engines (such as BoardReader and Moreover) have become

    popular recently, eliminating the need to build in-house

    text classifiers. You can use these services to specify

    keywords or a taxonomy of interest (in this case, digital

    camera models), and they will fetch the matching forums

    or blog articles.

    Step 3: Entity Extraction

    Entity extraction involves extracting the entities from thearticles or discussions. In this example, the most important

    entity is the name or model of the digital cameraif the

    name is incorrectly extracted, the entire sentiment or

    opinion analysis becomes irrelevant. ere are three major

    approaches for entity extraction:

    Dictionary or taxonomy:A dictionary or taxonomy

    of available and known models of digital cameras is

    provided to the system. Whenever the system finds

    SENTIMENT ANALYSIS

    Figure 1.Sentiment analysis steps

    Fetch/Crawl

    + Cleanse

    Text

    Classification

    Entity

    Extraction

    Sentiment

    Extraction

    Sentiment

    Summary

    Reports/

    Charts

  • 7/21/2019 BI Journal BI Sentimental Analytics[1]

    4/10

    44 BUSINESS INTELLIGENCEJOURNAL VOL. 15, NO. 2

    a name in the article, it tags it as a digital camera

    entity. is technique, though simple to set up, needs

    frequent updates on every subsequent model launch,so its not robust.

    Rules:A digital camera model name has a certain

    pattern, such as Canon PowerShot A540. erefore,

    a rule may be written to tag any alphanumeric token

    following the string Canon PowerShot as a digital

    camera model. Such techniques are more robust than

    the dictionary-based method, but if Canon decides to

    launch a new model, say the SuperShot, such rules must

    be updated manually.

    Machine learning:is algorithm learns the extraction

    rules automatically based on a sample of articles with

    the entities properly tagged. e rules are learned by

    forming graphical and probabilistic models of the

    entities and the arrangement of other terms adjoining

    them. Popular machine learning models for entity

    extraction are based on hidden Markov models (HMM)

    and conditional random fields (CRF).

    Step 4: Sentiment Extraction

    Sentiment extraction involves spotting sentiment wordswithin a particular sentence. is is typically achieved

    using a dictionary of sentiment terms and their their

    semantic orientations. ere are obvious limitations to the

    dictionary-based approach. For example, the sentiment

    word high in the context of price might have a negative

    polarity, whereas high in the context of camera resolu-

    tion will be of positive polarity. (Approaches to dealing

    with varying and domain-specific sentiment words and

    their semantic orientation are discussed in the next section.)

    Once an entity of interest (for example, the digital cameramodel or sentiment word) is identified, structured senti-

    ment is extracted from the sentence in the form of {model

    name, score}, where score is the positive or negative polarity

    value of the identified sentiment word in the sentence. If

    some dimension (such as price or resolution) is also

    found in the sentence, then the sentiment is extracted in

    the form of {model name, dimension, score}. We may also

    choose to report the source name or source ID to associate

    the extracted sentiment back to that source.

    e presence of negation words, such as not, no,

    didnt, and never, require special attention. ese

    keywords lead to a transformation in the polarity valueof the sentiment words and hence in their reported score.

    Natural-language techniques are used to detect the effect

    of the negation word on the adjoining sentiment word.

    If the negation effect is detected, then the polarity of the

    sentiment word is inverted.

    e extracted sentiment data is now in a structured format

    that can be loaded into relational databases for further

    transformation and reporting.

    Step 5: Sentiment Summarye raw sentiments extracted in Step 4 come from

    individual sentences that are specific to certain entities.

    To make the data meaningful for reporting, it must

    be aggregated. One of the obvious aggregations in the

    context of digital cameras will be model-name-based

    aggregationin this case, all of the positive, negative, or

    neutral entries in the database are grouped together. Again,

    model- and dimension-based sentiment aggregation would

    allow the discovery of detailed, dimension-wise sentiment

    distribution for every model. Based on the reporting needs,

    different levels of aggregation and summarization need tobe carried out and stored in a database or data warehouse.

    Step 6: Reports/Charts

    Reports and charts can be generated directly from the

    database or data warehouse where the aggregated data is

    stored in a structured format. Such reporting falls under

    the purview of traditional BI and reporting, and is not

    related to the core sentiment analysis steps.

    e steps described above have been used to transform

    the unstructured textual data in blogs and forums tostructured, quantifiable numeric sentiment data related to

    the entity of interest.

    Sentiment Analysis Challenges

    ere are challenges in sentiment analysis, but fortunately

    some simple tactics can help you overcome them. e

    challenges discussed in this section are related to sentiment

    assignment, co-reference resolution, and assigning domain-

    specific polarity values to sentiment words.

    SENTIMENT ANALYSIS

  • 7/21/2019 BI Journal BI Sentimental Analytics[1]

    5/10

    45BUSINESS INTELLIGENCEJOURNAL VOL. 15, NO. 2

    Sentiment Assignment

    Suppose a sentence mentions digital camera features such

    as resolution, usage, and megapixels; the sentence alsomentions a sentiment word, say, good. Should we relate

    all or only some of the features to the sentiment word?

    e issue becomes even more challenging when multiple

    sentiment words or model names are mentioned in the

    same sentence. Limited accuracy can be achieved by using

    simple heuristics, such as assigning the model name or

    feature to the nearest occurring sentiment word (this yields

    acceptable accuracy). Deep NLP techniques may be used

    to identify the model names or features (nouns) that are

    related to the sentiment word (adjective or adverb) in thecontext of that sentence.

    Reviews often include comparative comments about

    multiple digital camera models within single sentences. For

    example:

    Kodak V570 is better than the Canon

    Power-Shot A460.

    Kodak V570 scores more points than Canon

    PowerShot A460 in terms of resolution.

    In comparing the Kodak V570 and Canon Power-

    Shot A460, the latter wins in terms of resolution.

    Nikon D200 is good in terms of resolution, while

    Kodak V570 and Canon PowerShot A460 have

    better usability.

    Dealing with such comparative sentences requires building

    complex natural-language rules to understand the impact

    and span of every word. For example, the word betterwould signal a positive sentiment extraction for one camera

    model or feature and negative sentiment data for another.

    Co-reference Resolution

    Suppose a discussion about a digital camera mentions

    the model in the beginning of the article, but subsequent

    references use pronouns such as it or phrases such as the

    camera. Referring to a proper noun by using a pronoun is

    called co-reference.

    Co-reference is a common feature of the English language.

    Ignoring sentences that use it will lead to a loss in data and

    incorrect reporting. Co-reference resolution, also referred toas anaphora resolution, is a vast area of research in the NLP

    and computational linguistics communities. It is achieved

    using rule-based methods or machine-learning-based

    techniques. Open source co-reference resolution systems

    such as GATE (General Architecture for Text Engineering)

    provide the accuracy required for sentiment analysis.

    Domain-specific polarity values and sentiment words

    As discussed earlier, sentiment words have different

    interpretations in different contexts. For example, long

    in the context of movies might convey a negative senti-ment, whereas in sports it would indicate positive polarity.

    Similarly, unpredictable might convey positive sentiment

    for movies, but would indicate negative polarity when used

    to describe digital cameras or mobile phones.

    is problem can be tackled by using a domain-specific

    sentiment word list. Such a list is created by analyzing all

    the adjectives, adverbs, and phrases in the domain-specific

    document collection. e analysis calculates the proximity

    of these words to generic positive words such as good and

    generic negative words such as bad. Another calculationis called point-wise mutual information, which provides a

    measure of whether two terms are related and hence jointly

    occurring, rather than showing up together by chance.

    ese calculations can be performed for the word across all

    documents to determine whether a word occurs more often

    in the positive sense than in the negative sense.

    ese techniques work well if a certain sentiment word has

    a fixed polarity interpretation within a certain domain.

    Now, suppose we have the sentiment word high, which

    in the digital camera domain could indicate negativesentiment for price but positive sentiment for camera

    resolution. Such cases are a bit more difficult to handle

    and can often lead to errors in sentiment analysis. To tackle

    such scenarios, the system has to store some mapping of

    entity, the sentiment word, and its associated pola rityfor

    example, {high, price, ve} and {high, resolution, +ve}.

    Creating and verifying such mappings involves considerable

    manual work on top of automated techniques.

    SENTIMENT ANALYSIS

  • 7/21/2019 BI Journal BI Sentimental Analytics[1]

    6/10

    46 BUSINESS INTELLIGENCEJOURNAL VOL. 15, NO. 2

    Examples

    Sentiment Analysis of Digital Camera Reviews

    ere are many Web sites that contain reviews related todigital cameras. Suppose a consumer is looking to buy a

    particular digital camera and would like to get a complete

    understanding of the cameras different features, strengths,

    and weakness. She would then compare this information

    to other contemporary digital camera models of the same

    or competing brands. is would involve manual research

    across all related Web sites, which might require days

    or even months of research. Rather than doing this, the

    consumer is more likely to gather incomplete information

    by visiting just a few sites.

    Automated sentiment analysis and BI-based reporting can

    come to the rescue by providing a complete overview of

    the many discussions about digital camera models and

    their features.

    First, a list of available digital camera models is collected

    from the various companies catalogs to create a compre-

    hensive taxonomy of digital camera models. An initial list

    of digital camera features or dimensions is also collected

    from these catalogs. All online discussion pages are

    collected from the digital camera review Web sites.

    One important consideration during taxonomy creation is

    the grouping of synonymous entities. For example, Canon

    PowerShot A540 may a lso be referred to as PowerShot

    540 or Canon A540. All of these should be grouped as a

    single entity. Again, the dimension camera resolution may

    be referred to as resolution, megapixel, or simply MP;

    all should be aligned to the single entity resolution.

    e presence of the camera model name on a given page

    indicates that it should be considered for further analysis.e next challenge is to extract the entities of interest from

    the textthat is, the digital camera model names and

    features. A taxonomy-based method is used to extract those

    that are known. Machine-learning-based approaches can

    extract the others. Here, documents tagged with existing

    model names and features are provided as training to the

    machine learning the algorithm, which uses the data to

    learn the extraction rules. ese rules are then used to

    extract entities from other incoming articles.

    Raw sentiment is extracted from the sentiment-bearing

    sentences using the approaches described above. A list

    of sentiment-bearing words, along with their polarityvalues, is provided as input. Based on the raw sentiments,

    sentiment aggregation is carried out on two dimensions:

    digital camera model and digital camera features. Further

    aggregation can be carried out for each Web site to identify

    any site-specific bias in the extracted sentiments. ese

    aggregated values are then stored in the data warehouse for

    reporting purposes.

    Sentiment Analysis of Election Campaigns

    e most recent U.S. presidential election saw a la rge

    number of online Web sites discussing the post-electionpolicies and agendas of Democratic nominee Barack

    Obama and Republican John McCain. ese discussions

    come from people who are very likely to be legitimate

    American voters (rather than, say, chi ldren or people resid-

    ing outside the U.S.). Political parties such as Democrats

    and Republicans employ armies of people across the U.S.

    to survey people about their opinions on the policies of the

    presidential candidates. ese surveys incur huge costs and

    delays in information collection and analysis.

    Automated BI and sentiment ana lysis can work magic hereby continuously analyzing the comments posted on Web

    sites and providing prompt, sentiment-based reporting.

    For example, a popular presidential debate on television

    one evening will lead to comments on the Web. Sentiment

    analysis performed on the comments can be completed in

    real time, and the political parties can gauge the response

    to the debate and to the policy matters discussed. Smart

    technology use and intelligent data collection can provide

    in-depth, state-wide sentiment analysis of the comments.

    Such analysis would be extremely powerful in determining

    the future election campaigning strategy in each state.

    Considering the sensitivity and impact of the analysis,

    careful attention must be paid to generating the taxonomy,

    which consists of two main entities: the presidential nomi-

    nees and the policies or issues discussed. e presidential

    nominees list is finite, corresponding to the major political

    parties. Variations in the names, acronyms, or synonyms

    should also be carefully studied and collated.

    SENTIMENT ANALYSIS

  • 7/21/2019 BI Journal BI Sentimental Analytics[1]

    7/10

    47BUSINESS INTELLIGENCEJOURNAL VOL. 15, NO. 2

    Generating the taxonomy of issues or policies is far more

    challenging. Each issue is defined in terms of keywords

    or phrases; some of these will appear in multiple issues orpolicies. Variations among keywords and phrases can be

    quite large, and capturing them requires considerable time

    and effort. Automated methods may be used for many of

    these steps, but manual verification and editing is required

    to remove discrepancies. Another challenge is determining

    the location of each person entering comments. is can

    be done by capturing their Internet protocol (IP) addresses,

    then associating them with physical and geographical

    locations. Comments from outside the country are ignored.

    Other comments are associated with states (or cities, as

    available). Finally, carefully selected, election-specificsentiment words are added to the ta xonomy.

    Once the taxonomy is in place, the raw sentiments

    may be extracted from the comments. ey are in two

    primary forms:

    {Presidential Nominee, Location, Sentiment}, which

    captures generic sentiment about the presidential

    candidate regardless of issue

    {Presidential Nominee, Issue or Policy, Location,

    Sentiment}, which captures the sentiment or opinion

    about the particular issue for the presidential candidate

    A sing le comment may lead to the extraction of more

    than one raw sentiment, as shown above. Next, the data is

    aggregated along dimensions such as presidential nominee,

    policy issue, or location. e aggregated results are stored

    in a warehouse for quick access and reporting.

    In the future, many Web sites will likely collect further

    details about the people making the comments, including

    age group, income, education, religion, race, ethnic origin,

    and number of family members. is would allow more

    detailed analysis and drill-down of the sentiment results,

    which would aid in advanced campaign management such

    as micro-targeting specific groups of voters.

    SENTIMENT ANALYSIS

    Figure 2.Sample election campaignvoter sentiment report

    Washington

    Oregon

    ArizonaNew Mexico

    Texas

    Kansas

    Colorado

    Utah

    Nevada

    California

    Idaho

    Montana North Dakota

    South Dakota

    Nebraska

    Minnesota

    Iowa

    Missouri

    Arkansas

    Mississippi Alabama

    Louisiana

    Florida

    Georgia

    Tennessee

    Wisconsin

    IllinoisIndiana

    Ohio

    Michigan

    Kentucky

    Oklahoma

    New YorkR.I.

    Mass.

    N.H.

    Maine

    Wyoming

    Pennsylvania

    Virginia

    VirginiaWest

    Md.

    Vt.

    North Carolina

    South

    Alaska

    Negative

    Obama McCain

    Positive

  • 7/21/2019 BI Journal BI Sentimental Analytics[1]

    8/10

    48 BUSINESS INTELLIGENCEJOURNAL VOL. 15, NO. 2

    SENTIMENT ANALYSIS

    Other Applications of BI and Sentiment Analysis

    Additional applications of sentiment ana lysis and BI-based

    reporting include:

    Online product reviews.ese contributed to the

    development of sentiment analysis. Product reviews are

    analyzed to provide an overall idea about the features of

    the product along with its strengths and weaknesses.

    Online movie reviews.Tese are available in

    abundance, which led to the discovery of a new

    domain of sentiment analysis that analyzes peoples

    opinions about movies.

    Company news.Analyzing news articles and

    discussions related to a company can provide

    detailed sentiment analysis about an organizations

    performance, along with criteria such as profit,

    customer satisfaction, and products.

    Online videos.Sentiment analysis helps to capture

    opinions about both video quality and the

    events portrayed.

    Hotels, vacation homes, holiday destinations,and restaurants.Sentiment analysis helps people

    make informed decisions about holiday plans or

    where to dine out.

    Movie stars, popular sports figures, and televi-

    sion personalities.Sentiment analysis can capture

    the sentiments and opinions of large groups of

    people by analyzing discussions or articles related to

    such public figures.

    Existing Research in Sentiment AnalysisSentiment/opinion analysis is an emerging area of research

    in text mining. Early researchers rated movie reviews on

    a positive/negative scale by treating each review as a bag

    of words and applying machine-learning algorithms like

    Nave Bayes. Successive research progressed to detecting

    sentence-level sentiment and hence reporting higher

    accuracy figures. In contrast to the research on movie

    reviews, experts from the finance domain analyzed the

    sentiment in published news articles to predict the price of

    a certain stock for the following day.

    Experts also discovered new techniques for using Web

    search to determine the semantic orientation of words,

    which is at the core of quantifying the sentiment expressed

    in a sentence. See the bibliography at the end of this article

    for additional studies and reports.

    Final Thoughts

    In closing, we would like to spotlight two observations that

    highlight the growing need for sentiment analysis:

    With the explosion of Web 2.0 platforms such asblogs, discussion forums, peer-to-peer networks,

    and various other types of social media all of

    which continue to proliferate across the Internet

    at lightning speed, consumers have at their

    disposal a soapbox of unprecedented reach and

    power by which to share their brand experiences

    and opinions, positive or negative, regarding

    any product or service. As major companies are

    increasingly coming to realize, these consumer

    voices can wield enormous influence in shap-

    ing the opinions of other consumersand,ultimately, their brand loyalties, their purchase

    decisions, and their own brand advocacy. Com-

    panies can respond to the consumer insights they

    generate through social media monitoring and

    analysis by modifying their marketing messages,

    brand positioning, product development, and

    other activities accordingly.

    Jeff Zabin and Alex Jefferies [2008]. Social Media

    Monitoring and Analysis: Generating Consumer Insights from

    Online Conversation, Aberdeen Group Benchmark Report.

    Marketers have always needed to monitor media

    for information related to their brandswhether

    its for public relations activities, fraud violations,

    or competitive intelligence. But fragmenting

    media and changing consumer behavior have

    crippled traditional monitoring methods.

    Technorati estimates that 75,000 new blogs are

    created daily, along with 1.2 million new posts

  • 7/21/2019 BI Journal BI Sentimental Analytics[1]

    9/10

    49BUSINESS INTELLIGENCEJOURNAL VOL. 15, NO. 2

    SENTIMENT ANALYSIS

    each day, many discussing consumer opinions on

    products and services. Tactics [of the traditional

    sort] such as clipping services, field agents, and adhoc research simply cant keep pace.

    Peter Kim [2006]. Te Forrester Wave: Brand Monitor-

    ing, Q3 2006, white paper, Forrester Wave.

    Bibliography

    Baeza-Yates, Ricardo, and B. Ribeiro-Neto [1999].

    Modern Information Retrieval. Addison-Wesley

    Longman Publishing Company.

    Cunningham, Hamish, Diana Maynard, KalinaBontcheva, and Valentin Tablan [2002]. GAE: A

    Framework and Graphical Development Environment

    for Robust NLP ools and Applications. Proceedings of

    the 40th Anniversary Meeting of the Association for

    Computational Linguistics (ACL02). Philadelphia, PA.

    Das, Sanjiv Ranjan, and Mike Y. Chen [2001]. Yahoo!

    for Amazon: Sentiment Parsing from Small alk on

    the Web. Proceedings of the 8th Asia Pacific Finance

    Association Annual Conference.

    Esuli, Andrea, and Fabrizio Sebastiani [2006].

    SentiWordNet: A Publicly Available Lexical Resource

    for Opinion Mining. Proceedings of LREC-06, 5th

    Conference on Language Resources and Evaluation,

    Genova, Italy, pp. 417422.

    Hurst, Matthew, and Nigam Kamal [2004]. Retrieving

    Topical Sentiments from Online Document

    Collections. Document Recognition and Retrieval XI,

    pp. 2734.

    Lafferty, John, Andrew McCallum, and Fernando Pereira

    [2001]. Conditional Random Fields: Probabilistic

    Models for Segmenting and Labeling Sequence

    Data. Proceedings of the Eighteenth International

    Conference on Machine Learning, pp. 282289.

    Nigam, Kamal, and Matthew Hurst [2004]. owards a

    Robust Metric of Opinion. AAAI Spring Symposium on

    Exploring Attitude and Affect in Text.

    Pang, Bo, and Lillian Lee [2005]. Seeing Stars: Exploiting

    Class Relationships for Sentiment Categorization with

    Respect to Rating Scales. Proceedings of the ACL,pp. 115124.

    [2004].A Sentimental Education: Sentiment

    Analysis Using Subjectivity Summarization Based on

    Minimum Cuts. Proceedings of the ACL,

    pp. 271278.

    , and Shivakumar Vaithyanathan [2002]. Tumbs

    up? Sentiment Classification Using Machine Learning

    echniques. Proceedings of the ACL-02 Conference on

    Empirical Methods in Natural Language Processing,Vol. 10, pp. 7986.

    Rabiner, Lawrence R. [1989].A utorial on Hidden

    Markov Models and Selected Applications in Speech

    Recognition. Proceedings of the IEEE, Vol. 77, No. 2,

    pp. 257286.

    Sebastiani, Fabrizio [2002]. Machine Learning in

    Automated Text Categorization.ACM Computing

    Surveys, Vol. 34, No. 1, pp. 147.

    Turney, Peter D. [2002]. Tumbs Up or Tumbs Down?

    Semantic Orientation Applied to Unsupervised

    Classification of Reviews. Proceedings of the 40th

    Annual Meeting of the Association for Computational

    Linguistics, pp. 417424. Philadelphia, PA.

    , and Michael L. Littman [2003]. Measuring

    praise and criticism: Inference of semantic orientation

    from association.ACM ransactions on Information

    Systems, Vol. 21, No. 4, pp. 315346.

  • 7/21/2019 BI Journal BI Sentimental Analytics[1]

    10/10

    50 BUSINESS INTELLIGENCEJOURNAL VOL. 15, NO. 2

    AUTHOR INSTRUCTIONS

    e Business Intelligence Journal is a quarterly journal that

    focuses on all aspects of data warehousing and business

    intelligence. It serves the needs of researchers and prac-

    titioners in this important field by publishing surveys of

    current practices, opinion pieces, conceptual frameworks,

    case studies that describe innovative practices or provide

    important insights, tutorials, technology discussions, and

    annotated bibliographies. eJournalpublishes educa-tional articles that do not market, advertise, or promote

    one particular product or company.

    Editorial Topics for 2010

    Journalauthors are encouraged to submit articles of

    interest to business intelligence and data warehousing

    professionals, including the following timely topics:

    Agile business intelligence

    Project management and planning

    Architecture and deployment

    Data design and integration

    Data management and infrastructure

    Data analysis and delivery

    Analytic applications

    Selling and justifying the data warehouse

    Editorial Acceptance All articles are reviewed by theJournal seditors before

    they are accepted for publication.

    e publisher will copyedit the final manuscript to

    conform to its standards of grammar, style, format,

    and length.

    Articles must not have been published previously

    without the knowledge of the publisher. Submission

    of a manuscript implies the authors assurance that

    the same work has not been, will not be, and is not

    currently submitted elsewhere.

    Authors will be required to sign a release form before

    the article is published; this agreement is available

    upon request (contact [email protected]).

    eJournalwill not publish articles that market,advertise, or promote one particular product

    or company.

    Submissions

    tdwi.org/journalsubmissions

    Materials should be submitted to:

    Jennifer Agee, Managing Editor

    E-mail: [email protected]

    Upcoming Submissions Deadlines

    Volume 15, Number 4Submission Deadline: September 3, 2010

    Distribution Date: December 2010

    Volume 16, Number 1

    Submission Deadline: December 17, 2010

    Distribution Date: March 2011

    Editorial Calendar andInstructions for Authors