the performance of sentiment features of md&as for...

The performance of sentiment features of MD&As for financial misstatement prediction: A comparison of deep learning and “bag-of-words” approach By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi

Upload: others

Post on 20-Jul-2020

2 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: The performance of sentiment features of MD&As for ...raw.rutgers.edu/docs/wcars/41wcars2/Sun.pdf · By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi. Deep learning Deep learning

The performance of sentiment features of MD&As for financial misstatement prediction: A comparison of deep

learning and “bag-of-words” approach

By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi

Page 2: The performance of sentiment features of MD&As for ...raw.rutgers.edu/docs/wcars/41wcars2/Sun.pdf · By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi. Deep learning Deep learning

Deep learning

Deep learning mimics how a human brain thinks. It makes a machine think like human.

“The general idea of deep learning is to use neural networks to build multiple layers of abstraction to solve a complex semantic problem.”

-- Aaron Chavez, formerly chief scientist at Alchemy API

Page 3: The performance of sentiment features of MD&As for ...raw.rutgers.edu/docs/wcars/41wcars2/Sun.pdf · By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi. Deep learning Deep learning

Biological Neurons

Axon

Terminal Branches

of AxonDendrites

soma

Electrical impulse

Page 4: The performance of sentiment features of MD&As for ...raw.rutgers.edu/docs/wcars/41wcars2/Sun.pdf · By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi. Deep learning Deep learning

Deep neural network

Page 5: The performance of sentiment features of MD&As for ...raw.rutgers.edu/docs/wcars/41wcars2/Sun.pdf · By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi. Deep learning Deep learning

pixels

edges

object parts

(combination

of edges)

object models

Page 6: The performance of sentiment features of MD&As for ...raw.rutgers.edu/docs/wcars/41wcars2/Sun.pdf · By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi. Deep learning Deep learning

Research questions

(1) Do sentiment features add information for financial misreporting prediction?

(2) If they do, are they effective only for fraud prediction or for misstatement including both fraud and error?

(3) How effective the model using deep learning based sentiment features is as compared to the model using sentiment feature obtained by bag of words approach?

Page 7: The performance of sentiment features of MD&As for ...raw.rutgers.edu/docs/wcars/41wcars2/Sun.pdf · By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi. Deep learning Deep learning

Sentiment analysis approachesDeep learning approach Bag of words approach

Description of the technique Emerging technique employing deep

hierarchical neural network and trained

with a large amount of text files

Prevalent technique using various word lists

(dictionary), with each one representing a

particular sentiment feature

Rationale “understand” the meaning of a text file count the frequency of the words originated

from a specific dictionary

Output sentiment feature Sentiment scores sentiment scores (positive score-negative

score)

Is there prior literature in

accounting and auditing

domain

No Yes

Tool Alchemy language API Loughran and McDonald (2011a)

Is it a finance-specific tool No Yes

Required text document HTML/text document and webpage HTML/text document

Does it need data

preprocessing

No Yes

Page 8: The performance of sentiment features of MD&As for ...raw.rutgers.edu/docs/wcars/41wcars2/Sun.pdf · By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi. Deep learning Deep learning

Sentiment features and misstatements• We analyzed 31466 MD&As of 10-K filings for fiscal years from 2006 to

2015 using deep learning and “bag of words” approach separately.

• With deep learning approach, we obtained Sentiment_DL and Joy

• With bag of words approach, we obtained Sentiment_TM

•Misstatement samples:

• restatements caused by financial misreporting for the fiscal years in our MD&A sample.

• Misstatement=1 if there is a restatement as disclosed by audit analytics and 0 otherwise

• 321 out of 31466 observations are identified as misstatement (severe data imbalance issue)

Page 9: The performance of sentiment features of MD&As for ...raw.rutgers.edu/docs/wcars/41wcars2/Sun.pdf · By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi. Deep learning Deep learning

Classification models: CHAID (CHI-square Adjusted Interaction Detection) algorithm

Page 10: The performance of sentiment features of MD&As for ...raw.rutgers.edu/docs/wcars/41wcars2/Sun.pdf · By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi. Deep learning Deep learning

Results: Top 10 important predictors

Page 11: The performance of sentiment features of MD&As for ...raw.rutgers.edu/docs/wcars/41wcars2/Sun.pdf · By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi. Deep learning Deep learning

Page 12: The performance of sentiment features of MD&As for ...raw.rutgers.edu/docs/wcars/41wcars2/Sun.pdf · By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi. Deep learning Deep learning

Page 13: The performance of sentiment features of MD&As for ...raw.rutgers.edu/docs/wcars/41wcars2/Sun.pdf · By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi. Deep learning Deep learning

Prediction results for testing data

Page 14: The performance of sentiment features of MD&As for ...raw.rutgers.edu/docs/wcars/41wcars2/Sun.pdf · By Ting(Sophia) Sun, Yue Liu, and Miklos A. Vasarhelyi. Deep learning Deep learning

•Answers to RQs:

(1) Do sentiment features add information for financial misreporting prediction?

•Yes

•(2) If they do, are they effective only for fraud prediction or for misstatement including both fraud and error?

•Fraud prediction

•(3) How effective the model using deep learning based sentiment features is as compared to the model using sentiment feature obtained by bag of words approach?

•Improvement of effectiveness in terms of Accuracy, AUC, false positive rates,

•Conclusions:

• Considering its effectiveness and efficiency, Deep Learning based textual analysis is a promising technique for audit analytics

• Its predictive performance is expected to be improved if a finance-specific deep learning model is developed.

•Future work:

• Increase the sample of financial misstatement (currently all sample comes from audit analytics database), use AAER (Accounting and Auditing Enforcement Releases)

• Decrease false positives

• Increase overall accuracy

• Use deep learning as the main classification model