workshop on internet and bigdata finance (wibf 14)
DESCRIPTION
Workshop on Internet and BigData Finance (WIBF 14). A Big Data Analytical Framework for Portfolio Optimization - Dhanya Jothimani , Ravi Shankar, Surendra S Yadav Department of Management Studies Indian Institute of Technology Delhi India. Flow of Presentation. Introduction - PowerPoint PPT PresentationTRANSCRIPT
Workshop on Internet and BigData Finance (WIBF 14)
A Big Data Analytical Framework for Portfolio
Optimization -Dhanya Jothimani, Ravi Shankar, Surendra S
YadavDepartment of Management StudiesIndian Institute of Technology Delhi
IndiaApril 19, 2023
Flow of PresentationIntroduction
Scope of Work
Gaps Identified
Objective
Proposed Framework
Expected Outcomes
Conclusion
ReferencesApril 19, 2023
2
WIBF 14
Introduction Big Data: Few hundred Gigabytes, Terabytes
or Zettabytes! (Qin et al., 2012)
Dimensions: Volume, Velocity, Variety ,
Variability, Complexity, Low value density (Singh, 2012)
Use of big data technologies in capital firms (Verma & Mani, 2012)
Next Big Challenge for capital markets: To
handle the velocity of data being produced (Verma & Mani, 2012)
April 19, 20233
WIBF 14
IntroductionTypes of Capital Market Data
◦ Reference Data
◦ Fundamental Data (corporate financials,
etc)
◦ News (Earning report, Economic News,
etc)
◦ Social Media (Market Sentiment, etc) (Rauchman & Nazaruk,
2013)April 19, 2023
4
WIBF 14
Introduction Portfolio: Collection of assets Portfolio Optimization: Process of making
investment decisions on holding a set of financial assets to meet various criteria
Criteria: ◦ Maximizing Return
◦ Minimizing Risk (Markowitz, 1952)
Major Steps: ◦ Asset selection
◦ Asset Weighting
◦ Asset ManagementApril 19, 2023
5
WIBF 14
Scope of the work
Assets range from stocks, bonds
to real estate
Scope of the framework is limited
to Stock Analysis
April 19, 20236
WIBF 14
Previous WorksS.No
Theme Authors (Year)
1. Prediction of stock prices using Social Media/News
Xu (2012), Bollen at al (2011), Schumaker and Chen (2009), Quah (2008)
2. Portfolio Optimization using data mining approaches
Shen et al (2012), Zuo and Kita (2012), Gemela (2001), Karpio et al (2013) , Li and Cheng (2010)
3. Portfolio Optimization using heuristics
Gupta et al (2012), Bakkar et al (2010)
4. Use of Quantitative data for Portfolio optimization
Chen (2008), Edirisinghe and Zhang (2008), Dia (2009), Ismail et al. (2012), Patari et al (2012)
April 19, 20237
WIBF 14
Gaps IdentifiedTo the knowledge of the authors:
◦No framework handles both structured
and unstructured data for portfolio
optimization
◦Generally, quantitative data is
considered for fundamental analysis
April 19, 20238
WIBF 14
ObjectiveTo propose a framework that
considers both stock price data and current affairs data (i.e news articles, market sentiments) to help investors to make an informed investment decision
April 19, 20239
WIBF 14
Proposed Framework 5- stage Process
◦Stage 1: Selection of Stocks
◦Stage 2: Validation of Stocks
◦Stage 3: Stock Clustering
◦Stage 4: Stock Ranking
◦Stage 5: Optimization
April 19, 202310
WIBF 14
April 19, 202311
WIBF 14
Fig 1: Proposed Framework
Stage 1: Selection of Stocks Input: Listed firms in a particular stock
exchange as DMU
Output: Efficient firms
Methodology: DEA, a non-parametric
linear programming (Cooper et al., 2004)
◦ Input parameters: Total assets, Total equity,
Cost of sales and Operating expenses
◦ Output parameters: Net sales and Net income
(Chen, 2008; Chen & Chen, 2011)
April 19, 202312
WIBF 14
Stage 2: Validation of StocksInput: Tweets and News articles of
efficient firms
Output: Validated firms
Methodology: Sentiment analysis using
Distributed text mining
Software framework: Hadoop
MapReduce
◦ Properties: Ease-of-use, scalability and failover
properties (Dittrich & Ruiz, 2012)April 19, 2023
13
WIBF 14
Effect of Management on Stock Price
April 19, 202314
WIBF 14
Source: http://www.firstbiz.com/data/graphic-how-infosys-stock-moved-since-nr-narayana-murthys-return-86318.html
April 19, 202315
WIBF 14
Stage 2: Validation of Stocks
Figure 2: Hadoop Framework for Distributed Text Mining
Stage 3: Stock ClusteringInput: Validated firms
Output: Clusters of firms
Methodology:
◦ Correlation co-efficient of returns of stocks
◦ Clustering algorithm: Louvian or k-means
clustering (Blondel et al., 2008)
◦ Number and Quality of clusters: Maximize similarity within a cluster Minimize similarity between clusters
April 19, 202316
WIBF 14
Stage 4: Stock Ranking Selection of appropriate stocks from each
cluster
Input: Stocks in each cluster
Output: Stocks being ranked in each cluster
Methodology: ANN (Ware, 2005)
◦ Input parameters: GDP growth rate, Interest rate
◦ Output parameters: Future ROI (Koochakzadeh,
2013)
April 19, 202317
WIBF 14
Stage 5: OptimizationHow much to invest in each stock?!
Input: Ranked stocks
Methodology: Output: Top 3 portfolio
suggestions
◦ Markowitz Mean-Variance model (Markowitz,
1952)
◦ Optimization heuristics
April 19, 202318
WIBF 14
Expected OutcomesStage 1: Asset selection
Stage 2: Consideration of Qualitative
aspect of firms
Stage 3: Diversification of risk
Stage 4: Aids in the selection of
appropriate stocks
Stage 5: Top 3 portfolio suggestions to
the investorsApril 19, 2023
19
WIBF 14
Conclusion Informed investment decisions
Considers both structured and unstructured data
Flexibility to investors for selecting appropriate
stocks
Applicability to any stock data
Limitation:
◦ Considers only stock analysis
◦ Asset management is not considered
Future Work: Implementation of the framework
April 19, 202320
WIBF 14
References E. Patari, T. Leivo, and S. Honkapuro, "Enhancement of equity portfolio
performance using data envelopment analysis", European Journal of Operational Research, vol. 220, no. 3, pp. 786–797, 2012
H.-H. Chen, “Stock selection using data envelopment analysis.,” Industrial Management and Data Systems, vol. 108, no. 9, pp. 1255–1268, 2008.
H. Markowitz, “Portfolio selection,” The Journal of Finance, vol. 7, no. 1, pp. 77–91, 1952.
J. Bollen, H. Mao, and X Zeng, "Twitter mood predicts the stock market.", Journal of Computational Science, vol. 2, no. 1, pp. 1-8, 2011
J. Dittrich and J.-A. Quian S -Ruiz, “Efficient big data processing in hadoop mapreduce,” eProc. VLDB Endow., vol. 5, pp. 2014–2015, Aug. 2012.
J. Gemela, "Financial analysis using bayesian networks", Applied Stochastic Models in Business and Industry, vol. 17, no. 1, pp. 57-67, 2001
K. Karpio, P. Lukasiewicz, A. Orlowski, and T. Zabkowski, "Mining associations on the Warsaw stock exchange." Proceedings of the 6th Polish Symposium of Physics in Economy and Social Sciences (FENS2012), vol. 123, no. 3, pp. 553–559, 2013
L. Bakker, W. Hare, H. Khosravi, and B. Ramadanovic, "A social network model of investment behaviour in the stock market.", Physica A: Statistical Mechanics and its Applications, vol. 389, no. 6, pp. 1223 – 1229, 2010
April 19, 202321
WIBF 14
References M. Dia, "A portfolio selection methodology based on data envelopment
analysis.", INFOR, vol. 47, no. 1, pp. 51–57, 2009 M. Ismail, N. Salamudin, N. Rahman, and B. Kamaruddin, "DEA portfolio
selection in Malaysian stock market.", 2012 International Conference on Innovation Management and Technology Research (ICIMTR), pp. 739–743, 2012
M. Rauchman and A. Nazaruk, “Big Data in Capital Markets” Available online at: http://www.sigmod.org/2013/keynote1_slides.pdf, Accessed on: May 30, 2014
N. C. P. Edirisinghe and X. Zhang, "Portfolio selection under DEA-based relative financial strength indicators: case of US industries", Journal of the Operational Research Society, vol. 59, no. 6, pp. 842–856, 2007
N. Koochakzadeh, A heuristic stock portfolio optimization approach based on data mining techniques. PhD thesis, Department of Computer Science, University of Calgary, March 2013.
N. S. Sachchidanand Singh, “Big data analytics,” in International Conference on Communication, Information & Computing Technology (ICCICT), pp. 1-4, 2012.
P. Gupta, G. Mittal, and M. Mehlawat, "A multicriteria optimization model of portfolio rebalancing with transaction costs in fuzzy environment." Memetic Computing, pages 1–14, 2012
R. P. Schumakerand H. Chen, "Textual analysis of stock market prediction using breaking financial news: The AZFin text system.", ACM Transactions Information Systems, vol. 27, no. 2, pp. 1–19, 2009
April 19, 202322
WIBF 14
References R. Verma and S. R. Mani, “Use of big data technologies in capital
markets.” Available online at: http://www.infosys.com/industries/financialservices/whitepapers/Documents/big-data-analytics.pdf, 2012, Accessed on: April 2, 2014.
S. Shen, H. Jiang, and T. Zhang, "tock market forecasting using machine learning algorithms. Available online at: http://cs229.stanford.edu/proj2012/ShenJiangZhang-StockMarketForecastingusingMachineLearningAlgorithms.pdf., 2012
S. T. Li and Y. c. Cheng, "A stochastic HMM-based forecasting model for fuzzy time series.", IEEE Transactions on Systems, Man, and Cybernetics, Part B, vol. 40, no. 5, pp. 1255–1266, 2010
S. Y. Xu, "Stock price forecasting using information from yahoo finance and google trend."Available online at:https://www.econ.berkeley.edu/sites/default/files/Selene, 2012
T. S. Quah, "DJIA stock selection assisted by neural network", Expert Systems with Applications, vol. 35, no. 12, pp. 50 – 58, 2008
T. Ware, “Adaptive statistical evaluation tools for equity ranking models.” Submitted to: Canadian Industrial Problem Solving Workshops (Calgary, Canada, May 15-19, 2005), 2005.
V. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast unfolding of communities in large networks,” Journal of Statistical Mechanics: Theory and Experiment, p. 10008, 2008.
April 19, 202323
WIBF 14
References W. Cooper, L. Seiford, and J. Zhu, “Data envelopment analysis:
History, models and interpretations,” in Handbook on Data Envelopment Analysis (W. W. Cooper, L. M. Seiford, and J. Zhu, eds.), vol. 71 of International Series in Operations Research & Management Science, pp. 1–39, Springer US, 2004.
X. Qin, H. Wang, F. Li, B. Zhou, Y. Cao, C. Li, H. Chen, X. Zhou, X. Du, and S. Wang, “Beyond simple integration of RDBMS and MapReduce – Paving the way toward a unified system for big data analytics: Vision and progress,” in Proceedings of the 2012 Second International Conference on Cloud and Green Computing, CGC ’12, (Washington, DC, USA), pp. 716–725, IEEE Computer Society, 2012.
Y. S. Chen and B. S. Chen, “Applying DEA, MPI and Grey model to explore the operation performance of the Taiwanese wafer fabrication industry,” Technological forecasting and social change, vol. 78, no. 3, pp. 536–546, 2011.
Y. Zuo and E. Kita, "Stock price forecast using bayesian network", Expert Systems with Applications, vol. 39, no. 8, pp. 6729–6737, 2012
April 19, 202324
WIBF 14
April 19, 202325
WIBF 14
Thank You