data mining for management and e-commerce by johnny lee department of accounting and information...
Post on 22-Dec-2015
214 views
TRANSCRIPT
Data Mining for Management and E-commerce
By Johnny Lee
Department of Accounting and Information Systems
University of Utah
Agenda
1. Microeconomic view of Data Mining
2. A Survey of recommendation systems in E-commerce
3. Turning Data Mining into a management science tool
A Microeconomic View of Data Mining
• Kleinberg et al. 1998
• Research Question: What is the economic utility of data mining?How to determine whether DM result is
interesting?
A Microeconomic View of Data Mining
• “Interesting Pattern”– Confidence and support
• (High balanceHigh income)
– Information content• ?
– Unexpectedness• (Super ball result stock price)
– Actionability • $,$,$….
A Microeconomic View of Data Mining
• Value of data mining– computing power and data
un-aggregate optimization
– Study of intricate ways (correlation and clusters in data that affect the enterprise’s optimal DECISION
A Microeconomic View of Data MiningValue of DM
Firm max f(x)
)()( xfxf i
),()( ii yxgxf
yi=customer data
),(max iyxg
Example one
If (demand of Beer) is not related (demand of diapers) then NO DM
If (demand of beer +demand of diaper)=(supply of beer-demand of beer)+*(supply of diaper- demand of diaper)+
then DM is needed
Example 2
Phone rate and users
without Data mining
experimenting arbitrary clusters
with data mining
optimize the profit by best matching customers and strategies
Ci
iiDXX
XcXcMax 22),(
,max2
21
Example 3
• Beer and diaper a~~gain
• Mining to decide how to jointly promote items.
• Mining data in rows or columns
• Goal oriented
What is the goal? Generated revenue
• Conflict in action space, what to do?
Contribution
• Automatic pattern filtering system based on economic value
• Rules for manual pattern filtering system
• Rules for determine trigger point of Data Mining
A survey of recommendation systems in electronic commerce
• Wei et al. 2001
• Research question:What are the types of E-commerce recommendati
on systems and how do they work?
E-commerce recommendation Systems
• Suggest items that are of interest to users based on something.
• Something:– Customer characteristics (demographics)– Features of items– User preferences: rating/purchasing history
Framework for Recommendation
Recommnedation System
Feature of Items
User's preferenece
User Demographics
Recommnedation
Types of Recommendation
• Prediction on preference of customersPersonalized and non personalized
• Top-N recommendation items for customersPersonalized and non personalized
• Top-M users who are most likely to purchase an item
Classification of Recommendation Systems
• Popularity-based: best sell
• Content-based: similar in items features
• Collaborative filtering: similar user’s taste
• Association-based: related items
• Demographic-based: user’s age, gender…
• Reputation-based: Represent individual
• Hybrid
Procedures of Content-based
1. Feature extraction and Selection
2. Representation item pool by feature decided
3. User profile learning
4. Recommendation
User Profile Learning
• pim=preference score of the user I on item m• wi=coefficient associated with feature j• fmj=the value of the j-th feature for item m• b=bias
bfwpk
j mjjim 1
Collaborative Filtering
• Recommend items based on opinions of other similar users
1. Dimension reduction by trimming preference matrix
2. Neighborhood formation for most similar user(s)
3. Recommendation generation
Neighborhood Formation
• Pearson correlation coefficient
• Constrained Pearson correlation coefficient
• Spearman rank correlation coefficient
• Cosine similarity
• Mean-square
Neighborhood Selection
• Weight threshold
• Center-based best-k neighbors
• Aggregate-based best-k neighbors
Association-based
• Item-correlation for individual users 1. Similarity computing
2. Recommendation generation
• Association Rules– Guns and ammunition
– Cigarette and lighter
– Paper plate and soda
Theory: Complementary goods?
No theory: Co-occurrence?
Association-based
22)()(
))((),(
uujuui
jujiui
pppp
ppppjisim
Pui=preference score of user u on item I
Pibar=average preference sore of the I-th item over the set of co-rate user U
Pubar=average of the u-th user’s preference score
Demographics-based
• Items that customers with similar demographics characteristics have bought
– Teens marketing
1. Data transformation: Counting, Exp(# of items), Statistic based
2. Category Preference model learning
3. Recommendation generation
Demographics-based
Methods:
1. Counting-based (frequency threshold)
2. Expected-value-based method
3. Statistics-based method
Comparison of recommendation approach
Approach Input info Types of recommendation
Degree of Personalization
Popularity-based User preferences Top-N Non-Personalized
Content-based Features of items and individual user preferences
Prediction, top-N and top-M users
Personalized
Collaborative Filtering
User preferences Prediction top-N recommendation
Personalized
Association-based User preferences Prediction top-N recommendation
Personalized
Demographics-based User demographic &preferences,features of items
Prediction top-N & top-M
Personalized
Reputation-based User preferences & reputation matrix
top-N & possible prediction
Personalized
Contribution
• Provide a systematic way to choose from E-commerce recommendation systems for practitioners
• Lay out existing approach
Turning Datamining into a Management Science Tool: New Algorithms
and Empirical Results
• Cooper & Giuffrida 2000
• Research question:How can we improve the performance of PromoC
ast (or other market) Forecast system by adding some local adjustment parameters?
Terminology
• SKU: Stock keeping unit
• KDS: knowledge discovery using SQL)
• Management science: ??????????????
KDSStart Rule generation
Phase
Sales records(error from Sales
Forecast)
Bottom-UpRule Generation
Location Data
Entropy-BasedRules Ranking
Rule Filtering
Entropy and Confidence
satisfy the level decided
Corrective Action
Yes
No
Corrective Action
U_12= 0
U4-11= 58
U_3= 221
U_2= 1149
U_1= 3583
Ok= 1115
O_1= 7
O_2= 1
O_3= 0
O_4_11= 0
O_12= 0
KDS
• Bottom-up: start from the input database
• No Memory-Bound processing
• Minimal data preprocessing
• Separates the learning phase from the action phase
• Evaluation: for 10117 cases 8.9% ($?)
KDS
• Is this a research? Is this a case study?
• Is this a management research?
• Why should I know about it as a researcher/manager/engineer?