cis 600: master's project online trading and data mining- based marketing of it books...
TRANSCRIPT
CIS 600: Master's Project
Online Trading and Data Mining-Based Marketing of IT Books
Supervisor : Dr. Haiping Xu
Student : Tsung-Ta Tu
Student ID : 999-20-1529
Outline
1. Introduction and Motivation
2. Data Mining Technology
3. System Architecture & Demo
4. Analyze and Discuss The Result
5. Conclusion
6. Future work
Introduction and Motivation In Internet era, each E-Commerce
website contain a large database of customer transactions, where each transaction consists of a set of items that purchased by a customer in a visit.
All the data in the database is treasure not garbage. When you analyze the data, it can solve some questions.
Introduction and Motivation (2)
Questions:
(1) How to keep touch with increasing customers?
(2) What are the characteristics, the requirement mode and consuming patterns of the customers?
(3) How to design attractive binding products which supply more convenient shopping options for the customers?
Data Mining Techniques
(1) Association Rules (2) Classification (3) Clustering (4) Neural Network (5) Generalization
Association Rules An association rule is a rule which implies
certain association relationships among a set of objects (such as “occur together” or “one implies the other”) in a database.
The intuitive meaning of such a rule is that transactions of the database which contain X tend to contain Y .
Association Rules (2)
This basic process for association rules analysis consist of three important concerns
(1) Choosing the right set of items
(2) Generating rules by deciphering the counts in the co- occurrence matrix
(3) Overcoming the practical limits imposed by thousands or tens of thousands of items appearing in combinations
large enough to be interesting
An Example An example of an association rule is: ``75% of
transactions that contain diapers also contain beer; 37.5% of all transactions contain both of these items''. Here 75% is called the confidence of the rule, and 37.5% is called the support of the rule.
Jason Manager of IT Book
System Architecture and Skills Ⅰ. System Architecture ( 3-Tier ) : (1) Server Side Oracle 9.0.2 Database + Windows XP (2) Application Side Tomcat 5.0.18 + Windows XP (3) Client Side IE 6.0 + Windows XP Ⅱ. Skills : (1) UML (2) HTML , JavaScript (3) Java Program Language (J2SDK) (5) JSP , Java Servlet (6) JDBC , Java Bean (8) Oracle SQL , PL/SQL ( Trigger , Procedure , Function ) (9) Oracle Database Management
Use Case Diagram
Search Books
Check Top10 Books
Create Customer Profile
Update Customer Profile
Payment
View Book Information
View Customer Profile
Place order for book
View Order History
Customers
<<extend>>
<<extend>>
<<extend>>
<<extend>>
<<include>>
Use Case Diagram
Add Book
Update Book Information
Remove Book
Add Package for on Sale
Update Package Information
Remove Package
Check Books Information
<<extend>>
<<extend>>
<<extend>>
Analyze Association Rules of Books
Check on Sale List
<<extend>>
<<extend>>
<<extend>>
Manager
Class Diagram
Managementman_idman_nameman_pswman_state
Province DB
check_Province DB()add_Province DB()remove_Province DB()
Customer DB
check_Customer DB()add_Customer DB()remove_Customer DB()
Provinceprov_idprov_nameprov_cdateprov_state
0..*
1
0..*
1
On Sale List DB
check_On Sale List DB()add_On Sale List DB()remove_On Sale List DB()
Publisher DB
check_Publisher DB()add_Publisher DB()remove_Publisher DB()
On Sale Listosl_idosl_nameosl_costosl_discosl_priceosl_sdosl_edosl_summaryosl_tsqosl_sqosl_state
0..*
1
0..*
1
Book Information DB
check_Book Information DB()add_Book Information DB()remove_Book Information DB()
Publisherpub_idpub_namepub_contpub_bosspub_addrpub_telpub_fax
0..*
1
0..*
1
On Sale List Relation Itemosi_idosl_idbk_idbk_isbn
1..* 11..* 1
Book_Classificationbc_idbc_snamebc_namebc_state
Book Informationbk_idbk_isbnbk_namebk_clsbk_autbk_pdbk_vrbk_costbk_pricebk_discbk_tsqbk_sqbk_summarybk_photobk_state
0..*
1
0..*
1
10..* 10..*
1
1
1
1
1
1
1
1
Customercus_idcus_nfcus_nmcus_nlcus_pswcus_telcus_emailcus_streecus_citycus_zip
0..*
1
0..*
1
11 11
Customer Order DB
check_Customer Order DB()add_Customer Order DB()remove_Customer Order DB()
FIFO
Customer Order Itemcoi_idbk_isbncoi_cutcoi_sal
1
1
1
1Customer
Orderco_idco_dateco_totalco_stateco_cnoco_cdate
1..*
1
1..*
1
0..*
1
0..*
1
1..*1 1..*1
Credit Cardcc_idcc_namecc_state
11 11
Display System
Jason Manager of IT Book
Connect to Jason
Select Book Information
Search Book Information
Book Information
Login
My Profile
Place Order
Place Order
Place Order
Shopping Car
Place Order
Place Order
Order Information
Manager
Select Classification
Select Book
Profit Association Rule
Profit Association Rule
Promotion
Promotion
Analyze and Discuss The Result Association rule help us to find out the association
in transaction, but too depend on it will lose the consideration of other factor that influence the customer behavior.
For example, classification and quantity of sale item are also as an important factor that we need to consider.
Analyze and Discuss The Result
Is the most confident rule the best rule ? There is a problem. This rule is actually worse than if just
randomly saying that A appears in the transaction.
A occurs in 45 percent of the transactions but the rule only gives 33 percent confidence. The rule does worse than just randomly guessing.
Improvement Improvement tells how much better a rule is at predicting
the result than just assuming the result in the first place. It is given by the following formula:
P(A^B) / P (A)
Improvement = ---------------------------
P ( B )
Improvement (2)
When improvement is greater than 1, then the resulting rule is better at predicting the result than random chance.
When it is less than 1 , it is worse than the random probability.
The Profit Association Rules The profit association rules that not only consider the
basic concept of association rule but also other influence factor.
Three major portion of profit association rules are (1) Frequency (2) Quantity (3) Auxiliary Give each estimate a weight to calculate the final value
Frequency Portion
(1) Support : P(A^B)
(2) Confident : P(A^B) / P (A)
(3) Improvement : [P(A^B) / P (A)] / P(B)
Quantity Portion (1) B’s sale quantity of B’s classification quantity = Q(B) / Q (CB) (2) A’s sale quantity of A’s classification quantity = Q(A) / Q (CA) (3) Comparative quality = Q(B) / Q(A)
Auxiliary Portion
A and B have same author A and B in same classification Whether A in top 10 list or not Whether B in top 10 list or not Etc.
Case Study (1)
Case Study (2)
Case Study (3)
Conclusion Profit association rule can suggest an evaluation value
that let marketing manager can make business decisions include
(1) Catalog design
(2) What to put on sale
(3) How to design coupons
(4) Cross-marking.
Future work Optimize the weight factor of Profit Association
Rule. Integrate this system into CRM system (Data
Warehouse, Data Mining, Call Center) Using AI technology to make Jason Manager
more like a human being. Refine knowledge of domain know-how that
bring business intelligence (BI).
References R. Agrawal, T. Imielinski, and A. Swami, “Mining association rules
between sets of items in large databases,” Proceedings of the ACM-SIGMOD International Conference on Management of Data, Washington, DC, pp. 207-216, 1993.
C. H. Cai, “Mining association rules with weighted items,” Proceedings of the International Database Engineering and Application Symposium, Cardiff, Wales, UK, pp. 68-77, 1998.
A. Gyenesei, “Mining weighted association rules for fuzzy quantitative items,” Techical Report, Turku Centre for Computer Science, no. 346, Finland, 2000.
R. Rastogi and K. Shim, “Mining optimized association rules with categorical and numeric attributes,” IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 1, pp. 29 -50, 2002.
P. S. M. Tsai and C. M. Chen, “Mining quantitative association rules in a large database of sales transactions,” Journal of Information Science and Engineering, vol. 17, no.4, pp. 667-681, 2001.
Thank you