evolving dynamic web pages using web mining

21
Evolving dynamic web pages using web mining Kartik Menon Smart Engineering Systems Laboratory Engineering Management Department University of Missouri-Rolla

Upload: dorian-eaton

Post on 31-Dec-2015

39 views

Category:

Documents


0 download

DESCRIPTION

Evolving dynamic web pages using web mining. Kartik Menon Smart Engineering Systems Laboratory Engineering Management Department University of Missouri-Rolla. Overview. Goal Web Mining General Principle behind web mining Web Data Web Access Pattern Clustering - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Evolving dynamic web pages using web mining

Evolving dynamic web pages using web mining

Kartik MenonSmart Engineering Systems LaboratoryEngineering Management DepartmentUniversity of Missouri-Rolla

Page 2: Evolving dynamic web pages using web mining

Overview• Goal• Web Mining• General Principle behind web mining• Web Data• Web Access Pattern Clustering• Evolving web pages using cluster information• Clustering Techniques• Fuzzy C means• Experimental Set-up• Results• Conclusion and Future work• Questions

Page 3: Evolving dynamic web pages using web mining

Goal

Cluster similar web access traversal patterns and train the system to understand the needs and demands of different users accessing the website and use this information to evolve web pages.

Page 4: Evolving dynamic web pages using web mining

Web Mining

• Web Mining Learning about different users

accessing a web page.• The needs and requirements of the user• Web Access Traversal Patterns• Links which are more popular than

others• For example www.yahoo.com

» Emails» Search engine» News» Greeting cards

Page 5: Evolving dynamic web pages using web mining

General Principle behind web mining

• Gather web data from Web Log servers

• Cluster web traversal patterns• Evolve web pages

Page 6: Evolving dynamic web pages using web mining

Web Data

• What information is important for Mining– Links traversed (URL’s requested)– Documents downloaded – Time spent on the web page as

compared total time spent– Web Traffic– GET or POST messages

Page 7: Evolving dynamic web pages using web mining

Web Access Pattern Clustering

• Find users with similar web access patterns• Grouping and separating users• Concise representation of a system's behavior• Generalize about user needs and interests

Page 8: Evolving dynamic web pages using web mining

Evolving Web Pagesusing cluster information

• The cluster information can be used – To know about users– Modify the web page– Web personalization– Evolving Web pages

Page 9: Evolving dynamic web pages using web mining

Clustering Techniques • Neural Nets

– Kohonen’s Self Organizing Maps (SOMs)

• Statistical– K-Means

• Fuzzy Logic– Fuzzy C Means– Fuzzy ISODATA

Page 10: Evolving dynamic web pages using web mining

Fuzzy C Means

• Is a data clustering technique where each data point belongs to a cluster to some degree that is specified by a membership function

• If – X is a set of n data sample vectors – U is a partition of X in c part,– V are cluster centers – d^2 is an inner product induced norm – u grade of membership of xk to the cluster i between 0 and 1 – m is a parameter to increase or decrease the fuzziness

Page 11: Evolving dynamic web pages using web mining

Fuzzy C Means (contd)

)vx(d)u()V,U(J i,k2m

n

1k

c

1iikm

c

j

m

ji

ki

ki

d

du

1

)1(2

)(

)(

)(

1

2|| ikik vxd

N

i

mki

N

iij

mki

i

u

xu

v

1)(

1)(

Page 12: Evolving dynamic web pages using web mining

Experimental Set-up

• Target the website http://campus.umr.edu.• Mine the web log files for web data.• The main problem is to convert the web sites

accessed into numeric values.• Identify all the URLs from where you can go from this

web page • Number these URLs from 1 to N where N is the Nth

URL which can be accessed• Assign fuzzy weights (w(j)) to each URL that can be

accessed• A Boolean variable s(j) is defined which is set to 1 if

the jth URL is accessed by the user else s(j) is set to null.

Page 13: Evolving dynamic web pages using web mining

Experimental Set-up (contd.)

• Define the data point x as the number corresponding to the for all the sites accessed by the user in that particular user session.

• Apply fuzzy c-means by calculating Euclidean distance between the data sample as dij=|xj-ci| where xj being the data point and ci being the center of cluster i.

Page 14: Evolving dynamic web pages using web mining

http://.campus.umr.edu(0)

/parents(0.3)/community(0.5)/faculty(0.4)/staff(0.2)/students(0.1)

/registrar(0.11) www.umr.edu/~career(0.120) /departments(0.13)

/registrar/star(0.111) /registrar/courseinfo(0.112) /fairs(0.121) /jobtrack/*(0.122)

/academic.html#art_science (0.131) /academic.html#engineering(0.132)

Page 15: Evolving dynamic web pages using web mining

IP Address URL’s Accessed by the user

131.151.9.999 http://campus.umr.edu, /students, /departments, /departments/academic.html#arts_science

181.147.7.970 http://campus.umr.edu, /students, /registrar, /registrar/star

181.147.7.972 http://campus.umr.edu, /students, http://web.umr.edu/~career, /jobtrak/*

181.148.7.979 http://campus.umr.edu, /students, http://web.umr.edu/~career, /fairs

Page 16: Evolving dynamic web pages using web mining

Results : For 2 and 3 clusters

Page 17: Evolving dynamic web pages using web mining

Results :For 2 and 3 clusters(contd)

Page 18: Evolving dynamic web pages using web mining

Web Page Evolution

• Use the clustered information as

an input to modify the web page so that

users having similar access patterns get same web page as compared to others

• Adjust the placement of links

• Remove certain links (if possible)

Page 19: Evolving dynamic web pages using web mining

Conclusions• Fuzzy c-means is an easy way of

clustering similar web access patterns

for different user sessions • The use of Euclidean distance was very helpful to

learn more about these web access patterns. • The experiment provided easy results and plots

which was highly interpretable • We observe that that fuzzy c-means provided stable

results for the different data sets we took.

Page 20: Evolving dynamic web pages using web mining

Future Work

• Use other clustering algorithms and compare

• Developing self evolving web sites - sites that improve themselves by learning from user access patterns

• The results which we got using the fuzzy clustering algorithms could be used to recommend the web master of the http://campus.umr.edu

• Increase the popularity of the web page by tailoring it more to the needs of the users accessing it

Page 21: Evolving dynamic web pages using web mining

Questions ???