csa4080: adaptive hypertext systems ii

1 [email protected] University of Malta

CSA4080: Topic 4© 2004- Chris Staff

CSA4080:Adaptive Hypertext Systems II

Dr. Christopher StaffDepartment of Computer Science & AI

University of Malta

Topic 4: User Modelling



Aims and Objectives

• Short history of User Modelling• In CSA3080 we covered some of the

different approaches to user modelling...– Empirical Quantitative vs. Analytical Cognitive

• ... and Rich’s taxonomies– Canonical vs. Individual– Explicit vs. Implicit– Long-term vs. Short-term



Aims and Objectives

• In this lecture, we’ll cover the implementation approaches to user modelling...– Attribute-Value Pairs– Naïve Bayesian

• ... and types of user model...– Overlay, Differential, Perturbation...

• ... and capturing user behaviour



History of User Modelling

• UM and its history are linked to the history of user-adaptive systems

• Based on the way in which the UM updates its model of the user, the domain in which it is used, and the way the interface is caused to change




• For instance, UM + ratings = stereotype/probabilistic recommender system

• UM + hypertext + adaptation rules = AHS• UM + user goals + pedagogy + adaptation

rules = ITS• UM representation, and how it learns about

its users tends to depend on the domain




• Focusing on generic user modelling• Has its roots in dialogue systems and

philosophy– Need to model the participants to disambiguate

referents, model the participants beliefs, etc.

• Early systems (pre-mid-1985) had user modelling functionality embedded within other system functionality (e.g., Rich; Allen, Cohen & Perrault)




• From 1985, user modelling functionality was performed in a separate module, but not to provide user modelling services to arbitrary systems

• So one branch of user modelling focuses on user modelling shell systems

2001-UMUAI-kobsa (UM history).pdf




• Although UM has its roots in dialog systems and philosophy, more progress has been made in non-natural language systems and interfaces (PontusJ.pdf)

• GUMS (General User Modeling System) first to separate UM functionality from application - 1986




• GUMS– Adaptive system developers can define

stereotype hierarchies– Prolog facts describe stereotype membership

requirements– Rules for reasoning about them




• At runtime:– GUMS collects new facts about users using the

application system– Verifies consistency– Informs application of inconsistencies– Answers application queries about assumptions

about the user




• Kobsa, 1990, coins “User Modeling Shell System”

• UMT (Brajnik & Tasso, 1994): – Truth maintenance system– Uses stereotypes– Can retract assumptions made about users




• BGP-MS (Kobsa & Pohl, 1995)– Beliefs, Goals, and Plans - Maintenance System– Stereotypes, but stored and managed using

first-order predicate logic and terminological logic

– Can be used as multi-user, multi-application network server




• Doppelgänger (Orwant, 1995)– Info about user provided via multi-modal user

interface– User model that can be inspected and edited by

user




• TAGUS (Paiva & Self, 1995)– Also has diagnostic subsystem and library of

misconceptions– Predicts user behaviour and self-diagnoses

unexpected behaviour

• um (Kay, 1995)– Uses attribute-value pairs to represent user– Stores evidence for its assumptions




• From 1998 and with the popularisation of the Web, web personalisation grew in the areas of targeted advertising, product recommendations, personalised news, portals, adaptive hypertext systems, etc.



What might we store in a UM?

• Personal characteristics• General interests and preferences• Proficiencies• Non-cognitive abilities• Current goals and plans• Specific beliefs and knowledge• Behavioural regularities• Psychological states• Context of the interaction• Interaction history

PontusJ.pdf, ijcai01-tutorial-jameson.pdf



From where might we get input?

• Self-reports on personal characteristics• Self-reports on proficiencies and interests• Evaluations of specific objects• Responses to test items• Naturally occurring actions• Low-level measures of psychological states• Low-level measures of context• Vision and gaze tracking



Techniques for constructing UMs

• Attribute-Value Pairs

• Machine learning techniques & Bayesian (probabilistic)

• Logic-based (e.g.inference techniques or algorithms)

• Stereotype-based

• Inference rules kules.pdf



Attribute-Value Pairs

• e.g., ah2002AHA.pdf• The representation of the user and of the

domain are inextricably linked• What we want to do is capture the “degree”

to which a user “knows” or is “interested” in some concept

• We can then use simple or complex rules to update the UM and to adapt the interface




• Particularly useful for showing (simple) dependencies between concepts– Complex ones harder to update

• Can use IF-THEN-ELSE rules to trigger events– Such as updating a user model– Modifying the contents of a document (AHA!,

MetaDoc)– Changing the visibility or viability of links



Overview of AHA!

• Adaptive Hypertext for All!• Each time use visits a page, a set of rules

determines how the user model is updated• Inclusion rules determine the fragments in

the current page that will be displayed to the user (adaptive presentation)

• Requirement rules change link colours to indicate the desirability of each link (adaptive navigation)




• From where do the attributes come?– Need to be meaningful in the domain (domain

modelling)– Can be concepts (conceptual modelling)– Can be terms that occur in documents (IR)




• What do values represent?– Degrees of interest, knowledge, familiarity, ...– Skill level, proficiency, competence– Facts (usually as strings, rather than numerical

values)– Truth or falsehood (boolean)



Simple Baysian Classifier

• Rather than pre-determining which concepts, etc., to model, let features be selected based on observation

• SBCs are also used in machine learning approaches to user modeling– Instead of working with predetermined sets of

models, learn interests of current userProbUserModel.pdf



Simple Bayesian Classifier

• Let’s say we want to determine if a document is likely to be interesting to a user

• We need some prior examples of interesting and non-interesting documents

• Automatically select document features– Usually terms of high frequency

• Assign boolean values to terms in vectors– To indicate presence in or absence from

document




• Now, for an arbitrary document, we want to determine the probability that the document is interesting to the user

P(classj | word1 & word2 & ... wordk)

• Assuming term independence, the probability that an example belongs to classj is proportional to

€

P(class j ) P(word i | class j )i

k

∏



Syskill & Webert

• Learns simple Baysian classifier from user interaction

• User identifies his/her topic of interest

• As user browses, rates web pages as “hot” or “cold”

• S & K learns user’s interests to mark up links, and to construct search engine query

webb-umuai-2001.pdf, ProbUserModel.pdf



Syskill & Webert

• Text is converted to feature vectors (term vectors) for SBC

• Terms used are those identified as being “most informative” words in current set of pages– based on the expected ability to classify

document if the word is absent from doc



Simple Baysian Classifier

• Of course, the term independence assumption is unrealistic, but SBC still works well

• Algorithm is fast, so can be used to update user model in real time

• Can be modified to support ranking according to degree of probability, rather than boolean




• Needs to be “trained”, usually using small data sets

• Works by multiplying probability estimates to obtain joint probabilities– If any is zero, results will be zero...– Can use small constant (0.001) instead

(estimation bias) ...



Personal WebWatcher

• Predicting interesting hyperlinks from the set of documents visited by a user

• Followed links are positive examples of user interests

• Ignored links are negative examples of user interests

• Use descriptions of hyperlinks as “shortened documents” rather than full docs

pwwTR.pdf



Personal WebWatcher

• Also uses a simple bayesian classifier to recommend interesting links

– where

TF(w, c) is term frequency of term w in document of class c (e.g., interesting/non-interesting), and TF(w, doc) is frequency of term w in document doc

€

P(c | doc) =P(c)

w∏ P(w | c)TF(w,doc )

P(c i) w∏ P(w | c i)

TF(w,doc )∑

€

p(w | c) =1+ TF(w,c)

# words + TF(wi,c)i

∑



Personal WebWatcher

• “Training” set is set of documents that user has seen and user could have seen but has ignored

• Uses short description of document, rather than document vector itself



Logic-based

• Does a UM only contain facts about a user’s knowledge?

• Can we also represent assumptions, and assumptions about beliefs?

• Assumptions are contextualised, and represented using modal logic (AT:ac, or assumption type:assumption content)

pohl1999-logic-based.pdf



Logic-based

• We can also partition assumptions about the user



Logic-based

• Advantage is that beliefs, assumptions, facts are already in logical representation

• Makes it easier to draw conclusions about the user from the stored knowledge



Stereotype-based

• Originally proposed by Rich in 1979

• Captures default information about groups of users

• Tends not to be used anymore

1993-aui-kobsa.pdf



Stereotype-based

• Kobsa points out that developer of stereotypes needs to fulfill three tasks– Identify user subgroups– Identify key characteristics of typical user in

subgroup• So that new user may be automatically classified

– Represent hierarchically ordered stereotypes• Fine-grained vs. coarse-grained



Inference rules

• e.g., C-Tutor, avanti.pdf• May use production rules to make inferences

about user• Also, to update system about changes in user

state or user knowledge• Note that Polh points out that all user models

(that learn about the user) must infer assumptions about the user (pohl1999-logic-based.pdf)



Types of User Models

• User Models have their roots in philosophy and learning

• Student models assumed to be some subset of the knowledge about the domain to be learnt

• Consequently, the types of user model have been heavily influenced by this



Student Models

• Student Models are used, e.g., in Intelligent Tutoring Systems (ITSs)

• In ITS we know user goals, and may be able to identify user plans

• The domain/experts knowledge must be well understood

• Assumption that user wants to acquire expert’s knowledge

• Plan means moving from user’s current state to state that user wants to achieve



Student Models

• If we assume that expert’s knowledge is transferable to student, then student’s knowledge includes some of the expert’s knowledge

• Overlay, differential, perturbation models (from neena_albi_honours.pdf p25-)



Overlay models

• SCHOLAR (Carbonell, 1970)

• Simplest of the student models

• Student knowledge (K) is a subset of expert’s

• Assumes that K missing from student model is not known by the student

• But what if student has incorrectly learnt K?



Overlay models

• Good when subject matter can be represented as prerequisite hierarchy

• K remaining to be acquired by student is exactly difference between expert K and student K

• Cannot represent/infer student misconceptions



Differential models

• WEST (Burton & Brown, 1989)• Compares student/expert performance in

execution of current task• Divides K into K the student should know

(because it has already been presented) and K the student cannot be expected to know (yet)



Differential Models

• Still assumes that student’s K is subset of expert’s

• But can differentiate between K that has been presented but not understood and K that has not yet been presented



Perturbation models

• LMS (Sleeman & Smith, 1981)• Combines overlay model with

representation of faulty knowledge– Bug library

• Attempts to understand why student failed to complete task correctly

• Permits student model to contain K not present in expert’s K



Student modelling

• See neena_albi_honours.pdf for more examples of student models...

• We’ll look at ITS in more detail towards the end of the lecture series



Making Assumptions about the user

• Browsing behaviour– What does a user’s browsing behaviour tell us

about the user?




• Searle (1969)... when a speech act is performed certain presuppositions must have been valid for the speaker to perform the speech act correctly (from 1995-UMUAI-kobsa.pdf, 1995-COOP95-kobsa.pdf)




• If the user requests an explanation, a graphic, an example or a glossary definition for a hotword, then he is assumed to be unfamiliar with this hotword.

1996-kobsa.pdf




• If the user unselects an explanation, a graphic, an example or a glossary definition for a hotword, then he is assumed to be familiar with this hotword.

1996-kobsa.pdf




• If the user requests additional details for a hotword, then he is assumed to be familiar with this hotword.

1996-kobsa.pdf



User Actions in Hypertext

• Actions that can be performed in hypertext– Follow link– Don’t follow link– Print– Bookmark– Go to bookmark– Backup– Go to URL– ...



Understanding Browsing Behaviour

• What might each of these actions mean?

• Can we relate them to Kobsa’s assumptions?– Do we need link analysis first?



Identifying Browsing Behaviour

• Lost in Hyperspace (otter2000.pdf)

• Honing in on information

• Needing more help/information

• Being un/familiar with topic/web space

• Interested in topic

• Uninterested in topic

• Changing topic



Identifying Browsing Behaviour

• Search browsing

• General Purpose Browsing

• The serendipitous user

catledge95.pdf



Understanding Browsing Behaviour

• How can understanding browsing behaviour help us create better adaptive hypertext systems?– Less intrusive– Just-in-Time support– Don’t give more info when it is not

required/wanted– Efficient use of resources



Conclusions

• The ability to model the user allows reasoning about the user to tailor an interaction to the user’s needs and requirements...

• ... especially when the user is unable to describe what it is they need

• Tightly bound to domain/expert knowledge



Conclusions

• Significant efforts to decouple the user model from the application

• May be too expensive to accurately model all domains, and in any case, goal of many adaptive systems is not to help user become expert, but to provide timely assistance at the right level of detail

csa4080: adaptive hypertext systems ii

Documents