weighted markov chains and graphic state nodes for information retrieval

9
Weighted Markov Chains and Graphic State Nodes for Information Retrieval G. Benoit College of Communications & Information Studies, University of Kentucky, Lexington, KY 40506. Email: [email protected] Decision-making in uncertain environments, such as data mining, involves a computer user navigating through multiple steps, from initial submission of a query through evaluating re- trieval results, determining degrees of accept- ability of the results, and advancing to a terminal state of evaluating where the interaction is suc- cessful or not. This paper describes iterative in- formation seeking (IS) as a Markov process dur- ing which users advance through states of “nodes”. Nodes are graphic objects on a com- puter screen that represent both the state of the system and the group of users’ or an individual user’s degree of confidence in an individual node. After examining nodes to establish a con- fidence level, the system records the decision as weights affecting the probability of the transition paths between nodes. By training the system in this way, the model incorporates into the under- lying Markov process users’ decisions as a means to reduce uncertainty. The Markov chain becomes a weighted one whereby the IS makes justified suggestions. Introduction Human-computer interaction is a complex issue and a universal approach applicable to all types of human- machine situations and for all computational activities is not feasible. Some types of interaction, such as data mining and information storage and retrieval systems (ISR), re- quire a certain degree of sustained interaction between an end-user and the computer system in order to yield inter- pretable results. Displaying retrieved data, especially when the retrieval set is very large or when relations among the data are unknown or contradictory, begs the question of what role the interface should play in presenting data to reduce uncertainty. What types of system interaction for guiding interpretation of the data sets are acceptable? How might users manage potentially complex displays that con- flict with their own problem-solving heuristics? A common solution is to build systems modeled on human-human in- teraction or to program a system that largely directs the interaction, but this approach may not be the most appro- priate. A few interactive, decision-oriented computer- human interaction properties can be shaped into a model for interface behaviors, and imposed on existing systems, which gives the end-user a reasonable amount of interac- tion for individual decision-making and interpretation that support hislher heuristics. The User and the Information System The User This paper considers the uncertain decision making and behaviors of users of information systems as a form of randomness. The premise is that users benefit from interacting with information systems which 0 0 0 suggest paths of investigation, but do not dictate to the user which path should be taken, demonstrate the consequences or likely outcomes for selecting a path, indicate degrees of confidence in a selection, show multiple “states” - points in time where deci- sions were made; and permit the user to move through these states and exam- ine outcomes without fear of losing benefits achieved up to a given point in the interaction. To develop the theory, this paper considers interaction between information storage and retrieval systems (ISR) and end-user as a form of communication that does not mimic human-human interaction. Instead it incorporates a set of behaviors that humans use in discourse (Ervin-Tripp, 1974; Lievrouw & Finn, 2000) and models them into types of human-machine discourse as interactive dialogue boxes which yield degrees of support for system-supplied behav- iors. The Information System In uncertain situations, users may take any set of ac- tions but the system provides interaction opportunities only certain kinds of behavior and only at certain times. How much guidance should the system provide, and how much and what kind of interactivity can be programmed to be both computationally efficient and effective for the indi- vidual user? Typically in response to one user input, the query, the ISR system retrieves sets of documents, or terms, and rank the results according to some similarity measure of supposed relevance (Baeza-Yates & Ribeiro-Neto, 1999). This approach, however, does not help in uncertain situations: users can lose results, or follow hyperlink trails ASIST 2002 Contributed Paper 115

Upload: g-benoit

Post on 15-Jun-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Weighted Markov chains and graphic state nodes for information retrieval

Weighted Markov Chains and Graphic State Nodes for Information Retrieval

G. Benoit College of Communications & Information Studies, University of Kentucky, Lexington, KY 40506. Email: [email protected]

Decision-making in uncertain environments, such as data mining, involves a computer user navigating through multiple steps, from initial submission of a query through evaluating re- trieval results, determining degrees of accept- ability of the results, and advancing to a terminal state of evaluating where the interaction is suc- cessful or not. This paper describes iterative in- formation seeking (IS) as a Markov process dur- ing which users advance through states of “nodes”. Nodes are graphic objects on a com- puter screen that represent both the state of the system and the group of users’ or an individual user’s degree of confidence in an individual node. After examining nodes to establish a con- fidence level, the system records the decision as weights affecting the probability of the transition paths between nodes. By training the system in this way, the model incorporates into the under- lying Markov process users’ decisions as a means to reduce uncertainty. The Markov chain becomes a weighted one whereby the IS makes justified suggestions.

Introduction Human-computer interaction is a complex issue and a

universal approach applicable to all types of human- machine situations and for all computational activities is not feasible. Some types of interaction, such as data mining and information storage and retrieval systems (ISR), re- quire a certain degree of sustained interaction between an end-user and the computer system in order to yield inter- pretable results. Displaying retrieved data, especially when the retrieval set is very large or when relations among the data are unknown or contradictory, begs the question of what role the interface should play in presenting data to reduce uncertainty. What types of system interaction for guiding interpretation of the data sets are acceptable? How might users manage potentially complex displays that con- flict with their own problem-solving heuristics? A common solution is to build systems modeled on human-human in- teraction or to program a system that largely directs the interaction, but this approach may not be the most appro- priate. A few interactive, decision-oriented computer- human interaction properties can be shaped into a model for interface behaviors, and imposed on existing systems,

which gives the end-user a reasonable amount of interac- tion for individual decision-making and interpretation that support hislher heuristics.

The User and the Information System The User

This paper considers the uncertain decision making and behaviors of users of information systems as a form of randomness. The premise is that users benefit from interacting with information systems which

0

0

0

suggest paths of investigation, but do not dictate to the user which path should be taken, demonstrate the consequences or likely outcomes for selecting a path, indicate degrees of confidence in a selection, show multiple “states” - points in time where deci- sions were made; and permit the user to move through these states and exam- ine outcomes without fear of losing benefits achieved up to a given point in the interaction.

To develop the theory, this paper considers interaction between information storage and retrieval systems (ISR) and end-user as a form of communication that does not mimic human-human interaction. Instead it incorporates a set of behaviors that humans use in discourse (Ervin-Tripp, 1974; Lievrouw & Finn, 2000) and models them into types of human-machine discourse as interactive dialogue boxes which yield degrees of support for system-supplied behav- iors.

The Information System In uncertain situations, users may take any set of ac-

tions but the system provides interaction opportunities only certain kinds of behavior and only at certain times. How much guidance should the system provide, and how much and what kind of interactivity can be programmed to be both computationally efficient and effective for the indi- vidual user? Typically in response to one user input, the query, the ISR system retrieves sets of documents, or terms, and rank the results according to some similarity measure of supposed relevance (Baeza-Yates & Ribeiro-Neto, 1999). This approach, however, does not help in uncertain situations: users can lose results, or follow hyperlink trails

ASIST 2002 Contributed Paper 115

Page 2: Weighted Markov chains and graphic state nodes for information retrieval

that lead to useless data, and cannot with any evidence pre- dict the outcome of the interaction. The alternative ISR proposed here would address these concerns by

representing multiple time-dependent states on a single screen display suggesting graphically links between states suggesting graphically degrees of support for a series of linked states permitting interaction between end-user and system so user input influences chains of states (“state nodes”) building on the successful searches of previous users, and permitting progress and regress through time states to encourage investigation of multiple paths without loss of beneficial results.

0

Figure 1 demonstrates the concept.

Figure 1 : Information retrieval presented as a series of states, represented graphically as nodes with degrees of

association, based on strength of similarity of the underly- ing document or term sets.

This article addresses these issues by outlining five themes that underlay the model:

0

0

Consider uncertainty and user interaction, Formalize user uncertainty and interdocument similar- ity in information seeking as Markov processes, Formalize user decision making as a heuristic, Integrate the above into an interactive information sys- tem that combines system-provided guidance, interac- tive tools for user-input, and then a schema for present- ing the results in an interactive interface.

diately preceding outcome. Since each user interaction be- gins at a known point, progresses through a series of states to a termination, an individual user’s and then a group of users’ behavior is a Markov chain: the process U, = u (X,’, X:, ..., X,“), where X is a selected node, and U, a series of user transactions.

Uncertainty and User Interaction On-line Systems

In on-line systems, process control tasks are divided by agent role: the system and the user. From the system agent role, the interaction is a progression from an initial state, through any number of subsequent states, to a terminal one where the information needed to continue (the retrieval set) is determined by the system’s programming. “A predomi- nant model”, writes Langlotz and Shortliffe (1 984), “for expert consultation systems is one in which a computer program simulates the decision making process of an ex- pert. The expert system typically collects data from the user and renders a solution.” The solution, however, does not necessarily meet the individual user’s needs, is interpret- able by the user or hlly acceptable without further interac- tion. The traditional computer-human communicative event is single cause/effect pair, initiated when the user performs some action (such as input a query) or when the system replies with a modal dialog box. Interaction of thls simple sort between systeduser may be informational (such as “file done”) and the user acknowledges receipt of the mes- sage. Modal dialogues are processes started by the system to suspend the interaction until the user replies, such as dialog boxes that solicit “Yes No Cancel”.

The performance by system agents is based on the re- sult of pattern matching: a comparison to stored data re- turns a set of items that satisfy a logical condition, a binary yestno. Some search and retrieval engines, such as those that service the Internet, rank retrieval results by assumed relevancy to the query. It is a kind of fuzzy-logical match between input and stored data. The user navigates through a simple or ordered index list of retrieval results. The user progresses from a single index (the retrieval result screen) through various permutations, or states. Additionally in retrieval situations where there is a system-determined relevancy rank, the user has limited ways to investigate the rationale for the ranking. Where the ranking is interpreted by the user as marginally useful or contradictory, then sub- sequent events essentially become reduced to equal prob- ability despite system relevancy ranking.

A difficulty in ISR system design is how to interpret user input in light of his current aims. As a dynamic par- ticipant, the user hmself helps to structure the interaction and h s informed input shapes the creation of hture states (Jackson and Lefrere, 1998). It seems reasonable to view this type of user transaction as a Markov process (Asmus-

ity of an occurrence of each event depending on the imme-

Users The process of matching and decisions of association

is not obvious to users. Their level of consternation or hap- piness with the system is pegged to how well the user senses the retrieved data sets potentially fulfill their re- quirements. The sense of uncertainty cannot be resolved 1987), that is, a series Of random events, the probabil- until the invests in evaluating the utility of the pro-

ASIST 2002 Contributed Paper 116

Page 3: Weighted Markov chains and graphic state nodes for information retrieval

posed information resources. This is usually done separate from the information system’s interaction. To advance to more interactive systems that aid end-users’ cognition, minimize affective and behavior controlling processes, without overloading the user with system-specific task functions, another form of human-computer interaction is required.

From the user’s perspective, this alternative interaction may be cast as a form of rhetoric; an argumentation be- tween forms of evidence. The decision-making heuristic of users in real-world situations includes the binary yes/no, or confirmatiordnegation pair but it also includes complemen- tary streams of investigation that leads to degrees of sup- port. In human-human interaction, where there is uncer- tainty, speakers turn to complementary streams of investi- gation, tangential to the main discourse theme. Speakers rely on these brief forays to clarify some speech act or their own thoughts before returning to the main theme to make a decision. In the case below, the subjects ranked in descending order of importance, supervisor’s opinion, co- workers method or problem resolution, and printed stan- dard operating procedures, as the sources of evidence they use when deciding how to proceed in handling uncertainty.

End-users of IR systems fear of loss of potentially use- ful results in long linear or in non-linear navigation is a real concern (Scull, 1999). The fear of losing the benefits of the interaction affects the user’s decisions for hture actions. In extended or complicated searches, this kind of navigation and analysis of the process become a serious detriment to successful application of the computer as a tool in informa- tion retrieval.

Users develop from their own personal experiences a heuristic, that is, a rule of thumb that allows assigning a value for a variable that otherwise would be uncertain. Their personal heuristic affects their interaction as a user agent. User uncertainty increases when system agency runs counter to the user’s individual needs or when the user can- not evaluate results in light of the rationale. In a novel hu- man-computer interaction situation or in familiar situations with unfamiliar information objects, the user’s reliance on a given heuristic may actually confuse the present interaction instead of serving as a justifiable guide.

The system described in this paper addresses these is- sues by modeling interaction as a progress of states, shared by both the system and the user. The interface provides a graphic representation (called “state nodes”) of the poten- tially useful documents (or clusters of documents) and tan- gential evidence, which form a base of related concepts. These suggest how a user might advance through related themes. In this way, it reflects the history of the progress from the initial state to the present one reminiscent of hu- man-human interaction. Moreover, the model incorporates special interaction behaviors to provide the user with sup- plemental information and guidance about how to proceed to other states. This consequence-oriented information- seeking process is expressed as potential future states with

ASIST 2002 Contributed Paper

levels of evidentiary support (subjective probability), demonstrated on the screen as a network of nodes. The nodes themselves represent the navigation history of the interaction [the paths chosen by the user], the degree of support for selecting a node, and based on the cumulative probability of the path as it develops, the system will propose future states. At any point the user can return to an earlier state, save that state, and investigate other streams as equal to the saved one or as tangential ones. Users may also introduce their own nodes. This dynamic, subjective progress through states can be described as a Markov process. IS as a Markov Process Related work

In any information system, the two agents are the sys- tem model and the user. From the system perspective, in- formation retrieval is based on any number of random in- puts from a user, which initiates and adjusts the retrieval set membership, e.g., as in a relevance feedback system. In a relevance-feedback system or any probabilistic model, the initial retrieval set can be considered as the first of n states, which impact or determine the probability of document membership in the next iteration or state. User choices of what documents or clusters of resources to select is random and without other restrictions each document cluster has an equal chance of selection. Information retrieval, then, may be cast as an ergotic Markov process (Anderson, 1991; Asmussen, 1987; Breiman, 1969; Dartmouth 2002; Gre- chening and Tschegligi, 1993; Jackson and Lefrere, 1998).

There has been some interest by researchers in apply- ing Markov models to information-technology processes, where calculations are based on incomplete information or for decision support. Zhang (2001), for example, applies Markov to artificial intelligence tasks in uncertain envi- ronments. Paterman (1990), Cassandra (1998), Rajgopal and Mazumdar (2002) evaluate statistical inference in sys- tems. Few researchers focus specifically on information retrieval: Chen and Cooper (2002) studied stochastic mod- eling of use patterns. Miller, Leek and Schwartz (1999) and Danilowicz and Balinski (2001) focused on standard vec- tor/idf-tw information retrieval. Other work emphasizes analyzing the language of documents for retrieval, such as Szepesvari and Littman (1 996). Kaelbling and Littman (1994) combine Markov models and concept-oriented lin- guistic models of information (such as latent semantic in- dexing), and Lafferty and Zhai (2001) who create a frame- work for document and query models, incorporating user preferences, among other features. Researchers generally acknowledge the potential of Markov chains to clarify se- mantic issues and to respond to user feedback: “If query terms have multiple senses, a mixture of these sense may be present in the expanded model. For semantic smoothing, a more content-dependent model that takes into account the relationship between query terms may be desirable. One way to accomplish this is through a pseudo-feedback mechanism ... In this way the expanded language model

117

Page 4: Weighted Markov chains and graphic state nodes for information retrieval

may be more ‘semantically coherent,’ capturing the topic implicit in the set of documents rather than representing words related to the query terms qi in general” (Lafferty and Zhai, 2001, p. 115). If IR is viewed as a Markov proc- ess, then semantic-level or concept-level relationships may form association states.

For users of an IR system, the uncertainty they feel in- fluences their successful interaction and sense of satisfac- tion. Providing users with the same type of statistical back- ground data (Berger and Lafferty, 1999) and confidence levels may help them establish a picture of reality that they can rely on more hlly than occurs with hierarchical lists. An interactive, graphically-oriented IR session may provide the type of “pseudo-feedback mechanism” suggested above and contribute to the overall interpretability of the retrieval session and relevancy judgments. As a dynamic participant, the user helps to structure the interaction and hisher in- formed input shapes the creation of future states (Jackson and Lefrere, 1998).

This section describes the model of the system and then end-users as a Markov process. The retrieval system based on the model is then submitted to a group of end- users to determine whether their interactions were more satisfying, confidence levels increased and whether their relevancy ranking supports the transition probabilities ma- trix.

Model description System

Although the purpose of this paper is to consider the impact of weighted Markov chains and graphic state nodes on end-users’ confidence, there is still first the need to de- fine the model. Associated with each input by a user is a specific path or sequence of responses that form the infor- mation seeking event. The model here assumes that event control is passed between nodes (the document or docu- ment clusters returned in response to a query), according to a Markov chain. The probability pij that control transfers from node i to another node j is independent of how i was entered. In response to a query, there will be n such nodes, where node 1 (defined below also as state-0, so) represents the initial state. The goal of information retrieval is to minimize the number of iterations with a system before the user is satisfied, and to avoid a failure state. Therefore, the model requires a terminal node, S (satisfaction) and F (fail- ure), with probabilityp8 and pjF of being entered from state i. pis +C”,pgii = 1. No IR system is assured of complete relevancy (or convergence) in its initial state, each node contains irrelevant documents and that node i has a rele- vance factor, (ri), that is P(non-relevant documents when user selects node i) = (1- ri). “The Markov chain thus has n -t 2 states and a transition matrix Q, where qii = ripv for i =

1, 2, ..., n and j = 1, 2, ..., n, S; qjF= 1 - ri for i = 1, 2, ... n; and qFF = qss = 1” (Rajgopal and Mazumdar, 2002,360).

Users Since each user interaction begins at a known point,

progresses through a series of states to a termination, an individual user’s and then a group of users’ behavior is a Markov chain: the process U, = u (X t ’ , X:, ..., X,”), where X is a selected node, and U, a series of user transactions. The standard discussion of basic Markov processes is that the number of possible states is limited or finite; states are indicated .... From the initial state node (time, FO), users advance with uncertainty to subsequent states. From the initial node E~ (or from any node EJ, uncertain users may advance in a random fashion (t = 1, 2 . . .). Thus the random variable k(,) represents the state of the user’s pro- gress at a given time, f . The progress can be indicated as consecutive transitions from node to node:

When users of an IR system initiates a session (t = 0), the system is in its initial probability state (E~) to the next one:

i, j = 1, 2 ..., At any time during the information seeking event, the user is in a state, represented as a node (n) in the interaction. Users provide input to advance from the present state (E~)

that an IR system can use to help guide the interaction to the next best state. The probability that the system goes into another state E~ based on the user’s input is given as

regardless of n. A “typical” computer user of the system may want

only to interact to a certain point before abandoning the activity if there is no sense of success.

The system states (not the process of transition) consti- tute a Markov chain, let p,(n) = Pg(n)= E,}, that is, the

probability that the system will be in state E~ after n number of steps. Given the number of states &k, k = 1,2 . . . , the sys- tem must be in &k after n - 1 steps. {E,(n-l)= E k } , k l , 2 ,.... The fact that the user has advanced from one state to an- other establishes a link of nodes, the selected nodes form a set of states. Given also that some event in the mutually- exclusive set will occur if the user offers input, the prob- ability of the next step is P(i+l) equals the sum of the prob- abilities of a given next step, given the probabilities of pre- vious states, times the probability of previously selected states:

5(0) -3 5 (1) -+ 5(2) + . . .

P p {5(0)= 5 I5(n) = 5 1 3

i, J ’ - - 1 , 2

P{{(n) ‘E,} =

1 P{ {(n) = El I C(n - 1) = Ek }P {{(n - I) = Ek}.

PLO) = Pg + Pj (n> = C P h - h,

k The user’s uncertainty at the start of the interaction is

rewritten as n = 1,2 ....

k

When the system is in state node i at time = 0, the ini- 0 0 tial probability is p i = I,p, = 0 , if k # i.

ASIST 2002 Contributed Paper 118

Page 5: Weighted Markov chains and graphic state nodes for information retrieval

Subsequent states have the probability p , , j ( n ) = P ( r ( n ) = & j 16(0)=&~), i , j=1,2 ,.... Theoretically, both users and retrieved documents can

be represented by Markov chains. Here the emphasis is on the document clusters. A Q-matrix (Asmussen, 1987) of the probability of the nodes can be constructed that reflects the system’s relevancy ranking (and transition probabilities to associated nodes) and the end-user’s decision state. [In this initial state it is assumed that users can move to any node with equal probability.]

’=

This paper follows Danilowicz and Balinski’s (2001, p. 625-6) use tfeidf as transition probabilities being propor- tional to initial similarities between documents and popu- lates the @matrix. The transition probabilities is based on a document set (D) consisting of documents (or clusters) {d] ,

0 0.65 0.35 0 0.15 1-5 0 0 0 rz 0 1-r, 0 0 0 r, 0 I-r,

0.3r4 0.2r4 0 . 4 ~ O.lr, 0 I-r, 0 0 0 0 1 0

~ 0 0 0 0 0 1

The transition probabilities p(d,]dJ is the interdocu- ment similarity s(d,,4). r is the relevancy. As they describe it: “p(d,ld,) = s(d,,d,)lc, where c = const for i, j = 1, 2, . . ., n. As described above, P = [p,] is the transition matrix, so prJ =p(d,ld,) , and p = [pi, p 3 . . .pn] is the vector of probabili- ties”:

s(d, 9 d,) “J = m$x,s(d,, d,))

In terms of relevancy ranking overall performance might be computed via

i= 1

I, is the identity matrix of order n, QA is the (n x n) submatrix of Q (the trained or user-weighted preferred

pii are assumed to be known (based on tw-id0 is based on the probability of encountering a specific input (user choice based on investigation of the underlying statistical rank and evidence [the documents themselves]) and choices (the subsequent feedback) of the end-user provide the spe- cific path of nodes through which control is transferred for that particular input (the random variable). Thus given the user input the probabilities pii must be recalculated. It is reasonable to assume that pii could be estimated for a given session (based on the query), but we cannot assume the

Path).

relevancy values (r,) will be known for the individual nodes.

The transition probabilities of the above example would look like Figure 2 when plotted (after Rajgopal and Mazumdar, 2002, p. 360):

Figure 2. Transition probabilities as nodes.

Dynamic, Evidence-Based Heuristics The conceptual model is a forward-chaining, induction

system. Each state provides some declarative knowledge about the state, such as historicity (past, present, and future states) and a statistically-based confidence level that re- flects the relationships between states, and a degree of evi- dence for the state.

The graphic representation gives the user a multidi- mensional block of potential behaviors. From a given state, the user is provided information about the future possible states. Theoretically from this evidence users will advance with more confidence from state to state for several rea- sons:

Because the decision to proceed is reasoned by the user with more supporting evidence, the user works with the system in navigating uncertainty instead of the sys- tem controlling the interaction. Additionally, the system provides a means for the user to investigate further possible states before committing to the action, and through the interface users can re- gress from a given point to a past state without loss of benefits. Finally, user regress and progress can be performed non-linearly. Users can jump to different states or dif- ferent streams as evidence and their own knowledge suggest new ideas.

[A fuller description of the user input model, screens, and other data will be available at http:Nslis-it.uky.edu/Markovmodel.htm]

0

0

Scanning and Event Detection As the evidence is calculated for each node, the system

essentially is scanning the set of possible states and estab- lishing a strategy for interaction.

The strategy is based on which events are selected and their probability strengths as they pertain to the current interaction between a user and system. Events may be pre- set triggers, say when the cost of calculating a node reaches

ASIST 2002 Contributed Paper 119

Page 6: Weighted Markov chains and graphic state nodes for information retrieval

some critical level, or otherwise as determined in a real- world implementation.

Based on this strategy of scanning for events, the sys- tem itself may offer future paths based on the user’s selec- tions and calculated probabilities. These offers are recom- mendations for action by the user represented as potential states, along with probability heuristics for the user. From these evidences, users may interpret for themselves the cost of accepting the system recommendations or they pursue their own choices.

The combination of automatic system recommenda- tions and user selection gives a richness of dimension to the interaction that rule-based, closed, or linear systems do not. Manual control of events, along with degrees of evidence, and a limited set of actions theoretically minimize some of the user’s affective concerns and randomness in the system.

Each node (E,J carries with it both the base probability as part of the Markov process and a subjective probability, the aggregate of the behaviors of previous users of the sys- tem. The choice of previous users is cast as a weight (w) for a given node, ostensibly for end-users seeking the same type of records. From an initial set of probabilities de- scribed above, after a user has input a query and the re- trieval set created, the weights of nodes are added to the base probability: p I 2 or

P(n> = lbij + wi,(n> I I = [ p l l +WIl(.)] [ p 2 + W121tq)] ... [ p 2 1 + W2l(.;l [ p 2 l + W22(.,] ...

... ... ... The model of interaction theorized becomes a

probability-based heuristic with supporting evidence where choices are made explicitly by the end-user. Th~s approach has several benefits. One is the inferential power of the model. Users are able to infer the progress of states as their own development in the information retrieval and in inter- pretation of results. Interaction with the machine moves closer to an interpretive, cognitive event, focused on the data or information need of the user. Depending on the un- derlying knowledge representation, the graphic (screen level) representation of states with supporting evidence lets the user infer the likelihood of success both using the sys- tem and in evaluation of resources for the present need.

As the system maintains evidence about its own per- formance and user input, it provides its own evidence to be integrated into the calculation of node strength. From that the system derives dynamically the state of potential state nodes.

The system, then, learns from its own behavior. In a real-world application, the user may opt to create a personal profile (such as language level, types of resources) to tailor the system’s performance.

ASIST 2002 Contributed Paper

Example System A real-world information retrieval system has been

built on this principle. Using Java2, the application has been trained to perform a search on a single query term and paint on screen the average paths taken by users in the training phase. A test collection of 1000 records with key- words derived from the records stored in an inverted file. The application solicits a search query term. If the term is found, additional searching is done to create initial associa- tions and probabilities.

A convenience sample of twenty staff members of the UC Berkeley Budget and Finance department, all with > 10 years experience in fund accounting, highly experienced computer users (80% female), were asked to participate. All were asked which search engine display they preferred: the current method of unranked document lists with hyper- links, or an alternative graphic hygraph. (The hygraph is a graphc display of documents as interactive nodes with edges between them representing the strength of the transi- tion probability.) The participants were asked to confirm that the original distribution of concept clusters represents the concepts associated with the query. The users were asked to alter the probabilities through the interface. In response to user input (probability weighting), the interface is redrawn, with the edges and cluster positions reflecting the recalculated probabilities.

In t h s test example, the user selected a node and then specifically requested a tangential investigation into that node. The system retrieved the source of the underlying document and presents it in the dialogue box in State 2, as evidence for the record (Figure 3).

The dialogue box is a secondary frame displayed on the screen. In this frame are two sections: the upper section is a text area that displays the abstract, as a form of evi- dence for the system’s recommendation. The lower section is a series of choices for the user which map directly to the five interactive models described above. In this example, the user reads the evidence for the system claim in the dia- logue box and makes his decision. The dialogue box’s pro- gramming converts the user’s qualitative response into a quantity that is integrated into the q-matrix as a weight.

The addition of the user input may force a recalcula- tion of node states, as indicated in Figure 3.

120

Page 7: Weighted Markov chains and graphic state nodes for information retrieval

State 1: user submits

State 2: user investigates support evidence for R

State 3: User decision in state 2 creates the probabilities of the next state nodes: - 0

Figure 3: a) Query (9) and a retrieval set (R).

b) In state 2, the user initiates a dialog box to examine the evidence for the proposed node. The bottom of the dialog box has buttons to record the user's assessment

of task and stores the conclusion state as a weight. c) Based on the additional weight, the q-matrix probabili-

ties are recalculated and nodes with acceptable prob- ability levels are shown on screen as graphic nodes.

The user-interaction continues as nodes are calculated and presented. The system tracks user decisions based on the document or term properties that underlay the node. These decisions are included in creating the weight factors for the Markov process. Subsequent users inherit the projected probabilities for queries, when the query terms are similar to the previously successful searches.

Table 1 presents a q-matrix whose elements represent document sets of a retrieval group:

p = P' PI2 pIY+W(", p l 4 + W ( n ) ...

pzl+ W(") pz2+ W ( n ) pz3 p 2 4 ... P" p3z p31+ W ( n ) ... ...

Table 1 : q-matrix of probabilities. The matrix is weighted based on the user's input.

Figure 4 suggests how the graphic nodes, the user- input, could be superimposed on the matrix.

P11 P12 ...

P23 P24 ...

P31 P32 '@ P34 ...

I

Figure 4: Imposition of graphic nodes on members of the q- matrix.

In the application interface, user control performed via scroll bars, pop-up lists and text fields. Evidence for nodes are presented in two ways: one is a dialog box (as in Figure 8). The other is a text area. The text field indicates the node identifier (node number and term). The text area

ASIST 2002 Contributed Paper 121

presents data about the relationship between the search term and the underlying source document: its probability factor from the matrix, author name, document name, key- words, publication date, and abstract. Figure 5 demon- strates part of the interface; for clarity, most of the retrieval set and interface controls have been removed from the im- age.

Figure 5a: Interface exploiting weighted nodes. The figure

. . . . . . . . . . .

Conclusions This project suggests that Markov processes can be in-

tegrated into a dynamic information retrieval system where both system and user contribute to the retrieval behavior. The use of graphic nodes represents a means for end-user interaction to investigate data sources for determining document utility, and to minimize user anxiety of exploring unknown data by creating a graphic trail for progress and regress as the user sees fit without loss of retrieval gains. End-user decisions are incorporated as weights to update in real-time the probability matrix which over time makes this system appropriate for data mining and extracting novel relationships among documents.

Users Users were surveyed for their preferences in interface

design (original list or interactive nodes), end-user satisfac- tion with the system, concurrence with the determination of associated clusters, overall sense of confidence using the system, and whether they would recommend the approach to others in problem-solving. The results of the non- parametric binomial test (p > 0.5) indicate that the majority of the population represented by the sample prefer the graphic interface, concur with the clusters, would recom- mend the interface, are more confident in the system prob- abilities (represented by the graphic edges between nodes) as useful, and prefer to view the source documents. Results for whether the users agree with the original distribution

Page 8: Weighted Markov chains and graphic state nodes for information retrieval

(before weighting) are inconclusive, even though the majority of the participants preferred the new display, it will require a bigger sample to make a conclusive inference for the population.

Prediction of user transitions in this type of IR system may reveal specifics of end-user decision making and when failure is likely. Integration of context-specific evidence as a weighting factor, then, can be used to respond to potential failure by having the system respond through the interface, with explicit degrees of certainty, what the user should con- sider in response to hislher information need.

Acknowledgments I would like to thank Prof. Kert Viele, assistant professor of statistics, and Helena Truszczynska, Univ. of Kentucky, and the reviewers for their comments.

References Anderson, W. J. (1991). Continuous-Time Markov Chains. New

York: Springer-Verlag.

Asmussen, S. (1987). Applied Probability and Queues. New York: Wiley.

System Markov chains are used to predict convergence (P = 1)

in n transitions. In this model, user choices are used to

Asmussen, S. (2000, June). Matrix-analytic models and their analysis. Scandinavian Journal of Statistics. 27(2), p. 193-226.

weight probabilities to create pij =1 in n transitions. Uki- mately, by integrating and adapting to user inputs, it may be possible in the future to train the information retrieval system to achieve ps = 1 in one transition. However, given that end-user confidence in the system is increased by be- ing able to see both the source documents and the transition

Baezer-Yates, R., & Ribeiro-Neto, B. (1999). Modern Information Retrieval. Reading, MA: Addison-Wesley.

Berger, A., & Lafferty, J. (1 999). Information retrieval as statisti- cal translation. In Proceedings of the 1999 ACMSIGJR Confer- ence on Research and Development Information Retrieval. (pp.

probability, it may be useful to develop a graphic user in- 222-229).

terface approach that lets advance through states re- gardless of how they entered the state (which may give them a greater understanding of how their information need

Breiman, L. (1 969). Probability and Stochastic Processes: with a view toward applications. New York: Houghton MiMin.

is associated with other concepts), and which suggests the strongest probabilities of utility within a given use context (such as the example here of fund accounting).

Chen, H.-U. & Cooper, M. D. (2002). Stochastic modeling of usage patterns in a web-based information system. Journal of the American Society for Information Science and Technology, 53(7), 536-548.

Further Research The study described in this paper is not yet a wholly

operational, probabilistic type information storage and re- trieval system. The purpose was to investigate whether Markov process could be used to guide end-users and if weighting could be integrated into state nodes. Addition- ally, the research is based on a test bed of documents of only 1000 documents and the system was tested by a small sample (20) of domain-specific end-users. A revised opera- tional version is being developed to address scalability is- sues. In this model the term-document matrix is replaced by an object-oriented approach. The class-relationship defini- tion offers greater granularity and more precise semantic expression. Combining the class-relationship approach with the weighted Markov process suggests that missing data in the object definition could be compensated (Asmussen, 2000). Additionally the distribution of state nodes, whch at the moment is linearly assigned by the program, to empha- size the time-dependent transitions states from query to conclusion of the search, could be replaced by a probabilis- tic distribution of the sum of classes (Conniffe & Spencer, 2000). Future tests can compare predicted system perforrn- ance (Rs) to a trained system.

ASIST 2002 Contributed Paper

Conniffe, D., & Spencer, J. E. (2000). Approximating the distribu- tion of the maximum partial sum of normal deviates. Journal of Statistics, Planning, and Inference, 88, 19-27.

Danifowicz, C., & Balifiski, J. (2001). Document ranking based upon Markov chains. IP&M, 37,623-637.

Darmouth College, Chance Project, Dept. of Mathematics. (2002). Markov chains. In Probability (chp. 11). Online. [Available] http://www.Dartmouth.edu/-chance/teaching-aids~ooks-articl es/probability-book/Chapterll .pdf

Douglas, W. (1 990, Winter). Uncertainty, information-seeking, and liking during initial interaction. Western Journal of Speech Communication 54(1), 66-82.

Ervin-Tripp, S. M. (1974). Sociolinguistics. In B. C. Blount (Ed.). Language, Culture & Society, (p. 268-334). Cambridge: Win- throp.

Grechening, T., & Tschegligi, M., (Eds.). (1993). HCI, Vienna Conference, VCHCI '93 Fin de sidcle. LNCS 733. New York: Springer.

Jackson, P., & Lefrere, P. (1998). On the application of rule-based techniques to the design of advice-giving systems. In: Troek Andreasen, Henning Christiansen & Henrik L. Larsen, (Eds.), Flexible Query Answering Systems. New York: Springer- Verlag, p. 177-200.

122

Page 9: Weighted Markov chains and graphic state nodes for information retrieval

Kaelbling, L. P. & Littman, M. L. (1994, Sept.). A bibliography of work related to reinforcement learning. Technical Report CS- 94-39. Providence: Brown Univ, Dept. of Computer Science.

Laffery, J., & Zhai, C. (2001). Document language models, query models, and risk minimization for information retrieval. SICIR’OI. ACM, p. 11 1-1 19.

Langlotz, C. P., & Shortliffe, E. H. (1984). Heuristic program- ming. In: M. J. Coombs, (Ed.), Developments in Expert Sys- tems. London: Academic, p. 77-94.

Lievrouw, L. A,, & Finn, T. A. (1 990). Identifying the common dimensions of communications: the communication systems model. In B. D. Ruben, & L. A. Lievrouw (Eds.). Mediation, Informalion and Communication: Information and Behavior, vol. 3. New Brunswick, NJ: Transaction.

Liu, Y., et al. (1998). Using stem rules to refine document re- trieval queries. In: Troels Andreasen, H. Christiansen & H. L. Larsen, (Eds.), Flexible Query Answering Systems. New York: Springer, p. 248-259.

Miller, D. R. H., Leek, T., & Schwartz, R. M. (1999). A Hidden Markov Model Information Retrieval System. Proc. of SIGIR, pp. 2 14-22 1, Berkeley, CA, 1999.

Rajgopal, J., & Mazumdar, M. (2002). Modular operational test plans for inferences on software reliability based on a Markov model. IEEE Transactions on Software Engineering, 28(4), 35 8-363.

Scull, C. A., (1999). Computer anxiety at a graduate computer center: computer factors, support, and situational pressures. Computers in Human Behavior, 15, 21 3-226.

Szepesvbi, C., & Littman, M. L. (1996, Nov.). Generalized Markov decision processes: Dynamic-programming and rein- forcement-learning algorithms. Technical Report CS-96-11. Providence: Brown Univ., Dept. of Computer Science. On-line [Available] http://www.cs.duke.edu/-mlittman/docs/grndp.abs

Wroblewski, E., McCandless, T. P., & Hill, W. C. (1991). De- tente: practical support for practical action. In: ACM-CHI ‘91 - Reading through technology: Human factors in Computer Sys- tems, CHI’91. ACM Press.

ASIST 2002 Cotitributed Paper 123