[ieee 2008 ieee aerospace conference - big sky, mt, usa (2008.03.1-2008.03.8)] 2008 ieee aerospace...
TRANSCRIPT
![Page 1: [IEEE 2008 IEEE Aerospace Conference - Big Sky, MT, USA (2008.03.1-2008.03.8)] 2008 IEEE Aerospace Conference - Contextualized Search and Faceted Browsing of Heterogeneous ISS Mission](https://reader030.vdocuments.us/reader030/viewer/2022021500/5750aa401a28abcf0cd6831e/html5/thumbnails/1.jpg)
Contextualized Search and Faceted Browsing ofHeterogeneous ISS Mission Operations Data
Christopher D. Knight Jane T. MalinNational Aeronautics and Space Administration NASA Johnson Space Center
Code TI, Mail Stop 269-4 2101 NASA ParkwayMoffett Field, CA 9403 5-1000 Houston, TX 77058
650-604-3471 jane.t.malin(nasa.govChristopher.D.Knight(nasa.gov
Abstract This paper 2 details a full-text search interfacedeveloped to support mission operators to quickly searchacross multiple databases simultaneously and allow forsearch result refinement using a faceted user interface. Byleveraging the contextualized NETMARK[1] eXtensibleDataBase (XDB), this search interface automatically inferscategories for facets and allows users to select one or morecategory for each facet to narrow the results to only displaythose that meet all of the selected facet categories.
TABLE OF CONTENTS
1. INTRODUCTION.......................12. APPROACH.......................53. IMPLEMENTATION DETAILS....................... 64. RESULTS.......................64. CONCLUSION.......................75. BIOGRAPHIES.......................86. REFERENCES.......................8
1. INTRODUCTION
The immense success of commercial search systems such asthose deployed at Google, Amazon, eBay, and other top-tiersearch and commerce sites illustrate the demand from end-users for convenient and expedient access to the relevantinformation or products the user seeks. Naive andsophisticated users of search systems today demand asimple, easy-to-use search mechanism within which theycan type in words, identifiers, or phrases of interest and theresults are presented with the least amount of effort by theuser. These users expect relevant results and the ability toiteratively fine-tune their results, usually by adding moreterms to the search phrase.
Search Context
Generalized search systems for searching multiple Internetor Intranet sites are generally limited in their ability tocontrol the "context" of their search based upon fragmentsof the URL addressing space, such as host name and path ofthe pages indexed. This capability is particularly powerful
11-4244-1488-1/08/$25.00 C 2008 IEEEU.S. Government work not protected by U.S. copyright.2IEEEAC paper #1091, Version 2, Updated 2007:10:20
for users familiar with a subset of web sites that contain theinformation most relevant to them and most likely toaddress the questions asked through search. Additional"document contexts" are often available, depending on thetypes of systems indexed. For example, map systems can besearched on geographical or data set contexts, messageboards can be searched on author, date, or other relevantmetadata.
For search systems that do not have a-priori knowledge ofthe structure and metadata of the data they are indexing,they are unable to retrieve results based on more refinedcontexts. For example, searching the Internet for"Christopher D. Knight" will return a large number ofresults, but if the user was interested in papers on which Mr.Knight were the author, and the user was able to limit thesearch to the "Authors" section of documents, it wouldsignificantly narrow the number of results to a moremanageable and relevant set. Leveraging additionalcontexts, such as site (say, IEEE sites) would increasinglytarget the results for further accuracy.
Incremental Refinement ofSearch
While sophisticated users may already have preconceivedcontexts for their searches, most users find complexgrammars or user interfaces to be a barrier to use of thesearch system. Instead, this paper proposes an incrementalrefinement approach wherein the user performs a simplekeyword search across all contexts and, post search, refinesthe results to narrow them to a smaller relevant set.
Incremental refinement has additional benefits the usermay want to explore the larger space of results and may findresults that are relevant but not necessarily in the contextsthe user expected. Or, the user may not know the context(s)until they perform the initial search. Sophisticatedexperienced users, if given the choice to narrow the searchbefore exploring, may miss relevant results. Using theprevious example, if a sophisticated user were able to limitthe search to the Authors section of IEEE papers, that userwould not be aware of the instances where Mr. Knightappears in the references section of the papers, or that Mr.Knight has a patent with the USPTO on a technologyrelevant to the user's interests.
1
![Page 2: [IEEE 2008 IEEE Aerospace Conference - Big Sky, MT, USA (2008.03.1-2008.03.8)] 2008 IEEE Aerospace Conference - Contextualized Search and Faceted Browsing of Heterogeneous ISS Mission](https://reader030.vdocuments.us/reader030/viewer/2022021500/5750aa401a28abcf0cd6831e/html5/thumbnails/2.jpg)
Multifaceted Search as a Form ofIncremental Search
Many e-commerce sites not only provide for keywordsearch of their product database with incremental (andincreasingly-complex) search criteria, but also for post-search refinement of the results based on multiple "facets"[2]. Facets are classifications for indexing and informationretrieval that break out of the strict hierarchy of a singletaxonomy, to represent multiple category views of objects.Multi-faceted search presents users categories in multipledimensions to choose from for searching or refining a
search [3].
Usually these facets leverage one or more taxonomies (alsoused for browsing for products) developed and maintainedby the commerce site. They are often associated withthesauruses for text-based indexing. Even though facets are
most often defined manually, there are several ways todynamically generate facet structures through text analysis,data mining, and document structure analysis, and othertechniques. There is considerable research in automaticallygenerating these taxonomies given a set of input data. TheFlamenco project[4] has been the subject of both automaticgeneration of facet categories given a corpus of input dataand user interfaces for search with integrated facetednavigation and browsing.
For International Space Station (ISS) mission operations,software problem tracking, change tracking, and relateddocuments are stored in the PVCS problem report databasesystem. PVCS stands for the vendor name for the ISSChange Paper and Version Control System. The PVCSsystem is used to research ISS flight software changerequests and change documents. Other sets of documentsincluding procedures, handbooks, and the like are
constantly maintained by operators when they are not on
console or in training, and this process of updating requiresconstant awareness of the state of systems includingreferencing systems such as PVCS.
Figure 1 - Standard Faceted Ut Layout
The norm for displaying facets on e-commerce sites is toshow an ordered list of facets such as illustrated in thelayout shown in Figure 1 (with the facet containing themost number of matches first) in a tight column on the leftside of the page and the top matches on the right. Each facetis listed with the number of results in the total number ofresults that belong to that facet.
For example, on Amazon.com, searching for "Batman"returns a set of results in the main section of the windowand the left-hand column lists "Books (14,759), Toys &Games (1,588)..." Clicking on one of these categoriesnarrows the set of results to only those in the selectedcategory. For example. clicking on Books will focus on
only the books matching the search term of Batman. Othere-commerce sites add additional facets such as price ranges,
years of publication, or seller location where the user can
narrow the results based on categories within theseadditional facets.
2
Banner and navigation
Facet Search resultsList
Mission Operations and Search
![Page 3: [IEEE 2008 IEEE Aerospace Conference - Big Sky, MT, USA (2008.03.1-2008.03.8)] 2008 IEEE Aerospace Conference - Contextualized Search and Faceted Browsing of Heterogeneous ISS Mission](https://reader030.vdocuments.us/reader030/viewer/2022021500/5750aa401a28abcf0cd6831e/html5/thumbnails/3.jpg)
Figure 2 - Current PVCS Search User Interface
As illustrated in Figure 2, the existing PVCS web-basedsearch interface is time-consuming to use and restrictiveusers seeking all records regarding a particular ISSsubsystem or component and must perform a large numberof search operations to answer a simple question such as,"What problems have been reported and resolved regardingGPS software?" Searching each set of PVCS recordsrequires multiple user operations in a complex queryinterface. This search interface requires selecting onedatabase such as SCR and one or at most two fields such asthe "title" or "owner". All records are presented as HTMLtables for viewing, with references to related recordsprovided in the HTML tables as links.
NETMARK Search System
Developed by a team in the Intelligent Systems Division ofthe NASA Ames Research Center, the NETMARK systemprovides an environment for data integration acrossmultiple, disparate, and heterogeneous repositories.NETMARK combines the capability to ingest a variety ofdata in unstructured, semi-structured, and highly-structuredforms with the ability to search the NETMARK index andretrieve information atoms via a rich, while straightforwardquery language. Data is integrated in NETMARKeXtensible DataBase (NETMARK XDB) with no a-prioriknowledge of the structure and with no NETMARKconfiguration required. It is common, however, to massagethe data being loaded into NETMARK for increaseprecision of retrievals. For example, unstructured text,Word documents, HTML documents, and XML documentsmay be loaded into NETMARK XDB directly. However,
semantic information about the relationship betweensections of text is often not explicit in less-structuredformats such as HTML (i.e. in HTML tables). As part of theNETMARK deployment, Ames developed a web crawlingsystem that automatically retrieves updated PVCS recordsvia the web interface, normalizes the table-heavy HTMLinto an XML structure, and loads the XML into theNETMARK database.
Data is loaded into NETMARK XDB via the WebDistributed Authoring and Versioning (WebDAV[5])protocol; a file-sharing protocol built on top of theHyperText Transport Protocol (HTTP). As files are"transferred" to the NETMARK XDB system, they areautomatically decomposed and indexed with a full-textindex system and immediately available for querying anddocument fragment retrieval by clients and users.
Interacting with the NETMARK schema-less databasesystem is performed by executing simple or compoundsearches using a specific grammar of "context + content"requests, i.e. searching for the content "Knight" under acontext with "Authors" will produce a result set containingthis and other papers where the name Knight appears in theAuthors section of those documents. Searches inNETMARK XDB can be as simple as a "content-only"search (returns any context with the terms sought in thecontent section) or compound with multiple contexts andcontents.
3
![Page 4: [IEEE 2008 IEEE Aerospace Conference - Big Sky, MT, USA (2008.03.1-2008.03.8)] 2008 IEEE Aerospace Conference - Contextualized Search and Faceted Browsing of Heterogeneous ISS Mission](https://reader030.vdocuments.us/reader030/viewer/2022021500/5750aa401a28abcf0cd6831e/html5/thumbnails/4.jpg)
[8]), or any other client capable of making HTTP requestsand processing XML responses such as Microsoft Excel.
Query ResultMission Operations Integrated Search
A team of developers at NASA Ames Research Center hasbeen working to addess PVCS search needs-with te goalof finding relevant documents quickly and easily with amnmmof user interactions.
The NETMARK XDB system was leveraged to index a
Informationl Stores subset of the PVCS records of particular interest to missionin Netmark operators and provide unified search. The records extracted
from PVCS are Software Change Requests (SCR), StationFigure 3 - Context-Based Retrieval of relevant text fragments Program Notes (SPN), Change Notes (CN), and Schedule
and documents Issue/Change Forms (SIF).
Figure 3, above, illustrates the unique nature of theNETMARK XDB, in that text fragments are returned by thequery in a combined query result document. Re-composition of documents (i.e. taking all of the abstractsfrom a set of documents and producing a summarydocument for further distribution) is a straightforward andpowerful capability provided by NETMARK and has beenleveraged in a variety of applications.
Table 1. Example NETMARK Query + ResponseRequest:http://xdb/xdbq/Authors/Knight
Response:_<xml>
<resultset>
...<result><uri>/xdb/docl.xml</uri>...<Authors>...Knight...
...<result><uri>/xdb/doc2.xml</uri>
NETMARK queries not only return the records that matchthe search criteria but also return the context and allcontents under that context that match. For example, thequery in Table 1 returns the complete "Authors" sectionwherein the name Knight appears, not just the UniformResource Identifier (URI, often a Uniform ResourceLocator [URL] but not necessarily [6]) of the record or atext snippet. The resulting XML is a structured set of data,for example it is common to return an XML sub-tree in thebody of each result. That result structure allows for furtheranalysis and formatting for presentation.
Often a client user-interface is provided that interacts withNETMARK XDB by using XSLT [7] (server or client-side), using Asynchronous JavaScript and XML (AJAX
4
![Page 5: [IEEE 2008 IEEE Aerospace Conference - Big Sky, MT, USA (2008.03.1-2008.03.8)] 2008 IEEE Aerospace Conference - Contextualized Search and Faceted Browsing of Heterogeneous ISS Mission](https://reader030.vdocuments.us/reader030/viewer/2022021500/5750aa401a28abcf0cd6831e/html5/thumbnails/5.jpg)
The application of the NETMARK model to ISS missionoperations has been an on-going project for many years anda variety of user interface prototypes have been developed.These previous interfaces required the user to select acontext for their search, similar to the PVCS system, butmore powerful in that the search operation would beperformed across multiple record types simultaneously. Forexample, the previous system would allow users to selectthe "title" or "description" context using a pull-down formelement and provide search terms in a text box. Documentscontaining the search terms in the context specified werereturned. However, even sophisticated users were confusedby the user interface when it asked to both provide a searchstring and pre-select a context (in the case of data replicatedfrom PVCS, record field name). Also of concern to thedevelopers; while it would have been possible to allow the
the need to simplify and provide the bare minimum requiredto perform the work. This simplified user interface producesa set of results in a paginated view for the client to browsethrough and to review individual results in detail. Links tothe data record in the authoritative system of record areprovided for each search result (akin to the links in commonsearch service results such as Google) but an expandabledisplay of results is also available for immediate viewing.
In developing this simplified user interface, it becameapparent that the contexts of the resulting contents wereavailable at no cost. By using the NETMARK XDB system,all searches produce the associated contexts for the contentsfound. The contexts of all of the results are compiled andpresented to the end-user in a faceted user interface forfurther refinement. Each result is shown as one line of text
N umber of rsulsR 44
T*peSCFWR '3`)
% JlSP N(USiNON
l2E
mg!.E @. f.l :. e
ASCS APF ROVED0"lN"2' NERela2te . sear}fches
TypevSCR'SRRSCR
*SCR'SR
'SR
'SCR'SCRSCRSCR'SCRSCRSCR
In SCR F SPN $?N FSI GFCOH
Pag X 2,
ID Tite18 92
12326....1 ...........K..K............... ...................... ...................K
.....
K%,t8. ....................... .... ....._..... _KK..............KKKK
.....................
EIferSSuu _SSSS.X.X.X.=UUUUUUU._S.._S .....1.....
........
A'f,jl'IfW ...........-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~.......1674313202 ....
19178 .......
12860
Resu.taPgel20StatusCLOSEDCLOSECLOSED
CLOSEDNO ACTION REQUIRED
NO A O1N RQU1RE
CLOSACEDNRQS
NOACTION REQU REDCLOSEDCLOENO ACTIONREQURECL0SED
NO.ACTON.REQUIREDFigure 4 - NX Search Facet Ut
user to select "all contexts", habit and familiarity with thePVCS search system would lead users to select the contextsthey felt were most likely to return results, and records thatmay be of interest may be overlooked.
2. APPROACH
The first approach to addressing ease of use was to providea simple user interface with a single text input field andperform a "content-only" search of the database with thewords and phrases provided. Repeated requests from usersto make the interface "as easy to use as Google" reinforced
but a small triangle button expands one record's display,showing all of the fields recorded in NETMARK, for quickreview. A link to the record in PVCS is also provided sothat the system of record is consulted, and so that users maynavigate to related records in a manner familiar to regularusers of PVCS.
Users of this system are presented with a simple searchform wherein they select the type(s) of records they wish tosearch and a free-text field for providing terms and phrasesof interest. Screenshots of the user interface pages areshown in Figures 4 and 5.
5
.........M.....
![Page 6: [IEEE 2008 IEEE Aerospace Conference - Big Sky, MT, USA (2008.03.1-2008.03.8)] 2008 IEEE Aerospace Conference - Contextualized Search and Faceted Browsing of Heterogeneous ISS Mission](https://reader030.vdocuments.us/reader030/viewer/2022021500/5750aa401a28abcf0cd6831e/html5/thumbnails/6.jpg)
Figure 4 illustrates the layout of the complete user
interface, with the title, search field, and data set selectioncheckboxes at the top of the page, the faceted categoryselection interface on the left, and the results in the mainsection of the page.
Figure 5 - NX Search Facet Ut (Detail)
Figure 5 provides an illustration of the faceted user
interface, including the three main facets currentlygenerated (record fields wherein the search terms were
found, the record types wherein the terms were found, andthe record status, as extracted from the records.) Each facetcategory is named (with lengthy names being summarizedwith ellipses, rolling over the shortened name shows a "tooltip" of the full name) and the number of records that are
contained in that category. Each facet category is a click-able element that, when the user clicks the category thatcategory is toggled on and off. Records in the main sectionof the display are only shown when it belongs to at least onecategory for each facet currently "on". For example, for an
SCR with "plasma" in the title and description, clicking the"title" category off will not remove that SCR, while clickingthe SCR category under type or turning both title anddescription off under field will.
3. IMPLEMENTATION DETAILS
The NETMARK search system includes a sophisticatedsystem used to "crawl" PVCS on a regular basis and toconvert records into a normalized XML form for insertion(via the WebDAV protocol) into the NETMARK XDB.Records are organized in NETMARK by category and state(PVCS fields) for increased control over the search "scope"and to provide additional search facets.
Netmark Query(AJAX HTTP GET)
XML Results
[Context + Content]
Figure 6 - NETMARK XDB Implementation Architecture
Figure 6 illustrates this architecture for crawling,transforming, storing, and retrieving records.
Clicking the search button in the web user interface triggersan AJAX process that sends a query request to the server
and presents the results in a paginated list form. Thecontexts of all of the results are compiled and presented tothe end-user in a faceted user interface for furtherrefinement. The AJAX code also collects and compiles thecontexts provided in the results, as well as other metadataprovided by the server, and produces a faceted user
interface display. Further refinement is performed on theclient, as the entire result set is delivered to the client inresponse to the content-only query submitted.
4. RESULTS
Feedback from end-users has been extremely positive as theinterface reduces the number of user interactions required tosearch the PVCS records and the user interface is(generally, see below) intuitive and straightforward. Aspagination, facet selection, and other user interactions are
purely client-side, the interface is performs well, even witha large number (thousands) of results. The AJAXimplementation supports a wide variety of web browsersand generally presents the interface in a uniform way, butthere remain minor differences in the ways elements are
rendered.
This faceted user interface is novel in the following ways:
* Facet categories are automatically computed fromthe search results and do not require a pre-
generated taxonomy of facets. This allows forintegration of additional data sets at no cost, no
taxonomy generation is required when new datasets are added to the database and search interfaceas categories are computed based on the results ofthe query. Further normalization of computedfacets may be reasonably performed to ensure
semantically equivalent categories are combined.
6
F'ield
prolM d6scopm (2
NONE; *i
3TOe-H--W- U
..SC-.RH...W.WCA:I(-' Uut''2SpPq ;
I I~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~_
![Page 7: [IEEE 2008 IEEE Aerospace Conference - Big Sky, MT, USA (2008.03.1-2008.03.8)] 2008 IEEE Aerospace Conference - Contextualized Search and Faceted Browsing of Heterogeneous ISS Mission](https://reader030.vdocuments.us/reader030/viewer/2022021500/5750aa401a28abcf0cd6831e/html5/thumbnails/7.jpg)
* Individual results may belong to more than onecategory under a given facet (e.g. if the usersearched for "Knight" and it appeared in both theAuthor and Bio sections of a document, thatdocument would belong to both categories.As facets and their categories are non-hierarchical,and as records may belong to more than one facet,the UI allows users to select multiple facetcategories and they are presented with all recordsthat belong to any of those facets. Other facetedUI's rely on facet categories being hierarchical andonly allow users to select an individual categoryunder each facet.
* Interactions with the faceted UI are performedpurely client-side for high performance. Theauthors are unaware of any faceted user interfacesthat are performed purely on the client and thatprovide the immediate feedback and highperformance inherent in this implementation.
The user interface has presented some challenges and haveremaining, open questions; including:
* How to handle user selections -- When allcategories for a facet are selected and the userselects one category, should all other categories bede-selected, or the one selected?
* What numerical value to display next to each facetcategory? current implementation shows the totalnumber of records in that facet category, but thisvalue may be confusing when one facet hasmultiple categories turned off.
* How to handle very long facet category lists -currently the lists are unordered, but even whenthey are ordered, users may seek the categorieswith fewest results, or with the most results.
4. CONCLUSION
The NETMARK XDB Faceted Search user interface willcontinue to be improved based upon feedback provided byend-users. The techniques and layout will be leveraged forother applications at NASA and other agencies. It isapplicable to systems that leverage the XDB servertechnology and semi-structured, contextualized nature ofthe query results produced. Further developments includeevaluating how best to combine less-structured documentswith more structured data sets such as the PVCS records.They also include leveraging ontologies and taxonomies toorganize and unify the display in a more logical and user-friendly fashion. The technique may be easily applied to anysearch system that uses generated metadata to provide viewsof the information to be retrieved.
7
![Page 8: [IEEE 2008 IEEE Aerospace Conference - Big Sky, MT, USA (2008.03.1-2008.03.8)] 2008 IEEE Aerospace Conference - Contextualized Search and Faceted Browsing of Heterogeneous ISS Mission](https://reader030.vdocuments.us/reader030/viewer/2022021500/5750aa401a28abcf0cd6831e/html5/thumbnails/8.jpg)
6. REFERENCES5. BIOGRAPHIES
Christopher D. Knight is a computerengineer at the NASA Ames ResearchCenter. He is one of the key developers
focused on asynchronous collaborativetechnologies and architectures for rapid applicationdevelopment tools for enabling collaboration. He received abachelor's degree in computer science from Cal Poly, SanLuis Obispo in 1996.
Jane T. Maim is SeniorTechnical Assistant in the
InlaorteigensytesSytes Bntressanch
focused Automation, Robotics andtc i aSimulation Division
developmen toolsfor en EngineeringDirectorate, NASA
bachlorsdereein JohnuersoinSpceftmCenterl,wher
she has led intelligent systemsresearch and development since1984. She has led research on
model-based hazard analysis and simulation forengineering intelligent systems and on informationextraction from text for modeling, using ontology-basedsemantic text analysis. Her 1973 Ph.D. in ExperimentalPsychology is from the University of Michigan, with adissertation on mathematical modeling of informationprocessing in problem solving.
[1] David Maluf, P. Tran, NETMARK: A Schema-LessExtension for Relational Databases for Managing Semi-structured Data Dynamically, ISMIS, 2003.
[2] A.S. Pollit, The key role of classification and indexingin viewbased searching. Technical report, University ofHuddersfield, UK, 1998.
[3] Eetu Makela, Eero Hyvonen and Teemu Sidoroff, View-Based User Interfaces for Information Retrieval on theSemantic Web. Proceedings of the ISWC-2005 WorkshopEnd User Semantic Web Interaction, Nov, 2005.
[4] E. Stoica, M. Hearst, and M. Richardson, AutomatingCreation of Hierarchical Faceted Metadata Structures, in theproceedings ofNAACL-HLT, Rochester NY, April 2007.
[5] Web Distributed Authoring and Versioning (WebDAV)Standards, /w
[6] Uniform Resource Identifier Generic Syntax, IETF RFC3986, h
[7] XML Stylesheet Language for Transformation (XSLT)Standards, a
[8] Asynchronous JavaScript and XML (AJAX)Description,hik
8