information visualization visualizing explicit and...

Article

Visualizing explicit and implicit relationsof complex information spaces

Marian Dork, Sheelagh Carpendale and Carey Williamson

AbstractIn this work, we describe how EdgeMaps provide a new method for integrating the visualization of explicit andimplicit data relations. Explicit relations are specific connections between entities already present in a givendata set, while implicit relations are derived from multidimensional data based on similarity measures. Manydata sets include both types of relations, which are often difficult to represent together in information visu-alizations. Node-link diagrams typically focus on explicit data connections while not incorporating implicitsimilarities between entities. Multidimensional scaling considers similarities between items; however, expli-cit links between nodes are not displayed. In contrast, EdgeMaps visualize both explicit and implicit relationsby combining graph drawing and spatialization techniques. We have applied this technique to three casestudies (philosophers, painters, and musicians) and explored how integrated visualizations of explicit andimplicit relations reveal novel patterns and relationships.

Keywordsinformation visualization, explicit and implicit relationships, graph drawing, dimensionality reduction, aes-thetics, design, www

Introduction

An important goal of information visualization is toreveal different types of relationships within abstractdata. Through interaction, the viewer is able to findand understand connections between bits of informa-tion. Relations can be explicitly present in a data set aslinks that specifically connect information items orimplicitly by inferring relations based on similarity ofattributes. For both types of relationships – explicit andimplicit – several visualization techniques have beenproposed and refined over recent years. Two of themost popular techniques for visualizing relations arenode-link diagrams (NLD) and multidimensional scal-ing (MDS). On the one hand, NLD techniques are usu-ally applied to explicit relations or connections that arevisualized as edges between nodes representing, forexample, online communities, computer networks, orlinked web pages. On the other hand, MDS is typicallyused for implicit relations between documents or othertypes of multidimensional data. MDS spatializes attrib-ute similarities between items by placing similar itemsin close proximity to each other and less similar itemsfurther apart.

Although NLD and MDS techniques are widelystudied and increasingly deployed, they both have sig-nificant limitations with regard to readability and

interpretability of the resulting visualization. Layoutalgorithms for NLD visualizations are typically opti-mized for reducing edge crossings, with the side effectthat node positions are not utilized as meaningful visualvariables. Furthermore, as the number of edgesincreases, it becomes hard to distinguish directionality,if present, and identify high-degree nodes. The resultingvisualization of an MDS algorithm, on the other hand,lacks a guiding structure to put elements into contextwith each other. It is often difficult to understand themeaning of the positional proximity of elements. Weargue that the limitations of both techniques could beattributed to the fact that they are constrained to eitherexplicit or implicit relations, yet many data sets featureboth types of relationships.

In this paper, we explore how both explicit andimplicit relations can be visualized as EdgeMaps, inte-grated views that combine graph drawing and

Department of Computer Science, University of Calgary, Calgary,Canada.

Corresponding author:Marian Dork, Department of Computer Science, 2500 UniversityDrive NW, Calgary AB T2N1N4, CanadaEmail: [email protected]

Information Visualization

11(1) 5–21

� Authors 2011

Reprints and permissions:

sagepub.co.uk/journalsPermissions.nav

DOI: 10.1177/1473871611425872

wmr.sagepub.com

spatialization. EdgeMaps integrate NLD and MDStechniques utilizing both visual linkage and proximityfor the representation of complex – explicit and implicit– relations between items. The intent behind thisapproach is to make effective use of visual variablesthat have been underutilized in NLD and MDStechniques.

As case studies for this paper, we have chosen datasets of philosophers, painters, and musicians from theFreebase data community. While there are many bio-graphical records associated with these prominent per-sonalities of philosophy, art, and music, a particularlyinteresting aspect is the existence of influence connec-tions between people, which are a type of explicitrelations. On the other hand, birth dates, interests,movements, and genres are attributes that indicateimplicit relations between philosophers, painters, and

musicians. We chose these dimensions because theyprovide a compelling case for the visualization of expli-cit and implicit relationships and allow us to explorecomplex data relationships. Visualizing influencesbetween musicians or philosophers as edges may indi-cate who had more impact, yet it is not possible withthese links alone to see the extent of the impact. Byencoding meaningful data relations into both positionand edges, it becomes possible, for example, to explorethe influence of musicians across genres or of philoso-phers over time (see Figures 1 and 2).

The remainder of the paper is structured as follows.First, we provide an overview of prior work, afterwhich we explain our design goals and the data setswe use as case studies. We then introduce the visualrepresentations provided with EdgeMaps (‘Visualizingexplicit and implicit relations’ section) and describe the

Figure 1. Visualizing relations among musicians. The influence of The Beatles is visualized in the similarity map.

6 Information Visualization 11(1)

web-based interface design (‘Creating a web-based visu-alization interface’ section). Using the case studies, weillustrate new ways for exploring complex relations(‘Revealing complex relationships’ section). We thendiscuss the limitations and open questions of thiswork (‘Discussion’ section) and conclude the paper.

Related work

As visualizing relationships is at the heart of informa-tion visualization, our work builds upon many previouscontributions in the field, with particular regard to theuse of visual variables, graph drawing methods, andcasual visualization.

While not part of his visual information-seekingmantra (‘Overview first, zoom and filter, then details-on-demand’), Shneiderman1 notes the challenge ofbeing able to explore relationships between informationitems. He stresses the importance of interaction forrelating data entries; however, equally if not moreimportant are the appropriate visual representationsof different types of relations. To think about repre-senting relationships visually, it is worth consideringthe visual variables that are at our disposal. InSemiology of Graphics, Bertin2 distinguishes betweeneight visual variables: size, value, texture, colour, ori-entation, shape, and the two dimensions for the posi-tion on the plane. MDS renderings use planar positionas the primary visual variable, while NLDs typicallyrearrange position in order to minimize edge crossings.Stone3 makes the case that colour can make visualiza-tions more effective and beautiful when used well. Sheshows how colour can be used for labelling and quan-tifying data. It would be interesting to explore the use

of colour for conveying similarity between items as adegree of association in Bertin’s terms.

There has been extensive research on drawing andinteracting with NLDs,4 often aiming at reducingedge crossings, which is one of several geometricaland graph-theoretical metrics for graph aesthetics.5

Recent additions to this research include EdgeLens, atechnique for interactively exploring overlappingedges,6 and EdgeBundles, a method for combiningedges with similar paths.7 Another problem of largegraphs is occlusion, especially when arrowheads ofdirected edges impair the perception of the actualnodes. A study of directed graphs examined a rangeof visual cues for directionality and their effect ondetermining direct and two-step connections.8 Whilethese contributions can significantly improve the read-ability of large NLDs, we argue that contextual attrib-utes of graph elements need to be more acknowledged.This perspective is supported in earlier work on com-puter network visualization, where edge and nodeattributes (e.g. flow, capacity, utilization) of regionaland international Internet links were regarded to bemore important than the network topology.9 As partof a social network visualization, it was shown how thevisual representation of number of friends, gender, andcommunity structure enriches the NLD and allows forinteractive filtering.10

While conventional NLD techniques focus almostentirely on explicit relations, MDS can be seen as acomplementary approach focusing on proximity as avisual representation of implicit relations or similarity.MDS has been used for document visualizations withthe goal of visually conveying ‘thematic patterns andrelationships’ of text collections.11 While the idea

Figure 2. Visualizing influence relations between philosophers; Friedrich Nietzsche is selected in the timeline view.

Dork et al. 7

of spatializing document collections based on their sim-ilarities or differences is promising, the resulting galax-ies and themescapes still appear abstract and difficult tointerpret. An approach to making MDS more interac-tive focused on steering the algorithm, but did not lookat using interactivity to make the MDS view moremeaningful and accessible.12

Besides linkage and similarity, another importantrelation is based on the temporal dimension and refer-ences between temporally structured items. Consideringthat time is generally seen as a linear dimension, thechallenge is to visualize relations between items thatare mapped onto a linear axis. Arc diagrams showcross-references along a linear axis by adding visualsemicircles to linear views of documents, music pieces,and DNA sequences.13 In work that explored the use ofarcs to visualize email threads, it was shown how thecombination of arcs displayed above and below themain axis improved readability.14

Several contributions to information visualizationhave looked at enriching and combining techniquesfor visualizing different types of relations. For example,a visual document hierarchy was accompanied witharcs representing cross-references between different sec-tions.15 Furthermore, NLDs were made more readableby assigning nodes into multiple regions and allowingfor interactive edge filtering.16 A related approachplaces graph nodes on a multivariate grid allowing fornew forms of exploration.17 Another technique pro-vides a three-dimensional juxtaposition of differentvisualizations on panes linking corresponding nodesusing edges between the panes.18 All these techniquesunderline that there is a need for integrating explicitand implicit relations and enabling their interactiveexploration; however, there is a tendency to emphasizeone over another.

A related area within information visualizationfocuses on social, casual, and artistic aspects of visual-ization. For example, Many Eyes is an online commu-nity that allows web users to upload their data sets,create and customize visualizations, and discuss thesewith other community members.19 Many Eyes is oneexample among many web-based visualization projectsthat make visualizations more readily available withinthe web browser, which has become a major trend invisualization research and practice.20 The developmentthat visualization is becoming part of everyday activi-ties, moving beyond professional, specialized domains,is also described as casual information visualization,implying novel design challenges and audiences.21 Oneexample of this development is the increasing use ofvisualization as a medium for artistic expression,22

which also aims at creating beauty and triggering curi-osity besides supporting data analysis and sensemaking. Against the backdrop of casual information

practices with growing information spaces, consider-ations of aesthetics and curiosity also enter thedomain of information seeking.23

Design goals

The motivation behind this work is the multitude ofdata sets that features both explicit and implicit rela-tions and preliminary research on complementing thevisualization of one type of relation with aspects ofanother. For this work, we understand explicit relationsas data relations that specifically connect data entitiesand are already present in the data set. As implicit rela-tions we see data relations that are not defined in thedata set and need to be inferred based on similaritiesbetween data items.

Explicit linkage between items has been the mainstayof graph-drawing research. On the other hand, thereare numerous implicit similarities based on variousparameters and dimensions that can be used for dataspatialization. With this work, we are exploring thespace of integrating the visualization of both explicitand implicit data relations in order to reveal previouslyunseen patterns. In particular, we aim at supportingpeople in viewing complex data sets and exploring rela-tionships between information entries in a pleasing andengaging way. This translates into the following designgoals:

Integrate multiple relationships. Explicit and implicitrelations should be visualized as linkage andlayout to mutually support each other.

Show invisible data patterns. By integrating explicit andimplicit data connections, the visualization shouldprovide novel, interesting shapes and relations thatwere not visible before.

Support serendipitous exploration. The visualizationsshould allow viewers to find unexpected insightsand easily follow their interests. The interactivitynecessary should be readily accessible withoutrequiring training.

Display additional information. The interface shouldprovide detail-on-demand operations allowing theviewer to learn more about particular data entriesand to go back to the data source.

Provide aesthetic visuals. The colours, shapes, and tran-sitions used by the visualization should satisfy bothutility and visual appeal, making the interactionpleasant and able to evoke curiosity.

Data sets and dimensions

To illustrate the notion of complex information spacesand explore their visual representation, we chose


personalities from philosophy, visual art, and popularmusic as exemplary data sets. We collected the datafrom the Freebase website using their web-basedAPI.24 Freebase offers structured information aboutmany entities. In the case of philosophers, painters,and musicians, it provides data about birth dates, influ-ence connections, and a range of domain-specific key-words such as philosophical interests, artisticmovements, and musical genres. Besides programmaticmethods of accessing the data, Freebase also publishestabular views of individual data entries on their website(see Figure 3). As the group of musicians has onlysparse influence connectivity on Freebase, we comple-ment our data set by influence information from theAllMusic website.25

We focus on philosophers, painters, and musicianswho influenced at least one other member of their cor-responding group. The selection is also based on theavailability of keywords corresponding to the selectedsimilarity criteria that are most meaningful for thegiven group of people. For philosophers, we includetheir philosophical interests (e.g. ethics, logic, nature)and professions (e.g. writer, physicist, teacher). Thekeywords of painters include art forms (e.g. painting,mural, fresco) and artistic movements (e.g. renaissance,impressionism, surrealism). For musicians, we includedtheir genres (e.g. jazz, punk, electronic).

Based on the requirements of keyword and influ-ence information, the resulting data sets comprised196 philosophers, 226 painters, and 200 musicians.

For each person, we kept the name, birth date (formusical groups, year of formation), description, animage, and the keywords. Furthermore, we storedthe directed influence links between people, whichwe considered as explicit relations, and the keywords,which formed the basis for computing similaritiesamong people as implicit relations. The dates couldalso be regarded as another type of implicit relationas they implicitly link people of similar time periodsor epochs. For storing these records, we used aMySQL database that was easily accessible fromthe server-side component of our system, which waswritten in PHP.

Using these attributes, several interesting dimensionscan be inferred. For example, we can calculate thedegree of influence as a measure of significance byusing the sum of outgoing influence connections. Inother words, the more people a particular person hasinspired and influenced, the more significant this phi-losopher, painter, or musician is. Based on birth dates,people can also be grouped into similar epochs.Likewise, using domain-dependent keywords, philoso-phers, painters, and musicians can be grouped by theirsimilarity. When combining time periods and keywordswith the influence connections, one can look at theimpact of, for example, musicians across epochs andinterests. For example, considering all the musicianswho were influenced by The Beatles, their impactextended over a wide range of musical genres (seeFigure 1). As Friedrich Nietzsche was considered influ-ential by many subsequent philosophers, his impact isone of great temporal extent (see Figure 2). By juxta-posing and integrating both explicit and implicit rela-tions, it is possible to reveal these relations, which arenot easily accessible just by looking at data tables orreference pages.

Visualizing explicit and implicit relations

EdgeMaps provide a visualization that representsboth explicit and implicit relations by integratingspatialization and graph-drawing techniques. Explicitrelations are encoded as curved edges and implicit rela-tions as node position. Other visual variables are usedto double encode these data relations and introduceadditional information such as directionality and dis-tinctness. In the following, we describe our design con-siderations and decisions concerning visualrepresentation.

Implicit relations as layout

To represent the implicit relations, we designed twogeneral layouts: a similarity map and a timeline, asillustrated in Figure 4. While both layouts represent

Figure 3. One example data set for visualizing implicit andexplicit relations is philosopher data from the Freebasecommunity. Here is the entry about the French philosopherSimone de Beauvoir: description, birth date and profession(top), interests and influences (bottom).

Dork et al. 9

people as nodes and influences as edges, as we willdescribe in more detail later, the layouts differ in theway the positions on the plane are utilized. The simi-larity map represents the topical nearness amongpeople based on their domain-specific keywords suchas interests, movements, and genres. The timeline usesbirth dates as an ordering criterion to arrange philoso-phers, painters, and musicians along a temporal axis.While neither of these layouts is new by itself, theyprovide new context for graph visualizations.

Similarity map. The node positions for the similar-ity map were generated using an MDS algorithm26 onthe basis of the keywords. To do this, the keyword datawere transformed into a vector model that the MDSalgorithm used as a basis to map relative distances inthe high-dimensional vector space to distance in thetwo-dimensional plane. The result was a pair of x,ycoordinates for each philosopher, with each coordinate

value between �1 and 1. Depending on the size of thewindow, these coordinates were then scaled to theactual display resolution. However, the aspect ratio ofthe MDS output was not modified since the proximityof the resulting plane configuration was generated onthe basis of similarity. Stretching the layout would con-found the representation of similarities, which aremapped to positions on the plane using theirEuclidian distances. As there may be people with iden-tical keywords, the MDS algorithm can return itemswith the same positions, posing an occlusion problem.Considering that MDS is, after all, an approximation,the similarity map places overlapping nodes slightlyapart so that they are still close but not occludingeach other. The positions in the similarity map areused to generate unique colours for all circles, as dis-cussed in more detail later. The resulting layout isshown in Figure 5.

Timeline. The timeline maps birth year to positionon the plane. Initial trials with a linear scaling were notsatisfying because of dense clusters of people in sometime periods and very sparse or empty periods duringother times. We decided to neglect exact temporal inter-vals and focused on temporal sequences of peopleinstead. This still allows for relative temporal compar-isons of ‘earlier’ and ‘later’ people and, at the sametime, accommodate all people’s nodes along the timeaxis. The result of this ‘stringing’ of nodes along theaxis has the effect that certain periods take up muchmore display space than others. A time legend dis-played on top of the time view and a subtle grid repre-senting decades and centuries indicate this temporalfolding. As the data for the philosophers are especiallysparse between about 300 B.C. and 1200 A.D., there is aclustering of grid lines in this particular time area. Theresulting timeline is shown in Figure 6.

Explicit relations as curved edges

The layouts for topical and temporal similarities repre-sent only the implicit relations between data items. Inorder to represent explicit relations, that is influencesbetween philosophers, painters, and musicians, directededges are drawn between the nodes. However, if alledges were to be shown for all nodes at the sametime, there would be far too many edges to be readable,let alone interpretable. Instead, by activating only onenode at a time, it is possible to read individual edgesand differentiate between two types of influence:

Incoming influence. The person is inspired or affected byother people and builds upon their work.

Outgoing influence. The person has inspired or affectedother people who built upon their work.

Figure 5. Similarity map of philosophers withoutselection.

Similarity map Timeline

Figure 4. Layouts for visualizing explicit and implicitrelations based on similarity (left) and birth dates (right).


A recent study on representing directed edgessuggests that simple visual cues better help detectingconnections between two nodes than curves or arrow-heads.8 However, our aim was to reveal the extent ofmany connections of one node with all the remainingnodes and discern incoming and outgoing influences.Incoming influence can be seen as the basis of a per-son’s work, the outgoing influence is the impact theyhad on other people. To visually differentiate betweenthese edge types, we used curvature, directionality,shape, opacity, and colour (see Figure 7).

Curvature. Incoming edges are curved downwardsand outgoing edges are curved upwards. The idea isthat incoming influence stands for the foundationupon which a philosopher, painter, or musician buildstheir work. For the outgoing influence, the edge iscurved upward, as a visual depiction of reach beyondprevious work. Taken together, both types of edgesform a wave-like shape when nodes are arranged in asequential order. The extent of curvature of an edgedepends on its distance between the connected nodes.This way, the wider the reach of an influence, the moresalient the edge appears. To reduce the visual footprintof edges in the similarity map, the curvature is slightlysmaller than in the timeline view.

Directionality. Incoming and outgoing edges differalso with regard to how directionality is represented.As we assume that only one person is activated at atime, it is most likely that this person’s node will havemultiple incoming and outgoing edges, while all otherconnected nodes will mostly have only one associatededge or in some cases two edges (if the influence is

mutual). This indicates that to most clearly representedge directionality, there is much less clutter aroundthe linked nodes than around the activated node. Weexploited this fact by situating the visual representa-tion of directionality at the linked, non-active nodes:incoming edges have arrow-like cuttings at theirsource nodes, whereas the outgoing edges comingfrom the active node have arrows at the destinationnodes. This ensures that the edge endings at the activeperson’s node are simple and thin, allowing for manydiscernible edges.

Shape. An additional way to differentiate the edgetypes is by their shape. Incoming edges are drawnthicker than the outgoing edges. While there is noinherent reason why incoming edges should be thickerthan outgoing edges, giving the different types of edgesthese two distinct shapes allows for easy distinctionwhen multiple edges are displayed at the same time.

Opacity. To balance out the different visual weightsresulting from thick and thin edges, the opacity of theincoming edges is decreased. Using opacity instead ofadjusting brightness for the incoming edges alsoreduces the occlusion of other elements due to incomingedges.

Colour. As the nodes have different colours, thecolour of an edge corresponds with the colour of thenode – where the influence is coming from. The out-going influence edges take on the colour of the selectedperson, and the incoming edges share the colour withthe originating node. The mapping of nodes’ positionsto their colour is discussed in the following section.

Redundancy through size and colour

EdgeMaps visualize data sets primarily using positionto represent implicit relations (time and similarity)and edges to represent explicit relations (influence).As secondary visual variables we include node sizeand colour to accentuate significance and similarity ofdate items.

We understand the significance of a person as therelative degree of their outgoing influence. The

Figure 6. Timeline of philosophers without selection.

Figure 7. By selecting one node at a time, we candistinguish between two types of edges: incoming, B isinfluenced by A (left), and outgoing, B influences C (right).

Dork et al. 11

rationale is that a philosopher, painter, or musicianwho had more influence on their peers and successorsis arguably more significant than those who had lessimpact. Outgoing influence can be viewed by selectingthe corresponding node and viewing the outgoingedges. We decided to additionally expose the signifi-cance of a node without any interaction by encodingthe significance of a person as their circle’s area. Weexperimented with different types of glyphs; however,we decided that simple circles with varying sizes wouldallow for sufficient data encoding without adding sig-nificant perceptual complexity.

Furthermore, we decided to use colour to doubleencode similarity based on people’s keywords. Sincethe output of the MDS algorithm provides a spatializa-tion of this relationship, we used it in combination withthe HSV (hue, saturation, value) colour space as thebasis for the colour calculation (see Figure 8). Wedecided to map broad interest regions to hue and thedistance to the centre to saturation while keepingthe brightness constant. The idea behind this encodingis that the further out a person is located, the moredistinct they are from all other people. When translat-ing this notion of distinctness to a colour space, itwould seem intuitive that the more distinct itemswould be more saturated and the more common itemswould be less saturated. However, keeping the valueconstant means that either the nodes in the centrewould be too bright or the nodes on the peripherytoo dark. Therefore, we took the distance to centreinto account for the brightness value resulting in

well-visible, grey-like nodes in the centre and colourfulnodes towards the outside.

Based on this approach, nodes receive distinct col-ours, which are also used for their outgoing influenceedges. As shown, for example, in Figure 9, bottom, theactive philosopher’s outgoing edges have the colour ofthis node and the incoming edges have the colours oftheir respective philosophers. This also helps to distin-guish edge types and associate edges with correspond-ing source nodes. Furthermore, as colours and sizes areconsistently used between layouts, the viewer maybetter recognize previously visited nodes.

Yarn balls vs. fireworks and waves

After having discussed the visual representation ofimplicit and explicit relations individually, we willnow discuss how layout, edges, and additional encod-ings come together in novel formations.

While exploring several design options, we examinedthe possibility of activating multiple nodes at the sametime. The result is an example of the ‘yarn ball’ effectfor complex NLDs, conveying neither overview norstructure (see Figure 9, top). In contrast, displayingonly the edges associated with one individual nodeallows for interesting visualization and interaction pos-sibilities (see Figure 9, middle and bottom). As dis-cussed before, in this way it becomes possible tobetter represent edge types and directions.Furthermore, considering that the node size reflectsthe number of outgoing edges, differing node sizesmay be a more effective way to convey the general over-view of a data set. Instead of displaying many edgesthat make readability more and more difficult withincreasing numbers of nodes and edges, it appears tobe more effective to use colour, position, and size fornodes to provide context and overview.

An interesting structure emerges when selecting anode in the similarity map with incoming and outgoingedges rendered in patterns of fireworks (see Figure 9,middle and bottom). Relatively less influential people(e.g. Lacan) resulted in more modest fireworks than themost influential people, such as Kant. Besides thenumber of edges, the spatial extent of edges conveysthe topical scope of a person’s incoming and outgoinginfluences. In the temporal layout, the influence edgeslead to distinct wave-like patterns (see Figure 10). Theresulting edge layout is particularly interesting as thewaveform can also be seen as an engaging representa-tion of the dynamic evolution of philosophical ideas,artistic movements, and musical genres.

In the similarity map, the spatial extent of edgesstands for the topical scope, and in the timeline theextent of edges represents the temporal scope of influ-ence. It becomes possible to distinguish influences

Figure 8. Colours are derived from MDS positions usingthe distance from centre for saturation and the angle forhue.


within close temporal or topical proximity from influ-ences that span multiple time periods and similarityregions. There is a qualitative difference in influencingpeople of the same time with similar characteristicsthan in influencing people distributed over a largertime interval across a range of interests, movements,or genres. For example, the influence between twojazz singers is different from the influence that a bluesmusician has on a punk musician. In EdgeMaps, thesedifferences are revealed by the position, distribution,

and length of edges as well as the colours and positionsof associated nodes.

Creating a web-based visualizationinterface

In addition to the representation of multiple types ofdata relations, it is important to consider how the inter-action and visual design of the interface can be realizedto support the exploration of complex relationships. Inthe following, we describe how we designed a web-based interface for EdgeMaps aimed to support estab-lished information-visualization principles and inte-grate the tool into the web-browsing experience(demonstration online: http://mariandoerk.de/edge-maps/demo).

Interacting with the visualization

The interface provides a range of interaction techniquesfor manipulating the view, data, selections, and details.The primary way of interacting with the EdgeMapsvisualization is selecting a node representing a personand revealing the corresponding influence connectionscoming towards this node and going out to other nodes.The viewer can then change the selection by eitherselecting another node (possibly one that is associatedwith the current one) or by deselecting this node byclicking on the background or clicking again on thisnode.

A text search allows the viewer to search throughnames, descriptions, and keywords. When a searchterm is entered, the visualization shows only the labelsof the nodes representing data items that match thesearch query; the opacity of the remaining nodes isdimmed and their labels are not shown. This way it ispossible to quickly see where, for example, painters of acertain genre are situated, making the underlying simi-larity layout easier to interpret (see Figure 11).

Additional controls are provided in the top right ofthe interface as drop-down menus to change the layoutand data set of the visualization. Changing the data setinitiates a simple blending transition from one visuali-zation to the next. Switching between layouts triggers atransition between the similarity map and the timeline,gradually moving the nodes into their new locations. Asthe colours are based on similarity, it is possible torediscover nodes from the similarity map in the timelineview. The transition between the views is animated toallow the viewer to follow nodes between the views andestablish an understanding of temporal and topicalinterrelations. For example, by comparing how thesetransitions unfold, it is possible to visually examinewhether keyword-based similarities correspond withtime periods people were active in.

Figure 9. Displaying influence edges in the similarity mapfor both five philosophers (top) and one philosopher at atime (middle and bottom). With a single activated philoso-pher, influence edges arguably resemble the shape offireworks. Less influential philosophers have sparser edgepatterns (middle) than the more influential ones (bottom).

Dork et al. 13

To learn more about a person, a detailed view withadditional information about the selected node is dis-played in the lower right portion of the window (seeFigure 12). A short description of the person is

displayed and accompanied with a visual depiction(e.g. photo, painting, sculpture) of the person.Selecting the thumbnail opens the correspondingentry on Freebase, allowing the viewer to go to thedata source and learn more about a given person.The border of this detail display uses the colour ofthe corresponding person’s node in the EdgeMapsvisualization.

Visual information design

As an attempt to reduce visual clutter, only the name ofthe person that is currently selected and the peopleassociated with the selected person are displayed. Thenodes for the remaining people are dimmed and do nothave a label. This allows the viewer to focus on thecurrent selection, but it also alleviates a label-occlusionproblem that still occurs when many edges of asso-ciated nodes are displayed (see Figure 9, right). It ispossible to hover with the mouse over any node tomake occluded or hidden labels visible. To ensure aes-thetic proportions between circles and labels, the fontsize is set relative to the size of the circle. Furthermore,only the surname of the person is used, which is typi-cally unique and sufficiently known.

Figure 10. In the timeline view, influence edges result in a wave-like form indicating the propagation of philosophicalideas over many time periods. The more significant a philosopher is, the larger is the resulting wave (below).

Figure 11. The text search allows the viewer to explore,for example, the distribution of painters of portraits.


Legends are displayed to summarize the differentmappings that are used. For example, for the size ofcircles and types of edges, a legend for both layouts isalways displayed in the lower left of the screen (seeFigure 12, lower left). The two circle sizes displayedcorrespond to the smallest and largest nodes in thevisualization, giving a sense of the extent of significancebetween people. In the timeline view, years indicate thetemporal distribution along the time axis. The legendsare drawn in shades of light grey to avoid distractionfrom the information visualization.

Web integration and native graphics

One goal behind realizing EdgeMaps as a web-basedsystem was to align the interactivity with existing usepatterns supported by web browsers. As shown inFigure 9, our prototype supports the browser’s history,and bookmarking and sharing features. The backbutton in web browsers provides a common way ofbacktracking on the Web, allowing the surfer toreturn to a previous page; it serves a similar functionin the EdgeMaps tool – the viewer can use it to returnto a previous state of the visualization. The state of thevisualization, that is the type of layout, data set, selec-tion, and search query, is encoded into the address aspart of the fragment identifier of the URL. In this way,the interaction steps can be added to the browser’s his-tory and a given state can be bookmarked or shared asa link via email, instant messaging, or on social

networks. We also adjusted the title to reflect the cur-rent state of the visualization to make the history,bookmarks, and link titles easy to understand. Forexample, when dragging the address into a rich-textdocument, modern browsers and word processors auto-matically turn it into a hyperlink with the correspond-ing title (see Figure 12, bottom left).

There are several options for drawing interactivevisualizations in the browser. On the one hand thereare browser plug-ins, such as Flash and Java, that areused traditionally to provide advanced graphics andinteractivity. On the other hand, new web standards –in particular HTML5 – are currently emerging thatstrive for native support of interactive graphics in thebrowser. We are interested in exploring the browser’snative, standards-based graphics capabilities and in notrelying on proprietary technologies. This approach alsohas the benefit that the tool would be viewable onmobile devices that do not support browser plugins(mainly Apple’s iOS devices). The two main optionsfor browser-native graphics are the Canvas elementfor bitmap drawing and SVG for scalable vector gra-phics. As native browser graphics are currently goingthrough rapid developments, we considered bothapproaches. To make a quick performance comparisonbetween Canvas and SVG, we drew 1000 random cir-cles with differing opacities, positions, and sizes. In thisexperiment, the Canvas element was about twice as fastin terms of rendering than the SVG. While the outputof both approaches looks identical, browser-based SVG

Figure 12. The web-based interface of EdgeMaps ties in with common browser functions and use patterns on the Web.

Dork et al. 15

allows for straightforward event handling, as it retainsits own database of drawn elements. In contrast, theCanvas element requires custom-picking logic for theobjects to be interacted with, which would lead to asignificant performance overhead. Based on these con-siderations, we chose SVG in combination with thevector graphics JavaScript library Raphael.27

Revealing complex relationships

By visualizing explicit and implicit relations, EdgeMapsallow the viewer to explore data patterns that are other-wise difficult to see. In the following, we illustrate someof the new kinds of exploration that can be pursued andthe types of insight that can be gained.

Pivotal data exploration

Limiting selections to one node at a time has interest-ing implications on the interaction with the

visualization. Activating one person draws the incom-ing and outgoing edges, highlights the respectivenodes, and displays their names. As the linked nodesare emphasized, the viewer is more likely to follow thedisplayed edges and activate one of the linked people.In a sense, exploring the visualization along influenceedges can be seen to be like pivoting through the dataset around one person at a time. Each step triggers anovel visualization, sharing the positioning of theformer state yet providing a different influence net-work centred around the current node. Considerselecting, for example, the node of the philosopherBeauvoir, which is one of the smaller circles in theperiphery of the visualization (see Figure 13, top).After having selected her node, it is possible tofollow one of the philosophers that was influencedby her. In this case, one could select Deleuze(bottom right) and afterwards a philosopher with alarger and more saturated node, in this caseNietzsche (bottom left). The path of exploration

Figure 13. Exploratory pivoting along philosophers’ influences from Beauvoir over Deleuze to Nietzsche.


depends somewhat on serendipity, intuition, and inter-est, all of which are affected by the overall visualiza-tion and the viewer’s inclinations.

Qualitative differences in influence

The influence edges reaching across the view generatevisual patterns that can help develop a rich understand-ing of a person’s influence. For example, the similaritymap of the musical artists’ data allows us to see thereach of a musician’s influence. Taking a closer lookat the influence network of The Beatles shows thattheir influence spans most regions of the similaritymap except those inhabited mostly by jazz, country,and rap singers at the bottom of the visualization (seeFigure 1). In contrast, the influence network of thecountry singer Hank Williams appears much more

constrained to his immediate neighbours (see Figure14, top left). Furthermore, it is possible to comparethe relative location of nodes that are connected byincoming and outgoing influence edges. For instance,John Coltrane appears to be influenced by musicians ofsimilar genres, while his influence reached relatively farand wide across the similarity map (see Figure 14, topright). The influence networks of Jimi Hendrix and TheRamones are in each case located between the musi-cians they got influenced by and the singers andbands they themselves influenced (see Figure 14,bottom left and right). Taking into account that JimiHendrix is considered as a pioneer of the electric guitarand The Ramones are seen by many as the first punk-rock band, it makes intuitive sense that their incomingand outgoing influences are disparate groups of musi-cians on the similarity map.

Figure 14. EdgeMaps reveal different patterns of how musicians influence each other. There is a difference betweenreaching members of similar genres (top left) and reaching further across the similarity map (top right). Influential andinnovative musicians tend to show a spatial split between incoming and outgoing influences (bottom left and right).

Dork et al. 17

Time–similarity comparisons

By switching between time and similarity layouts, itappears as if time periods and genres of musicianshave a relatively high correlation. This hunch is furthersupported considering the colour distribution of nodesalong the time axis (see Figure 15, top). The nodesappearing in the bottom left of the similarity map inshades of purple are mostly located on the left side ofthe timeline, that is in the first half of the last century.On the flip side, the nodes appearing in tones of greenin the top right of the similarity map are locatedtowards the right of the timeline. Examining the distri-bution of nodes highlighted for the search terms ‘jazz’and ‘punk’ further supports this observation (seeFigure 15, middle and bottom). The painters’ timeline

does not indicate such a strong correlation betweensimilarity map locations with time periods; however,searching for ‘impression’ and ‘abstract’ indicates rela-tively discrete intervals along the time axis (see Figure16). The more heterogeneous colour distribution islikely owing to the inclusion of art forms in the simi-larity keywords that are fairly independent from artisticmovements.

Discussion

The design and implementation of EdgeMaps consti-tute a first step towards integrating explicit and implicitdata relations. While the idea of integrating graphdrawing and spatialization techniques is promising,we discuss several limitations and challenges that

Figure 16. In the painters’ timeline view, there is a slightly higher concentration of blue and purple tones on the right.Searching for terms indicating artistic movements confirms the temporal development of impressionist and abstract art.

Figure 15. The gradual colour changes in the time layout of musicians suggests a correlation between genres and timeperiods. This observation can be examined more closely by highlighting nodes matching search terms of genres.


suggest opportunities for more research on the visuali-zation of complex data sets featuring diverse relations.

Edge congestion

We have argued that displaying edges for only onenode, that is a philosopher, painter, or musician, at atime allows for novel exploration methods and avoidssome of the edge congestion problems of larger NLDs.However, once a node has a wide and dense influencenetwork, similar edge-congestion challenges occur. Itwould be interesting to examine how lens and bundletechniques6,7 could be combined with visualizationsthat use a layout representing implicit relations withedges for explicit relations.

Overlapping nodes

With growing information spaces, the challenge is notonly edge congestion, but the overall perceptual scal-ability of visualizations with more and more overlap-ping nodes. In the timeline view, it would be possible toposition the nodes vertically, for example, ordered bysignificance, that is, size. The interest map currentlyplaces smaller nodes around larger nodes with thesame position. In order to support large numbers ofoverlapping nodes, it is possible to indicate spatialtogetherness using bubble sets.28

Aggregating nodes

Another promising approach is to introduce aggrega-tion that can simplify the graph visualization and pro-vide higher level perspectives.17,29 Considering the timeand similarity layouts of EdgeMaps, nodes could beaggregated along a temporal or topical proximity.The resulting influence networks of clusters wouldallow for higher level exploration of mutual influencesof groups of philosophers, painters, or musicians. Forexample, the viewer could explore how painters of theRenaissance influenced the impressionists, or how jazzand blues singers affected electronic and new wavemusicians.

Data set

The data sets used to illustrate the potential ofEdgeMaps are limited both in size and also the culturalscope that is centred around North America andEurope. This bias probably reflects the demographicsof the Freebase community. It would be interesting touse the visualization as a starting point to add missingconnections, scrutinize the keywords, and add dataentries that are beyond the cultural horizon. Besidesvisualizing historic personalities, we are interested in

examining other data sets such as citation networks,migration patterns, and international trade.

Spatialization

While we used an existing MDS algorithm,26 it wouldbe beneficial to explore its parameters and, for example,consider planes with dynamic rectangular shapesbesides squares as it is realistic to rely on windows offixed aspect ratios. The difficulty of interpreting themeaning of position and proximity is alleviated withthe display of influence edges, yet further explorationis necessary to find other techniques that make theoutput of MDS algorithms more accessible and mean-ingful. One of the ideas that arose during this work is tolabel regions in the MDS plane based on representativekeywords that are more common among nodes that arepositioned closer to each other. Besides the aim ofmaking the MDS layout more comprehensible, itcould be useful to create a flexible MDS algorithm,allowing the viewer to change how items are positioned.The great challenge for this would be to make thisalgorithm interactive.

Folding time

The timeline features an irregular scaling that is due toremoving spacing between nodes in favour of largerinformation density. While the textual time legend indi-cates this temporal folding, the added grid lines visuallyexpose the scaling. Ideally the viewer would be able toadjust time foldings according to their interests andexploration steps. Currently, a person is representedas a single point on the timeline, which makes it diffi-cult to represent their lifetime duration or longevity ofactive influence. It would be interesting to represent theactual lifespans and active years of philosophers, pain-ters, and musicians.

Layouts

Supplementing the graph visualization with meaningfulnode positions led to the creation of novel visual repre-sentations that expose temporal and topical qualities ofdata connections. Besides time and keywords, manymore dimensions can be utilized as implicit relationsfor underlying layouts. For example, positions in ahierarchy or geographic space could be used for nodeplacement. It is also feasible to combine two types ofsimilarities for the spatial dimensions of the plane. Forexample, a one-dimensional MDS projection could bejuxtaposed with the temporal distribution of nodes. Inaddition to enhancing the understanding of influenceconnections, such a hybrid layout could help theviewer to compare time relations with similarities

Dork et al. 19

based on interests, movements, or genres. As artisticmovements can be viewed as correlations betweentime and style, a corresponding layout could be veryhelpful.

Insight and experience

Having applied the EdgeMaps visualization to threedata sets allowed us to closely examine the insightsand experiences that can be supported. However, wehave only anecdotal evidence for the potential of visua-lizing explicit and implicit relations this way. It wouldbe beneficial to study how domain experts wouldexplore the data sets using EdgeMaps. Particularly,we would be interested in the role of pivoting aroundindividual nodes to gain a bottom-up understanding ofa complex data set. It may be interesting to comparehow domain experts experience the visualization of agiven data set with viewers who are unfamiliar with thedata.

Conclusion

As information spaces grow in size and complexity,there is a growing need for visual representations thathelp us make better sense of diverse data relations andpatterns. With this work, we have approached this chal-lenge by making the following contributions:

. A novel information visualization technique thatcombines implicit relations (topical and temporalsimilarity) with explicit relations (influences).

. A detailed discussion of the design considerationsconcerning visual representations and interactiontechniques of the EdgeMaps visualization.

. The implementation of a web-based system incor-porating basic use patterns on the Web such asbrowser’s history, bookmarks, and link sharing.

. An application to three case studies exploring dataabout personalities from philosophy, art, and musicexamining the types of novel insights and uses.

The EdgeMaps technique represents implicitrelations as the underlying layouts for node-link dia-grams in which nodes stand for people and edges standfor the explicit influence connections between them.By constraining the selection to one node at a time,it is possible to visually distinguish between incomingand outgoing edges. Novel visual patterns resemblingthe aesthetics of fireworks and waves not only lookmore appealing, they also allow for a closer examina-tion of influence differences. In contrast to the notor-ious yarn-ball effect of dense graph visualizations, wehave suggested that an interactive pivotal explorationalong edges between nodes may better allow viewers to

grasp network structure than complicated overviews.While much more work is necessary to help us betterexplore and understand complex information spaces,we see EdgeMaps as a promising step towards thiseffort.

Acknowledgements

Funding was provided by SMART Technologies, NSERC,iCORE/AITF, and NECTAR.

References

1. Shneiderman B. The eyes have it: A task by data type taxonomy

for information visualizations. Proc IEEE Symp Vis Lang 1996;

336–343.

2. Bertin J. Semiology of Graphics: Diagrams, Networks, Maps.

Madison, WI: University of Wisconsin Press, 1983.

3. Stone MC. A field guide to digital color. In: Color in Information

Display. Boca Raton, FL: AK Peters, 2003, pp. 277–300.

4. Herman I, Melancon G and Scott Marshall M. Graph visualiza-

tion and navigation in information visualization: a survey. IEEE

Trans Vis Comput Graph 2000; 6(1): 24–43.

5. Purchase HC. Metrics for graph drawing aesthetics. J Vis Lang

Comput 2002; 13(5): 501–516.

6. Wong N, Carpendale S and Greenberg S. EdgeLens: An interac-

tive method for managing edge congestion in graphs. InfoVis

2003: Symposium on Information Visualization. IEEE

Computer Society 2003; 51–58.

7. Holten D. Hierarchical edge bundles: visualization of adjacency

relations in hierarchical data. IEEE Trans Vis Comput Graph

2006; 12(5): 741–748.

8. Holten D, Jarke J and van Wijk. A user study on visualizing

directed edges in graphs. CHI ’09: Proceedings of the SIGCHI

Conference on Human Factors in Computing Systems, ACM,

2009: pp. 2299–2308.

9. Becker RA, Eickt SG and Wilks. Visualizing network data. IEEE


10. Heer J and Boyd D. Vizster: Visualizing online social networks.

InfoVis 2005: Symposium on Information Visualization, IEEE


11. Wise JA, Thomas JJ, Pennock K, et al. Visualizing the non-

visual: Spatial analysis and interaction with information from

text documents. InfoVis 1995: Symposium on Information

Visualization, IEEE Computer Society 1995; 51–58.

12. Williams M and Munzner T. Steerable, progressive multidimen-

sional scaling. InfoVis 2004: Symposium on Information

Visualization, IEEE Computer Society 2004; 57–64.

13. Wattenberg M. Arc diagrams: Visualizing structure in strings.

InfoVis 2002: Symposium on Information Visualization, IEEE


14. Kerr B. Thread arcs: An email thread visualization. InfoVis 2003:

Symposium on Information Visualization, IEEE Computer Society

2003; 211–218.

15. Neumann P, Schlechtweg S and Carpendale S. ArcTrees:

Visualizing relations in hierarchical data. Eurographics – IEEE

VGTC Symposium on Visualization 2005; 53–60.

16. Shneiderman B and Aris A. Network visualization by semantic

substrates. IEEE Trans Vis Comput Graph 2006; 12(5):

733–740.

17. Wattenberg M. Visual exploration of multivariate graphs. CHI

’06: Proceedings of the SIGCHI Conference on Human Factors in

Computing Systems, ACM, 2006; pp. 811–819.


18. Collins C and Carpendale S. VisLink: revealing relationships

amongst visualizations. IEEE Trans Vis Comput Graph 2007;

13(6): 1192–1199.

19. Viegas FB, Wattenberg M, van Ham F, Kriss J and McKeon M.

Many Eyes: a site for visualization at internet scale. IEEE Trans

Vis Comput Graph 2007; 13(6): 1121–1128.

20. Heer J, Viegas FB, Wattenberg M and Agrawala M. Point, talk,

publish: visualization and the web. In: Liere R, Adriaansen T and

Zudilova-Seinstra E (eds) Trends in Interactive Visualization.

Springer, 2009, pp. 269–283.

21. Pousman Z, Stasko JT and Mateas M. Casual information visu-

alization: depictions of data in everyday life. IEEE Trans Vis

Comput Graph 2007; 13(6): 1145–1152.

22. Viegas FB and Wattenberg M. Artistic data visualization:

beyond visual analytics. In: Online Communities and Social

Computing, Vol. 4564 of Lecture Notes in Computer Science.

Springer, 2007, pp. 182–191.

23. Dork M, Carpendale S and Williamson C. The information fla-

neur: a fresh look at information seeking. CHI ’11: Proceedings

of the SIGCHI Conference on Human Factors in Computing

Systems, ACM, 2011.

24. Freebase, http://www.freebase.com

25. Allmusic, http://www.allmusic.com

26. de Leeuw J and Mair P. Multidimensional scaling using major-

ization: sMACOF in R. J Stat Software 2009; 31(3): 1–30.

27. Raphael, http://raphaeljs.com

28. Collins C, Penn G and Carpendale S. Bubble sets: revealing set

relations with isocontours over existing visualizations. IEEE


29. van Ham F, Jarke J and van Wijk. Interactive visualization of

small world graphs. IEEE Symposium on Information

Visualization, 2004, 2004; 199–206.

Dork et al. 21

information visualization visualizing explicit and...

Documents