finding ostriches in the courtroom

Post on 27-Jun-2015

498 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Introductory slides on my research projects related to content analysis linguistic visualization from Transparent Text 2009

TRANSCRIPT

Finding Ostriches in the CourtroomEnabling Insight with Linguistic Visualization

Christopher CollinsUniversity of Toronto (to Dec 2009)

University of Ontario Institute of Technology (Jan 2010-)

Target Audience

Real-time Single Document

Discrete Corpus

Continuous Corpus

Linguistic

NLP

CL

General

Public

Domain

ExpertsLanguage

Researchers

Real-time Single Document

Discrete Corpus

Continuous Corpus

Linguistic

NLP

CL

Problem Areas

Humans have reached

their cognitive capacity.

Humans have reached

their cognitive capacity.

Information is overwhelming

because of

the naïve manner

in which it is delivered.

7

External Cognition

• External cognition is the interaction

between internal and external

representations when performing cognitive

tasks.

• Computational offloading is the extent to

which external representations can reduce

the amount of cognitive effort to solve a

problem.

Yvonne Rogers, New Theoretical Approaches for Human-Computer Interaction, 2004.

Document Visualization

Collins, C.; Carpendale, S.; Penn, G.

DocuBurst: Visualizing Document Content using Language Structure.

Proceedings of Eurographics/IEEE VGTC Symposium on Visualization, June, 2009.

Mihalcea and Tarau, 2004Many Eyes Tag Cloud

DocuBurst

absolute,noun,10

chair,noun,2

moment,noun,11

game,noun,30

reality,noun,3

take,verb,13

represent,verb,17

...

games�game

taken�take

game IS activity

chair IS furnitureWordNet

U.S. Presidential Debates

Collins, C.; Viégas, F.; Wattenberg, M.

Parallel Tag Clouds to Explore and Analyze Faceted Text Corpora.

To appear in Proc. IEEE Symposium on Visual Analytics Science & Technology (VAST), 2009.

Corpus Visualization

• Beyond similarity and clustering

– How do we discern differences within and between

document collections?

Our Data: U.S. Federal Court Decisions

Data from public.resource.org

Visualization Design

• Size = significance of

difference (G2 score)

• Order = alphabetic

• Edges = word occurring in

multiple columns

17

Patent Invention

Ostriches in the 7th Circuit

Highfalutin Judge Selya

furculum

impuissant

immurement

Open APIs for data

NYT, Twitter, Google

Toolkits and APIs for

Visualization

Processing, Rafael,

Flare, Flash

?Open APIs for NLP

- Summarization

- Keyword extraction

- Sentiment analysis

Bridging the Linguistic Divide

Visualization

Augments

Reading

www.christophercollins.ca

top related