visualization of large multivariate data sets using parallel coordinates

Post on 20-Jun-2015

326 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presentation of my tool which visualize multivariate datasets using parallel coordinates

TRANSCRIPT

Visualization of Large Multivariate Data Sets using Parallel Coordinates

Ing. Ľuboš TakáčPhD student

Faculty of Management Science and InformaticsUniversity of Žilina

Presentation overview• Visualization

• Parallel Coordinates (PC)

• Large Multivariate Data Sets (LMDS)

• Problem of visualization LMDS

• Solutions

• Developed Software Tool

• Further research and application

Visualization• One of the best approach for presenting data from

PC to human

• Advantages• Global view of data (all in one picture)• Significant features highlighted (and vice versa)• Fast understanding of data by human

• Purpose• Understand raw data• To see some significant characteristics or anomalies which

can be further examined to gain some additional information about raw data.

Parallel Coordinates (parallel axes)• Invented by Maurice d’Ocagne, 1885• Popularized by Alfred Inselberg, 1959

• Easy construction (like ordinary graph)• Data are represented by polylines

• Suitable for visualizing multivariate data (more than 3 dimension)

• You can see dependecies between variables• Distribution per variable

Parallel Coordinates - Construction

Main difference between ordinary graph of function and parallel coordinates is in position of axes.

Parallel Coordinates – Examples

Examples of some 2D functions visualized using parallel coordinates by developed software tool.

Large Multivariate Data Sets• Collection of data usually presented in tabular

form

Multivariate data set of movies with 7 dimensions.

Problem of visualization LMDS• Records overlapping• by simply painting records should be overlapped, you

loose some information

• Overlapping the same records• by simply painting you do not see the difference between

overlapping two or hundreds same records

• Too many records to visualize => one big blur• imagine resolution 1024x768, ten thousand of records

uniformly distributed over axes (height 768 px means about 13 records per pixel)

Problem of visualization LMDS

Problem of overlapping painted records.

Possible Solutions

• Preprocessing data before visualization

• Paint data sophisticated by Alpha Compositing

Alpha compositing• Computer graphics painting method which use

alpha channel to define each color (alpha channel – transparency of color)

• If you paint object with non opaque color, the resulting color depends on background too

http://en.wikipedia.org/wiki/Alpha_compositing

Implemented solution

Visualizing the same randomly generated multivariate data sets by opaque color (upper image) and using alpha compositing technique (right image).

Developed software tool

• Based on mentioned principles• Interactive analyzing of LMDS• Interactive set operation (selection, difference,

intersection …)• High quality, antialiased image• Data import from text file• Record count is limited to hundreds thousand at

rs. 1920x1080

Developed software tool

Demonstration of developed software tool. Visualized data sets come from IMDb (Internet Movie Database).

Further research and application• Tool can help decision makers and data analyst to

gain some added information to do better decisions.

• Medical data• Scholar data

Thank you for your attention!lubos.takac@gmail.com

top related