visualization of large multivariate data sets using parallel coordinates
Post on 20-Jun-2015
326 Views
Preview:
DESCRIPTION
TRANSCRIPT
Visualization of Large Multivariate Data Sets using Parallel Coordinates
Ing. Ľuboš TakáčPhD student
Faculty of Management Science and InformaticsUniversity of Žilina
Presentation overview• Visualization
• Parallel Coordinates (PC)
• Large Multivariate Data Sets (LMDS)
• Problem of visualization LMDS
• Solutions
• Developed Software Tool
• Further research and application
Visualization• One of the best approach for presenting data from
PC to human
• Advantages• Global view of data (all in one picture)• Significant features highlighted (and vice versa)• Fast understanding of data by human
• Purpose• Understand raw data• To see some significant characteristics or anomalies which
can be further examined to gain some additional information about raw data.
Parallel Coordinates (parallel axes)• Invented by Maurice d’Ocagne, 1885• Popularized by Alfred Inselberg, 1959
• Easy construction (like ordinary graph)• Data are represented by polylines
• Suitable for visualizing multivariate data (more than 3 dimension)
• You can see dependecies between variables• Distribution per variable
Parallel Coordinates - Construction
Main difference between ordinary graph of function and parallel coordinates is in position of axes.
Parallel Coordinates – Examples
Examples of some 2D functions visualized using parallel coordinates by developed software tool.
Large Multivariate Data Sets• Collection of data usually presented in tabular
form
Multivariate data set of movies with 7 dimensions.
Problem of visualization LMDS• Records overlapping• by simply painting records should be overlapped, you
loose some information
• Overlapping the same records• by simply painting you do not see the difference between
overlapping two or hundreds same records
• Too many records to visualize => one big blur• imagine resolution 1024x768, ten thousand of records
uniformly distributed over axes (height 768 px means about 13 records per pixel)
Problem of visualization LMDS
Problem of overlapping painted records.
Possible Solutions
• Preprocessing data before visualization
• Paint data sophisticated by Alpha Compositing
Alpha compositing• Computer graphics painting method which use
alpha channel to define each color (alpha channel – transparency of color)
• If you paint object with non opaque color, the resulting color depends on background too
http://en.wikipedia.org/wiki/Alpha_compositing
Implemented solution
Visualizing the same randomly generated multivariate data sets by opaque color (upper image) and using alpha compositing technique (right image).
Developed software tool
• Based on mentioned principles• Interactive analyzing of LMDS• Interactive set operation (selection, difference,
intersection …)• High quality, antialiased image• Data import from text file• Record count is limited to hundreds thousand at
rs. 1920x1080
Developed software tool
Demonstration of developed software tool. Visualized data sets come from IMDb (Internet Movie Database).
Further research and application• Tool can help decision makers and data analyst to
gain some added information to do better decisions.
• Medical data• Scholar data
Thank you for your attention!lubos.takac@gmail.com
top related