exploring the multivariate structure of missing values ... · exploring the multivariate structure...
TRANSCRIPT
Exploring the multivariate structure of missing valuesusing the R package VIM
Matthias Templ1,2, Andreas Alfons1, Peter Filzmoser1
1 Department of Statistics and Probability Theory, Vienna University of Technology2 Department of Methodology, Statistics Austria
Rennes, July 8, 2009
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 1 / 18
Content
1 Motivation
2 Visualization of missing values
3 Conclusions
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 2 / 18
Motivation
Missing values
Real data sets often contain missing values:
X =
x11 . . . . . . x1p
... NA...
NA... NA
...xn1 . . . . . . xnp
,
with n observations, p variables, and some missing values. (NA)
Examples: nonresponse in surveys, element concentration belowdetection limit in chemical analyses.
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 3 / 18
Motivation
Comments on missing values
Most statistical methods can only be applied to complete data.
In order to select an appropriate imputation method (especially formodel-based imputation), it is necessary to know the multivariatestructure of the missing values beforehand.
Visualizing missing values may not only help to detect the missingvalue mechanisms, but also to gain insight into the quality andvarious other aspects of the data.
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 4 / 18
Motivation
Missing value mechanisms
Three important cases (e.g., Little and Rubin 2002):
MCAR (Missing Completely At Random):
P(Xmiss |X) = P(Xmiss)
MAR (Missing At Random):
P(Xmiss |X) = P(Xmiss |Xobs)
MNAR (Missing Not At Random):
P(Xmiss |X) = P(Xmiss |Xobs ,Xmiss)
where X = (Xobs ,Xmiss) denotes the complete data, and Xobs and Xmiss
are the observed and missing parts, respectively.
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 5 / 18
Visualization of missing values
Visualization of missing values
Famous books and almost all articles about missing values do notaddress vizualization.
Visualization tools for missing values are rarely or not at allimplemented in SAS, SPSS, STATA or even R.
Through linking, missing values can be highlighted in GGobi (Cookand Swayne 2007) and Mondrian (Theus 2002).
MANET (Unwin et al. 1996, Theus et al. 1997) is quite powerful, butonly available for older Apple systems with PowerPC architecture andMac OS.
Visualization tools for missing values need to be available for the Rcommunity so that visualization of missing valuess, imputation andanalysis can all be done from within R, without the need of additionalsoftware.
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 6 / 18
Visualization of missing values
Histogram and spinogram
20 30 40 50 60 70 80
010
020
030
040
050
060
070
0
age
mis
sing
/obs
erve
d in
py0
10n
0.0
0.2
0.4
0.6
0.8
1.0
age
mis
sing
/obs
erve
d in
py0
10n
15 30 40 50 65
Figure: Austrian EU-SILC data from 2004 with missings generated in variable age.
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 7 / 18
Visualization of missing values
Marginplot
● ●● ●●
●
●
15
00
3.0 3.5 4.0 4.5 5.0
3.0
3.5
4.0
4.5
pek_n
py13
0n
Figure: Austrian EU-SILC data from 2004.
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 8 / 18
Visualization of missing values
Scatterplot matrix
pek_n
20 30 40 50 60 70 80
02
46
810
12
2030
4050
6070
80
age
0 2 4 6 8 10 12
py010n
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Figure: Austrian EU-SILC data from 2004.
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 9 / 18
Visualization of missing values
Matrixplot
P00
1000
r007
000
py01
0n
py03
5n
py05
0n
py09
0n
py10
0n
pek_
n
bund
esld
age
010
0020
0030
0040
00
Inde
x
Figure: Austrian EU-SILC data from 2004.
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 10 / 18
Visualization of missing values
Parallel coordinate plotse
x
pek_
g
P03
3000
P02
9000
P01
4000
P00
1000 ag
e
bund
esld
Figure: Austrian EU-SILC data from 2004
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 11 / 18
Visualization of missing values
Parallel boxplots
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●
●
●●●●●●
●●
●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●
●
●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●
●
●●●●●●
●●
●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●
●●
●●●
●●●
●
●
●
●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●
●
●●●●●●
●●
●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●
●
●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●● ●
●
●
●
●
●
●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●
●
●●●●●●
●●
●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●
●
●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●
●
●●●●●●
●●
●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●
●
●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●
●
●●●●●●
●●
●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●
●
●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●
●
●●●●●●
●●
●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●
●
●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●
●
●●●●●●
●●
●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●
●
●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●
●
●●●●●●
●●
●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●
●
●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●
●
●●●●●●
●●
●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●
●
●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●
●
●●●●●●
●●
●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●
●
●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●
4626 4132 4530 96 4479 4612 14 4622 4 4592 34 4560 66 4618 8 4611 15 4625 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
obs.
in p
y010
nm
iss.
in p
y010
n
obs.
in p
y035
nm
iss.
in p
y035
n
obs.
in p
y050
nm
iss.
in p
y050
n
obs.
in p
y070
nm
iss.
in p
y070
n
obs.
in p
y080
nm
iss.
in p
y080
n
obs.
in p
y090
nm
iss.
in p
y090
n
obs.
in p
y100
nm
iss.
in p
y100
n
obs.
in p
y110
nm
iss.
in p
y110
n
obs.
in p
y130
nm
iss.
in p
y130
n
obs.
in p
y140
nm
iss.
in p
y140
n
01
23
45
pek_
n
Figure: Austrian EU-SILC data from 2004.
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 12 / 18
Conclusions
General Statements
The detection of missing value mechanisms is quite complex whenusing models or tests.
Statistical methods frequently lead to only vague statements aboutthe missing value mechanisms.
Non-robust methods lead to erroneous statements about missingvalue mechanisms for data containing outliers.
Visualization tools are easier to handle and more powerful, butflexible, easy-to-use visualization software is required.
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 13 / 18
Conclusions
The R package VIM
The R package VIM (Templ and Filzmoser 2008, Templ and Alfons 2009). . .
has all previously shown plots implemented, along with some more.
is a tool for explorative data analysis of data with missing values.
makes it possible to analyze the multivariate structure of missingvalues.
comes with a graphical user interface (GUI).
contains interactive features.
allows producing high-quality graphics for publications.
is available on CRAN(http://cran.r-project.org/package=VIM).
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 14 / 18
Conclusions
Graphical user interface of the R Package VIM
Figure: VIM GUI
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 15 / 18
Conclusions
Acknowledgments
This work was partly funded by the European Union (represented by theEuropean Commission) within the 7th framework programme for research(Theme 8, Socio-Economic Sciences and Humanities, Project AMELI(Advanced Methodology for European Laeken Indicators), GrantAgreement No. 217322). Visit http://ameli.surveystatistics.netfor more information.
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 16 / 18
Conclusions
References I
D. Cook and D.F. Swayne. Interactive and Dynamic Graphics for DataAnalysis: With R and GGobi. Springer, New York, 2007. ISBN978-0-387-71761-6.
R.J.A. Little and D.B. Rubin. Statistical Analysis with Missing Data.Wiley, New York, 2nd edition, 2002. ISBN 0-471-18386-5.
M. Templ and A. Alfons. VIM: Visualization and Imputation of MissingValues, 2009. URL http://cran.r-project.org/package=VIM. Rpackage version 1.3.
M. Templ and P. Filzmoser. Visualization of missing values using theR-package VIM. Research Report CS-2008-1, Department of Statisticsand Probability Theory, Vienna University of Technology, 2008. URLhttp://www.statistik.tuwien.ac.at/forschung/CS/CS-2008-1complete.pdf.
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 17 / 18
Conclusions
References II
M. Theus. Interactive data visualization using mondrian. Journal ofStatistical Software, 7(11): 1–9, 2002. URLhttp://www.jstatsoft.org/v07/i11.
M. Theus, H. Hofmann, B. Siegl, and A. Unwin. MANET - Extensions tointeractive statistical graphics for missing values. In In New Techniquesand Technologies for Statistics II, pages 247–259. IOS Press, 1997.
A. Unwin, G. Hawkins, H. Hofmann, and B. Siegl. Interactive graphics fordata sets with missing values: MANET. Journal of Computational andGraphical Statistics, 5(2): 113–122, 1996.
Templ, Alfons, Filzmoser (TUW) Exploring the structure of missing values Rennes, July 8, 2009 18 / 18