quality metrics for assessing the impact of editing and imputation on economic data
DESCRIPTION
Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data. Broderick E. Oliver and Katherine Jenny Thompson Office of Statistical Methods and Research for Economic Programs. Outline. Motivation for the study Quality Metrics (Formulas) - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/1.jpg)
1
Quality Metrics for Assessing the Impact of Editing and Imputation
on Economic Data
Broderick E. Oliver
and
Katherine Jenny Thompson
Office of Statistical Methods and Research for Economic Programs
![Page 2: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/2.jpg)
2
Outline
• Motivation for the study
• Quality Metrics (Formulas)
• Quality Metrics (Actual Results)
• Future Research
![Page 3: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/3.jpg)
3
Motivation
Economic Directorate conducted a series of
studies to evaluate the editing efficiency of
selected surveys and censuses.
1. What value is added from subjecting the same
record to multiple editing phases?
2. What is the impact of editing and imputation on
the final data?
![Page 4: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/4.jpg)
4
Development of Quality Metrics
• Assess overall changes to “reported” data at the:– Micro level– Macro level
• Examine– the size of change to reported data.– the source of change to reported data.
• Determine which changes had greatest impact on final tabulations
![Page 5: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/5.jpg)
5
Key Terms
• Critical Item• Reported Data• Final Data• Data Flag
![Page 6: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/6.jpg)
6
Metric 1• Item Level (Critical Items)• Percentage of records with reported values
whose value was changed by editing/imputation
•Where: yi = 1 if reported value final value
• 0 otherwise. and n = number of records
1001METRIC 1
n
n
iyi
![Page 7: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/7.jpg)
7
Metric 2• Item Level (Critical Items).• The percentage of changes to the records with reported values
that is attributable to analyst correction versus machine correction.
Where ai = 1 if reported value final value and source is analyst correction.
0 otherwise.
mi = 1 if reported value final value and source is machine correction.
0 otherwise
n = number of records.
100A2METRIC 1
na
n
i i
100M2METRIC 1
nm
n
i i
![Page 8: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/8.jpg)
8
Metric 3
• Item Level (Critical Items).• The source of change of the reported data.• The size of change of the reported data.• The impact of the changes on the final tabulations.
![Page 9: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/9.jpg)
9
Metric 3: Tabular Format: (Item Level)
Source of
Change
(1)
Change Category
(2)
No. of Records
(3)
Tabulated
(Weighted)
Reported
(4)
Tabulated
(Weighted)
Edited
(5)
Percent
Difference
(6)
Sum of the
Absolute
Difference
(7)
Average
Absolute
Difference
(8)
Analyst
Correction
1.0 < R/E < 1.1 n x y (y-x)*100/x z z/n 1.1 R/E < 9 9 R/E < 90
90 R/E < 900
R/E 900
No Change
R/E=1
Totals Total 3 Total 4 Total 5
Percent Difference
![Page 10: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/10.jpg)
10
Metrics Applied to:
• Annual Wholesale Trade Survey (AWTS)• Annual Survey of Manufactures (ASM)
![Page 11: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/11.jpg)
11
Annual Wholesale Trade Survey(AWTS)
• Sample Survey• Approximately 8,000 wholesale businesses• Critical Items:
– Sales– Total Purchases– Total Inventories
• Processed in Standard Economic Processing System (StEPS)
![Page 12: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/12.jpg)
12
AWTS Editing/Imputation• StEPS Automatic Processing Flow
– Simple Imputation Module: Data “clean up”– Edit Module: Identifies “suspicious” values– General Imputation module: Replaces “suspicious” values
• Item Flagging– Can identify four distinct sources of change:
• Analyst Correction• Analyst Impute• Machine Correction• No Change
• “Cycling” between analyst and machine corrections
![Page 13: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/13.jpg)
13
Annual Survey of Manufactures (ASM)
• Sample Survey• 55,000 establishments• Critical Items:
– Cost of Materials– Employment– Annual Payroll– Receipts
• Processed in the Economic Census System– Plain Vanilla Editing Module
![Page 14: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/14.jpg)
14
ASM Editing/Imputation• ASM Automatic Processing Flow
– Pre-editing Module: Data filling and clean up– Plain Vanilla Edit Modules
• Ratio (editing/imputation)• Balancing (editing/imputation)
• Item Flagging– Can identify three sources of change:
• Analyst correction/impute (cannot distinguish)• Machine impute• No change
• “Cycling” between analyst and machine
![Page 15: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/15.jpg)
15
Illustration of Metric 1: AWTS
– Relatively few of the reported values for each critical item changed.
– Changes to these records had a great impact on final tabulations.
Critical Item
No. of records
with reported values
No. of records changed
Percent Reported
Amount
(Weighted)
In Millions
Edited
Amount
(Weighted)
In Millions
Percent
Difference
Sales 4,819 238 4.9% $41,156,147 $2,156,439 - 93.9 %
Purchases 4,628 403 8.7% $4,486,157 $1,953,140 - 56.5%
Inventories 4,334 326 7.5% $21,392,659 $256,920 - 98.8%
![Page 16: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/16.jpg)
16
Illustration of Metric 1: ASM
– Relatively few of the reported values for each critical item changed.
– Except for employment, changes to these records had a “small” impact on final tabulations
Critical Item No. of records
with reported values
No. of records changed
Percent Reported
Amount
(Weighted)
In Millions
Edited
Amount
(Weighted)
In Millions
Percent
Difference
Cost of Materials
35,908 3,520 9.8% $2,157 $1,936 -10.2%
Employment 31,603 4,032 12.8% 12 6 - 44.6%
Annual Payroll
30,756 454 1.5% $293 $291 - 0.4%
Receipts 38,074 4,157 10.9% $4,320 $3,647 - 15.6%
![Page 17: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/17.jpg)
17
Illustration of Metric 2: AWTSCritical Item Source of
ChangeNo.
Records Changed
Percent of Total
Average Absolute Difference Between Reported and Edited
Amount
(In Millions)
Ratio of
AC to AI and
AC to MI
Sales AC 221 92.9% $175,545 -----
AI 15 6.3% $653 269/1
MI 2 0.8% $55 3150/1
Purchases AC 363 90.1% $7,404 -----
AI 37 9.2% $289 26/1
MI 3 0.7% $79 94/1
Inventories AC 285 87.4% $74,196 -----
AI 15 4.6% $73 1011/1
MI 26 8.0% $39 1914/1
AC = Analyst Correction; AI = Analyst Impute; MI = Machine Impute
![Page 18: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/18.jpg)
18
Illustration of Metric 2: ASMCritical Item Source of
ChangeNo.
Records Changed
Percent of Total
Average Absolute Difference
Ratio of
AC to MI
Cost of Materials
AC 940 26.7% $188,435 --
MI 2,580 73.3% $63,468 3/1
Receipts AC 2,723 65.5% $270,852 --
MI 1,434 34.5% $74,262 4/1
AC = Analyst Correction MI = Machine Impute
![Page 19: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/19.jpg)
19
Key Findings With Metric 3: AWTS
• Analyst corrections accounted for the majority of the changes to all three critical items
• Correction of “rounding” errors– Corrected by analysts– Most substantive impact on tabulations– Relatively few records
![Page 20: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/20.jpg)
20
Key Findings Metric 3: ASM
• A high percentage of changes to reported data fell into the “small change” categories. – For Cost of Materials, machine imputes made the majority
of these small changes (74.7 percent).– For Receipts, analysts made the majority of these changes
(68.4 percent).
• Correction of “rounding” errors:– Corrected equally by analyst and machine– Most substantive impact on tabulations– Relatively few records
![Page 21: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/21.jpg)
21
Study Highlights/Key Findings• Importance of rounding errors:
– Small number of cases– Resolved generally by analysts in AWTS– Resolved by analysts and machine in ASM
• Large proportion of small changes in ASM:– Identified potential edit parameter problems
![Page 22: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/22.jpg)
22
Advantages of Standardized Metrics
• Allowed for direct comparisons between different programs.
• Uncovered different areas of investigation in different programs.
• Facilitated “buy-in” from all parties via development process.
• Provides baseline measures for future investigation.
![Page 23: Quality Metrics for Assessing the Impact of Editing and Imputation on Economic Data](https://reader036.vdocuments.us/reader036/viewer/2022081511/56815113550346895dbf30f0/html5/thumbnails/23.jpg)
23
Future Research
• Apply metrics at various processing stages (AWTS).
• Apply metrics at industry level.
• Examine the number of times the records are subjected to changes.