1spatial source data validation analysis. high performance monitoring system (hpms) field manual...

24
1Spatial Source Data Validation Analysis

Upload: joanna-pitts

Post on 31-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

1Spatial Source Data Validation Analysis

High Performance Monitoring System (HPMS)

• Field manual analysis describing how data is collected and submitted

• Cooperation between:– State Highways Agencies (SHAs)– Local Governments– Metropolitan Planning Organisations(MPOs)

• Understanding of original and target data models and quality

Understanding HPMS

Background

• Utilise the State submitted data

• Focus on two data submissions components:– Routes: An LRS representing State’s road network– Sections: State’s HPMS attribute dataset

• Data quality issues as a result of:– Decoupling of spatial routes and attribute Sections during collection– State by State business processes to data collection– Varying degrees of data quality between states

Preliminary Analysis Project

Objectives

• To better understand HPMS

• To Identify the potential workflow to ingest, transform and quality assure incoming data

• To assess the initial viability of a Cloud based solution

• To initially assess the potential of FHWA to:– Reduce their processing overhead of incoming source data– Reduce the development requirement for data validation– Have control of the validation rules in a flexible environment

Preliminary Analysis Project

Parameters

• Supplied with data from two States:– Ohio & Arizona

• Two different data types:– Routes & Sections

• Data did not contain SAMPLE_PANEL_ID– No conformance analysis could be performed with this criteria

• The data model and validation criteria were extracted from the HPMS field manual

Preliminary Analysis Project

Potential Workflow

Potential Workflow

Ingest Validation Service• Ingest the raw data into the workflow• Report on:

– HPMS model and attribute conformance– LRS validity– Table Foreign Key conformance

Spatial & Attribute Validation Rule Service• Generate topology on the ROUTES geometry to

spatially articulate the recorded SECTIONS content• Produce rules which articulate constraints and logic

that go beyond simple model definitions • Report on underlying data validity and model

conformance both spatially and a-spatially

Potential Workflow

Ingest Validation Service Preliminary Results

Potential Workflow

Sections Data is conformant to HPMS data model with one exception:

• 0.2% of the values reported against COUNTER_PEAK_LANES are returned as 0• 15 non-conformances

All non conformances are from Ohio.

Ingest Validation Service Preliminary Results

Potential Workflow

LRS ROUTES data is not well-formed (valid) :

• Multiple measurement (m) values in the same line segment with the same value

• Geometry problems (zero length, multiple nodes etc.)

The major problem was in Arizona. This caused problems for extracting the geometry for each SECTION segment.

Ingest Validation Service Preliminary Results

Potential Workflow

Foreign Key joins between the SECTIONS and ROUTES table:

• For Arizona 31.64% of the foreign key values for YEAR_RECORD, STATE_CODE and ROUTE_ID reported in the SECTIONS table could not be found in the ROUTES table.

This means that observations have been taken on road sections but we are unable to match the actual polyline associated with these observations. DUNCAN – change as you see fit

Ohio reported 100% conformance.

Ingest Validation Service Preliminary Results

Potential Workflow

SOMETHING IN HERE – THE VALUE ADD

don’t think we need to talk about the pivot-tabling.

SEE NEXT SLIDE

Ingest Validation Service Preliminary Results

Post-Ingest Processing

• Generate geometry for each SECTION based on the BEGIN_POINT and END_POINT and the corresponding geometry line in ROUTES.

• Generate topology on the new SECTIONS content to spatially articulate content embedded in multiple SECTION observations.

• DUNC: I’m happy to talk through the next slide

Potential Workflow

Spatial & Attribute Validation Rule Service

Potential Workflow

The following validation checks were suggested:

• Coverage– CRACKING_PERCENT: Surface_Type must be in (2,3,4,5,6,7,8,9,10) and

Sample– This has been done - but augmented with a RUTTING rule as no valid

entries were in the data

• Cross– Cross, AADT_Single_Unit, AADT_Single_Unit < AADT/2.5

• LRS– LRS, Route ID Not Found, Section data references a Route ID that does not

exist in the LRS file

Spatial & Attribute Validation Rule Service

Potential Workflow

The following validation checks were utilised:

1.All SECTIONS have a matching YEAR_RECORD, STATE_CODE and ROUTE_ID in ROUTES

2.For all records in SECTIONS which have AADT_SINGLE_UNIT then AADT/2.5 > AADT_SINGLE_UNIT should be TRUE.

3.For all records in SECTIONS which have STRUCTURE_TYPE then FACILITY_TYPE >= 1 AND FACILITY_TYPE <= 4 should be TRUE.

4.For all records in SECTIONS which have anything in CURVES_AtoF then the SUM of CURVES_AtoF equals the length of the section (END_POINT - BEGIN_POINT) should be TRUE.

5.For all records in SECTIONS which have anything in GRADES_AtoF then the SUM of GRADES_AtoF equals the length of the section (END_POINT - BEGIN_POINT) should be TRUE.

Spatial & Attribute Validation Rule Service

Potential Workflow

The following validation checks were utilised:

6. For all records in SECTIONS which have PEAK_LANES then PEAK_LANES + COUNTER_PEAK_LANES >= THROUGH_LANES should be TRUE.

7. For all instances of CRACKING_PERCENT. The SECTION with the CRACKING_PERCENT value should have a SURFACE_TYPE 2-10 OR if there is no SURFACE_TYPE value in that section it must be within another SECTION with a SURFACE_TYPE 2-10.

8. For all instances of RUTTING. The SECTION with the RUTTING value should have a SURFACE_TYPE 2,6,7 or 8 OR if there is no SURFACE_TYPE value in that section it must be within another SECTION with a SURFACE_TYPE 2,6,7 or 8 (calculated using topology).

9. For all instances of RUTTING. The SECTION with the RUTTING value should have a SURFACE_TYPE 2,6,7 or 8 OR if there is no SURFACE_TYPE value in that section it must be within another SECTION with a SURFACE_TYPE 2,6,7 or 8 (calculated without topology).

Spatial & Attribute Validation Rule Service

Potential Workflow

Spatial & Attribute Validation Rule Service ResultsRule Checking Dataset

Records Checked

Non-Conformances

Conformances Percent

1 Model All 354442 82228 76.80

1 Model 39 94592 0 100.00

1 Model 4 259850 82228 68.36

2 Model All 9453 3520 62.76

2 Model 39 7957 3481 56.25

2 Model 4 1496 39 97.39

3 Model All 0 0 100.00

3 Model 39 0 0 100.00

3 Model 4 0 0 100.00

4 Model All 3558 568 84.04

4 Model 39 1415 78 94.49

4 Model 4 2143 490 77.13

5 Model All 3474 1271 63.41

5 Model 39 1415 282 80.07

5 Model 4 2059 989 51.97

6 Model All 29657 25264 14.81

6 Model 39 21052 20267 3.73

6 Model 4 8605 4997 41.93

7 Spatial All 248238 0 100.00

7 Spatial 39 61447 0 100.00

7 Spatial 4 186837 0 100.00

8 Spatial All 4391 126 97.13

8 Spatial 39 3940 1 99.97

8 Spatial 4 452 125 72.35

9 Spatial All 248238 1201 99.52

9 Spatial 39 61447 1 100.00

9 Spatial 4 186837 1201 99.36

4 – Arizona39 – OhioAll - Combined

Rule 2 - attributeFor all records in SECTIONS which have AADT_SINGLE_UNIT then AADT/2.5 >

AADT_SINGLE_UNIT should be TRUE.

Rule 5 - attributeFor all records in SECTIONS which have anything in GRADES_AtoF then the SUM of

GRADES_AtoF equals the length of the section (END_POINT - BEGIN_POINT) should be TRUE.

Rule 8 - topologyFor all instances of RUTTING. The SECTION with the RUTTING value should have a SURFACE_TYPE 2,6,7 or 8 OR if there is no SURFACE_TYPE value in that section it must be within another SECTION with a SURFACE_TYPE 2,6,7 or 8..

Potential Workflow

Conclusions

Potential Workflow

The problem issues in the pilot were caused by ingesting data that was not ‘well formed’.

This was particularly relevant for the LRS ROUTES data supplied by Arizona – although it could be read by a GIS it contained embedded non-conformities resulting in data that was not well-formed and could not be processed easily. Essentially it was not valid LRS data.

The ingest validation service is designed to ensure that downstream process, analysis and rule-extraction is conducted on data that is well-formed so that reporting can accurately inform decision making.

Ingest Validation Service Preliminary Results