automated classification based on video data at ... · 11.7h of video for intersections without...

36
Automated Classification Based on Video Data at Intersections with Heavy Pedestrian and Bicycle Traffic Sohail Zangenehpour Luis Miranda-Moreno Nicolas Saunier Department of Civil Engineering and Applied Mechanics McGill University ITS Canada, Annual Conference and General Meeting June 1-4, 2014

Upload: others

Post on 16-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

Automated Classification Based on Video Data

at Intersections with

Heavy Pedestrian and Bicycle Traffic

Sohail Zangenehpour Luis Miranda-Moreno Nicolas Saunier

Department of Civil Engineering and Applied Mechanics

McGill University

ITS Canada, Annual Conference and General Meeting

June 1-4, 2014

Page 2: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

Introduction

2

Page 3: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

Introduction

• Two main approaches for studying road safety:

– Traditional crash and injury data

– Surrogate measures

• Problems with crash data:

– Small sample size in short time

– Lack of detail on the cause of accidents

– Significant number of crashes need to be recorded before an action can be taken

• Detecting and treating the safety deficiencies before they cause accidents → using Surrogate Measurements

• Examples of surrogate measures:

– Time To Collision (TTC)

– Post Encroachment Time (PET)

3

Page 4: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

Introduction

• Shortcoming in availability and quality of data for non-motorized modes

• Few automated methods for collecting microscopic data separately for

different road users

• Low accuracy of classification for pedestrians and cyclists

• Problems with classifying pedestrians and cyclists:

– Non-rigidity

– Varied appearance

– Less organized movements

– Moving in groups close each other

• The main objective of this work: Design an automated method to track and

classify objects in video 4

Page 5: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

Object Classification

5

Page 6: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

Object Classification

Training Step

6

Video Traffic

Intelligence (Tracker)

Trajectory & Image Boxes of all

Moving Objects

Manual Classification

Dataset of Training Images

Resizing & HOG

HOG Feature

Descriptors

Train the SVM Model

Ordinary video camera

Page 7: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

7

Video Traffic

Intelligence (Tracker)

Trajectory & Image Boxes of all

Moving Objects

Manual Classification

Dataset of Training Images

Resizing & HOG

HOG Feature

Descriptors

Train the SVM Model

Object Classification

Training Step

1- Individual pixels (features) are detected and tracked frame to frame

2- Features are grouped based on consistent common motion to make moving objects

Page 8: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

8

Video Traffic

Intelligence (Tracker)

Trajectory & Image Boxes of all

Moving Objects

Manual Classification

Dataset of Training Images

Resizing & HOG

HOG Feature

Descriptors

Train the SVM Model

1500 manually classified sample images (training set) for each class:

Object Classification

Training Step

Page 9: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

9

Video Traffic

Intelligence (Tracker)

Trajectory & Image Boxes of all

Moving Objects

Manual Classification

Dataset of Training Images

Resizing & HOG

HOG Feature

Descriptors

Train the SVM Model

Object Classification

Training Step

Pedestrian

Cyclist

Vehicle

HOG vectors

Page 10: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

10

Video Traffic

Intelligence (Tracker)

Trajectory & Image Boxes of all

Moving Objects

Manual Classification

Dataset of Training Images

Resizing & HOG

HOG Feature

Descriptors

Train the SVM Model

Object Classification

Training Step

Dimension = 1764

Dimension = 2

Page 11: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

11

Video Traffic

Intelligence (Tracker)

Trajectory & Image Boxes of all

Moving Objects

Resizing & HOG

HOG Feature

Descriptors

Trained SVM Model

Integrate Speed

Predict the Class

The same as training step

Object Classification

Prediction

Page 12: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

12

Video Traffic

Intelligence (Tracker)

Trajectory & Image Boxes of all

Moving Objects

Resizing & HOG

HOG Feature

Descriptors

Trained SVM Model

Integrate Speed

Predict the Class

For each frame

Most probable class:

1- Pedestrian

2- Cyclist

3- Vehicle

One prediction per frame

P(pedestrian | appearance) = # 𝑜𝑓 𝑓𝑟𝑎𝑚𝑒 𝑎𝑠 𝑝𝑒𝑑𝑒𝑠𝑡𝑟𝑎𝑖𝑛𝑠

# 𝑜𝑓 𝑓𝑟𝑎𝑚𝑒𝑠

P(cyclist | appearance) = # 𝑜𝑓 𝑓𝑟𝑎𝑚𝑒 𝑎𝑠 𝑐𝑦𝑐𝑙𝑖𝑠𝑡

# 𝑜𝑓 𝑓𝑟𝑎𝑚𝑒𝑠

P(vehicle | appearance) = # 𝑜𝑓 𝑓𝑟𝑎𝑚𝑒 𝑎𝑠 𝑣𝑒ℎ𝑖𝑐𝑙𝑒

# 𝑜𝑓 𝑓𝑟𝑎𝑚𝑒𝑠

Object Classification

Prediction

Page 13: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

13

Video Traffic

Intelligence (Tracker)

Trajectory & Image Boxes of all

Moving Objects

Resizing & HOG

HOG Feature

Descriptors

Trained SVM Model

Integrate Speed

Predict the Class

0 5 10 15 20 25 30 35 400

0.1

0.2

0.3

0.4

0.5

Median of speed

P(S

peed | C

lass)

Pedestrian

Cyclist

Vehicle

2 4 6 8 100

5

10

15

20

25

30

Median of speed

Fre

qu

ency

pedestrians

0 10 20 30 400

5

10

15

20

25

Median of speed

Fre

qu

ency

cyclists

-10 0 10 20 30 40 500

20

40

60

80

100

Median of speed

Fre

qu

ency

vehicles

Object Classification

Prediction

Page 14: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

14

Video Traffic

Intelligence (Tracker)

Trajectory & Image Boxes of all

Moving Objects

Resizing & HOG

HOG Feature

Descriptors

Trained SVM Model

Integrate Speed

Predict the Class

• Four methods for integrating speed:

1. Without using appearance, classification just based on speed, two speed thresholds

0 5 10 15 20 25 30 35 400

0.1

0.2

0.3

0.4

0.5

Median of speed

P(S

peed | C

lass)

Pedestrian

Cyclist

Vehicle

pedestrians cyclist vehicle

Object Classification

Prediction

Page 15: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

15

Video Traffic

Intelligence (Tracker)

Trajectory & Image Boxes of all

Moving Objects

Resizing & HOG

HOG Feature

Descriptors

Trained SVM Model

Integrate Speed

Predict the Class

• Four methods for integrating speed:

2. Without using speed, just based on appearance:

Predicted class is the class with maximum P(class | appearance)

P(pedestrian | appearance) = # 𝑜𝑓 𝑓𝑟𝑎𝑚𝑒 𝑎𝑠 𝑝𝑒𝑑𝑒𝑠𝑡𝑟𝑎𝑖𝑛𝑠

# 𝑜𝑓 𝑓𝑟𝑎𝑚𝑒𝑠

P(cyclist | appearance) = # 𝑜𝑓 𝑓𝑟𝑎𝑚𝑒 𝑎𝑠 𝑐𝑦𝑐𝑙𝑖𝑠𝑡

# 𝑜𝑓 𝑓𝑟𝑎𝑚𝑒𝑠

P(vehicle | appearance) = # 𝑜𝑓 𝑓𝑟𝑎𝑚𝑒 𝑎𝑠 𝑣𝑒ℎ𝑖𝑐𝑙𝑒

# 𝑜𝑓 𝑓𝑟𝑎𝑚𝑒𝑠

Object Classification

Prediction

Page 16: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

16

Video Traffic

Intelligence (Tracker)

Trajectory & Image Boxes of all

Moving Objects

Resizing & HOG

HOG Feature

Descriptors

Trained SVM Model

Integrate Speed

Predict the Class

• Four methods for integrating speed:

3. Using speed thresholds for switching between different SVM models

Is speed of the tracked

object lower than threshold

for pedestrian speed?

Three Class HOG-SVM

(Pedestrian, Cyclist,

Vehicle)

Is speed of the tracked

object lower than threshold

for cyclist speed?

Two Class HOG-SVM

(Cyclist, Vehicle)

The object is a Vehicle

No

Yes No

Yes

Object Classification

Prediction

Page 17: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

17

Video Traffic

Intelligence (Tracker)

Trajectory & Image Boxes of all

Moving Objects

Resizing & HOG

HOG Feature

Descriptors

Trained SVM Model

Integrate Speed

Predict the Class

• Four methods for integrating speed:

4. Combining the probability taken from appearance to the probability taken from speed:

𝑃 𝐶𝑙𝑎𝑠𝑠 | 𝑆𝑝𝑒𝑒𝑑, 𝐴𝑝𝑝𝑒𝑎𝑟𝑎𝑛𝑐𝑒 ∝ 𝑃 𝐶𝑙𝑎𝑠𝑠|𝐴𝑝𝑝𝑒𝑎𝑟𝑎𝑛𝑐𝑒 𝑃 𝑆𝑝𝑒𝑒𝑑 𝐶𝑙𝑎𝑠𝑠)

0 5 10 15 20 25 30 35 400

0.1

0.2

0.3

0.4

0.5

Median of speed

P(S

peed | C

lass)

Pedestrian

Cyclist

Vehicle

Predicted class is the class with highest P(Class | Speed, Appearance)

Object Classification

Prediction

Page 18: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

18

Video Traffic

Intelligence (Tracker)

Trajectory & Image Boxes of all

Moving Objects

Resizing & HOG

HOG Feature

Descriptors

Trained SVM Model

Integrate Speed

Predict the Class

Object Classification

Prediction

Page 19: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

19

Confusion Matrix Ground Truth

Accuracy Pedestrian Bike Vehicle Total Precision

Pre

dic

ted

Classifier

I

Pedestrian 946 86 277 1309 72.3 %

72.4 % Bike 77 324 793 1194 27.1 %

Vehicle 0 78 2175 2253 96.5 %

Total 1023 488 3245 4756

Recall 92.5 % 66.4 % 67.0 %

Classifier

II

Pedestrian 742 191 584 1517 48.9 %

75.9 % Bike 121 244 37 402 60.7 %

Vehicle 160 53 2624 2837 92.5 %

Total 1023 488 3245 4756

Recall 72.5 % 50.0 % 80.9 %

Classifier

III

Pedestrian 726 43 64 833 87.2 %

86.3 % Bike 131 373 177 681 54.8 %

Vehicle 166 72 3004 3242 92.7 %

Total 1023 488 3245 4756

Recall 71.0 % 76.4 % 92.6 %

Classifier

IV

Pedestrian 969 53 180 1202 80.6 %

88.5 % Bike 42 371 198 611 60.7 %

Vehicle 12 64 2867 2943 97.4 %

Total 1023 488 3245 4756

Recall 94.7 % 76.0 % 88.4 %

Object Classification

Accuracy

Page 20: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

20

Object Classification

Accuracy

Page 21: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

Case Studies on Cyclist Safety

21

Page 22: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

1st Study: Cycle Track

22

Without

cycle track

With cycle

track

Saint-Urbain

Mont-Royal

Saint-Urbain

Pins

Page 23: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

1st Study: Cycle Track

23

Without

cycle track

With cycle

track

Saint-Urbain

Mont-Royal

Saint-Urbain

Pins

Page 24: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

1st Study: Cycle Track

24

Ho

urs

of

Vid

eo

Cyc

lists

Rig

ht-

Turn

ing

Ve

hic

les

Ave

rage

Cyc

list

Spe

ed

Ave

rage

Ve

hic

le

Spe

ed

TTC

15 <

5

seco

nd

s

TTC

15 <

1.5

seco

nd

s

PET

< 5

seco

nd

s

PET

< 1

.5

seco

nd

s

TTC

Co

nf.

Rat

e*

TTC

Dan

g.

Co

nf.

Rat

e*

PET

Co

nf.

Rat

e*

PET

Dan

g.

Co

nf.

Rat

e*

Without

bicycle

facility

2.57 119 263 11.8 12.3 4 2 37 2 328 164 3038 164

With

bicycle

facility

3.88 438 622 15.2 13.7 13 4 161 10 185 57 2293 142

0

500

1000

1500

2000

2500

3000

3500

0

100

200

300

400

500

600

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

Cum

ula

tive

Confl

ict

Rat

e

Confl

ict

Rat

e

PET (seconds)

Without Bicycle

Facility

With Bicycle Facility

Without Bicycle

Facility (Cumulative)

With Bicycle Facility

(Cumulative)

𝐶𝑜𝑛𝑓𝑙𝑖𝑐𝑡 𝑅𝑎𝑡𝑒 =𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝐶𝑜𝑛𝑓𝑙𝑖𝑐𝑡𝑠, 𝑝𝑒𝑟 𝐻𝑜𝑢𝑟 ∗ 106

𝑇𝑟𝑎𝑐𝑘𝑒𝑑 𝐶𝑦𝑐𝑙𝑖𝑠𝑡𝑠, 𝑝𝑒𝑟 𝐻𝑜𝑢𝑟 ∗ (𝑇𝑟𝑎𝑐𝑘𝑒𝑑 𝑉𝑒ℎ𝑖𝑐𝑙𝑒𝑠, 𝑝𝑒𝑟 𝐻𝑜𝑢𝑟)

Page 25: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

2nd Study: Bicycle Box

25

11.7h of video for intersections without bicycle box (3 intersections)

10.1h of video for intersections with bicycle box (2 intersections)

Two types of conflicts:

• Conflict Type 1: Cyclist (green)

with Vehicle (red)

• Conflict Type 2: Cyclist (green)

with Vehicle (blue)

Modelling conflicts by logit model

• Number of lanes

• Red and green times

• Land use

• Presence of bicycle box

• Any other bicycle facility

• Traffic flow of cyclists (30s before)

• Traffic flow of vehicles (30s before)

Page 26: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

2nd Study: Bicycle Box

26

Variables

Conflicts Type 1 (Green - Red) Conflicts Type 2 (Green - Blue)

Conflict (PET < 5s) Dangerous Conflict

(PET < 1.5s) Conflict (PET < 5s)

Dangerous Conflict

(PET < 1.5s)

Coefficient p-value Coefficient p-value Coefficient p-value Coefficient p-value

Constant -2.99 0.00 -4.35 0.00 -0.56 0.00 -1.95 0.00

Cyclist Flow (green)

passing 30s before - - - - 0.4230 0.00 0.4340 0.00

Vehicle Flow 1 (red)

passing 30s before 0.1170 0.00 0.0970 0.00 -0.0857 0.00 -0.0823 0.01

Vehicle Flow 2 (blue)

passing 30s before 0.0628 0.00 - - 0.0908 0.00 0.0399 0.04

Presence of Bicycle Box -0.726 0.00 -2.050 0.00 -0.739 0.00 -1.230 0.00

Number of total

observations 1074 1074 1074 1074

Number of positive

observations 103 14 291 79

Final log-likelihood -299.85 -66.44 -544.00 -251.48

Constant log-likelihood -339.37 -74.67 -627.43 -282.19

Adjusted Rho2 0.592 0.907 0.263 0.655

Page 27: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

Thank You!

27

Page 28: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

28

Page 29: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

29

P Class | Speed, Appearance =P Class

P(Speed, Appearance)P Speed, Appearance Class)

P Class | Speed, Appearance =P Class

P Speed P(Appearance)P Speed Class)P Appearance Class)

P Appearance Class)P Class = P Class|Appearance P Appearance

P Class | Speed, Appearance =P Class|Appearance

P Speed P Speed Class)

P Class | Speed, Appearance ∝ P Class|Appearance P Speed Class)

Bayes’ Rule

0 5 10 15 20 25 30 35 400

0.1

0.2

0.3

0.4

0.5

Median of speed

P(S

peed | C

lass)

Pedestrian

Cyclist

Vehicle

Page 30: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

30

Normal, Log-Normal Distribution

𝑓 𝑥 =1

𝜎 2𝜋𝑒−(𝑥−𝜇)2

2𝜎2

𝑓 𝑥 =1

𝑥𝜎 2𝜋𝑒−[ln 𝑥 −𝜇]2

2𝜎2

2 4 6 8 100

5

10

15

20

25

30

Median of speed

Fre

qu

ency

pedestrians

0 10 20 30 400

5

10

15

20

25

Median of speed

Fre

qu

ency

cyclists

-10 0 10 20 30 40 500

20

40

60

80

100

Median of speed

Fre

qu

ency

vehicles

Page 31: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

31

HOG

• Normalized image size: 64x64 pixels

• Number of pixels per cell: 8x8

• Number of cells per block: 2x2

• Number of orientations: 9

• Normalization over the blocks for each cell: 𝑣 ←𝑣

𝑣 2+𝜀

• Vector dimension: 49 x 4 x 9 = 1764

Cells: 8x8 pixels Blocks: 2x2 cells

0

20

40

60

80 100

120

140

160

180

Number of orientations: 9 bins

Page 32: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

SVM

• Non Linear SVM

• Here we used RBF kernel (Radial Basis Function)

32

Page 33: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

33

Confusion Matrix Ground Truth

Accuracy Pedestrian Bike Vehicle Total Precision

Pre

dic

ted

Classifier

IV

Pedestrian 969 53 180 1202 80.6 %

88.5 % Bike 42 371 198 611 60.7 %

Vehicle 12 64 2867 2943 97.4 %

Total 1023 488 3245 4756

Recall 94.7 % 76.0 % 88.4 %

Recall – Precision - Accuracy

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑘 =𝑐𝑘𝑘 𝑐𝑖𝑘𝑖

𝑅𝑒𝑐𝑎𝑙𝑙𝑘 =𝑐𝑘𝑘 𝑐𝑘𝑗𝑗

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑐𝑘𝑘𝑘

𝑐𝑖𝑗𝑗𝑖

Page 34: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

34

Confusion Matrix Ground Truth

Accuracy Pedestrian Bike Vehicle Total Precision

Pre

dic

ted

Classifier

IV

Pedestrian 969 53 180 1202 80.6 %

88.5 % Bike 42 371 198 611 60.7 %

Vehicle 12 64 2867 2943 97.4 %

Total 1023 488 3245 4756

Recall 94.7 % 76.0 % 88.4 %

ROC

• True positive rate: true positive out of all the positives

• False positive rate: false positive out of all the negatives

• For example for pedestrian:

True Positive Rate (pedestrian) = Recall =

False Positive Rate (pedestrian) =

Page 35: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

35

Object Classification

Accuracy

• Receiver Operating Characteristic (ROC)

• To reduce the effect of poor choice of parameters

Page 36: Automated Classification Based on Video Data at ... · 11.7h of video for intersections without bicycle box (3 intersections) 10.1h of video for intersections with bicycle box (2

TTC

• Time To Collision

– Is a measure of remaining time (at any time t) before two

objects collide, in case of no reaction from them

36

𝑉𝑏,𝑡 Collision

Point