flight delays and cancellations

Post on 23-Jan-2018

39 Views

Category:

Engineering

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Flight Delays and

Cancellations

Asad Zaidi Soubhi Hadri

Department of Electrical and Computer Engineering

The University of Oklahoma

December, 2017

EDA & Flight delay prediction

Introduction:

1

• Data was collected and published by the U.S. Department of

Transportation's for 2015.

• It is available on Kaggle.

• The question to answer:

• Which airline should you fly on?

Dataset Discovery :

2

Dataset contains three CSVs files:

1- airlines.csv

Dataset Discovery :

3

Dataset contains three CSVs files:

2- airports.csv

Dataset Discovery :

4

Dataset contains three CSVs files:

3- flights.csv

Dataset Discovery :

5

Exploratory Analysis :

6

Missing Data:

Many NaNs !

Exploratory Analysis:

7

Negative Delay!

Ahead flights

Exploratory Analysis:

8

Exploratory Analysis:

9

Exploratory Analysis:

10

Best Airlines :

ON_TIME_PER

Exploratory Analysis:

11

Best Airlines :

MEAN_DEPARTURE_DELAY

Exploratory Analysis:

Best Airlines :

MEAN_DEPARTURE_AHEAD

12

Exploratory Analysis:

Best Airlines :

CANCELLED_PERCENTAGE

13

Exploratory Analysis:

Best Airlines :

CANCELLATION_REASONS

14

Exploratory Analysis:

Best Airlines :

DIVERTED_FLIGHTS

15

Exploratory Analysis:

16

The same for ARRIVAL_TIME

Exploratory Analysis:

17

Exploratory Analysis:

18

Exploratory Analysis:

Best Airlines :

MEAN_SPEED

19

Exploratory Analysis:

Best Airlines :

Simple ranking using:

• MEAN_SPEED

• MEAN_DEPARTURE_DELAY

• MEAN_DEPARTURE_AHEAD

• CANCELLED_PERCENTAGE

• DIVERTED_FLIGHTS

20

Flight Delay Prediction

Flight Delay Prediction

• Convolution Neural Network.

• Tensorflow – Python.

• Columns:

• AIRLINE

• DAY_OF_WEEK

• ORIGIN_AIRPORT

• DESTINATION_AIRPORT

• DISTANCE

• DEPARTURE_DELAY

21

Flight Delay Prediction

Steps:

• Remove :

• DEPARTURE_DELAY<0

• CANCELLED

• DIVERTED

• Encode (using One hot encoding):

• AIRLINE

• ORIGIN_AIRPORT

• DESTINATION_AIRPORT

22

Flight Delay Prediction

23

Flight Delay Prediction

24

• Regression:

• 5 convolution layers.

• 2 pooling layers.

• 2 full connected layers _ dropout.

• loss function : square mean.

• Bad results!

• Reasons (maybe):

• Not able to use full dataset.

• Inappropriate encoding.

• Network structure.

First Try:

Flight Delay Prediction

25

• Convert the problem from regression to classification.

• Spread delay values into 5 levels.

• Use CNN structure similar to AlexNet.

• Result:

• Still running :D .

Second Try:

Script on GitHub: https://github.com/SubhiH/Flight-Delays-and-Cancellations-EDA

Thank you

top related