CONCEPTUAL FRAMEWORKUSING DL FOR AIRPORT CEP
Wei Lin, Ph.D.Chief Data Scientist, Senior Manager, Americas Consulting PracticeApplication and Big Data/IoT TransformationDell [email protected]
Bill SchmarzoCTO, Big Data Analytics Consulting PracticeDell [email protected]
Knowledge Sharing Article © 2017 Dell Inc. or its subsidiaries.
2018 Dell EMC Proven Professional Knowledge Sharing 2
Table of Contents
Abstract ............................................................................................................................................................. 3
1. Introduction................................................................................................................................................. 3
2. Conceptual Prototype for Airport Event Processing .................................................................................... 4
2.1 Airport ..................................................................................................................................................... 4
2.2 Deep Learning to Process Image Data Introduction ................................................................................ 7
3 Feature Recognition and Event Trigger Processing .................................................................................... 9
3.1 Features Selection .................................................................................................................................. 9
3.2 DL Recognition ..................................................................................................................................... 14
3.3 LSTM Responses Trigger ..................................................................................................................... 18
4 Conclusion and Future Work .................................................................................................................... 24
References ...................................................................................................................................................... 25
Disclaimer: The views, processes or methodologies published in this article are those of the authors. They do
not necessarily reflect Dell EMC’s views, processes or methodologies.
2018 Dell EMC Proven Professional Knowledge Sharing 3
Abstract Airport event processing is known for its complexity and it is termed as complex event processing (CEP). In this
conceptual framework, Deep Learning (DL) techniques are leveraged to simplify the processing by encapsulating
the events sequence and responses by recognizing event and then, generating logical responses via supervised
learning. Otsu, CNN, and LSTM are used in this conceptual framework to perform feature extraction, pattern
recognition and predict the responses.
The emergency scenario is a fire on an airliner parked on the tarmac captured via video feeds. The situation is
to first recognize the event and then generate four responses, e.g. 1. Trigger alert to Fire Department (to
extinguish fire), 2. Trigger alert to Law Enforcement (to barricade scene of fire and evacuate passengers), 3.
Trigger alert to Command Post (to coordinate air traffic tower and flights) 4. Trigger alert to Airport Operations
(to clear path and facilitate Fire Department and Law Enforcement activities).
1. Introduction Airport event streaming analysis tracks and analyzes continuous information to identify areas of interest and
derive a conclusion from them. Airport complex event processing (CEP) is event processing that combines data
from a wide variety of diverse data streams to infer events or patterns that suggest more complicated and/or dire
situations. The goal of CEP is to identify meaningful events (such as opportunities or threats) and respond to
them quickly and with the most relevant response.
Airports in particular provide a number of well-known examples of CEP opportunities. Airport events may be
happening across the various layers of an airport such as airport lobby, ticket counters, Customs gates,
terminals, tarmac, airliners, ground operations, customer service, rental car counters, etc. and in format of text
messages, sensors input, video, audio, images, audio communications, equipment locations, traffic reports,
weather reports, or other kinds of data.
An event may also be defined as a "change of state," when a measurement exceeds a predefined threshold of
time, temperature, or other value or new events could be triggered. CEP provides organizations a correlated way
to analyze patterns in real-time and help the business and operation communicate better with service
departments. This Knowledge Sharing paper lays out an approach for “mining” official manuals and documents
to codify and enforce policies during emergency events.
In this paper, we present a simplified airport CEP scenario with an airliner on fire on the tarmac. The sensor
inputs will focus on identifying, analyzing and classifying an image and then generating (predicting) the most
relevant event responses. The steps include
1. Event-pattern detection (feature extraction)
2. Event abstraction (convert image to knowledge)
3. Event aggregation (aggregate across time stamp)
4. Event response triggers (trigger alerts to departments)
The analytical cycles used in this content consist of Descriptive, Exploration, Predictive, and Prescriptive
Analytics. These four essential analytics are used to understand emerged patterns, convert outliers into
controllable variables, estimate time to events, and make real time predictions that lead to analytics-driven smart
interactions.
2018 Dell EMC Proven Professional Knowledge Sharing 4
This paper is arranged in sections.
Section 1 is an Introduction of the Analytics Framework for a complex event processing
Section 2 describes the overall entities (airport, airport CEP architecture, CNN, LSTM) in this study
Section 3 describes details of image (video frame) feature extraction, recognition, and response forecast
analytics aspects
Section 4 is the conclusion and future work
2. Conceptual Prototype for Airport Event Processing This section will introduce the entities in the study. They are airport, airport CEP functional architecture,
Convolution neural network (CNN), and Long Short Term Memory (LSTM).
2.1 Airport Airports are divided into landside and airside areas. Landside areas include parking lots, public transportation,
train stations and access roads. Airside areas include all areas accessible to aircraft, including runways, taxiways
and aprons. Access from landside areas to airside areas is tightly controlled at airports. Passengers on
commercial flights access airside areas through terminals, where passengers can purchase tickets, clear security
check, or claim luggage and board aircraft through gates. The waiting areas which provide passenger access to
aircraft are typically called concourses, although this term is often used interchangeably with terminal. The area
where aircraft park next to a terminal to load passengers and baggage is known as a ramp (or tarmac). Parking
areas for aircraft away from terminals are called aprons. Airports can be towered pending on air traffic density
and available funds. Due to their high capacity and busy airspace, many international airports have air traffic
control located on site. Airports with international flights have customs and immigration facilities. International
flights often require a higher level of physical security, although in recent years, many countries have adopted
the same level of security for international and domestic travel. Airports provide commercial outlets for products
and services. Most of the vendors are located within the departure areas. These include clothing boutiques, and
major fast food chains. Some airport restaurants offer regional cuisine specialties for those in transit so that they
may sample local food or culture without leaving the airport.
Aircraft and passenger Boarding Bridges Maintenance, Pilot Operations, Commissioning, Training Services, aircraft rental, and hangar rental are most often performed by a fixed-base operator (FBO). At major airports, particularly those used as hubs, airlines may operate their own support facilities. Figure 1 shows Airport functional areas.
2018 Dell EMC Proven Professional Knowledge Sharing 5
Figure 1. Airport functional areas (using SFO as an example)
The airport conceptual complex event processing framework [3] consists of three layers. The base layer is a data
integration layer. Second layer is the application integration layer and the third layer is the presentation layer.
The three layers are shown in Figure 2.
Figure 2.Airport CEP functional areas [3]
2018 Dell EMC Proven Professional Knowledge Sharing 6
(1) Hadoop will be used as the data content, modeling repository and API service facility. Hadoop will also
be used for analytical models development, performance monitoring and continuously training against
historical data.
(2) Data layer’s primary function is to integrate or blend data from the different data sources. Two universal
dimensions (time and location) are used to align further of asynchronies data. Data from (6) could align
thee cores aspects, e.g. people (airport employees, airline employees, passengers), processes (staff
schedule, staff training, customer services) and technologies (server, hardware, software) and then
correlate to events/events’ proximity to establish relationships/heat map or seasonality.
(3) Application layer interfaces services to perform orchestration and decision support. The core of this layer
is a business rule engine which determines the events’ possible next best actions and generate alerts
and route to the corresponding parties with location (GIS), event content alone with location and time.
The operation decisions will be transmitted via external system connectors and obtain the receiving status
of the external systems.
(4) Presentation layer is like a single pane of glass containing status of airline schedules, gate occupancies,
passengers’ throughput, airport RFID devices streaming, and facilities status logs. The event monitoring
could be drilled down to equipment’s heart beats, equipment streaming logs, system operational ranges,
KPIs, and GIS /Time stamp-enriched information. The KPIs are used as baseline to measure outliers,
and ROI of improvements,
(5) The external systems contain two catalogues, e.g. external streaming data services sources such as
weather, traffic and integrated auxiliary (response) systems such as Fire Department, TSA, and Law
Enforcement. The auxiliary systems could receive alerts and responses to alerts requests from airport.
(6) Airport internal streaming data sources are ingested through data layers. This is a bi-direction data
transmission, e.g. CEP could adjust sensors parameters such as camera’s lance angle. The type of
sensors data include, but are not limited to, WiFi, RFID, Video, Audio, and Image.
For the scenario in this paper, the entities involved are Video, Data Layers, Business Rule Engine, Alert Service
and External system connectors and Requested Response Units. Those entities are highlighted in Figure 2.
The functional architecture could be translated into generic Hadoop solution architecture as shown in Figure 3.
The complex event processes could map into the core data platform highlighted in red.
For internal data ingestion, external data integration and alerts communications, this reference architecture could
impose different security requirements. This will match external systems connectors’ requirements as well.
The machine learning real-time analytics containers could process data feeds independently and/or collectively.
The considerations of distributing owned/shared data in physical/virtual/cloud could reside in the data marts with
visualization environment and adjustable computing power allocation by optimized computing power in
hardware. This will match business rules engine and alerts services well.
The Dev/Test/Prod clusters could support DL/ML development and performance monitoring.
The presentation layer will match to subject area workgroups to track KPIs and ROI of business actions by their
feedback.
The inbound channels will ingest airport sensors inputs.
2018 Dell EMC Proven Professional Knowledge Sharing 7
Figure 3 Hadoop Reference architecture
2.2 Deep Learning to Process Image Data Introduction Video inputs are one of the key airport data sources and it consumes significant time and effort for the airport
monitoring center to properly monitor and analyze the video feeds. To correlate video streaming content
(connected time stamps T(1 to n)) and objects within image would be more complicated.
Advances in Deep Learning (DL) makes general image recognition possible. The basic construct of DL is
connected layers of artificial neural networks where each layer is responsible for extracting features of the image
constructed in a prior layer (or input layer).
There are two basic ways to prepare an image. 1. Greyscale: an image will be converted to greyscale (range of
gray shades from white to black). Each pixel has a value based on dark degree and convert image into array for
computing. 2. RGB Values: an image’s color can be represented as RGB values (a combination of red, green
and blue ranging from 0 to 255). The result of each RGB value is extracted and put in an array for interpretation.
The basic procedure when matching a new image to known, annotated images is to convert the image to an
array by using the same technique, then compare the numbers patterns against the already-known objects.
Then, it computes the confidence scores for each class. The class with the highest confidence score is the
predicted one. DL extends the pattern matching to features matching, increasing the confidence significantly.
One of the most popular techniques used to improve the accuracy of image classification is Convolutional Neural
Networks (CNN) [7]. CNN is a special type Neural Networks that works in the same way of a regular neural
network except that it has a convolution layer at the beginning
Instead of feeding the entire image as an array of numbers, the image is broken up into a number of tiles and
the machine then tries to predict what each tile is. Finally, the computer tries to predict what’s in the picture
based on the prediction of all the tiles. This allows the computer to parallelize the operations and detect the
object regardless of where it is located in the image. In general, in a deep convolutional neural network, several
layers are stacked and are trained to the task at hand. The network learns several low/mid/high level features at
the end of its layers.
2018 Dell EMC Proven Professional Knowledge Sharing 8
Residual Network (ResNet) [2] in Figure 4 is a variant of CNN. In residual learning, instead of trying to learn
mapping features, it tries to learn residual. Residual can be simply understood as subtraction of features learned
from input of that layer. Thus, the Feature residual would be H(x) –x. The layers learned approximate residual
function would be F(x) = H(x) – x, thus H(x) = F(x) +x. Thus, the F(x, {w}) could represent multiple CNN layers.
ResNet does this using shortcut connections (directly connecting input of nth layer to some (n+x)th layer. It has
proved that training this form of networks is easier than training simple deep convolutional neural networks and
also resolves the problem of degrading accuracy.
Figure 4 Resnet configuration [2]
A recurrent neural network (RNN) [8] is one type of artificial neural network using its internal memory to process
sequential inputs. This makes it applicable to tasks such as predicting most likely next events. Basic RNN is
configured as a network of neuron-like nodes, each with a directed (one-way) connection to other nodes. Each
node has a time-varying activation and each connection has a modifiable weight. Nodes are either input nodes,
output nodes or hidden nodes.
For RNN in supervised learning in discrete time settings, sequences of real-valued input vectors arrive at the
input nodes as one vector at a time. At any given time step, each non-input unit computes its current activation
(result) as a nonlinear function of the weighted sum of the activations of all units that connect to it. Supervisor-
given target activations can be supplied for output units at later time steps. For example, if the input sequence
is a speech signal corresponding to a spoken digit, the final target output at the end of the sequence may be a
label classifying the digit.
Each sequence produces an error as the sum of the deviations of all target signals from the corresponding
activations computed by the network. For a training set of numerous sequences, the total error is the sum of the
errors of all individual sequences.
2018 Dell EMC Proven Professional Knowledge Sharing 9
3 Feature Recognition and Event Trigger Processing Complex event analysis is focusing on recognition and prediction. One important aspect in enhancing event
analysis is semantic-level video analysis of activity and event understanding, which aims at accurately
describing video contents using key semantic elements, such as activities and events.
Unconstrained videos are qualitatively very different and even more challenging than widely-used video
datasets, in which video clips contain fairly coherent single action or atomic event occurring within a short
duration. A series of temporal structure analysis methods could specifically design to tackle these complexities.
Integrated with other vision techniques, an integrated approach can extend the domains of video that can be
understood by machine vision systems.
The approaches in this section lay out the steps that simplify airport complex event analysis using DL. The
given time sequence of images are the scenario of an airplane on fire. The steps include 1. Feature selection:
identify “fire” section of the image. 2. Sequential Image recognition via CNN: recognize T1 image content to T4
image content with/out using feature selection region, 3. Response prediction via LSTM: based on the event to
raise alerts to corresponding auxiliary departments (e.g. Fire Department, Command Post, Law Enforcement
and others).
3.1 Features Selection For the Feature selection, Otsu's method [9] is used to automatically perform clustering-based image
thresholding via the reduction of a gray level image to a binary image. The algorithm assumes that the image
contains two classes of pixels following bi-modal histogram (foreground pixels and background pixels). It then
calculates the optimum threshold separating the two classes so that their combined spread (intra-class variance)
is minimal so that their inter-class variance is maximal.
Otsu's method exhaustively searches for the threshold that minimizes the intra-class variance (the variance
within the class), defined as a weighted sum of variances of the two classes as shown in Eq 1
(EQ 1)
Where 𝜎 is variance of the two classes (0 and 1) and ω is probability of the two classes (0 and 1). t is the
threshold to separate this two classes. The class probability ω(t) is computed from the L bins of the histogram:
(EQ 2)
Otsu method shows that minimizing the intra-class variance is the same as maximizing inter-class variance.
(EQ.3)
and which is expressed in terms of class probabilities ω and class means µ. Thus the class mean µT(t) would
be
2018 Dell EMC Proven Professional Knowledge Sharing 10
(EQ4)
The following relations can be easily verified.
(EQ5)
The class probabilities and class means can be computed iteratively. This yields an effective iterative algorithm.
The code segment below is used to process an image (video frame) of airplane cargo fire. The objective is to
identify the range of pixels that represent anomaly, which is cargo fire in this case.
This list of libraries in Figure 7 are used in image segmentation Otsu algorithm.
Figure 7. Python Libraries used in the Otsu algorithms
Given a frame (airplane_fire.png), the first step is to normalize the intensity to scale it to 0 and 1.
Figure 8. Image normalizing
R, G and B are extracted out for processing. R is selected to distinguish flame with other objects. Figure 10 and
Figure 11 show the original image and its R/G/B filtering.
Figure 9. Image R/G/B filtering
2018 Dell EMC Proven Professional Knowledge Sharing 11
Figure 10 shows the original image and Figure 11 show the results of execution R/G/B filtering.
Figure 10. Image of an airplane on fire.
Figure 11. Image R/G/B filtering
Next, Otsu algorithm is used to calculate threshold. Threshold is defined as image’s front and background
segments threshold. It is assumed that image has binomial distribution between front and background.
The histogram is binning into 1000 bins. Figure 12 shows the code and its front/background segmentation at
otsu threshold = 0.5134.
2018 Dell EMC Proven Professional Knowledge Sharing 12
Figure 12. Image front/background segmentation at otsu threshold =0.5134
By incrementing the threshold by 0.2 (1.2, 1.4, 1.6, 1.8, 2.0), Figures 13 to Figure 17 display front/background
segmentations at incremental otsu thresholds.
Figure 13. Image front/background segmentation at otsu threshold =0.6160
Figure 14. Image front/background segmentation at otsu threshold =0.7187
Figure 15. Image front/background segmentation at otsu threshold =0.8214
Figure 16. Image front/background segmentation at otsu threshold =0.9241
2018 Dell EMC Proven Professional Knowledge Sharing 13
Figure 17. Image front/background segmentation at otsu threshold =1.0268
Once the front/background image segment identified with 1.8* threshold (otsu threshold = 0.9241), the image
will be mapped back to original image.
Figure 18. Image front/background segmentation at otsu threshold =0.9241
The process is to overlay the region of the image (with otsu threshold =0.9241) and return the four corners’ pixels
(350,1047,440, 1203) related to the front ground object identified in Figure 18.
Figure 19. Image overlay to obtain object identified with otsu threshold =0.9241
2018 Dell EMC Proven Professional Knowledge Sharing 14
Figure 20 shows the result.
Figure 20. The airplane cargo fire segmentaion.
The selected object is further cropped and ready for recognition shown in Figure 21.
Figure 21. Isolate airplane cargo fire segmentaion
3.2 DL Recognition Image recognition is using pre-trained CNN (Vgg16, Vgg19, Inception, Exception, and Resnet). The computing
stack are Python, Spark, Tensorflow, and Keras [5]. The stack is shown in Figure 22.
Figure 22. The computing stack.
2018 Dell EMC Proven Professional Knowledge Sharing 15
For the pre-trained CNN Keras code, the following code segments represent types of models in considerations.
Figure 23. First 2 segment of Keras pre-trained CNN code segments
Figure 24. Last segment of Keras pre-trained CNN code segments
The timestampped frames in temportal format [1] are showing first 4 intervals. Figure 25 shows the first 4 T
(timestampped) images.
Figure 25. Airplane cargo fire (T1 – T4)
A voting model is leveraged to increase the accuracy of classification via general image net. For T1, 5 models
(Vgg16, Vgg19, Inception, Exception, and Resnet) [6] are executed and “Airliner” is the top ranked matched.
Figure 26 to Figure 30 shows the results.
2018 Dell EMC Proven Professional Knowledge Sharing 16
Vgg16: T1 top ranked match is Airliner 36.38%
Figure 26. Vgg16 T1 top ranked match is Airliner 36.38%
Vgg19: T1 top ranked match is Airliner 94.25%
Figure 27. Vgg19 T1 top ranked match is Airliner 94.25%
Inception: T1 top ranked match is Airliner 92.59%
Figure 28. Inception T1 top ranked match is Airliner 92.59%
Exception: T1 top ranked match is Airliner 69.61%
Figure 29. Exception T1 top ranked match is Airliner 69.61%
Resnet: T1 top ranked match is Airliner 95.17%
2018 Dell EMC Proven Professional Knowledge Sharing 17
Figure 30. Resnet T1 top ranked match is Airliner 95.17%
T= 1 example
Figure 31. Example of T1.
The process is repeated for all timestampped images. T4 contains addition image segmentation as shown in
Figure 21.
T=4
Figure 32. Resnet T4 top ranked match is fire_screen 99.35%
The “fire” is recognized as fire screen by out of box pre-trained model. Additional learning transferring would
need to be conducted.
Thus, combining T1 + T4 and video telemetrary, the event will be described as
Airliner is on fire at <location> of time T1.
2018 Dell EMC Proven Professional Knowledge Sharing 18
3.3 LSTM Responses Trigger To predict who to involve and what to do when an airliner is on fire, we use airport emergency manuals [4] as
training samples for LSTM to learn, predict and enforce official airport policies to follow during an emergency
situation. We will decompose the manual into individual letters in order to be able to predict the next letter for a
seed sentence and determine or score its relevance to the emergency situation.
The Emergency manual alphanumerical letters/numbers and special characters are translated into real numbers
shown below.
{'\n': 0, ' ': 1, '"': 2, '#': 3, "'": 4, '(': 5, ')': 6, '*': 7, ',': 8, '-': 9, '.': 10, '/': 11, '0': 12, '1': 13, '2': 14, '3': 15, '4': 16, '5':
17, '6': 18, '7': 19, '8': 20, '9': 21, ':': 22, ';': 23, '?': 24, '[': 25, ']': 26, 'a': 27, 'b': 28, 'c': 29, 'd': 30, 'e': 31, 'f': 32, 'g':
33, 'h': 34, 'i': 35, 'j': 36, 'k': 37, 'l': 38, 'm': 39, 'n': 40, 'o': 41, 'p': 42, 'q': 43, 'r': 44, 's': 45, 't': 46, 'u': 47, 'v': 48, 'w':
49, 'x': 50, 'y': 51, 'z': 52, '–': 53, '’': 54, '“': 55, '”': 56, '•': 57}
Thus, for a sentence like command post for all emergencies, a joint fixed-position and/or mobile command post
will be established, its letters/characters to numerical translation matches to
pattern = [29, 41, 39, 39, 27, 40, 30, 1, 42, 41, 45, 46, 1, 32, 41, 44, 1, 27, 38, 38, 1, 31, 39, 31, 44, 33, 31, 40,
29, 35, 31, 45, 8, 1, 27, 1, 36, 41, 35, 40, 46, 1, 32, 35, 50, 31, 30, 9, 42, 41, 45, 35, 46, 35, 41, 40, 1, 27, 40,
30, 11, 41, 44, 1, 39, 41, 28, 35, 38, 31, 1, 29, 41, 39, 39, 27, 40, 30, 1, 42, 41, 45, 46, 1, 49, 35, 38, 38, 1, 28,
31, 1, 31, 45, 46, 27, 28, 38, 35, 45].
For example, “command” is (c to 29), (o to 41), (m to 39), (m to 39), (m to 39), (a to 27), (n to 40), (d to 30).
The LSTM construct is using two connected layers with 256 RNN nodes in each layer. The dropout is set to 0.2.
Figure 33 and 34 show the code segments in training mode.
2018 Dell EMC Proven Professional Knowledge Sharing 19
Figure 33. Code segment of LSTM training using airport emergency procedure manual
The code segment shown above is using 100 characters as input/output pair. Training data set is assigning 64
pairs in a batch and run 50 epochs for the training.
Figure 34. Code segment of LSTM construct and training
The epochs in the training mode (first 5 epochs are shown in Figure 35) display the monotonic loss decreasing
at Figure 36. Multiple training cycles are conducted to obtain best performance. Each epoch, weights matrix will
be saved in a predefined filename.
2018 Dell EMC Proven Professional Knowledge Sharing 20
Figure 35. LSTM training epochs and corresponding “loss” measurements
Figure 36. LSTM training epochs and corresponding “loss” reduction
For the retrieval, most of the code are the same except the loading back the weights back to LSTM.
Figure 37. The LSTM common code segments for training and retrieval
2.9
75
2.7
49
82
.54
56
2.3
00
42
.09
54
1.9
51
71
.83
92
1.7
48
31
.66
69
1.5
98
91
.53
26
1.4
82
31
.43
08
1.3
85
81
.33
87
1.2
93
91
.25
45
1.2
21
91
.19
05
1.1
53
11
.12
77
1.0
99
71
.07
06
1.0
45
31
.01
21
0.9
94
40
.97
60
.95
37
0.9
31
70
.91
46
0.8
98
40
.87
97
0.8
65
60
.85
01
0.8
34
30
.82
95
0.8
08
80
.79
87
0.7
82
60
.77
76
0.7
61
0.7
49
0.7
39
70
.73
80
.72
32
0.7
18
30
.70
54
0.6
42
70
.64
24
0.6
42
1
1 2 3 4 5 6 7 8 9 1 01 11 21 31 41 51 61 71 81 92 02 12 22 32 42 52 62 72 82 93 03 13 23 33 43 53 63 73 83 94 04 14 24 34 44 54 64 74 84 95 0
LOSS
2018 Dell EMC Proven Professional Knowledge Sharing 21
The best weights matrix (weights-improvement-50-0.6724.hdf5) is loaded back LSTM model and execute by the
seeds phase.
Figure 38. The LSTM common code segments for retrieval
Event trigger is using the key phase “Airliner is on fire ……”. The seeds were selected as the following
1. " command during all fire or medical related emergencies, the senior los angeles fire department offic "
2. " above with the los angeles fire department. provide barricades or means to secure contaminated area
"
The output of LSTM is shown in Figure 39 and 40.
Figure 39. LSTM output by “fire” seed 1.
Figure 40. LSTM output by “fire” seed 2.
2018 Dell EMC Proven Professional Knowledge Sharing 22
By combining outputs and perform nouns ranking, Figure 41 shows the next step
Figure 41. Extract high frequency nouns from LSTM outputs
Using the noun ranking shown above, law enforcement is selected as next candidate for cascaded prediction.
For Law Enforcement, seeds are selected as below.
1. " law enforcement, firefighting and rescue agencies, medical resources, the principal tenants at the "
2. " h the law enforcement officer-in-charge to ensure adequate clear zones are maintained and airport op
"
Figure 42. LSTM output by “law enforcement” seed 1.
Figure 43. LSTM output by “law enforcement” seed 2.
2018 Dell EMC Proven Professional Knowledge Sharing 23
By combining outputs and perform nouns ranking,
Figure 44. Extract high frequency nouns from LSTM outputs
Using the noun ranking shown above, command post and airport operations are selected as next candidate for
cascaded prediction.
For Command Post:
1. " command post for all emergencies, a joint fixed-position and/or mobile command post will be establis"
2. “ emergency will report to the command post to assist in liaison and coordination, control tower fun”
Figure 45. LSTM output by “command post” seed 1.
Figure 46. LSTM output by “command post” seed 2.
For Operations:
Figure 47. LSTM output by “airport operations” seed 1.
2018 Dell EMC Proven Professional Knowledge Sharing 24
Figure 48. LSTM output by “airport operations” seed 2.
From the aggregations, for airliner on fire, the sequence of event triggers suggested are
1. Fire Department: extinguish fire, manage fuel containment
2. Law Enforcement: crowd control, scene control and passenger evacuation
3. Command Post: coordinate air traffic, active decision center
4. Airport Operations: clear paths and facilitate Fire Department and Law Enforcement activities and
airport workforce
4 Conclusion and Future Work
This study showcases the potential value and approach to integrate Deep Learning into Airport complex event
processing support. The emergency scenario is to identify a landed airliners fire via video feed. The situation is
first recognized and four responses are generated, e.g. 1. Trigger alert to Fire Department (to extinguish fire), 2.
Trigger alert to Law Enforcement (to barricade scene of fire and evacuate passengers), 3. Trigger alert to
Command Post (to coordinate air traffic tower and flights) 4. Trigger alert to Airport Operations (to clear path and
facilitate Fire Department and Law Enforcement activities).
Airport event processing is known for its complexity. The traditional tree/graph decision support for complex
decision making would grow and require consolidation where deep learning could consume the new materials
by training. In this conceptual framework, Deep Learning techniques are leveraged to simplify the processing by
encapsulating the events sequence and responses by recognizing event and generating logical responses via
supervised learning. Otsu, CNN, LSTM are used in this conceptual framework to perform feature extraction,
pattern recognition and predict the responses.
The future work includes
1. Train LSTM with two types of documents. One is additional airport emergency documents for better
response generation. The other is to obtain detailed investigation reports and train systems to predict
events in addition to responses.
2. Obtain Airport detailed video footage to perform Learning transfer by injecting airport-specific
video/images in training CNN. This will help solve the issue of “fire” vs. “fire screen” like issues.
3. Enhance feature selection by integrating CNN with Otsu algorithms.
2018 Dell EMC Proven Professional Knowledge Sharing 25
References [1] Kang Li, Video Event Recognition and Prediction based on Temporal Structure Analysis, Dissertation,
Department of Electrical and Computer Engineering, Northeastern University, January, 2015
[2] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Deep Residual Learning for Image Recognition, Microsoft
Research, arXiv:1512.03385v1 [cs.CV] 10 December 2015
[3] Gabriel Pestana, Sebastian Heuchler, Augusto Casaca, Pedro Reis, and Joachim Metter, Complex Event
Processing for Decision Support in an Airport Environment, Internal Journal on Advances in Software, Vol 6 no
3 & 4, year 2013, pp. 246-260
[4] Emergency Procedure, Los Angeles World Airport VNY Rules and Regulations, March 2005, Section 5
[5] Keras Documentation, https://keras.io/
[6] Trained image classification models for Keras, https://github.com/fchollet/deep-learning-models
[7] Convolutional Neural Networks (LeNet) – DeepLearning 0.1 documentation, DeepLearning 0.1. LISA Lab.
Retrieved 31 August 2013.
[8] Sepp Hochreiter; Jürgen Schmidhuber, Long Short-Term Memory, Neural Computation. 9 (8): 1997, pp 1735–
1780.
[9] Nobuyuki Otsu, A threshold selection method from gray-level histograms, IEEE Transactions on Systems
Man and Cybernetics, 9 (1): 1979, pp 62–66
Dell EMC believes the information in this publication is accurate as of its publication date. The information is
subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” DELL EMC MAKES NO
RESPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS
PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR
FITNESS FOR A PARTICULAR PURPOSE.
Use, copying and distribution of any Dell EMC software described in this publication requires an applicable
software license.
Dell, EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries.