reinforcement learning-based traffic control to … · 2019-01-16 · traffic control use cases...

6
1. HIGH-PERFORMANCE COMPUTING AND BIG DATA 1.1 REINFORCEMENT LEARNING-BASED TRAFFIC CONTROL TO OPTIMIZE ENERGY USAGE AND THROUGHPUT (ORNL) Thomas P Karnowski, Principle Investigator Oak Ridge National Laboratory One Bethel Valley Road Oak Ridge, TN 37831 [email protected] David Anderson DOE EEMS Program Manager U.S. Department of Energy E-mail: [DOE [Program/Technology Development] Manager Email] Start Date: February 22, 2018 End Date: April 2019 Project Funding (FY18): $310K DOE share: $250K Non-DOE share: $60K Project Introduction The US roadways are critical to meeting the mobility and economic needs of the nation. The United States uses 28% of its energy in moving goods and people, with approximately 60% of that used by cars, light trucks, and motorcycles. Thus, improved transportation efficiency is vital to America’s economic progress. The increasing congestion and energy resource requirements of transportation systems for metropolitan areas require research in methods to improve and optimize control methods. Coordinating and optimizing traffic in urban areas might introduce hundreds of thousands of vehicles and traffic management systems, which can require high- performance computing (HPC) resources to model and manage. In this work, we seek to use machine learning, computer vision, and HPC to improve the energy efficiency aspects of traffic control by leveraging GRIDSMART traffic cameras as sensors for adaptive traffic control, with a sensitivity to the fuel consumption characteristics of the traffic in the camera’s visual field. Traffic control use cases using reinforcement learning have been published and achieved good results. Surveys from DOE national laboratories estimate that the fuel cost of idling is six billion gallons wasted annually [1]. GRIDSMART cameras—an existing, fielded commercial product—sense the presence of vehicles at intersections and replace more conventional sensors (such as inductive loops) to issue calls to traffic control. These cameras, which have horizon-to-horizon view, offer the potential for an improved view of the traffic environment, which can be used to generate better control algorithms. Objectives There are two primary objectives in this project. The first is to develop algorithms that essentially teach GRIDSMART cameras to estimate fuel consumption of vehicles in their visual field. The second is to use this capability to improve energy efficiency by changing timing and phasing of traffic lights, while minimizing penalties to throughput and mobility. HPC can play a role in both objectives by allowing more complete exploration of the machine learning architectures, parameters, and methods that enable the capability to determine vehicle types. HPC-based simulations that model traffic and capture the performance of GRIDSMART cameras in estimating the visual field (extrapolated from real data using developed algorithms and models) serve as training and testing data for reinforcement learning algorithms that learn policies for traffic camera control. The key outcome of this work will be control strategies generated through a novel reinforcement learning framework, with performance measured through simulations and validation data and oriented toward the GRIDSMART sensing capability. Other important outcomes include projections of the required sensing capabilities to achieve these control strategies. This will pave the way for future research to

Upload: others

Post on 07-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: REINFORCEMENT LEARNING-BASED TRAFFIC CONTROL TO … · 2019-01-16 · Traffic control use cases using reinforcement learning have been published and achieved good results. Surveys

1. HIGH-PERFORMANCE COMPUTING AND BIG DATA

1.1 REINFORCEMENT LEARNING-BASED TRAFFIC CONTROL TO OPTIMIZE ENERGY USAGE AND THROUGHPUT (ORNL)

Thomas P Karnowski, Principle Investigator Oak Ridge National Laboratory One Bethel Valley Road Oak Ridge, TN 37831 [email protected]

David Anderson DOE EEMS Program Manager U.S. Department of Energy E-mail: [DOE [Program/Technology Development] Manager Email] Start Date: February 22, 2018 End Date: April 2019 Project Funding (FY18): $310K DOE share: $250K Non-DOE share: $60K

Project Introduction The US roadways are critical to meeting the mobility and economic needs of the nation. The United States uses 28% of its energy in moving goods and people, with approximately 60% of that used by cars, light trucks, and motorcycles. Thus, improved transportation efficiency is vital to America’s economic progress. The increasing congestion and energy resource requirements of transportation systems for metropolitan areas require research in methods to improve and optimize control methods. Coordinating and optimizing traffic in urban areas might introduce hundreds of thousands of vehicles and traffic management systems, which can require high-performance computing (HPC) resources to model and manage. In this work, we seek to use machine learning, computer vision, and HPC to improve the energy efficiency aspects of traffic control by leveraging GRIDSMART traffic cameras as sensors for adaptive traffic control, with a sensitivity to the fuel consumption characteristics of the traffic in the camera’s visual field. Traffic control use cases using reinforcement learning have been published and achieved good results. Surveys from DOE national laboratories estimate that the fuel cost of idling is six billion gallons wasted annually [1]. GRIDSMART cameras—an existing, fielded commercial product—sense the presence of vehicles at intersections and replace more conventional sensors (such as inductive loops) to issue calls to traffic control. These cameras, which have horizon-to-horizon view, offer the potential for an improved view of the traffic environment, which can be used to generate better control algorithms.

Objectives There are two primary objectives in this project. The first is to develop algorithms that essentially teach GRIDSMART cameras to estimate fuel consumption of vehicles in their visual field. The second is to use this capability to improve energy efficiency by changing timing and phasing of traffic lights, while minimizing penalties to throughput and mobility. HPC can play a role in both objectives by allowing more complete exploration of the machine learning architectures, parameters, and methods that enable the capability to determine vehicle types. HPC-based simulations that model traffic and capture the performance of GRIDSMART cameras in estimating the visual field (extrapolated from real data using developed algorithms and models) serve as training and testing data for reinforcement learning algorithms that learn policies for traffic camera control. The key outcome of this work will be control strategies generated through a novel reinforcement learning framework, with performance measured through simulations and validation data and oriented toward the GRIDSMART sensing capability. Other important outcomes include projections of the required sensing capabilities to achieve these control strategies. This will pave the way for future research to

Page 2: REINFORCEMENT LEARNING-BASED TRAFFIC CONTROL TO … · 2019-01-16 · Traffic control use cases using reinforcement learning have been published and achieved good results. Surveys

expand the number of studied intersections, investigate the potential of wide-range coordinated control, add naturalistic driving study data for higher resolution and simulation detail, extend sensing capabilities to other technologies such as RFID/cellular and/or connected vehicle technology, and incorporate direct vehicle emissions sensing to minimize cumulative emissions measured.

Approach The GRIDSMART cameras will be trained to estimate fuel consumption by using a ground-based camera system located under a GRIDSMART instrumented intersection. ORNL has three GRIDSMART cameras on site, and these will be used to collect data. The simultaneous capture of the ground-based camera image with the GRIDSMART camera image will allow a view from the “GRIDSMART perspective” along with a view from the ground camera. The latter will then be classified into a vehicle class (i.e., make and model), ideally using a commercial application procured for this purpose. ORNL will leverage an existing, ongoing project that is collecting data on the reservation as part of another project. The data used here will allow the creation of a training set of images—from the unique GRIDSMART view—that will be used to create a machine-learning model to classify vehicle make and model and therefore estimate fuel consumption. There are contingencies built into this process. First, there could be better methods to estimate the fuel consumption that simply estimating the make and model, and these approaches will be explored. Second, if sufficient data is not collected, an estimate will be made using ground data from existing data sets [2]. Finally, we will also attempt to leverage coarse statistics such as vehicle size to determine whether there can be a reasonable substitute for true vehicle classification. The approach is shown in Figure 1.

The GRIDSMART cameras will be trained to control timing and phasing for improved fuel efficiency by reinforcement learning and simulations (on HPC). The HPC simulations will create derived training, validation, and testing data to explore control strategies based on reinforcement learning (RL) with automated vehicle identification algorithms at varying resolution using the Participant video data. The control strategies development will start with single intersections and expand to multiple intersections with studies on scalability and impact. The HPC simulations will be performed on HPC, but with a goal of producing control strategies that can be deployed in environments with a small computational footprint such as a distributed network of GRIDSMART cameras. RL finds solutions to problems where an actor or set of actors learn to respond to

Figure 1. Data collection to estimate fuel consumption using GRIDSMART cameras. GRIDSMART installations at ORNL will be used in conjunction with a ground imaging system under development for

another project. The simultaneous captures will be used to produce a library of images suitable for training a machine learning algorithm to estimate fuel consumption.

Page 3: REINFORCEMENT LEARNING-BASED TRAFFIC CONTROL TO … · 2019-01-16 · Traffic control use cases using reinforcement learning have been published and achieved good results. Surveys

dynamic environmental conditions to achieve an overall optimized solution such as winning a game or controlling a process. In this collaboration, the actions are the activation of one or more traffic signals in response to sensed vehicle types (and corresponding fuel economy metrics), vehicle dynamics, and throughput objectives. The optimization goal is a combination of throughput and energy efficiency. The huge input space (combinations of vehicle types, vehicular dynamics, and multiple signal lights) represents a large dimensional problem that will require HPC for simulations and deep RL for solutions. Our initial planned approach is to develop a custom simulation environment for the vehicle simulation, given the scope of the proposed work, as a simple proof of concept.

Results We created algorithms and a process to capture simultaneous GRIDSMART images and ground imager system images, align them, and label them. GRIDSMART data at the ORNL locations must be captured using a USB hard drive plugged into the controller. GRIDSMART personnel helped ORNL confirm the method for this capture and also helped ensure the controllers were time synchronized, which was critical to use timing data to correlate the ground imager captures with the GRIDSMART data. Computer vision algorithms were developed to segment the vehicles from the background using a process similar to GRIDSMARTs implementation. In Figure 2 a simultaneous capture with the GRIDSMART imager and the ground truth imager is shown. The commercial application labeled this ground capture as a “Ford Transit Connect,” which is inserted into the image for illustration in the upper left corner. Multiple images such as these have been collected and will continue to be collected into 2019. As of the end of September 2018, approximately 12,600 vehicles have been collected. We note that a percentage of these have ground truth labels spanning 474 classifications. Although this is substantial, more data is needed to create deep learning models for effective classification, so continuous collections are ongoing to expand the set.

Given that more data is needed to effectively create a model for vehicle classification from the GRIDSMART view, we used two contingencies to estimate vehicle classification performance for our simulations. First, we used a table of vehicle types and the length and width estimates of the vehicles to try and estimate vehicle fuel efficiency using a linear regression model. Second, we used an open-source database of vehicle images to estimate classification performance and its impact on fuel efficiency estimation. (Note that for our initial analysis we did not include large commercial trucks, which will have a definite impact on the system performance when we are able to include them.) The estimated fuel economy with this model and perfect measurements is RMS error of 2.85 MPG, but conversations with GRIDSMART indicated that there can be substantial error in such a measurement from any computer vision platform. We found that the regression model rapidly degrades with small measurement error; even a 500 mm mean error creates an RMS error of approximately 5.8 MPG. Therefore, we believe the utility of a measurement-based system will largely be found in discriminating commercial trucks (particularly “18 wheelers”) from passenger vehicles.

Our second effort used the dataset from [2]. Although this was taken from the “ground view,” we believe it serves as a good estimate for what might be possible with a full data set from the GRIDSMART vantage. We retrained a convolutional neural network based on the Alexnet topology [3] to act as a vehicle make/model classifier. This was inspired by the example of [2], which served as a good baseline for the exercise. We trained using 70% training data, 15% validation data, and 15% testing data and evaluated our performance on the test data set aside. We also degraded the image resolution to simulate actual degradation of the image quality from the GRIDSMART imager at ORNL, at ranges of 0 m, 20 m, 40 m, and 60 m. Finally, we used the classifier to estimate fuel efficiency visually by assuming if we successfully identified the make and model of the vehicle, our error was 0 MPG; otherwise, we used the erroneous classification as the MPG and measured the RMS error between this estimate and the actual value. The results are summarized in Table 1. An example of a tracked vehicle from 60 m to the stop light is shown in Figure 3.

Deleted: Table 1

Page 4: REINFORCEMENT LEARNING-BASED TRAFFIC CONTROL TO … · 2019-01-16 · Traffic control use cases using reinforcement learning have been published and achieved good results. Surveys

Table 1 Estimates of Fuel Economy using CNN Baseline Model Range to Vehicle (m) Classifier Accuracy (%) RMS MPG Error

0 33 3.5

20 16 5.1

40 3 6.7

60 1 10.0

Figure 2. GRIDSMART capture (overview) with simultaneous ground imager collection (bottom left). The commercial application labeled the ground imager data as a “Ford Transit Connect,” which has an estimated fuel efficiency of 28 MPG. The bottom right depicts an image of the same vehicle, found

independently, confirming this labeling. Using this and many more captures we can begin to teach the GRIDSMART cameras how to estimate vehicle fuel consumption.

Page 5: REINFORCEMENT LEARNING-BASED TRAFFIC CONTROL TO … · 2019-01-16 · Traffic control use cases using reinforcement learning have been published and achieved good results. Surveys

Given the error based on the make and model, we are interested in determining whether there are better methods for vehicle classification that can be more robust to RMS estimates. These include estimating the length and width with convolutional neural networks (CNNs) and developing a topology that focuses on improving the overall MPG estimates. At the end of September 2018 we used the MENNDL processing engine [4] on ORNL’s Titan supercomputer to attempt to evolve a better topology for fuel efficiency estimates. Those initial results were indeterminate, but we plan to continue this effort in FY 2019, leading to a role for HPC in the actual first objective for the project.

Conclusions In this period of performance, we have identified data sources at ORNL and successfully collected camera images with the assistance of GRIDSMART. The data has been correlated with ground-level collections, and we have used a commercial application to classify the vehicles, allowing us to begin building a data set for our first objective. We have developed and deployed computer vision algorithms to segment vehicles from the background, which allows us to capture a view of the identified vehicle type from multiple ranges from the camera. Vehicle collections are ongoing and are expected to continue through the duration of the work with weekly data pulls. Given the limited scope of the project, we took the approach of having contingencies for our estimation method. Estimates based on vehicle size for passenger vehicles can produce a good estimate of fuel consumption characteristics, with RMS error under 3 MPG, but these estimates are susceptible to error. We believe relying on vehicle size estimates will still be beneficial when we consider commercial vehicles, which are typically larger and much less fuel efficient. We have also elected to project an estimate of classification performance using an open-source data set, with an estimated RMS error in fuel consumption of 3.5 MPG for noncommercial, passenger vehicles at close range with degradation as the range increases. We used the MENNDL HPC algorithms to attempt to improve the CNNs that estimate fuel consumption, with limited

Figure 3. Example of an actual tracked vehicle from the ORNL GRIDSMART camera. The image degradation due to resolution and the fish-eye lens is profound at longer distances and degrades

classifier performance, as shown in Table 1.

Page 6: REINFORCEMENT LEARNING-BASED TRAFFIC CONTROL TO … · 2019-01-16 · Traffic control use cases using reinforcement learning have been published and achieved good results. Surveys

success in this performance period but with more work to be attempted in FY 2019. Finally, our simulation efforts will be our primary focus in FY 2019, with three potential approaches: a cell transmission model, an open-source traffic simulator, and a simplified custom simulator.

Key Publications None to date.

References [1] https://www.anl.gov/energy-systems/project/reducing-vehicle-idling.

[2] Gebru, Timnit, Jonathan Krause, Yilun Wang, Duyun Chen, Jia Deng, Erez Lieberman Aiden, and Li Fei-Fei. 2017. “Using Deep Learning and Google Street View to Estimate the Demographic Makeup of the US.” arXiv preprint arXiv:1702.06683.

[3] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. 2012. “Imagenet classification with deep convolutional neural networks.” Advances in neural information processing systems. pp. 1097–1105.

[4] Young, Steven R., Derek C. Rose, Thomas P. Karnowski, Seung-Hwan Lim, and Robert M. Patton. 2015. “Optimizing deep learning hyper-parameters through an evolutionary algorithm.” In Proceedings of the Workshop on Machine Learning in High-Performance Computing Environments. p. 4. ACM.

Acknowledgements We acknowledge the contributions of the ORNL team, including Matt Eicholtz, Russ Henderson, Johnathan Sewell, Eva Freer, Regina Ferrell, Travis Johnston, Sean Oesch, Thomas Naughton, Wael Elwasif, Steven Young, and Derek Rose.