method development for measuring black carbon (bc) using ......figure 10 four pre-processed images...

Method Development for Measuring Black Carbon (BC) using a Smartphone Camera

by

Gang Chen

A thesis submitted in conformity with the requirements for the degree of Master of Applied Science

Department of Chemical Engineering & Applied Chemistry University of Toronto

© Copyright by Gang Chen 2018

ii

Method Development for Measuring Black Carbon (BC) Using a

Smartphone Camera

Gang Chen

Master of Applied Science

Department of Chemical Engineering & Applied Chemistry

University of Toronto

2018

Abstract

Black carbon (BC) is one of the major components of the atmospheric particulate matter (PM),

which can cause adverse health impacts and contribute significantly to climate change. Poor

understanding of BC sources and concentrations is the main obstacles to reduce BC emissions.

Current commercial BC sensors remain too costly to deploy widely. A fast, cost-effective, and

easily accessible deployment of smartphone camera was used to quantify colour information of

PM collected on filters to estimate BC and elemental carbon (EC) loading. When applied to 1266

PM2.5 ambient samples collected from six sites across Ontario, Canada, the RGB-based BC model

showed powerful predictability with R2=0.95 between predicted and measured BC concentrations

from an aethalometer. The RGB-based EC model was trained using 478 personal PM2.5 samples

collected from pre-diabetic subjects in Beijing with an R2=0.91 between predicted and measured

EC concentrations from OC/EC analyzer.

iii

Acknowledgments

First and foremost, I would like to thank my supervisor, Dr. Arthur Chan, for his contributions to

this thesis in the past two years. As a mentor of mine, his office was always open whenever I had

trouble with my research or even writing. His diligent, enthusiastic, and meticulous nature are

always inspiring me to challenge myself and to my best self. He continuously encouraged me when

I was stuck in my research at the beginning. What’s more, I have learned so many things from him

in these past two years besides research. It is not easy to express all my appreciation to Arthur in

this merely single paragraph. Throughout two years’ study in Arthur’s group, I realized that I might

not be able to find a better supervisor like him.

During the entirety of my graduate studies at the University of Toronto, I have had the opportunity

to work with a large number of colleagues. This work could not have been possible without the

contribution from them. I want to thank the Southern Ontario Centre for Atmospheric Aerosol

Research (SOCAAR) members. The “so what” and “who cares” questions that Professor Greg

Evans asks all the time which have become the guidance for my research. I want to thank Cheol-

Heon Jeong for offering me much assistance regarding technical issues of fixing and operating the

DustTrak, SHARP, and Aethalometer. Also, I would like to thank Dr. Bruce Urch for the valuable

advice to my thesis and the technical support in operating particle concentrator. In addition, I have

learned so much about the presentation skills by practicing at the SOCAAR meetings, and

feedback from SOCAAR members were always the guiding stars for my thesis. I would also like

to acknowledge our collaborators, Dr. Yushan Su at the Ministry of the Environment and Climate

Change Ontario (MOECC), Dr. Mi Tian at Chinese Academic of Science, and Professor Tong Zhu

at Peking University. This work would not have been possible without their support on sample

supplies.

iv

It is my great honor to have such an excellent opportunity to work with intelligent people like my

lab mates, Mengxuan Cai, Jianhuai Ye, Manpreet Takhar, Meng Meng, Shunyao Wang, Tian Mi,

Lukas Kohl, Alicia Hill-Turner, Anthony Tuccitto and Rui Zeng. I can still remember the great

time we had when we stayed late in the lab and hung out for meals. I want to a give special thanks

to Jackie. As the first Ph.D. in our group, he is like a big brother to me. I will never forget his

advice on my research and future career paths.

Finally, none of my achievements would be possible without unrequited love and mentally,

financially supports for both my undergraduate and graduate studies from my parents, Shuanglu

Chen, Yingqun Huang, my sibling, Haoyun Chen, my maternal grandparents, Mingchun Huang,

Chunpei Li, my departed paternal grandparents, Xingnan Chen, Cailiu Zhu, and the rest of family

members. This is your achievement as much as mine.

v

Table of Contents

Acknowledgments.......................................................................................................................... iii

Table of Contents .............................................................................................................................v

List of Tables ................................................................................................................................ vii

List of Equations .......................................................................................................................... viii

List of Figures ................................................................................................................................ ix

Chapter 1 Introduction .....................................................................................................................1

1.1. Black Carbon .......................................................................................................................1

1.2. Principles of Commercialized Black Carbon Instruments ...................................................2

1.3. Deployment of Smartphone .................................................................................................3

1.4. Image Processing .................................................................................................................4

1.5. Literature Review.................................................................................................................6

1.5.1. Quantifying BC/EC using optical methods ..............................................................6

1.5.2. Research Gaps ..........................................................................................................7

1.6. Research Goals .....................................................................................................................7

Chapter 2 Methods .........................................................................................................................10

2.1. Sample Collection and Instrumentation .............................................................................11

2.1.1. Ontario Samples .....................................................................................................11

2.1.2. Beijing Samples .....................................................................................................15

2.2. Image Capturing and Image Processing ............................................................................17

2.2.1. Capturing Raw Images ...........................................................................................17

2.2.2. Demosaicing ..........................................................................................................18

2.2.3. White Balancing .....................................................................................................19

2.2.4. Colour Transformation ...........................................................................................19

vi

2.3. Detection Algorithms for Colour Extraction .....................................................................21

2.3.1. Detection of 24 Patches in ColorChecker (BRISK Point Feature Matching) ........21

2.3.2. Detection of Filter Sample .....................................................................................22

2.4. Model Buildup ...................................................................................................................24

Chapter 3 Results and Discussion ..................................................................................................26

3.1. Effectiveness of Image Processing ....................................................................................26

3.2. RGB-based Model to Predict PM2.5 Loading .....................................................................29

3.3. RGB-based Model to Predict BC Loading .........................................................................32

3.3.1. Assessments of the Model .....................................................................................32

3.3.2. Diagnostics of Systematic Bias from Different Sources of PM2.5 .........................34

3.4. RGB-based Model to Predict EC Loading .........................................................................35

3.5. Integrated RGB Model for All Samples ............................................................................40

Chapter 4 Conclusions and Recommendation ...............................................................................42

4.1. Conclusions ........................................................................................................................42

4.2. Recommendation ...............................................................................................................43

Bibliography ..................................................................................................................................45

vii

List of Tables

Table 1 Summary of current black carbon (BC) instruments market (Du et al., 2011).

Current BC instruments are too expensive to afford for the community. .............................. 3

Table 2 Research to date using optical techniques (scanner, conventional cellphone, and

colorimeter) to quantify BC/EC loading (µg/cm2). .................................................................... 9

Table 3 Detailed description of the sampling sites, training and test datasets, reference

instruments, and filter types. ..................................................................................................... 14

Table 4 List of materials used in this work ............................................................................... 18

viii

List of Equations

Equation 1 .................................................................................................................................... 13

Equation 2 .................................................................................................................................... 19

Equation 3 .................................................................................................................................... 20

Equation 4 .................................................................................................................................... 20

Equation 5 .................................................................................................................................... 20

Equation 6 .................................................................................................................................... 29

Equation 7 .................................................................................................................................... 30

Equation 8 .................................................................................................................................... 30

Equation 9 .................................................................................................................................... 30

Equation 10 .................................................................................................................................. 32

ix

List of Figures

Figure 1 Flow diagram of camera preprocessing program (Chakrabarti et al., 2009).

Preprocessing image systems of the camera alter colour, white balance, contrast, and

brightness of camera photographs. They also further convert images to sRGB colour space

and irreversibly compressed them into jpg format. These nonlinear operations make it

impossible to derive device-independent colour information from compressed jpg images

because these non-linearities are usually unknown and cannot be inverted. .......................... 5

Figure 2 Flow diagram of the experimental test ...................................................................... 11

Figure 3 a) Schematic diagram of the synchronized hybrid ambient real-time particulate

(SHARP) monitor. It combines light scattering (nephelometer) and β-ray attenuation to

measure PM2.5 concentrations precisely and accurately with a one-minute resolution. In this

study, all the SHARP monitors in Ontario monitoring stations were programmed to advance

one filter spot every 8 hours, following the U.S. EPA standard. The only exception is the

SHARP monitor in Downtown Toronto (located on the rooftop of the Wallberg Building),

which advanced each filter tape every 24 hr. (Thermo Fisher Scientific, 2013); b) Picture of

PM2.5 loaded SHARP filter (16-mm glass fiber); c) Photo of the SHARP monitor. ............. 12

Figure 4 a) A schematic of a Libra Model L-4 Personal Sampler placed in the participant’s

breathing zone (30 cm from the nose); b) Photo of a 37-mm PM2.5 loaded quartz filter. .... 16

Figure 5 Image-processing workflow used in this study. Colour information can be used for

research purposes if raw images are captured and processed manually with linear operations

so that linear relationship with scene reflectance can be maintained. ................................... 17

Figure 6 Set-up of the raw image capture ................................................................................ 18

Figure 7 a) Reference image of ColorChecker in the scene. b) Matched putative points

between the reference image (left side) and the target image (right side). c) The output of the

point feature detection program. ............................................................................................... 21

Figure 8 a) Scanned image of Ontario samples. b) Detected images...................................... 23

x

Figure 9 a) One of the annotated images before training. b) Test image of the Mask R-CNN.

c) The output of the Mask R-CNN. Masks are shown in colours and bounding box. .......... 24

Figure 10 Four pre-processed images by two iPhone 6s cameras. The two top photos (S352

and S604) are Beijing samples captured using the same iPhone 6s but under two different

light conditions (S352 is brighter than S604). The two bottom photos (HWY28 and WW49)

are Ontario samples captured at two distinct locations using another iPhone 6s. b) Four

processed photos using the image processing program in MATLAB. It proves that this image

processing program can take into account different light conditions and devices effectively.

....................................................................................................................................................... 27

Figure 11 Calibrated RGB values of ColorChecker versus published RGB values of

ColorChecker. a), b), and c) represent the correlation between published RGB values and

RGB values of the 24 colour patches obtained from processed images using the image

processing program mentioned in Chapter 2. It shows that this program can calibrate RGB

values into the ground truth RGB effectively, despite different light conditions and devices.

....................................................................................................................................................... 28

Figure 12 a) One of two thousand hold out validations for the linear interactions regression

model. This randomly chosen 20% testing dataset does not show a good agreement with the

model trained using the remaining 80% data. b) Distribution of the PM2.5 loading for all the

Ontario samples and the distribution of the 2000 RMSE from hold out validations. The mean

of 2000 R2 is 0.50, and the CV(RMSE) is 77.3%. Also, the detection limit of this model is 61.0

[µg/cm2], which means nearly 91.6% of the dataset was smaller than LOD. ........................ 30

Figure 13 a) PM2.5 loading (µg/cm2) reported by the linear interactions regression model

versus actual PM2.5 loading measured by the SHARP monitor for all Ontario samples

(N=1266). The R2 for this model is 0.50, and RMSE is 33.5 [µg/cm2]. b) The residuals of the

selected model versus BC to PM2.5 ratio. It demonstrates that this model cannot predict PM2.5

loading when BC loading is relatively small. Moreover, the residual shows a trend as BC/PM

increasing, which indicates that this model measures BC loading instead of the PM2.5 loading.

....................................................................................................................................................... 31

xi

Figure 14 One of two thousand hold out validations for the linear interactions regression

model. This randomly chosen 20% testing dataset shows strong agreement with the model

trained using the remaining 80% data. b) Distribution of the BC loading for all the Ontario

samples and the distribution of the 2000 RMSEs from 2000 times hold out validations. This

model shows strong predictability of BC loading for 2000 tests with the means of 2000 R2

and 2000 RMSEs equal to 0.95 and 0.6 [µg/cm2], respectively. Also, the detection limit of this

model is 0.3 [µg/cm2], which means all the dataset are larger than LOD. ............................. 33

Figure 15 a) BC loading reported by the linear interactions regression model versus actual

BC loading measured by the aethalometer (AE33/AE31) for all Ontario samples (N=1266).

b) The residuals of the selected model versus PM2.5 to BC ratio. It exhibits a random

distribution of the residuals, which indicates that the PM2.5 (except BC) deposited on the

filter does not affect the predictability of this model. .............................................................. 34

Figure 16 Boxplots of residuals for all Ontario samples: three categories, three time periods

in the day, and Weekdays vs. Weekends. It shows all categories agree with the model without

any systematic bias, which means that this RGB-based model can measure BC loading

consistently and accurately despite the variety of the PM2.5 sources. .................................. 35

Figure 17 One of two thousand hold out validations for the linear interactions regression

model. This randomly chosen 20% testing dataset shows strong agreement with the model

trained using the remaining 80% data. b) distribution of the BC loading for all the Ontario

samples and the distribution of the 2000 RMSEs from 2000 times hold out validations. This

model shows strong predictability of BC loading for 2000 tests with the median of 2000 R2

and 2000 RMSEs equal to 0.91 and 0.9 [µg/cm2], respectively. Also, the detection limit of this

model is 0.5 [µg/cm2], which means nearly 0.07% of the dataset is smaller than LOD. ...... 36

Figure 18 a) EC loading reported by the linear interactions regression model versus actual

EC loadings measured by Sunset OC/EC analyzer for Beijing samples (N=478). b) The

residuals of the selected model versus OC to EC ratio. It shows that the residuals are

randomly distributed in two sides of the y=0 line, which indicates that the OC deposited on

the filter does not affect the predictability of this model. ........................................................ 37

xii

Figure 19 Predicted BC using the BC model vs. actual EC loading for Beijing samples. It

does not show a good agreement when using the BC model to predict EC with an R2 of 0.84

and an RMSE of 0.9 (µg/cm2). However, the slope (0.99) of this trendline exhibits that

smartphone image analysis is consistent with a 1% error. Thus, the relatively poor

predictabilities of EC loading using the BC model is due to the differences in their measuring

techniques. ................................................................................................................................... 39

Figure 20 a) BC/EC loading predicted by the linear interactions regression model trained

using the whole data set versus actual BC/EC loadings measured by reference instruments.

b) BC loading predicted by Ontario BC model and EC loading predicted by Beijing EC

model versus actual BC/EC loadings measured by reference instruments. .......................... 40

Figure 21 a) Sketch of the experimental set-up. b) Manikin wearing a facemask and

“breathing” using a “breathing pump” (can inhale and exhale at a flow rate of 8L/min). c)

Ambient fine particle (PM2.5) concentrator (concentrate the PM2.5 concentration in the

chamber for 64 times). ................................................................................................................ 43

1

Chapter 1 Introduction

1.1. Black Carbon

Black Carbon (BC) is one of the essential components in particulate matter (PM). It is primarily

emitted from incomplete combustion of carbonaceous fuels during residential heating and cooking,

transportation, power generation, and other industrial processes. It has drawn significant attention

in recent decades from climate change, air pollution, and health research communities (Petzold et

al., 2013). Quantification of BC emissions and concentration is required to understand the effects

of BC on climate change and the health of humans and ecosystems. However, owing to the

complexity in BC sources, BC emissions remain difficult to quantify (Du et al., 2011). Also, both

the high capital costs and operating costs of BC measurements prohibit their widespread

deployment and limit the ability to capture the spatial and temporal variability of the emissions.

Thus, a cost-effective, easily accessible, and relatively accurate BC measurement method is

required. In this study, we develop a new method to indicate BC exposure using smartphone image

analysis, which is the first step to popularize the BC sensor to the general public.

Petzold et al. (2013) formally defined BC and Elemental Carbon (EC) by referring to measurement

techniques instead of formation processes. BC refers to an ideally light-absorbing carbonaceous

material; this definition is based on the optical properties of the material (Petzold et al., 2013). In

contrast, Elemental Carbon (EC) is defined by its chemical properties and refers to a chemical

substance which only contains carbon in its elemental form, but potentially exists in different

allotropic forms (Schwartz et al. 2012). In other words, BC and EC are defined based on their

different measuring principles. BC is measured using light attenuation, Laser-induced

2

incandescence (LII), and photoacoustic methods, while EC is measured using a thermal optical

technique.

1.2. Principles of Commercialized Black Carbon Instruments

In general, there are three widely used methods for measuring BC/EC concentration: thermal

optical measurement, photoacoustic spectroscopy, and light attenuation. There are two different

commercialized thermal optical based instruments to measure both Elemental Carbon (EC) and

Organic Carbon (OC): Sunset Laboratory Thermal Optical Analysis system and DRI

Thermal/Optical Reflectance carbon analyzer (Chow et al., 1993). Instrument operation often

follows one of two protocols: Interagency Monitoring of PROtected Visual Environments

(IMPROVE) and National Institute of Occupational Safety and Health (NOISH). The main

difference between IMPROVE and NOISH is the operating temperature and timing of the anoxic

phase of the analysis.

Two of the most common light attenuation-based instruments is the Magee Scientific

Aethalometer and the Particle Soot Absorption Photometer (PSAP). The instruments based on the

photoacoustic method is the Photoacoustic Soot Spectrometer (PASS). Also, the instrument based

on the thermal emission of incandescent BC is the Single Particle Soot Photometer (SP2) which

can measure BC on a single particle level. However, all these instruments are too expensive for

the general public as shown in Table 1 below.

3

Table 1 Summary of current black carbon (BC) instruments market (Du et al., 2011).

Current BC instruments are too expensive to afford for the community.

Sensor Aethalometer AE31 Micro-aethalometer

AE51

Off-line OC/EC

Analyzer

Manufacturer Magee Scientific Aethlabs Sunset

Principles Light attenuation Light attenuation Thermal/Optical

Price (USD) $40, 000 $6, 500 $70, 000

1.3. Deployment of Smartphone

According to the 2016 global census, nearly 43% of the world’s population owns a smartphone,

and the smartphone market is growing rapidly, particular in developing countries (Poushter et al.,

2016). Smartphones have been deployed in various scientific fields, including diagnosis based on

Chinese medicine using pictures of the human tongue (Cheng et al., 2017), diagnosis of skin

diseases based on skin photos (Kuzmina et al., 2015), as well as quantifying temperature

distribution on a surface using smartphone images (Treibitz et al., 2015). Moreover, current

smartphones are capable of storing raw images, allowing for extracting color information based

on a linear relationship with scene radiance. In other words, a smartphone camera is capable of

capturing true colours for research purposes with an acceptable level of accuracy. Thus, it is

possible to deploy off-the-shelf smartphones to assess BC/EC loading by taking photos of collected

PM filter samples with its distinct advantages, such as low cost and easy accessibility. Widespread

adoption of smartphones to measure BC/EC will provide abundant and meaningful data on BC/EC

exposure, and raise awareness of the health and climate effects of BC/EC.

4

1.4. Image Processing

One of the main challenges in deploying cameras to capture the true colour of an object is the

significant variation in the image quality of photos of the same object taken under different light

conditions, from different angles, at different distances, and with different cameras. Thus, there is

an urgent need for an effective image processing algorithm that can take these variables into

account. In general, every camera has a built-in image system for preprocessing photos (Figure

1). These proprietary systems often include operations that can alter the colour, white balance,

brightness, and contrast of the images. It converts the images into standard RGB1 (sRGB) colour

space, which has a nonlinear relationship with scene radiance (the intensity of light captured by

the camera sensors). The images are then compressed irreversibly into a jpg format so that a

nonlinear RGB image can be shown on the screen (Chakrabarti et al., 2009). However, as a result

of nonlinear processing, built-in image processing programs often do not preserve the linearity

between the RGB values and the scene radiance. To obtain the true colour for scientific purposes,

the raw image will need to be processed by algorithms that operate linearly and in a device-

independent manner (Chakrabarti et al., 2009).

1 RGB stands for “Red Green Blue”. It refers to the intensities in three hues of light, which can be used to indicate

colours.

5

Figure 1 Flow diagram of camera preprocessing program (Chakrabarti et al., 2009).

Preprocessing image systems of the camera alter colour, white balance, contrast, and

brightness of camera photographs. They also further convert images to sRGB colour

space and irreversibly compressed them into jpg format. These nonlinear operations

make it impossible to derive device-independent colour information from compressed

jpg images because these non-linearities are usually unknown and cannot be inverted.

In some cases, raw photos are captured by cameras and further processed manually using three

linear operations. Firstly, because most digital cameras capture images by a single sensor overlaid

with a colour filter array (CFA), colour reconstruction is required to convert the viewable format

of images (has a full set of colour triples) from CFA. The missing two intensities at each pixel

location are estimated through the interpolation process which is known as demosaicing, colour

reconstruction or CFA interpolation. Secondly, white balance is used to adjust for the different

light conditions. Human eyes have evolved to distinguish colours of objects based on their spectral

properties despite the presence of objects under various illuminants. However, digital cameras can

only record the actual reflectance of the objects and have considerable difficulties in judging what

is white even with their automatic white balance process. Thus, it is important to introduce another

6

white balance algorithm manually to extract the true colour of objects in the photos. Lastly, two

different cameras will record different colour information for the same object because of the

difference in colour sensitivities of two cameras’ sensors. It is essential to transform the images to

a device-independent colour space using calibration targets in the scene. The linearized colour

information can then be extracted after this operation. The specific image processing algorithms

used in this study will be discussed in the following chapter.

1.5. Literature Review

1.5.1. Quantifying BC/EC using optical methods

In recent years, there has been increasing interest in quantifying BC/EC loading (µg/cm2) on a

filter substrate using optical techniques, such as scanners, smartphone cameras, digital cameras,

and colorimeters. This body of research has demonstrated that optical sensing has distinct

advantages in the measurement of BC/EC loading: it is non-destructive, low cost, and fast. Cheng

et al. (2011) showed that the reflectance (min{R, G, B}) of particles collected on quartz filters

measured by a scanner could be used to estimate EC loading in ambient urban Hong Kong air well

within an error of 10%. Ramanathan et al. (2011) showed that a conventional cellphone could be

used to obtain reflectance at the red wavelength (ρ-red), from which BC loading can be predicted.

Along the same lines, Olson et al. (2016) investigated the predictability of a Hue, Saturation, and

Value (HSV)-based model using both a conventional cellphone and a professional colorimeter

(iPro), obtaining promising results when the latter was used. Khuzestaniet al. (2016) extended the

analysis to another color space (CIELAB) which is the most representative of human vision. Their

CIELAB model demonstrates strong predictability for EC collected on both quartz and Teflon

filters. All these studies concluded that the reflectance of particles collected on the filter is

positively related to the BC/EC loading with promising predictabilities.

7

1.5.2. Research Gaps

While the feasibility of using digital images to quantify BC has been demonstrated, smartphone

cameras have never been used in the quantification of BC/EC loading. Also, the methods described

above still cannot estimate BC/EC loading consistently from various PM sources. In addition, the

cellphone model still shows relatively low accuracy, which has been assessed by De la Sota et al.

(2017). Furthermore, it is not easy to build a robust model with limited samples in these studies.

Moreover, the correlation between the colour information of particles loaded filters and BC

loading is still poorly understood. Lastly, the professional colorimeter remains too expensive (>$2,

000 USD) to be widely adopted by the general public.

1.6. Research Goals

Recognizing these gaps in knowledge, this work proposes a novel RGB model based on

smartphone image analysis. Specifically, the goals of the research are threefold: (1) Develop an

image processing program that can take into account the variabilities between devices and light

conditions, (2) to develop a model that can predict BC/EC loading precisely and accurately, and

(3) to assess the accuracy, precision, and consistency of the model in different urban environments.

Since Oct 2016, 1266 ambient PM samples with an aerodynamic size smaller than 2.5 µm (PM2.5)

were obtained at the Ministry of the Environment and Climate Change (MOECC), and Southern

Ontario Centre for Atmospheric Aerosol Research (SOCAAR) sites located across southwestern

Ontario, Canada. Also, since 2013, 478 personal PM2.5 samples were obtained from Peking

University, which were collected from pre-diabetic participants living in the downtown Beijing,

China.

Raw images of all these samples were captured under relatively consistent light condition. Colour

information of these photos was extracted after applying for image processing and edge detection

8

programs in MATLAB. By training and verifying a model in MATLAB using the colour

information of these filter samples and corresponding reference BC/EC data, a robust polynomial

model based on RGB values was built to estimate BC/EC loading accurately and precisely in two

different urban environments.

9

Table 2 Research to date using optical techniques (scanner, conventional cellphone,

and colorimeter) to quantify BC/EC loading (µg/cm2).

Devices Reference

Instruments

Color

Space

# of

Samples

Sources of

Samples CV(RMSE)* Researchers

Scanner

Sunset OC/EC

analyzer

IMPROVE

RGB 79

Ambient

urban

samples in

China

N/A (~10.3 error) (Cheng et al., 2011)

Conventional

Cellphone

Aethalometer

(AE31) RGB 126

Rural LA,

rural India

indoor and

rural India

outdoor

N/A (Ramanathan et al., 2011)

Conventional

Cellphone

Sunset OC/EC

analyzer

NIOSH

HSV

120

Rural China,

urban Iraq,

and urban

California

22.6%

(Olson et al., 2016)

Professional

Colorimeter

(i1Pro)

Sunset OC/EC

analyzer

NIOSH

93

Rural China

and urban

California

30.8%

Sunset OC/EC

analyzer

IMPROVE

315

Urban LA,

Urban

Riverside,

and urban

Denver

16.1%

Professional

Colorimeter

(3nh)

Sunset OC/EC

analyzer

NIOSH

CIELAB 226

Urban and

sub-urban

China

17.4% (Khuzestani, et al., 2017)

∗ 𝐶𝑉(𝑅𝑀𝑆𝐸) =𝑅𝑀𝑆𝐸

𝑀𝑒𝑎𝑛

10

Chapter 2 Methods

To achieve these research goals, the following experiments were conducted (Figure 2). First,

1777 PM2.5 filter samples were obtained from Canada and China in this study. Raw images of these

samples with the ColorChecker in the scene were then captured using a smartphone with Adobe

Lightroom app under relatively consistent light conditions. Adobe DNG Converter was used to

demosaic the raw images, and “dcraw2” was implemented to convert DNG format to a MATLAB

readable format, TIFF. Lastly, these images were processed using the image processing program

in MATLAB with white balancing and colour transformation. The calibrated linear RGB values

of the particle-loaded filters were extracted by the objects detection programs. A model was trained

and assessed using Regression Learner app in MATLAB with the colour information and the

reference BC/EC data. In the future, the MATLAB procedures and model can be easily translated

into a smartphone app, which will make it possible for the smartphone to measure BC/EC loading

off-line.

2 dcraw is an open-source image processing program which can read numerous raw image format files.

11

Figure 2 Flow diagram of the experimental test

2.1. Sample Collection and Instrumentation

2.1.1. Ontario Samples

Since April 2017, 1266 PM2.5 samples were obtained by MOECC and SOCAAR across six sites

in the Air Quality Monitoring Network (AQMN), located in northern Toronto, downtown Toronto,

Highway 401 roadside, western Windsor, and downtown Windsor in the southwestern Ontario,

Canada. A detailed description of these samples is presented in Table 3.

These samples can be classified into three categories based on their locations, including highway,

near-road, and suburban residential areas. Sites that are located within 100 m of a major roadway

with average traffic volumes greater than 10,000 vehicles per day are classified as “Near-Road.”

The Windsor West site does not meet the above criteria, but it is surrounded by residential houses.

Thus, it is classified as “residential region”(Healy et al., 2017). The HWY 401 station is located

directly to the southeast of Ontario's Highway 401, which passes through Toronto and is

12

considered one of the busiest highways in North America (~400,000 vehicles/day) (Ministry of

Transportation, 2012). Three of the sites are located in the Greater Toronto Area (GTA), which

has a population of 6.3 million (Statistics Canada, 2018).

Figure 3 a) Schematic diagram of the synchronized hybrid ambient real-time

particulate (SHARP) monitor. It combines light scattering (nephelometer) and β-ray

c)

13

attenuation to measure PM2.5 concentrations precisely and accurately with a one-

minute resolution. In this study, all the SHARP monitors in Ontario monitoring

stations were programmed to advance one filter spot every 8 hours, following the U.S.

EPA standard. The only exception is the SHARP monitor in Downtown Toronto

(located on the rooftop of the Wallberg Building), which advanced each filter tape

every 24 hr. (Thermo Fisher Scientific, 2013); b) Picture of PM2.5 loaded SHARP filter

(16-mm glass fiber); c) Photo of the SHARP monitor.

All the samples were collected onto glass fiber filters using the Synchronized Hybrid Ambient

Real-Time Particulate (SHARP) monitor (Figure 3). With the aid of collocated aethalometers

(AE33 or AE31) to monitor the real-time BC concentration with a 1-min resolution, the

corresponding BC loading for these 1266 samples can be calculated using Equation 1, where

[BC]Aethalometer is the BC concentration measured by aethalometer at 880nm wavelength, Vf_i is the

volumetric flow rate of the SHARP monitor at every minute, and A [cm2] is the effective area of

the filter, in this study, 2.01[cm2].

BCloading [𝜇𝑔/𝑐𝑚2] =

∑ [BC]Aethalometer_i[𝜇𝑔/𝑚3] × 𝑉𝑓𝑖[𝑚

3/ℎ𝑟] × 1/60[ℎ𝑟/𝑚𝑖𝑛]480/1440[𝑚𝑖𝑛]𝑖=1

𝐴[𝑐𝑚2] Equation 1

14

Table 3 Detailed description of the sampling sites, training and test datasets, reference

instruments, and filter types.

* All Ontario samples are 8-hr samples, except the 107 downtown Toronto samples, which are 24-hr samples

Sample sites Site

characteristics Location

Sampling

period

BC/EC

Range

(µg/cm2)

Training

samples

Testing

samples

All

samples

Reference

instruments

Filter

types

HWY 401 Highway

43°42'39.6" N;

79°32'34.8" W

July 6th to July

26th, 2017 1.1-18 49 12 61

Aethalometer,

AE33

Glass

Fiber

Toronto

North

Near-Road

43°46’53.8” N;

79°25’03.8” W

Jan 1st to July

7th, 2017 0.6-33.1 207 52 259

Windsor

Downtown

42°18’56.8” N;

83°02’37.2” E

April 30th to

July 25th, 2017 0.8-7.1 190 47 237

Aethalometer,

AE31

Hamilton

Downtown

43°15’28.0” N;

79°51’42.0” W

April 18th to

Sept 27th,

2017

0.9-12.3 293 73 366

Toronto

Downtown *

43°39'32.1" N;

79°23'45.7" W

Sep 17th, 2016

to Mar 21st,

2017

1.2-20.6 86 21 107

Aethalometer,

AE33

Windsor

West

Residential

region

42°17’34.4” N;

83°04’23.3” W

May 2nd to

July 26th, 2017 0.9-9.4 189 47 236

Beijing

Pre-diabetic

subjects’

personal

samples in an

urban area

39°59’23” N;

116°18’19” E

2013 0.2-19.7 382 96 478

Sunset OC/EC

Analyzer

NIOSH

Quartz

Filter

Total 1395 349 1744

15

2.1.2. Beijing Samples

China contributes around 25% of the annual global total BC emission in recent years (Bond et al.,

2004; Cooke et al., 1999; Qin & Xie, 2012; Wang et al., 2012). Moreover, the annual mean BC

concentrations are around 11.2 µg/m3 at the urban sites, 3.6 µg/m3 at the rural sites, and 0.35 µg/m3

at the remote background sites in China (Zhang et al., 2008). Beijing is one of the most polluted

cities in China, has a population of 21.707 million (Beijing Municipal Bureau of Statistics, 2018).

In this work, 478 PM2.5 samples were collected on 37-mm quartz filters using Libra Model L-4

personal samplers (manufactured by A.P. Buck, Inc.) in Beijing. A detailed description of these

samples is presented in Table 3. The personal samplers were attached to pre-diabetic subjects

(with a fasting plasma glucose level of 6.1-7.0 mmol/L in the most recent annual health

examination) living in downtown Beijing during 2013. The detailed information of participants

recruitment criteria and demographics of study subjects were described by Wang et al. (2018). A

schematic of the equipment and a photo of a PM2.5 loaded quartz filter are shown in Figure 4.

Before each sampling run, the personal sampler was calibrated to sample at a flow rate of 4L/min.

A blank filter was pre-baked in a muffle furnace held at 550 °C for 5.5 hr to eliminate background

organic carbon (OC). 24 hr after prebaking, the filter was pre-weighed in a super clean lab at 25

°C and 40% relative humidity (RH). Each patient was asked to turn on the sampler 24 hr before

his/her doctor’s appointment and to wear it at all times. It was also recommended that the sampler

is placed as close as possible to the breathing zone (within a 30-cm radius of the nose) except for

when the sampler would interfere with activities, such as cooking and sleeping. After sampling,

the quartz filters were weighed again to obtain the mass of loaded PM2.5 during the sampling period

(Wang et al., 2018). Then, they were analyzed off-line using a semi-continuous OC/EC analyzer

(Model 4, Sunset Laboratory Inc., USA) following the NIOSH protocol.

16

Figure 4 a) A schematic of a Libra Model L-4 Personal Sampler placed in the

participant’s breathing zone (30 cm from the nose); b) Photo of a 37-mm PM2.5 loaded

quartz filter.

17

2.2. Image Capturing and Image Processing

To obtain linear colour information, manual image processing of raw photographs is required. The

workflow of the image capturing, and image processing used in this study is shown in Figure 5.

The details of the image processing pipeline are described as follows.

Figure 5 Image-processing workflow used in this study. Colour information can be

used for research purposes if raw images are captured and processed manually with

linear operations so that linear relationship with scene reflectance can be maintained.

2.2.1. Capturing Raw Images

Retaining the raw image makes it possible to extract the colour information which has a linear

relationship with the scene reflectance. Therefore, Adobe Lightroom in iPhone 6s was used to

obtain and store the raw images. To understand the linear relationship between scene reflectance

and values recorded in the raw images, calibration targets (Macbeth ColorChecker) are required

in the scene. To simplify the difficulty in image processing and objects detection programs, the

raw photographs of these PM2.5 samples captured using Adobe Lightroom app were taken under

18

relatively consistent light conditions and distances. The set-up of image capture is shown in

Figure 6, and the hardware and software used are listed in Table 4.

Table 4 List of materials used in this work

Hardware and Software Manufacturer Parameters

Macbeth ColorChecker X-Rite ColorChecker Passport

Lamp AUKEY LT-T11, 7V

Stand N/A N/A

iPhone 6s Apple N/A

Adobe Photoshop Lightroom Adobe N/A

Figure 6 Set-up of the raw image capture

2.2.2. Demosaicing

After raw images of these filters were captured, Adobe DNG converter was used to demosaic raw

images. Also, because of raw images in DNG format is not readable by MATLAB, dcraw was

used to covert these DNG images into a TIFF format.

19

2.2.3. White Balancing

The human vision system is capable of adapting to slight changes of colour results from differences

in illumination. Through the process of chromatic adaption, the human eye is also able to

effectively discern the spectral properties of objects in the scene despite various light conditions.

Contrasting this, cameras can only capture the actual reflectance of the objects in the scene.

Therefore, white balancing for photographs is necessary to capture the spectral properties of the

objects in the scene (Reinhard et al., 2008). There are two different concepts widely used in white

balancing: RGB equalization and chromatic adaptation transform (CAT). In this study, RGB

equalization method was chosen over CAT, because high perceptual accuracy is not necessary for

scientific data acquisition, and RGB equalization is easier to implement (Treibitz et al., 2015).

Through RGB equalization is also known as “wrong von Kries model” (Westland & Ripamonti,

2004), RGB values of grey calibration targets in the scene are corrected based on the published

RGB values of the six grey patches in the ColorChecker Passport. Mathematically, each pixel in

each colour channel of a linear image is calibrated using the following equation (Treibitz et al.,

2015):

𝑝𝑖𝑊𝐵 =

𝑝𝑖 − 𝐾𝑆𝑖𝑊𝑆𝑖 − 𝐾𝑆𝑖

, 𝑖 = 𝑅𝐺𝐵 Equation 2

where 𝑝𝑖𝑊𝐵 is the intensity of the white-balanced pixel in the ith channel, 𝑝𝑖 is the intensity of the

linear image in the ith channel (i.e., R value, G value, and B value), and 𝐾𝑆𝑖 and 𝑊𝑆𝑖 are the

intensities of published darkest and whitest standard in ith channel, respectively.

2.2.4. Colour Transformation

As explained in the first chapter, owing to the variations in different camera sensors, two different

cameras may record different RGB values for the same scene. To know the linear relationship

20

between the images and scene radiance, colour transformation is required, such that the device-

independent colour information can be extracted. It is performed according to the Macbeth

ColorChecker chart, which consists of 24 colour patches that provide the majority of natural

reflectance spectra (Westland et al., 2004). Treibitz et al. (2015) showed that the total error is

minimized when 18 patches of the ColorChecker were used in colour transformation. In this study,

we included all 24 patches to ensure transformation accuracy. The matrix T in Equation 3 is the

standard matrix for the linear transformation. In this case, a total of 9 coefficients must be

determined as shown in Equation 4. The 3×24 matrices, 𝐶𝑙𝑖𝑛𝑒𝑎𝑟𝑅𝐺𝐵 and 𝐶𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒

𝑋𝑌𝑍 contain the RGB

values obtained from the linear RGB image of 24 patches in the ColorChecker and the published

XYZ3 tri-stimulus values for the 24 patches, respectively. Furthermore, because 𝐶𝑙𝑖𝑛𝑒𝑎𝑟𝑅𝐺𝐵 is not a

square matrix, T is calculated using the pseudoinverse matrix ([𝐶𝑙𝑖𝑛𝑒𝑎𝑟𝑅𝐺𝐵 ]

+), as shown in Equation

5 (Westland et al., 2004).

𝐶𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑋𝑌𝑍 = 𝑇 × 𝐶𝑙𝑖𝑛𝑒𝑎𝑟

𝑅𝐺𝐵 Equation 3

𝑋 = 𝑎11𝑅 + 𝑎12𝐺 + 𝑎13𝐵

𝑌 = 𝑎21𝑅 + 𝑎22𝐺 + 𝑎23𝐵

𝑍 = 𝑎31𝑅 + 𝑎32𝐺 + 𝑎33𝐵

Equation 4

𝑇 = 𝐶𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑋𝑌𝑍 [𝐶𝑙𝑖𝑛𝑒𝑎𝑟

𝑅𝐺𝐵 ]+

Equation 5

3 XYZ is a colour space developed by the International Commission on Illumination (CIE) to denote how much

three different types of human cone cells are stimulated at three different wavelengths to quantify colours.

21

2.3. Detection Algorithms for Colour Extraction

2.3.1. Detection of 24 Patches in ColorChecker (BRISK Point Feature Matching)

To extract the RGB values of these 24 patches for colour transformation, it is more efficient to

detect the ColorChecker in the scene automatically. Because the ColorChecker in the scene is

unique as shown in Figure 6, a point feature matching algorithm in MATLAB was used, which

is highly accurate on detecting a specific object that does not have repeating patterns. To do so, a

reference picture of the ColorChecker (Figure 7 a)) is required as an input. The program can

extract the feature points in the reference picture, and then finds putative point matches (Figure

7 b) ) in the target image containing a cluttered scene to locate the ColorChecker in the scene. The

code in the MATLAB was adapted from the example of point feature posted by (MathWorks,

2014).

Figure 7 a) Reference image of ColorChecker in the scene. b) Matched putative points

between the reference image (left side) and the target image (right side). c) The output

of the point feature detection program.

22

2.3.2. Detection of Filter Sample

2.3.2.1. Detection of PM2.5 Loaded Spots in Scanned Pictures

Prior to using the smartphone, the Ontario samples were scanned using the Canon imageClass

MF4890dw office scanner for the preliminary tests. The scanner is more efficient than using a

smartphone for taking photos for every spot because it can scan 50 SHARP samples in one image.

Also, since each spot in the image is uniformly illuminated, as shown in Figure 8 a), there is less

variation than using a smartphone camera. In addition, the scanner is capable of capturing an

uncompressed raw image in tiff format, which is a linear RGB image. Therefore, there is no need

to apply an image processing algorithm for these scanned photos.

A python-based program was developed to find the filter spots in the scene. A function in OpenCV

called “AdaptiveTthreshold” was used to detect particle loaded filters. Because the ratios of size

to the perimeter of these spots are consistent, using a reasonable default ratio is useful to exclude

some grey parts but not filter spots in the scene. In addition, the threshold, the size of the detected

object, and the ratio of area to perimeter are adjusted by dragging the slider in the user interface to

make sure all the desired spots can be detected regardless of variabilities in the size of the scanned

picture. Furthermore, in order to avoid two straight white lines for all the SHARP samples during

extracting RGB values, three rectangles are drawn in each spot, as shown in Figure 8 b) based

on the relative positions to the centroid of the spot’s contour, which is computed using “moments”

function in OpenCV. Lastly, the average RGB values for the three mean RGB values of each

rectangle are reported.

23

Figure 8 a) Scanned image of Ontario samples. b) Detected images

2.3.2.2. Detection of PM2.5 Loaded Spots in Smartphone Pictures

Since samples from Beijing were punched and analyzed using OC/EC analyzer before photos were

captured, most of these samples do not present perfectly circular spots (Figure 9 b) ). Therefore,

common edge-detection algorithms are not capable of finding the sample spot in the scene

effectively due to the complexities of the morphologies of our samples. In this study, a novel

machine learning technique, Mask Region based Convolutional Neural Network (Mask R-CNN)

approach was proposed to detect the sample spot and the ColorChecker in the scene with its distinct

advantages: fast, flexible and simple (He et al., 2017). The code for this method is posted by the

author at https://github.com/matterport/Mask_RCNN. In this study, 45 images of samples were

randomly selected as an input training dataset for Mask R-CNN. Accordingly, the sample spot and

the ColorChecker in each training image were labeled and annotated as shown in Figure 9 a).

After these 45 images and their labels were inputted, with the aid of the Graphics Processing Unit

(GPU), an object (filter and ColorChecker) detection programmed was trained. Then, all available

https://github.com/matterport/Mask_RCNN

24

images were used for model testing. It shows great consistency and efficiency (within 1s) in

detecting the sample spots and the ColorChecker as shown in Figure 9 c).

Figure 9 a) One of the annotated images before training. b) Test image of the Mask

R-CNN. c) The output of the Mask R-CNN. Masks are shown in colours and bounding

box.

2.4. Model Buildup

RGB values extracted from processed images using the image processing program have a linear

relationship with the scene radiance, from which spectral properties of the object can be obtained.

Using these RGB values and reference BC/EC loading data, a model was trained to predict BC or

EC loadings based on extracted colours. Previous work demonstrated that the darkness of the PM2.5

loaded filter is positively related to the BC/EC loading. Cheng et al., (2011) built an exponential

model between EC loading and min {R, G, B}. Furthermore, Ramanathan et al., (2011) also built

an exponential model between R-value and BC loading, because their images used gamma

correction which is a nonlinear operation of the image processing. However, the relationship

between linear RGB values and BC/EC loading are poorly understood.

25

In this study, multiple models were tested using Regression Learner (one of the machine learning

toolboxes in MATLAB), including linear regression models, regression trees, support vector

machines, Gaussian process regression models, and ensembles of trees. Also, this toolbox is

capable of training and validating the models simultaneously. In this study, hold out validation

(80% data was randomly chosen for validation, and the remaining 20% was for testing) was used

due to a large number of data points. The predictability of all these trained models can be assessed

using these model figures of merit, including Root Mean Square Error (RMSE), R-Squared (R2),

and Mean Absolute Error (MAE). The best model, interactions regression model, was chosen with

its smallest RMSE and MAE and an R2 close to 1.

26

Chapter 3 Results and Discussion

3.1. Effectiveness of Image Processing

Four sets of images were captured under different light conditions using two different iPhones for

examination of the image processing program, as shown in Figure 10. All these four example

images on the left were pre-processed by the camera of iPhone 6s, can be seen that the calibration

targets of four images are all different in colour as captured by the smartphone. As shown, the

proprietary image processing system of the camera in the iPhone 6s does not perform well and fail

to take various light conditions and devices into account. This shortfall serves as motivation for

the manual image processing conducted in this study as described in Chapter 2.

With the ColorChecker in the scene, the RGB values of the 24 colour patches were extracted from

each processed photograph to compare with the published RGB values of these 24 calibration

targets. As shown in Figure 11, the calibrated R, G, B values of these colour patches have strong

correlations (R2>0.92) with the published R, G, B values of the calibration targets. Also, calibrated

RGB values from four different cases, regarding different light conditions and devices also agree

with each other (R2>0.95). Therefore, this image processing program takes into account different

devices and light conditions effectively.

27

Figure 10 Four pre-processed images by two iPhone 6s cameras. The two top photos

(S352 and S604) are Beijing samples captured using the same iPhone 6s but under

two different light conditions (S352 is brighter than S604). The two bottom photos

(HWY28 and WW49) are Ontario samples captured at two distinct locations using

another iPhone 6s. b) Four processed photos using the image processing program in

MATLAB. It proves that this image processing program can take into account

different light conditions and devices effectively.

28

Figure 11 Calibrated RGB values of ColorChecker versus published RGB values of

ColorChecker. a), b), and c) represent the correlation between published RGB values

and RGB values of the 24 colour patches obtained from processed images using the

image processing program mentioned in Chapter 2. It shows that this program can

calibrate RGB values into the ground truth RGB effectively, despite different light

conditions and devices.

29

3.2. RGB-based Model to Predict PM2.5 Loading

The correlation between PM2.5 loading and RGB values was investigated using all the Ontario

samples (N=1266) using minutely data of PM2.5 concentrations and flow rate obtained from the

SHARP monitor. The PM2.5 loading was calculated using Equation 6, where Vf_i is the volumetric

flow rate of the SHARP monitor each minute, and A [cm2] is the effective area of the filter; in this

study, 2.0[cm2].

PM2.5_loading [𝜇𝑔/𝑐𝑚2] =

∑ [PM2.5]SHARP_i[𝜇𝑔/𝑚3] × 𝑉𝑓𝑖[𝑚

3/ℎ𝑟] × 1/60[ℎ𝑟/𝑚𝑖𝑛]480/1440[𝑚𝑖𝑛]𝑖=1

𝐴[𝑐𝑚2] Equation 6

The best model, linear interactions regression model (Equation 7) was trained and assessed using

hold out validation in MATLAB. The interaction terms in Equation 7 (i.e., R×G, G×B, and R×B)

were introduced to take the saturation of colour (the colour of the filter samples tend to saturate at

very high loadings) into account. The performance of the selected model using the whole dataset

of Ontario samples is demonstrated in Figure 13 a). It shows that PM2.5 loading poorly correlates

with RGB values with an R2 of 0.50 and an RMSE of 33.5 [µg/cm2].

As mentioned in Chapter 2, hold out validations repeated over 2000 times were conducted, the

distributions of both the PM2.5 loadings and the 2000 RMSEs obtained are shown in Figure 12

b). The Coefficient of Variation in RMSE (CV(RMSE)) was calculated using Equation 9. It is

evident that these two distributions are normally distributed. Therefore, the CV(RMSE) was

calculated by dividing the mean of these 2000 RMSEs by the mean of PM2.5 loading and has a

value of 77.3%. Also, 25 blank filters were tested to estimate the detection limit of this model

within 99% confidence using Equation 8, where 𝑥𝑏𝑙𝑎𝑛𝑘 and 𝜎𝑏𝑙𝑎𝑛𝑘 are mean and standard

deviation of 25 blank-filter PM loadings, respectively. Thus, the detection limit of this model is

61.0 [µg/cm2], which means that nearly 91.6% of the Ontario samples are below the LOD for

30

detecting PM2.5. Overall, this model performed very poorly in predicting PM2.5, indicating that the

light absorbing properties and PM2.5 mass do not correlate with each other.

𝑃𝑀2.5_𝑙𝑜𝑎𝑑𝑖𝑛𝑔[𝜇𝑔/𝑐𝑚2]~1 + 𝑅 + 𝐺 + 𝐵 + 𝑅 × 𝐺 + 𝐺 × 𝐵 + 𝑅 × 𝐵 Equation 7

𝐿𝑖𝑚𝑖𝑡 𝑜𝑓 𝐷𝑒𝑐𝑡𝑖𝑜𝑛 (𝐿𝑂𝐷) = 𝑥𝑏𝑙𝑎𝑛𝑘 + 3.14𝜎𝑏𝑙𝑎𝑛𝑘 (𝑤𝑖𝑡ℎ 99% 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒) Equation 8

CV(RMSE) =𝑅𝑀𝑆𝐸

𝑚𝑒𝑎𝑛 Equation 9

Figure 12 a) One of two thousand hold out validations for the linear interactions

regression model. This randomly chosen 20% testing dataset does not show a good

agreement with the model trained using the remaining 80% data. b) Distribution of

the PM2.5 loading for all the Ontario samples and the distribution of the 2000 RMSE

from hold out validations. The mean of 2000 R2 is 0.50, and the CV(RMSE) is 77.3%.

Also, the detection limit of this model is 61.0 [µg/cm2], which means nearly 91.6% of

the dataset was smaller than LOD.

31

The reason this RGB-based model shows poor predictability of PM2.5 loading is that the

complexities of PM2.5 cannot be fully explained by the light absorption of particles deposited on

the filter. In addition, different components in PM2.5 have different colours, such as BC is black,

OC and other components may be brown or even colourless. With the help of the collocated

aethalometer, the residual of this model versus BC/PM is shown in Figure 13. It shows that this

model cannot predict PM2.5 loading when BC loading is relatively small. Moreover, the residual

becomes negative as BC/PM increases, because this model relies on light absorbing properties of

the particles deposited on the filter. In addition, BC contributes more to light absorption than any

other components in PM2.5 do. Therefore, this model consistently overestimates PM2.5 loading

when BC/PM is high. In summary, all these results point to the fact that RGB values are not

predictive of PM2.5 loading, and subsequently, we focus on the relationship between the BC

loading and RGB values.

Figure 13 a) PM2.5 loading (µg/cm2) reported by the linear interactions regression

model versus actual PM2.5 loading measured by the SHARP monitor for all Ontario

samples (N=1266). The R2 for this model is 0.50, and RMSE is 33.5 [µg/cm2]. b) The

32

residuals of the selected model versus BC to PM2.5 ratio. It demonstrates that this

model cannot predict PM2.5 loading when BC loading is relatively small. Moreover,

the residual shows a trend as BC/PM increasing, which indicates that this model

measures BC loading instead of the PM2.5 loading.

3.3. RGB-based Model to Predict BC Loading

3.3.1. Assessments of the Model

In this study, 1266 samples were collected across Ontario, Canada (Table 3) with corresponding

aethalometer data, the large number of samples makes it possible to build a robust model for BC

loading prediction. From the results of Figure 13 b), it shows that the RGB-based linear

interactions model is capable of measuring BC. This hypothesis was tested by training and

assessing several models in MATLAB. The best model, the linear interactions regression model

(Equation 10) was selected again, and the performance of the selected model using the whole

dataset was demonstrated in Figure 15 a). It shows that BC loadings strongly correlate with RGB

values with an R2 of 0.95 and an RMSE of 0.6 [µg/cm2].

Hold out validations repeated over 2000 times were conducted for model assessment, the RMSE

for each validation was computed and stored. The distributions of both the 1266 samples’ BC

loading and the 2000 RMSE obtained from 2000 times hold out validations are shown in Figure

14 b). This model can predict BC loading precisely and accurately with a CV(RMSE) of 18.1%.

The limit of detection (LOD) is comparable to LOD (0.3 [µg/cm2]) of the reference instrument

(off-line Sunset OC/EC analyzer) used in this study (Karanasiou et al., 2015).

𝐵𝐶/𝐸𝐶𝑙𝑜𝑎𝑑𝑖𝑛𝑔[𝜇𝑔/𝑐𝑚2]~1 + 𝑅 + 𝐺 + 𝐵 + 𝑅 × 𝐺 + 𝐺 × 𝐵 + 𝑅 × 𝐵 Equation 10

33

Figure 14 One of two thousand hold out validations for the linear interactions

regression model. This randomly chosen 20% testing dataset shows strong agreement

with the model trained using the remaining 80% data. b) Distribution of the BC

loading for all the Ontario samples and the distribution of the 2000 RMSEs from 2000

times hold out validations. This model shows strong predictability of BC loading for

2000 tests with the means of 2000 R2 and 2000 RMSEs equal to 0.95 and 0.6 [µg/cm2],

respectively. Also, the detection limit of this model is 0.3 [µg/cm2], which means all

the dataset are larger than LOD.

As shown in Figure 15 b), the residuals are randomly scattered regardless of the change of the

PM2.5 loading to BC loading ratio, which proves that this RGB-based linear interactions model

does not have a bias when measuring BC loading with various BC fractions.

34

Figure 15 a) BC loading reported by the linear interactions regression model versus

actual BC loading measured by the aethalometer (AE33/AE31) for all Ontario

samples (N=1266). b) The residuals of the selected model versus PM2.5 to BC ratio. It

exhibits a random distribution of the residuals, which indicates that the PM2.5 (except

BC) deposited on the filter does not affect the predictability of this model.

3.3.2. Diagnostics of Systematic Bias from Different Sources of PM2.5

To identify if this model can predict BC loading consistently across various sources of BC, samples

were classified based on the potential differences in sources, including the location, time period,

and weekdays vs. weekends. As shown in Table 3, 1266 samples were collected from six different

sites across Ontario, Canada and were classified into three groups based on the location: Near-

Road, Highway, and Residential. For 8 hours samples collected across Ontario, we also classified

the samples by the time of day (0:00-8:00; 8:00-16:00; and 16:00-0:00) as the vehicular traffic

patterns may differ among these time periods. Lastly, given the different patterns on weekdays and

weekends (e.g., differences in diesel truck traffic), the samples were classified into weekday

(N=1012), and weekend (N=157) samples. For each category, we investigate the model residuals

35

(as shown in Fig. 16) of each category and identify any systematic differences between samples in

each category.

Figure 16 Boxplots of residuals for all Ontario samples: three categories, three time

periods in the day, and Weekdays vs. Weekends. It shows all categories agree with

the model without any systematic bias, which means that this RGB-based model can

measure BC loading consistently and accurately despite the variety of the PM2.5

sources.

As shown in Figure 16, there is no systematic difference in the residuals observed in any of the

categories, which suggests that the proposed RGB-based model can measure BC loading

consistently and accurately for samples from various potential BC sources. It is likely that BC

exhibits similar spectral properties regardless of the emission source, and thus enabling the

proposed model to predict BC across different sources relatively accurately.

3.4. RGB-based Model to Predict EC Loading

Based on the differences in measuring principles between BC and EC, a different model was

trained and assessed for Beijing samples. As shown in Table 3, 478 quartz filter samples were

36

collected from pre-diabetic participants living in downtown Beijing, China. All these samples were

analyzed using Sunset OC/EC analyzer with NIOSH protocol as mentioned in Chapter 1. In this

case, the linear interactions model (Equation 10) was chosen. As shown in Figure 17a), this

model presents the best performance during hold out validations repeated over 2000 times with a

mean R2 of 0.91 and a mean RMSE of 0.9 [µg/cm2], which is 21.1% of the mean for the EC

loading. The LOD was determined using Equation 8, which is 0.5 [µg/cm2]. The LOD is

comparable to LOD (0.15 [µg/cm2]) of the reference instrument (off-line Sunset OC/EC analyzer)

used for EC quantification (Karanasiou et al., 2015).

The validated model performs very well for the whole dataset with an R2 of 0.91 and an RMSE of

0.9[µg/cm2], as shown in Figure 18 a). Furthermore, the residuals are randomly scattered despite

the changes in the OC to EC ratio as shown in Figure 18 b), which suggests that this model is

not influenced by OC when predicting EC loading.

Figure 17 One of two thousand hold out validations for the linear interactions

regression model. This randomly chosen 20% testing dataset shows strong agreement

with the model trained using the remaining 80% data. b) distribution of the BC

37

loading for all the Ontario samples and the distribution of the 2000 RMSEs from 2000

times hold out validations. This model shows strong predictability of BC loading for

2000 tests with the median of 2000 R2 and 2000 RMSEs equal to 0.91 and 0.9 [µg/cm2],

respectively. Also, the detection limit of this model is 0.5 [µg/cm2], which means nearly

0.07% of the dataset is smaller than LOD.

Figure 18 a) EC loading reported by the linear interactions regression model versus

actual EC loadings measured by Sunset OC/EC analyzer for Beijing samples (N=478).

b) The residuals of the selected model versus OC to EC ratio. It shows that the

residuals are randomly distributed in two sides of the y=0 line, which indicates that

the OC deposited on the filter does not affect the predictability of this model.

All Beijing samples in this study have very different sources of PM2.5 because the pattern of

personal exposure may vary significantly among individuals. The consistency and precision of the

model predictability suggest that the selected EC model can quantify EC loadings independent of

38

potential variabilities in the sample sources and the amount of OC. However, the EC model does

not perform as well as the BC model for Ontario samples, likely because of potential errors arising

from colour information extraction and also the difference between the measurement principles

underlying the reference instruments (aethalometer and Sunset OC/EC analyzer) and that used by

the smartphone. Smartphone image analysis is based on the optical properties of the BC/EC loaded

on the filter, which is a technique similar to that used by an aethalometer. In contrast, the Sunset

OC/EC analyzer operates based on the chemical properties of the EC loaded on the filter.

To investigate this phenomenon, the BC model from Canada was applied to the Beijing samples

to predict the EC loadings (Figure 19). The slope (0.99) of this trendline indicates that smartphone

image analysis is very consistent and precise for both the Ontario and Beijing samples with a 1%

error. Thus, the relatively poor predictability of EC model may be due solely to the difference in

the measurement techniques respectively deployed by the smartphone and the Sunset OC/EC

analyzer (light absorption and the thermal-optical method). Furthermore, because of the similarity

in the operating principle of an aethalometer and smartphone images, the BC model is expected to

have a better agreement with BC measured using light attenuation than that indicated by the

thermal-optical technique.

39

Figure 19 Predicted BC using the BC model vs. actual EC loading for Beijing samples.

It does not show a good agreement when using the BC model to predict EC with an

R2 of 0.84 and an RMSE of 0.9 (µg/cm2). However, the slope (0.99) of this trendline

exhibits that smartphone image analysis is consistent with a 1% error. Thus, the

relatively poor predictabilities of EC loading using the BC model is due to the

differences in their measuring techniques.

Lastly, as shown in Figure 19, it does not show a good agreement when using the BC model to

predict EC with an R2 of 0.84 and an RMSE of 0.9 (µg/cm2), which is reasonable, due to the

difference in operating principles of BC and EC. Numerous studies have shown that BC measured

40

by aethalometer and EC measured by thermal-optical OC/EC analysis does not show a good

correlation (R2=0.65-0.85) between each other (Healy et al., 2017).

3.5. Integrated RGB Model for All Samples

An integrated RGB model was trained to investigate the possibility of using a single model to

predict both BC and EC loadings collected in this study. Same as the separated BC and EC models,

linear interactions regression model was applied because of its best performance in hold-out

validations. As shown in Figure 20 a), the integrated RGB model shows good predictability with

R2 of 0.92, and RMSE of 1.0 µg/cm2, respectively, which is comparable with previous studies as

mentioned in Table 2. However, the results indicate that the integrated model cannot predict EC

loading as robustly as Beijing EC model does by comparing Figure 20 a) and Figure 20 b). But

the integrated model still has strong predictability for BC quantification. It is reasonable and

worthwhile to train a separate model for measuring EC loading.

Figure 20 a) BC/EC loading predicted by the linear interactions regression model

trained using the whole data set versus actual BC/EC loadings measured by reference

41

instruments. b) BC loading predicted by Ontario BC model and EC loading predicted

by Beijing EC model versus actual BC/EC loadings measured by reference

instruments.

Overall, in this study, a MATLAB program was developed for image analysis and BC/EC

quantifications, which will be translated to a smartphone app both in iOS and Android platforms

in the future. The link of demonstration videos of the MATLAB program and iOS app is available

at https://drive.google.com/drive/folders/1kqDysjEmi_G5jaqiR2iqff5QgSo8EnO8?usp=sharing.

Furthermore, the RGB-based BC and EC models were trained and assessed, and these results give

enough evidence that the BC/EC models can quantify BC and EC loading with comparable

accuracy (CV(RMSE)=18.1% and 21.1%, respectively) with previous studies as listed in Table

2. Also, these two models are robust enough to consistently predict BC/EC loadings despite the

various sources and compositions of BC with the LOD of 0.27, and 0.50 [µg/cm2], respectively.

In another word, with the help of the great predictabilities of these two models, all the filter samples

exposed at a flowrate of 8 [L/min] (breathing rate of a healthy adult while sitting) under a BC

concentration of 0.84 [µg/m3] (annual mean BC concentration of downtown Toronto) for 13.3 hr

will be detectable by this method. Moreover, the integrated RGB model shows a promising result,

but it is reasonable to use a separate model for EC quantification.

https://drive.google.com/drive/folders/1kqDysjEmi_G5jaqiR2iqff5QgSo8EnO8?usp=sharing

42

Chapter 4 Conclusions and Recommendation

4.1. Conclusions

The main contribution of this study is that with the aid of the image processing program, an

affordable, accessible, and relatively accurate method for BC and EC quantifications was

developed using smartphone images of particle-loaded filters. Moreover, this method is capable of

predicting BC/EC loadings consistently and precisely for various sources of black carbon.

The principle of this method is based on the light absorption of loaded particles on the filter

substances, which is similar to the principles of one of the reference instruments, aethalometer

(AE31/33). This is the reason that this method can predict BC loadings more accurately than that

of EC loadings. However, despite the differences between our method and the thermal optical

technique of the other reference instrument, Sunset OC/EC analyzer, the predictability of the EC

model is still comparable with the previous literature.

Smartphone offers distinct advantages in the measurement of BC/EC loading: it is non-destructive,

easily accessible, off-the-shelf, low cost, and fast. The use of smartphone makes it possible to

popularize a BC/EC sensor to the community, which will be possible to collect more data about

BC/EC exposure, which will raise awareness of the adverse effects caused by black carbon both

on public health and climate change.

43

4.2. Recommendation

Based on the results of this study, several recommendations are presented. More work should be

done to train a model based on different colour spaces (e.g., CIELAB, CIEXYZ) to investigate if

the different expressions of the colour information will affect the predictabilities of the BC/EC

models.

To commercialize and popularize our method, more smartphone cameras should be tested, and the

MATLAB program should be optimized and then translated into a smartphone app both in Android

and iOS platforms.

Figure 21 a) Sketch of the experimental set-up. b) Manikin wearing a facemask and

“breathing” using a “breathing pump” (can inhale and exhale at a flow rate of

8L/min). c) Ambient fine particle (PM2.5) concentrator (concentrate the PM2.5

concentration in the chamber for 64 times).

Furthermore, feasibility tests on some other easier sampling processes for PM2.5 are required, such

as sampling of particles onto facemasks. This can be done with a face mask exposure experiment

in our facility as shown in Figure 21. The “breathing pump” was plugged in the back of the

manikin’s head with inhaling and exhaling operations at a flow rate of 8L/min (close to the

44

human’s breathing rate at rest) to simulate human’s breathing. Moreover, the BC concentration in

the chamber can be monitored using an aethalometer. With the raw images of the exposed face

masks and reference data of BC loading, a new “face mask model” could be built.

45

Bibliography

Beijing Municipal Bureau of Statistics. (2018). Economic Development of Beijing Maintained a

Stable and Good Momentum in 2017. Retrieved July 5, 2018, from

http://tjj.beijing.gov.cn/English/PR/201801/t20180125_391609.html

Bond, T. C., Streets, D. G., Yarber, K. F., Nelson, S. M., Woo, J. H., & Klimont, Z. (2004). A

technology-based global inventory of black and organic carbon emissions from combustion.

Journal of Geophysical Research: Atmospheres, 109(14), 1–43.

https://doi.org/10.1029/2003JD003697

Chakrabarti, A., Scharstein, D., & Zickler, T. (2009). An Empirical Camera Model for Internet

Color Vision. Procedings of the British Machine Vision Conference 2009, 51.1-51.11.

https://doi.org/10.5244/C.23.51

Cheng, J. Y. W., Chan, C. K., & Lau, A. P. S. (2011). Quantification of airborne elemental

carbon by digital imaging. Aerosol Science and Technology, 45(5), 581–586.

https://doi.org/10.1080/02786826.2010.550960

Cheng, M. H., Hu, M. C., & Lan, K. C. (2017). Tongue fur detection on the smartphone.

Proceedings - 2016 IEEE International Conference on Bioinformatics and Biomedicine,

BIBM 2016, 1365–1371. https://doi.org/10.1109/BIBM.2016.7822719

Chow, J. C., Watson, J. G., Pritchett, L. C., Pierson, W. R., Frazier, C. A., & Purcell, R. G.

(1993). The Dri Thermal Optical Reflectance Carbon Analysis System - Description,

Evaluation and Applications in United-States Air-Quality Studies. Atmospheric

Environment Part A-General Topics, 27(8), 1185–1201. https://doi.org/10.1016/0960-

1686(93)90245-t

Cooke, W. F., Liousse, C., Cachier, H., & Radioactivit, F. (1999). for carbonaceous aerosol and

implementation radiative impact in the ECHAM4 model found using bulk aerosol emission

factors , while global black carbon emissions carbon emissions m -2 were Because of

secondary carbon aerosol be doubled m -2 . The resultant, 104.

de la Sota, C., Kane, M., Mazorra, J., Lumbreras, J., Youm, I., & Viana, M. (2017).

46

Intercomparison of methods to estimate black carbon emissions from cookstoves. Science of

the Total Environment, 595, 886–893. https://doi.org/10.1016/j.scitotenv.2017.03.247

Du, K., Wang, Y., Chen, B., Wang, K., Chen, J., & Zhang, F. (2011). Digital photographic

method to quantify black carbon in ambient aerosols. Atmospheric Environment, 45(39),

7113–7120. https://doi.org/10.1016/j.atmosenv.2011.09.035

He, K., Gkioxari, G., Dollar, P., & Girshick, R. (2017). Mask R-CNN. Proceedings of the IEEE

International Conference on Computer Vision, 2017–Octob, 2980–2988.

https://doi.org/10.1109/ICCV.2017.322

Healy, R. M., Sofowote, U., Su, Y., Debosz, J., Noble, M., Jeong, C. H., … Munoz, A. (2017).

Ambient measurements and source apportionment of fossil fuel and biomass burning black

carbon in Ontario. Atmospheric Environment, 161, 34–47.

https://doi.org/10.1016/j.atmosenv.2017.04.034

Karanasiou, A., Minguillón, M. C., Viana, M., Alastuey, A., Putaud, J.-P., Maenhaut, W., …

Kuhlbusch, T. A. J. (2015). Thermal-optical analysis for the measurement of elemental

carbon (EC) and organic carbon (OC) in ambient air a literature review. Atmospheric

Measurement Techniques Discussions, 8(9), 9649–9712. https://doi.org/10.5194/amtd-8-

9649-2015

Khuzestani, R. B., Schauer, J. J., Wei, Y., Zhang, Y., & Zhang, Y. (2017). A non-destructive

optical color space sensing system to quantify elemental and organic carbon in atmospheric

particulate matter on Teflon and quartz filters. Atmospheric Environment, 149, 84–94.

https://doi.org/10.1016/j.atmosenv.2016.11.002

Kuzmina, I., Lacis, M., Spigulis, J., Berzina, A., & Valeine, L. (2015). Study of smartphone

suitability for mapping of skin chromophores. Journal of Biomedical Optics, 20(9), 090503.

https://doi.org/10.1117/1.JBO.20.9.090503

MathWorks. (2014). Object Detection in a Cluttered Scene Using Point Feature Matching -

MATLAB & Simulink. Retrieved May 18, 2018, from

https://www.mathworks.com/help/vision/examples/object-detection-in-a-cluttered-scene-

using-point-feature-matching.html#d119e845

47

Ministry of Transportation. (2012). Provincial Highways Traffic Volumes 1998-2012 (Vol. i).

Olson, M. R., Graham, E., Hamad, S., Uchupalanun, P., Ramanathan, N., & Schauer, J. J.

(2016). Quantification of elemental and organic carbon in atmospheric particulate matter

using color space sensing-hue, saturation, and value (HSV) coordinates. Science of the Total

Environment, 548–549, 252–259. https://doi.org/10.1016/j.scitotenv.2016.01.032

Petzold, A., Ogren, J. A., Fiebig, M., Laj, P., Li, S. M., Baltensperger, U., … Zhang, X. Y.

(2013). Recommendations for reporting black carbon measurements. Atmospheric

Chemistry and Physics, 13(16), 8365–8379. https://doi.org/10.5194/acp-13-8365-2013

Poushter, J., & Stewart，Rhonda. (2016). Smartphone Ownership and Internet Usage Continues

to Climb in Emerging Economies.

Qin, Y., & Xie, S. D. (2012). Spatial and temporal variation of anthropogenic black carbon

emissions in China for the period 1980-2009. Atmospheric Chemistry and Physics, 12(11),

4825–4841. https://doi.org/10.5194/acp-12-4825-2012

Ramanathan, N., Lukac, M., Ahmed, T., Kar, A., Praveen, P. S., Honles, T., … Ramanathan, V.

(2011). A cellphone based system for large-scale monitoring of black carbon. Atmospheric

Environment, 45(26), 4481–4487. https://doi.org/10.1016/j.atmosenv.2011.05.030

Reinhard, E., Khan, E., Akyuz, A., & Johnson, G. (2008). Color Imaging: Fundamentals and

Applications (2nd ed.). CRC Press.

Schwartz E. R., S. E. and L. (2012). Interactive comment on “Are black carbon and soot the

same?” by P. R. Buseck et al.: Disagreement on proposed nomen

method development for measuring black carbon (bc) using ......figure 10 four pre-processed images...

Documents