method development for measuring black carbon (bc) using ......figure 10 four pre-processed images...

60
Method Development for Measuring Black Carbon (BC) using a Smartphone Camera by Gang Chen A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Department of Chemical Engineering & Applied Chemistry University of Toronto © Copyright by Gang Chen 2018

Upload: others

Post on 31-Jan-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Method Development for Measuring Black Carbon (BC) using a Smartphone Camera

    by

    Gang Chen

    A thesis submitted in conformity with the requirements for the degree of Master of Applied Science

    Department of Chemical Engineering & Applied Chemistry University of Toronto

    © Copyright by Gang Chen 2018

  • ii

    Method Development for Measuring Black Carbon (BC) Using a

    Smartphone Camera

    Gang Chen

    Master of Applied Science

    Department of Chemical Engineering & Applied Chemistry

    University of Toronto

    2018

    Abstract

    Black carbon (BC) is one of the major components of the atmospheric particulate matter (PM),

    which can cause adverse health impacts and contribute significantly to climate change. Poor

    understanding of BC sources and concentrations is the main obstacles to reduce BC emissions.

    Current commercial BC sensors remain too costly to deploy widely. A fast, cost-effective, and

    easily accessible deployment of smartphone camera was used to quantify colour information of

    PM collected on filters to estimate BC and elemental carbon (EC) loading. When applied to 1266

    PM2.5 ambient samples collected from six sites across Ontario, Canada, the RGB-based BC model

    showed powerful predictability with R2=0.95 between predicted and measured BC concentrations

    from an aethalometer. The RGB-based EC model was trained using 478 personal PM2.5 samples

    collected from pre-diabetic subjects in Beijing with an R2=0.91 between predicted and measured

    EC concentrations from OC/EC analyzer.

  • iii

    Acknowledgments

    First and foremost, I would like to thank my supervisor, Dr. Arthur Chan, for his contributions to

    this thesis in the past two years. As a mentor of mine, his office was always open whenever I had

    trouble with my research or even writing. His diligent, enthusiastic, and meticulous nature are

    always inspiring me to challenge myself and to my best self. He continuously encouraged me when

    I was stuck in my research at the beginning. What’s more, I have learned so many things from him

    in these past two years besides research. It is not easy to express all my appreciation to Arthur in

    this merely single paragraph. Throughout two years’ study in Arthur’s group, I realized that I might

    not be able to find a better supervisor like him.

    During the entirety of my graduate studies at the University of Toronto, I have had the opportunity

    to work with a large number of colleagues. This work could not have been possible without the

    contribution from them. I want to thank the Southern Ontario Centre for Atmospheric Aerosol

    Research (SOCAAR) members. The “so what” and “who cares” questions that Professor Greg

    Evans asks all the time which have become the guidance for my research. I want to thank Cheol-

    Heon Jeong for offering me much assistance regarding technical issues of fixing and operating the

    DustTrak, SHARP, and Aethalometer. Also, I would like to thank Dr. Bruce Urch for the valuable

    advice to my thesis and the technical support in operating particle concentrator. In addition, I have

    learned so much about the presentation skills by practicing at the SOCAAR meetings, and

    feedback from SOCAAR members were always the guiding stars for my thesis. I would also like

    to acknowledge our collaborators, Dr. Yushan Su at the Ministry of the Environment and Climate

    Change Ontario (MOECC), Dr. Mi Tian at Chinese Academic of Science, and Professor Tong Zhu

    at Peking University. This work would not have been possible without their support on sample

    supplies.

  • iv

    It is my great honor to have such an excellent opportunity to work with intelligent people like my

    lab mates, Mengxuan Cai, Jianhuai Ye, Manpreet Takhar, Meng Meng, Shunyao Wang, Tian Mi,

    Lukas Kohl, Alicia Hill-Turner, Anthony Tuccitto and Rui Zeng. I can still remember the great

    time we had when we stayed late in the lab and hung out for meals. I want to a give special thanks

    to Jackie. As the first Ph.D. in our group, he is like a big brother to me. I will never forget his

    advice on my research and future career paths.

    Finally, none of my achievements would be possible without unrequited love and mentally,

    financially supports for both my undergraduate and graduate studies from my parents, Shuanglu

    Chen, Yingqun Huang, my sibling, Haoyun Chen, my maternal grandparents, Mingchun Huang,

    Chunpei Li, my departed paternal grandparents, Xingnan Chen, Cailiu Zhu, and the rest of family

    members. This is your achievement as much as mine.

  • v

    Table of Contents

    Acknowledgments.......................................................................................................................... iii

    Table of Contents .............................................................................................................................v

    List of Tables ................................................................................................................................ vii

    List of Equations .......................................................................................................................... viii

    List of Figures ................................................................................................................................ ix

    Chapter 1 Introduction .....................................................................................................................1

    1.1. Black Carbon .......................................................................................................................1

    1.2. Principles of Commercialized Black Carbon Instruments ...................................................2

    1.3. Deployment of Smartphone .................................................................................................3

    1.4. Image Processing .................................................................................................................4

    1.5. Literature Review.................................................................................................................6

    1.5.1. Quantifying BC/EC using optical methods ..............................................................6

    1.5.2. Research Gaps ..........................................................................................................7

    1.6. Research Goals .....................................................................................................................7

    Chapter 2 Methods .........................................................................................................................10

    2.1. Sample Collection and Instrumentation .............................................................................11

    2.1.1. Ontario Samples .....................................................................................................11

    2.1.2. Beijing Samples .....................................................................................................15

    2.2. Image Capturing and Image Processing ............................................................................17

    2.2.1. Capturing Raw Images ...........................................................................................17

    2.2.2. Demosaicing ..........................................................................................................18

    2.2.3. White Balancing .....................................................................................................19

    2.2.4. Colour Transformation ...........................................................................................19

  • vi

    2.3. Detection Algorithms for Colour Extraction .....................................................................21

    2.3.1. Detection of 24 Patches in ColorChecker (BRISK Point Feature Matching) ........21

    2.3.2. Detection of Filter Sample .....................................................................................22

    2.4. Model Buildup ...................................................................................................................24

    Chapter 3 Results and Discussion ..................................................................................................26

    3.1. Effectiveness of Image Processing ....................................................................................26

    3.2. RGB-based Model to Predict PM2.5 Loading .....................................................................29

    3.3. RGB-based Model to Predict BC Loading .........................................................................32

    3.3.1. Assessments of the Model .....................................................................................32

    3.3.2. Diagnostics of Systematic Bias from Different Sources of PM2.5 .........................34

    3.4. RGB-based Model to Predict EC Loading .........................................................................35

    3.5. Integrated RGB Model for All Samples ............................................................................40

    Chapter 4 Conclusions and Recommendation ...............................................................................42

    4.1. Conclusions ........................................................................................................................42

    4.2. Recommendation ...............................................................................................................43

    Bibliography ..................................................................................................................................45

  • vii

    List of Tables

    Table 1 Summary of current black carbon (BC) instruments market (Du et al., 2011).

    Current BC instruments are too expensive to afford for the community. .............................. 3

    Table 2 Research to date using optical techniques (scanner, conventional cellphone, and

    colorimeter) to quantify BC/EC loading (µg/cm2). .................................................................... 9

    Table 3 Detailed description of the sampling sites, training and test datasets, reference

    instruments, and filter types. ..................................................................................................... 14

    Table 4 List of materials used in this work ............................................................................... 18

  • viii

    List of Equations

    Equation 1 .................................................................................................................................... 13

    Equation 2 .................................................................................................................................... 19

    Equation 3 .................................................................................................................................... 20

    Equation 4 .................................................................................................................................... 20

    Equation 5 .................................................................................................................................... 20

    Equation 6 .................................................................................................................................... 29

    Equation 7 .................................................................................................................................... 30

    Equation 8 .................................................................................................................................... 30

    Equation 9 .................................................................................................................................... 30

    Equation 10 .................................................................................................................................. 32

  • ix

    List of Figures

    Figure 1 Flow diagram of camera preprocessing program (Chakrabarti et al., 2009).

    Preprocessing image systems of the camera alter colour, white balance, contrast, and

    brightness of camera photographs. They also further convert images to sRGB colour space

    and irreversibly compressed them into jpg format. These nonlinear operations make it

    impossible to derive device-independent colour information from compressed jpg images

    because these non-linearities are usually unknown and cannot be inverted. .......................... 5

    Figure 2 Flow diagram of the experimental test ...................................................................... 11

    Figure 3 a) Schematic diagram of the synchronized hybrid ambient real-time particulate

    (SHARP) monitor. It combines light scattering (nephelometer) and β-ray attenuation to

    measure PM2.5 concentrations precisely and accurately with a one-minute resolution. In this

    study, all the SHARP monitors in Ontario monitoring stations were programmed to advance

    one filter spot every 8 hours, following the U.S. EPA standard. The only exception is the

    SHARP monitor in Downtown Toronto (located on the rooftop of the Wallberg Building),

    which advanced each filter tape every 24 hr. (Thermo Fisher Scientific, 2013); b) Picture of

    PM2.5 loaded SHARP filter (16-mm glass fiber); c) Photo of the SHARP monitor. ............. 12

    Figure 4 a) A schematic of a Libra Model L-4 Personal Sampler placed in the participant’s

    breathing zone (30 cm from the nose); b) Photo of a 37-mm PM2.5 loaded quartz filter. .... 16

    Figure 5 Image-processing workflow used in this study. Colour information can be used for

    research purposes if raw images are captured and processed manually with linear operations

    so that linear relationship with scene reflectance can be maintained. ................................... 17

    Figure 6 Set-up of the raw image capture ................................................................................ 18

    Figure 7 a) Reference image of ColorChecker in the scene. b) Matched putative points

    between the reference image (left side) and the target image (right side). c) The output of the

    point feature detection program. ............................................................................................... 21

    Figure 8 a) Scanned image of Ontario samples. b) Detected images...................................... 23

  • x

    Figure 9 a) One of the annotated images before training. b) Test image of the Mask R-CNN.

    c) The output of the Mask R-CNN. Masks are shown in colours and bounding box. .......... 24

    Figure 10 Four pre-processed images by two iPhone 6s cameras. The two top photos (S352

    and S604) are Beijing samples captured using the same iPhone 6s but under two different

    light conditions (S352 is brighter than S604). The two bottom photos (HWY28 and WW49)

    are Ontario samples captured at two distinct locations using another iPhone 6s. b) Four

    processed photos using the image processing program in MATLAB. It proves that this image

    processing program can take into account different light conditions and devices effectively.

    ....................................................................................................................................................... 27

    Figure 11 Calibrated RGB values of ColorChecker versus published RGB values of

    ColorChecker. a), b), and c) represent the correlation between published RGB values and

    RGB values of the 24 colour patches obtained from processed images using the image

    processing program mentioned in Chapter 2. It shows that this program can calibrate RGB

    values into the ground truth RGB effectively, despite different light conditions and devices.

    ....................................................................................................................................................... 28

    Figure 12 a) One of two thousand hold out validations for the linear interactions regression

    model. This randomly chosen 20% testing dataset does not show a good agreement with the

    model trained using the remaining 80% data. b) Distribution of the PM2.5 loading for all the

    Ontario samples and the distribution of the 2000 RMSE from hold out validations. The mean

    of 2000 R2 is 0.50, and the CV(RMSE) is 77.3%. Also, the detection limit of this model is 61.0

    [µg/cm2], which means nearly 91.6% of the dataset was smaller than LOD. ........................ 30

    Figure 13 a) PM2.5 loading (µg/cm2) reported by the linear interactions regression model

    versus actual PM2.5 loading measured by the SHARP monitor for all Ontario samples

    (N=1266). The R2 for this model is 0.50, and RMSE is 33.5 [µg/cm2]. b) The residuals of the

    selected model versus BC to PM2.5 ratio. It demonstrates that this model cannot predict PM2.5

    loading when BC loading is relatively small. Moreover, the residual shows a trend as BC/PM

    increasing, which indicates that this model measures BC loading instead of the PM2.5 loading.

    ....................................................................................................................................................... 31

  • xi

    Figure 14 One of two thousand hold out validations for the linear interactions regression

    model. This randomly chosen 20% testing dataset shows strong agreement with the model

    trained using the remaining 80% data. b) Distribution of the BC loading for all the Ontario

    samples and the distribution of the 2000 RMSEs from 2000 times hold out validations. This

    model shows strong predictability of BC loading for 2000 tests with the means of 2000 R2

    and 2000 RMSEs equal to 0.95 and 0.6 [µg/cm2], respectively. Also, the detection limit of this

    model is 0.3 [µg/cm2], which means all the dataset are larger than LOD. ............................. 33

    Figure 15 a) BC loading reported by the linear interactions regression model versus actual

    BC loading measured by the aethalometer (AE33/AE31) for all Ontario samples (N=1266).

    b) The residuals of the selected model versus PM2.5 to BC ratio. It exhibits a random

    distribution of the residuals, which indicates that the PM2.5 (except BC) deposited on the

    filter does not affect the predictability of this model. .............................................................. 34

    Figure 16 Boxplots of residuals for all Ontario samples: three categories, three time periods

    in the day, and Weekdays vs. Weekends. It shows all categories agree with the model without

    any systematic bias, which means that this RGB-based model can measure BC loading

    consistently and accurately despite the variety of the PM2.5 sources. .................................. 35

    Figure 17 One of two thousand hold out validations for the linear interactions regression

    model. This randomly chosen 20% testing dataset shows strong agreement with the model

    trained using the remaining 80% data. b) distribution of the BC loading for all the Ontario

    samples and the distribution of the 2000 RMSEs from 2000 times hold out validations. This

    model shows strong predictability of BC loading for 2000 tests with the median of 2000 R2

    and 2000 RMSEs equal to 0.91 and 0.9 [µg/cm2], respectively. Also, the detection limit of this

    model is 0.5 [µg/cm2], which means nearly 0.07% of the dataset is smaller than LOD. ...... 36

    Figure 18 a) EC loading reported by the linear interactions regression model versus actual

    EC loadings measured by Sunset OC/EC analyzer for Beijing samples (N=478). b) The

    residuals of the selected model versus OC to EC ratio. It shows that the residuals are

    randomly distributed in two sides of the y=0 line, which indicates that the OC deposited on

    the filter does not affect the predictability of this model. ........................................................ 37

  • xii

    Figure 19 Predicted BC using the BC model vs. actual EC loading for Beijing samples. It

    does not show a good agreement when using the BC model to predict EC with an R2 of 0.84

    and an RMSE of 0.9 (µg/cm2). However, the slope (0.99) of this trendline exhibits that

    smartphone image analysis is consistent with a 1% error. Thus, the relatively poor

    predictabilities of EC loading using the BC model is due to the differences in their measuring

    techniques. ................................................................................................................................... 39

    Figure 20 a) BC/EC loading predicted by the linear interactions regression model trained

    using the whole data set versus actual BC/EC loadings measured by reference instruments.

    b) BC loading predicted by Ontario BC model and EC loading predicted by Beijing EC

    model versus actual BC/EC loadings measured by reference instruments. .......................... 40

    Figure 21 a) Sketch of the experimental set-up. b) Manikin wearing a facemask and

    “breathing” using a “breathing pump” (can inhale and exhale at a flow rate of 8L/min). c)

    Ambient fine particle (PM2.5) concentrator (concentrate the PM2.5 concentration in the

    chamber for 64 times). ................................................................................................................ 43

  • 1

    Chapter 1 Introduction

    1.1. Black Carbon

    Black Carbon (BC) is one of the essential components in particulate matter (PM). It is primarily

    emitted from incomplete combustion of carbonaceous fuels during residential heating and cooking,

    transportation, power generation, and other industrial processes. It has drawn significant attention

    in recent decades from climate change, air pollution, and health research communities (Petzold et

    al., 2013). Quantification of BC emissions and concentration is required to understand the effects

    of BC on climate change and the health of humans and ecosystems. However, owing to the

    complexity in BC sources, BC emissions remain difficult to quantify (Du et al., 2011). Also, both

    the high capital costs and operating costs of BC measurements prohibit their widespread

    deployment and limit the ability to capture the spatial and temporal variability of the emissions.

    Thus, a cost-effective, easily accessible, and relatively accurate BC measurement method is

    required. In this study, we develop a new method to indicate BC exposure using smartphone image

    analysis, which is the first step to popularize the BC sensor to the general public.

    Petzold et al. (2013) formally defined BC and Elemental Carbon (EC) by referring to measurement

    techniques instead of formation processes. BC refers to an ideally light-absorbing carbonaceous

    material; this definition is based on the optical properties of the material (Petzold et al., 2013). In

    contrast, Elemental Carbon (EC) is defined by its chemical properties and refers to a chemical

    substance which only contains carbon in its elemental form, but potentially exists in different

    allotropic forms (Schwartz et al. 2012). In other words, BC and EC are defined based on their

    different measuring principles. BC is measured using light attenuation, Laser-induced

  • 2

    incandescence (LII), and photoacoustic methods, while EC is measured using a thermal optical

    technique.

    1.2. Principles of Commercialized Black Carbon Instruments

    In general, there are three widely used methods for measuring BC/EC concentration: thermal

    optical measurement, photoacoustic spectroscopy, and light attenuation. There are two different

    commercialized thermal optical based instruments to measure both Elemental Carbon (EC) and

    Organic Carbon (OC): Sunset Laboratory Thermal Optical Analysis system and DRI

    Thermal/Optical Reflectance carbon analyzer (Chow et al., 1993). Instrument operation often

    follows one of two protocols: Interagency Monitoring of PROtected Visual Environments

    (IMPROVE) and National Institute of Occupational Safety and Health (NOISH). The main

    difference between IMPROVE and NOISH is the operating temperature and timing of the anoxic

    phase of the analysis.

    Two of the most common light attenuation-based instruments is the Magee Scientific

    Aethalometer and the Particle Soot Absorption Photometer (PSAP). The instruments based on the

    photoacoustic method is the Photoacoustic Soot Spectrometer (PASS). Also, the instrument based

    on the thermal emission of incandescent BC is the Single Particle Soot Photometer (SP2) which

    can measure BC on a single particle level. However, all these instruments are too expensive for

    the general public as shown in Table 1 below.

  • 3

    Table 1 Summary of current black carbon (BC) instruments market (Du et al., 2011).

    Current BC instruments are too expensive to afford for the community.

    Sensor Aethalometer AE31 Micro-aethalometer

    AE51

    Off-line OC/EC

    Analyzer

    Manufacturer Magee Scientific Aethlabs Sunset

    Principles Light attenuation Light attenuation Thermal/Optical

    Price (USD) $40, 000 $6, 500 $70, 000

    1.3. Deployment of Smartphone

    According to the 2016 global census, nearly 43% of the world’s population owns a smartphone,

    and the smartphone market is growing rapidly, particular in developing countries (Poushter et al.,

    2016). Smartphones have been deployed in various scientific fields, including diagnosis based on

    Chinese medicine using pictures of the human tongue (Cheng et al., 2017), diagnosis of skin

    diseases based on skin photos (Kuzmina et al., 2015), as well as quantifying temperature

    distribution on a surface using smartphone images (Treibitz et al., 2015). Moreover, current

    smartphones are capable of storing raw images, allowing for extracting color information based

    on a linear relationship with scene radiance. In other words, a smartphone camera is capable of

    capturing true colours for research purposes with an acceptable level of accuracy. Thus, it is

    possible to deploy off-the-shelf smartphones to assess BC/EC loading by taking photos of collected

    PM filter samples with its distinct advantages, such as low cost and easy accessibility. Widespread

    adoption of smartphones to measure BC/EC will provide abundant and meaningful data on BC/EC

    exposure, and raise awareness of the health and climate effects of BC/EC.

  • 4

    1.4. Image Processing

    One of the main challenges in deploying cameras to capture the true colour of an object is the

    significant variation in the image quality of photos of the same object taken under different light

    conditions, from different angles, at different distances, and with different cameras. Thus, there is

    an urgent need for an effective image processing algorithm that can take these variables into

    account. In general, every camera has a built-in image system for preprocessing photos (Figure

    1). These proprietary systems often include operations that can alter the colour, white balance,

    brightness, and contrast of the images. It converts the images into standard RGB1 (sRGB) colour

    space, which has a nonlinear relationship with scene radiance (the intensity of light captured by

    the camera sensors). The images are then compressed irreversibly into a jpg format so that a

    nonlinear RGB image can be shown on the screen (Chakrabarti et al., 2009). However, as a result

    of nonlinear processing, built-in image processing programs often do not preserve the linearity

    between the RGB values and the scene radiance. To obtain the true colour for scientific purposes,

    the raw image will need to be processed by algorithms that operate linearly and in a device-

    independent manner (Chakrabarti et al., 2009).

    1 RGB stands for “Red Green Blue”. It refers to the intensities in three hues of light, which can be used to indicate

    colours.

  • 5

    Figure 1 Flow diagram of camera preprocessing program (Chakrabarti et al., 2009).

    Preprocessing image systems of the camera alter colour, white balance, contrast, and

    brightness of camera photographs. They also further convert images to sRGB colour

    space and irreversibly compressed them into jpg format. These nonlinear operations

    make it impossible to derive device-independent colour information from compressed

    jpg images because these non-linearities are usually unknown and cannot be inverted.

    In some cases, raw photos are captured by cameras and further processed manually using three

    linear operations. Firstly, because most digital cameras capture images by a single sensor overlaid

    with a colour filter array (CFA), colour reconstruction is required to convert the viewable format

    of images (has a full set of colour triples) from CFA. The missing two intensities at each pixel

    location are estimated through the interpolation process which is known as demosaicing, colour

    reconstruction or CFA interpolation. Secondly, white balance is used to adjust for the different

    light conditions. Human eyes have evolved to distinguish colours of objects based on their spectral

    properties despite the presence of objects under various illuminants. However, digital cameras can

    only record the actual reflectance of the objects and have considerable difficulties in judging what

    is white even with their automatic white balance process. Thus, it is important to introduce another

  • 6

    white balance algorithm manually to extract the true colour of objects in the photos. Lastly, two

    different cameras will record different colour information for the same object because of the

    difference in colour sensitivities of two cameras’ sensors. It is essential to transform the images to

    a device-independent colour space using calibration targets in the scene. The linearized colour

    information can then be extracted after this operation. The specific image processing algorithms

    used in this study will be discussed in the following chapter.

    1.5. Literature Review

    1.5.1. Quantifying BC/EC using optical methods

    In recent years, there has been increasing interest in quantifying BC/EC loading (µg/cm2) on a

    filter substrate using optical techniques, such as scanners, smartphone cameras, digital cameras,

    and colorimeters. This body of research has demonstrated that optical sensing has distinct

    advantages in the measurement of BC/EC loading: it is non-destructive, low cost, and fast. Cheng

    et al. (2011) showed that the reflectance (min{R, G, B}) of particles collected on quartz filters

    measured by a scanner could be used to estimate EC loading in ambient urban Hong Kong air well

    within an error of 10%. Ramanathan et al. (2011) showed that a conventional cellphone could be

    used to obtain reflectance at the red wavelength (ρ-red), from which BC loading can be predicted.

    Along the same lines, Olson et al. (2016) investigated the predictability of a Hue, Saturation, and

    Value (HSV)-based model using both a conventional cellphone and a professional colorimeter

    (iPro), obtaining promising results when the latter was used. Khuzestaniet al. (2016) extended the

    analysis to another color space (CIELAB) which is the most representative of human vision. Their

    CIELAB model demonstrates strong predictability for EC collected on both quartz and Teflon

    filters. All these studies concluded that the reflectance of particles collected on the filter is

    positively related to the BC/EC loading with promising predictabilities.

  • 7

    1.5.2. Research Gaps

    While the feasibility of using digital images to quantify BC has been demonstrated, smartphone

    cameras have never been used in the quantification of BC/EC loading. Also, the methods described

    above still cannot estimate BC/EC loading consistently from various PM sources. In addition, the

    cellphone model still shows relatively low accuracy, which has been assessed by De la Sota et al.

    (2017). Furthermore, it is not easy to build a robust model with limited samples in these studies.

    Moreover, the correlation between the colour information of particles loaded filters and BC

    loading is still poorly understood. Lastly, the professional colorimeter remains too expensive (>$2,

    000 USD) to be widely adopted by the general public.

    1.6. Research Goals

    Recognizing these gaps in knowledge, this work proposes a novel RGB model based on

    smartphone image analysis. Specifically, the goals of the research are threefold: (1) Develop an

    image processing program that can take into account the variabilities between devices and light

    conditions, (2) to develop a model that can predict BC/EC loading precisely and accurately, and

    (3) to assess the accuracy, precision, and consistency of the model in different urban environments.

    Since Oct 2016, 1266 ambient PM samples with an aerodynamic size smaller than 2.5 µm (PM2.5)

    were obtained at the Ministry of the Environment and Climate Change (MOECC), and Southern

    Ontario Centre for Atmospheric Aerosol Research (SOCAAR) sites located across southwestern

    Ontario, Canada. Also, since 2013, 478 personal PM2.5 samples were obtained from Peking

    University, which were collected from pre-diabetic participants living in the downtown Beijing,

    China.

    Raw images of all these samples were captured under relatively consistent light condition. Colour

    information of these photos was extracted after applying for image processing and edge detection

  • 8

    programs in MATLAB. By training and verifying a model in MATLAB using the colour

    information of these filter samples and corresponding reference BC/EC data, a robust polynomial

    model based on RGB values was built to estimate BC/EC loading accurately and precisely in two

    different urban environments.

  • 9

    Table 2 Research to date using optical techniques (scanner, conventional cellphone,

    and colorimeter) to quantify BC/EC loading (µg/cm2).

    Devices Reference

    Instruments

    Color

    Space

    # of

    Samples

    Sources of

    Samples CV(RMSE)* Researchers

    Scanner

    Sunset OC/EC

    analyzer

    IMPROVE

    RGB 79

    Ambient

    urban

    samples in

    China

    N/A (~10.3 error) (Cheng et al., 2011)

    Conventional

    Cellphone

    Aethalometer

    (AE31) RGB 126

    Rural LA,

    rural India

    indoor and

    rural India

    outdoor

    N/A (Ramanathan et al., 2011)

    Conventional

    Cellphone

    Sunset OC/EC

    analyzer

    NIOSH

    HSV

    120

    Rural China,

    urban Iraq,

    and urban

    California

    22.6%

    (Olson et al., 2016)

    Professional

    Colorimeter

    (i1Pro)

    Sunset OC/EC

    analyzer

    NIOSH

    93

    Rural China

    and urban

    California

    30.8%

    Sunset OC/EC

    analyzer

    IMPROVE

    315

    Urban LA,

    Urban

    Riverside,

    and urban

    Denver

    16.1%

    Professional

    Colorimeter

    (3nh)

    Sunset OC/EC

    analyzer

    NIOSH

    CIELAB 226

    Urban and

    sub-urban

    China

    17.4% (Khuzestani, et al., 2017)

    ∗ 𝐶𝑉(𝑅𝑀𝑆𝐸) =𝑅𝑀𝑆𝐸

    𝑀𝑒𝑎𝑛

  • 10

    Chapter 2 Methods

    To achieve these research goals, the following experiments were conducted (Figure 2). First,

    1777 PM2.5 filter samples were obtained from Canada and China in this study. Raw images of these

    samples with the ColorChecker in the scene were then captured using a smartphone with Adobe

    Lightroom app under relatively consistent light conditions. Adobe DNG Converter was used to

    demosaic the raw images, and “dcraw2” was implemented to convert DNG format to a MATLAB

    readable format, TIFF. Lastly, these images were processed using the image processing program

    in MATLAB with white balancing and colour transformation. The calibrated linear RGB values

    of the particle-loaded filters were extracted by the objects detection programs. A model was trained

    and assessed using Regression Learner app in MATLAB with the colour information and the

    reference BC/EC data. In the future, the MATLAB procedures and model can be easily translated

    into a smartphone app, which will make it possible for the smartphone to measure BC/EC loading

    off-line.

    2 dcraw is an open-source image processing program which can read numerous raw image format files.

  • 11

    Figure 2 Flow diagram of the experimental test

    2.1. Sample Collection and Instrumentation

    2.1.1. Ontario Samples

    Since April 2017, 1266 PM2.5 samples were obtained by MOECC and SOCAAR across six sites

    in the Air Quality Monitoring Network (AQMN), located in northern Toronto, downtown Toronto,

    Highway 401 roadside, western Windsor, and downtown Windsor in the southwestern Ontario,

    Canada. A detailed description of these samples is presented in Table 3.

    These samples can be classified into three categories based on their locations, including highway,

    near-road, and suburban residential areas. Sites that are located within 100 m of a major roadway

    with average traffic volumes greater than 10,000 vehicles per day are classified as “Near-Road.”

    The Windsor West site does not meet the above criteria, but it is surrounded by residential houses.

    Thus, it is classified as “residential region”(Healy et al., 2017). The HWY 401 station is located

    directly to the southeast of Ontario's Highway 401, which passes through Toronto and is

  • 12

    considered one of the busiest highways in North America (~400,000 vehicles/day) (Ministry of

    Transportation, 2012). Three of the sites are located in the Greater Toronto Area (GTA), which

    has a population of 6.3 million (Statistics Canada, 2018).

    Figure 3 a) Schematic diagram of the synchronized hybrid ambient real-time

    particulate (SHARP) monitor. It combines light scattering (nephelometer) and β-ray

    c)

  • 13

    attenuation to measure PM2.5 concentrations precisely and accurately with a one-

    minute resolution. In this study, all the SHARP monitors in Ontario monitoring

    stations were programmed to advance one filter spot every 8 hours, following the U.S.

    EPA standard. The only exception is the SHARP monitor in Downtown Toronto

    (located on the rooftop of the Wallberg Building), which advanced each filter tape

    every 24 hr. (Thermo Fisher Scientific, 2013); b) Picture of PM2.5 loaded SHARP filter

    (16-mm glass fiber); c) Photo of the SHARP monitor.

    All the samples were collected onto glass fiber filters using the Synchronized Hybrid Ambient

    Real-Time Particulate (SHARP) monitor (Figure 3). With the aid of collocated aethalometers

    (AE33 or AE31) to monitor the real-time BC concentration with a 1-min resolution, the

    corresponding BC loading for these 1266 samples can be calculated using Equation 1, where

    [BC]Aethalometer is the BC concentration measured by aethalometer at 880nm wavelength, Vf_i is the

    volumetric flow rate of the SHARP monitor at every minute, and A [cm2] is the effective area of

    the filter, in this study, 2.01[cm2].

    BCloading [𝜇𝑔/𝑐𝑚2] =

    ∑ [BC]Aethalometer_i[𝜇𝑔/𝑚3] × 𝑉𝑓𝑖[𝑚

    3/ℎ𝑟] × 1/60[ℎ𝑟/𝑚𝑖𝑛]480/1440[𝑚𝑖𝑛]𝑖=1

    𝐴[𝑐𝑚2] Equation 1

  • 14

    Table 3 Detailed description of the sampling sites, training and test datasets, reference

    instruments, and filter types.

    * All Ontario samples are 8-hr samples, except the 107 downtown Toronto samples, which are 24-hr samples

    Sample sites Site

    characteristics Location

    Sampling

    period

    BC/EC

    Range

    (µg/cm2)

    Training

    samples

    Testing

    samples

    All

    samples

    Reference

    instruments

    Filter

    types

    HWY 401 Highway

    43°42'39.6" N;

    79°32'34.8" W

    July 6th to July

    26th, 2017 1.1-18 49 12 61

    Aethalometer,

    AE33

    Glass

    Fiber

    Toronto

    North

    Near-Road

    43°46’53.8” N;

    79°25’03.8” W

    Jan 1st to July

    7th, 2017 0.6-33.1 207 52 259

    Windsor

    Downtown

    42°18’56.8” N;

    83°02’37.2” E

    April 30th to

    July 25th, 2017 0.8-7.1 190 47 237

    Aethalometer,

    AE31

    Hamilton

    Downtown

    43°15’28.0” N;

    79°51’42.0” W

    April 18th to

    Sept 27th,

    2017

    0.9-12.3 293 73 366

    Toronto

    Downtown *

    43°39'32.1" N;

    79°23'45.7" W

    Sep 17th, 2016

    to Mar 21st,

    2017

    1.2-20.6 86 21 107

    Aethalometer,

    AE33

    Windsor

    West

    Residential

    region

    42°17’34.4” N;

    83°04’23.3” W

    May 2nd to

    July 26th, 2017 0.9-9.4 189 47 236

    Beijing

    Pre-diabetic

    subjects’

    personal

    samples in an

    urban area

    39°59’23” N;

    116°18’19” E

    2013 0.2-19.7 382 96 478

    Sunset OC/EC

    Analyzer

    NIOSH

    Quartz

    Filter

    Total 1395 349 1744

  • 15

    2.1.2. Beijing Samples

    China contributes around 25% of the annual global total BC emission in recent years (Bond et al.,

    2004; Cooke et al., 1999; Qin & Xie, 2012; Wang et al., 2012). Moreover, the annual mean BC

    concentrations are around 11.2 µg/m3 at the urban sites, 3.6 µg/m3 at the rural sites, and 0.35 µg/m3

    at the remote background sites in China (Zhang et al., 2008). Beijing is one of the most polluted

    cities in China, has a population of 21.707 million (Beijing Municipal Bureau of Statistics, 2018).

    In this work, 478 PM2.5 samples were collected on 37-mm quartz filters using Libra Model L-4

    personal samplers (manufactured by A.P. Buck, Inc.) in Beijing. A detailed description of these

    samples is presented in Table 3. The personal samplers were attached to pre-diabetic subjects

    (with a fasting plasma glucose level of 6.1-7.0 mmol/L in the most recent annual health

    examination) living in downtown Beijing during 2013. The detailed information of participants

    recruitment criteria and demographics of study subjects were described by Wang et al. (2018). A

    schematic of the equipment and a photo of a PM2.5 loaded quartz filter are shown in Figure 4.

    Before each sampling run, the personal sampler was calibrated to sample at a flow rate of 4L/min.

    A blank filter was pre-baked in a muffle furnace held at 550 °C for 5.5 hr to eliminate background

    organic carbon (OC). 24 hr after prebaking, the filter was pre-weighed in a super clean lab at 25

    °C and 40% relative humidity (RH). Each patient was asked to turn on the sampler 24 hr before

    his/her doctor’s appointment and to wear it at all times. It was also recommended that the sampler

    is placed as close as possible to the breathing zone (within a 30-cm radius of the nose) except for

    when the sampler would interfere with activities, such as cooking and sleeping. After sampling,

    the quartz filters were weighed again to obtain the mass of loaded PM2.5 during the sampling period

    (Wang et al., 2018). Then, they were analyzed off-line using a semi-continuous OC/EC analyzer

    (Model 4, Sunset Laboratory Inc., USA) following the NIOSH protocol.

  • 16

    Figure 4 a) A schematic of a Libra Model L-4 Personal Sampler placed in the

    participant’s breathing zone (30 cm from the nose); b) Photo of a 37-mm PM2.5 loaded

    quartz filter.

  • 17

    2.2. Image Capturing and Image Processing

    To obtain linear colour information, manual image processing of raw photographs is required. The

    workflow of the image capturing, and image processing used in this study is shown in Figure 5.

    The details of the image processing pipeline are described as follows.

    Figure 5 Image-processing workflow used in this study. Colour information can be

    used for research purposes if raw images are captured and processed manually with

    linear operations so that linear relationship with scene reflectance can be maintained.

    2.2.1. Capturing Raw Images

    Retaining the raw image makes it possible to extract the colour information which has a linear

    relationship with the scene reflectance. Therefore, Adobe Lightroom in iPhone 6s was used to

    obtain and store the raw images. To understand the linear relationship between scene reflectance

    and values recorded in the raw images, calibration targets (Macbeth ColorChecker) are required

    in the scene. To simplify the difficulty in image processing and objects detection programs, the

    raw photographs of these PM2.5 samples captured using Adobe Lightroom app were taken under

  • 18

    relatively consistent light conditions and distances. The set-up of image capture is shown in

    Figure 6, and the hardware and software used are listed in Table 4.

    Table 4 List of materials used in this work

    Hardware and Software Manufacturer Parameters

    Macbeth ColorChecker X-Rite ColorChecker Passport

    Lamp AUKEY LT-T11, 7V

    Stand N/A N/A

    iPhone 6s Apple N/A

    Adobe Photoshop Lightroom Adobe N/A

    Figure 6 Set-up of the raw image capture

    2.2.2. Demosaicing

    After raw images of these filters were captured, Adobe DNG converter was used to demosaic raw

    images. Also, because of raw images in DNG format is not readable by MATLAB, dcraw was

    used to covert these DNG images into a TIFF format.

  • 19

    2.2.3. White Balancing

    The human vision system is capable of adapting to slight changes of colour results from differences

    in illumination. Through the process of chromatic adaption, the human eye is also able to

    effectively discern the spectral properties of objects in the scene despite various light conditions.

    Contrasting this, cameras can only capture the actual reflectance of the objects in the scene.

    Therefore, white balancing for photographs is necessary to capture the spectral properties of the

    objects in the scene (Reinhard et al., 2008). There are two different concepts widely used in white

    balancing: RGB equalization and chromatic adaptation transform (CAT). In this study, RGB

    equalization method was chosen over CAT, because high perceptual accuracy is not necessary for

    scientific data acquisition, and RGB equalization is easier to implement (Treibitz et al., 2015).

    Through RGB equalization is also known as “wrong von Kries model” (Westland & Ripamonti,

    2004), RGB values of grey calibration targets in the scene are corrected based on the published

    RGB values of the six grey patches in the ColorChecker Passport. Mathematically, each pixel in

    each colour channel of a linear image is calibrated using the following equation (Treibitz et al.,

    2015):

    𝑝𝑖𝑊𝐵 =

    𝑝𝑖 − 𝐾𝑆𝑖𝑊𝑆𝑖 − 𝐾𝑆𝑖

    , 𝑖 = 𝑅𝐺𝐵 Equation 2

    where 𝑝𝑖𝑊𝐵 is the intensity of the white-balanced pixel in the ith channel, 𝑝𝑖 is the intensity of the

    linear image in the ith channel (i.e., R value, G value, and B value), and 𝐾𝑆𝑖 and 𝑊𝑆𝑖 are the

    intensities of published darkest and whitest standard in ith channel, respectively.

    2.2.4. Colour Transformation

    As explained in the first chapter, owing to the variations in different camera sensors, two different

    cameras may record different RGB values for the same scene. To know the linear relationship

  • 20

    between the images and scene radiance, colour transformation is required, such that the device-

    independent colour information can be extracted. It is performed according to the Macbeth

    ColorChecker chart, which consists of 24 colour patches that provide the majority of natural

    reflectance spectra (Westland et al., 2004). Treibitz et al. (2015) showed that the total error is

    minimized when 18 patches of the ColorChecker were used in colour transformation. In this study,

    we included all 24 patches to ensure transformation accuracy. The matrix T in Equation 3 is the

    standard matrix for the linear transformation. In this case, a total of 9 coefficients must be

    determined as shown in Equation 4. The 3×24 matrices, 𝐶𝑙𝑖𝑛𝑒𝑎𝑟𝑅𝐺𝐵 and 𝐶𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒

    𝑋𝑌𝑍 contain the RGB

    values obtained from the linear RGB image of 24 patches in the ColorChecker and the published

    XYZ3 tri-stimulus values for the 24 patches, respectively. Furthermore, because 𝐶𝑙𝑖𝑛𝑒𝑎𝑟𝑅𝐺𝐵 is not a

    square matrix, T is calculated using the pseudoinverse matrix ([𝐶𝑙𝑖𝑛𝑒𝑎𝑟𝑅𝐺𝐵 ]

    +), as shown in Equation

    5 (Westland et al., 2004).

    𝐶𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑋𝑌𝑍 = 𝑇 × 𝐶𝑙𝑖𝑛𝑒𝑎𝑟

    𝑅𝐺𝐵 Equation 3

    𝑋 = 𝑎11𝑅 + 𝑎12𝐺 + 𝑎13𝐵

    𝑌 = 𝑎21𝑅 + 𝑎22𝐺 + 𝑎23𝐵

    𝑍 = 𝑎31𝑅 + 𝑎32𝐺 + 𝑎33𝐵

    Equation 4

    𝑇 = 𝐶𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑋𝑌𝑍 [𝐶𝑙𝑖𝑛𝑒𝑎𝑟

    𝑅𝐺𝐵 ]+

    Equation 5

    3 XYZ is a colour space developed by the International Commission on Illumination (CIE) to denote how much

    three different types of human cone cells are stimulated at three different wavelengths to quantify colours.

  • 21

    2.3. Detection Algorithms for Colour Extraction

    2.3.1. Detection of 24 Patches in ColorChecker (BRISK Point Feature Matching)

    To extract the RGB values of these 24 patches for colour transformation, it is more efficient to

    detect the ColorChecker in the scene automatically. Because the ColorChecker in the scene is

    unique as shown in Figure 6, a point feature matching algorithm in MATLAB was used, which

    is highly accurate on detecting a specific object that does not have repeating patterns. To do so, a

    reference picture of the ColorChecker (Figure 7 a)) is required as an input. The program can

    extract the feature points in the reference picture, and then finds putative point matches (Figure

    7 b) ) in the target image containing a cluttered scene to locate the ColorChecker in the scene. The

    code in the MATLAB was adapted from the example of point feature posted by (MathWorks,

    2014).

    Figure 7 a) Reference image of ColorChecker in the scene. b) Matched putative points

    between the reference image (left side) and the target image (right side). c) The output

    of the point feature detection program.

  • 22

    2.3.2. Detection of Filter Sample

    2.3.2.1. Detection of PM2.5 Loaded Spots in Scanned Pictures

    Prior to using the smartphone, the Ontario samples were scanned using the Canon imageClass

    MF4890dw office scanner for the preliminary tests. The scanner is more efficient than using a

    smartphone for taking photos for every spot because it can scan 50 SHARP samples in one image.

    Also, since each spot in the image is uniformly illuminated, as shown in Figure 8 a), there is less

    variation than using a smartphone camera. In addition, the scanner is capable of capturing an

    uncompressed raw image in tiff format, which is a linear RGB image. Therefore, there is no need

    to apply an image processing algorithm for these scanned photos.

    A python-based program was developed to find the filter spots in the scene. A function in OpenCV

    called “AdaptiveTthreshold” was used to detect particle loaded filters. Because the ratios of size

    to the perimeter of these spots are consistent, using a reasonable default ratio is useful to exclude

    some grey parts but not filter spots in the scene. In addition, the threshold, the size of the detected

    object, and the ratio of area to perimeter are adjusted by dragging the slider in the user interface to

    make sure all the desired spots can be detected regardless of variabilities in the size of the scanned

    picture. Furthermore, in order to avoid two straight white lines for all the SHARP samples during

    extracting RGB values, three rectangles are drawn in each spot, as shown in Figure 8 b) based

    on the relative positions to the centroid of the spot’s contour, which is computed using “moments”

    function in OpenCV. Lastly, the average RGB values for the three mean RGB values of each

    rectangle are reported.

  • 23

    Figure 8 a) Scanned image of Ontario samples. b) Detected images

    2.3.2.2. Detection of PM2.5 Loaded Spots in Smartphone Pictures

    Since samples from Beijing were punched and analyzed using OC/EC analyzer before photos were

    captured, most of these samples do not present perfectly circular spots (Figure 9 b) ). Therefore,

    common edge-detection algorithms are not capable of finding the sample spot in the scene

    effectively due to the complexities of the morphologies of our samples. In this study, a novel

    machine learning technique, Mask Region based Convolutional Neural Network (Mask R-CNN)

    approach was proposed to detect the sample spot and the ColorChecker in the scene with its distinct

    advantages: fast, flexible and simple (He et al., 2017). The code for this method is posted by the

    author at https://github.com/matterport/Mask_RCNN. In this study, 45 images of samples were

    randomly selected as an input training dataset for Mask R-CNN. Accordingly, the sample spot and

    the ColorChecker in each training image were labeled and annotated as shown in Figure 9 a).

    After these 45 images and their labels were inputted, with the aid of the Graphics Processing Unit

    (GPU), an object (filter and ColorChecker) detection programmed was trained. Then, all available

    https://github.com/matterport/Mask_RCNN

  • 24

    images were used for model testing. It shows great consistency and efficiency (within 1s) in

    detecting the sample spots and the ColorChecker as shown in Figure 9 c).

    Figure 9 a) One of the annotated images before training. b) Test image of the Mask

    R-CNN. c) The output of the Mask R-CNN. Masks are shown in colours and bounding

    box.

    2.4. Model Buildup

    RGB values extracted from processed images using the image processing program have a linear

    relationship with the scene radiance, from which spectral properties of the object can be obtained.

    Using these RGB values and reference BC/EC loading data, a model was trained to predict BC or

    EC loadings based on extracted colours. Previous work demonstrated that the darkness of the PM2.5

    loaded filter is positively related to the BC/EC loading. Cheng et al., (2011) built an exponential

    model between EC loading and min {R, G, B}. Furthermore, Ramanathan et al., (2011) also built

    an exponential model between R-value and BC loading, because their images used gamma

    correction which is a nonlinear operation of the image processing. However, the relationship

    between linear RGB values and BC/EC loading are poorly understood.

  • 25

    In this study, multiple models were tested using Regression Learner (one of the machine learning

    toolboxes in MATLAB), including linear regression models, regression trees, support vector

    machines, Gaussian process regression models, and ensembles of trees. Also, this toolbox is

    capable of training and validating the models simultaneously. In this study, hold out validation

    (80% data was randomly chosen for validation, and the remaining 20% was for testing) was used

    due to a large number of data points. The predictability of all these trained models can be assessed

    using these model figures of merit, including Root Mean Square Error (RMSE), R-Squared (R2),

    and Mean Absolute Error (MAE). The best model, interactions regression model, was chosen with

    its smallest RMSE and MAE and an R2 close to 1.

  • 26

    Chapter 3 Results and Discussion

    3.1. Effectiveness of Image Processing

    Four sets of images were captured under different light conditions using two different iPhones for

    examination of the image processing program, as shown in Figure 10. All these four example

    images on the left were pre-processed by the camera of iPhone 6s, can be seen that the calibration

    targets of four images are all different in colour as captured by the smartphone. As shown, the

    proprietary image processing system of the camera in the iPhone 6s does not perform well and fail

    to take various light conditions and devices into account. This shortfall serves as motivation for

    the manual image processing conducted in this study as described in Chapter 2.

    With the ColorChecker in the scene, the RGB values of the 24 colour patches were extracted from

    each processed photograph to compare with the published RGB values of these 24 calibration

    targets. As shown in Figure 11, the calibrated R, G, B values of these colour patches have strong

    correlations (R2>0.92) with the published R, G, B values of the calibration targets. Also, calibrated

    RGB values from four different cases, regarding different light conditions and devices also agree

    with each other (R2>0.95). Therefore, this image processing program takes into account different

    devices and light conditions effectively.

  • 27

    Figure 10 Four pre-processed images by two iPhone 6s cameras. The two top photos

    (S352 and S604) are Beijing samples captured using the same iPhone 6s but under

    two different light conditions (S352 is brighter than S604). The two bottom photos

    (HWY28 and WW49) are Ontario samples captured at two distinct locations using

    another iPhone 6s. b) Four processed photos using the image processing program in

    MATLAB. It proves that this image processing program can take into account

    different light conditions and devices effectively.

  • 28

    Figure 11 Calibrated RGB values of ColorChecker versus published RGB values of

    ColorChecker. a), b), and c) represent the correlation between published RGB values

    and RGB values of the 24 colour patches obtained from processed images using the

    image processing program mentioned in Chapter 2. It shows that this program can

    calibrate RGB values into the ground truth RGB effectively, despite different light

    conditions and devices.

  • 29

    3.2. RGB-based Model to Predict PM2.5 Loading

    The correlation between PM2.5 loading and RGB values was investigated using all the Ontario

    samples (N=1266) using minutely data of PM2.5 concentrations and flow rate obtained from the

    SHARP monitor. The PM2.5 loading was calculated using Equation 6, where Vf_i is the volumetric

    flow rate of the SHARP monitor each minute, and A [cm2] is the effective area of the filter; in this

    study, 2.0[cm2].

    PM2.5_loading [𝜇𝑔/𝑐𝑚2] =

    ∑ [PM2.5]SHARP_i[𝜇𝑔/𝑚3] × 𝑉𝑓𝑖[𝑚

    3/ℎ𝑟] × 1/60[ℎ𝑟/𝑚𝑖𝑛]480/1440[𝑚𝑖𝑛]𝑖=1

    𝐴[𝑐𝑚2] Equation 6

    The best model, linear interactions regression model (Equation 7) was trained and assessed using

    hold out validation in MATLAB. The interaction terms in Equation 7 (i.e., R×G, G×B, and R×B)

    were introduced to take the saturation of colour (the colour of the filter samples tend to saturate at

    very high loadings) into account. The performance of the selected model using the whole dataset

    of Ontario samples is demonstrated in Figure 13 a). It shows that PM2.5 loading poorly correlates

    with RGB values with an R2 of 0.50 and an RMSE of 33.5 [µg/cm2].

    As mentioned in Chapter 2, hold out validations repeated over 2000 times were conducted, the

    distributions of both the PM2.5 loadings and the 2000 RMSEs obtained are shown in Figure 12

    b). The Coefficient of Variation in RMSE (CV(RMSE)) was calculated using Equation 9. It is

    evident that these two distributions are normally distributed. Therefore, the CV(RMSE) was

    calculated by dividing the mean of these 2000 RMSEs by the mean of PM2.5 loading and has a

    value of 77.3%. Also, 25 blank filters were tested to estimate the detection limit of this model

    within 99% confidence using Equation 8, where 𝑥𝑏𝑙𝑎𝑛𝑘 and 𝜎𝑏𝑙𝑎𝑛𝑘 are mean and standard

    deviation of 25 blank-filter PM loadings, respectively. Thus, the detection limit of this model is

    61.0 [µg/cm2], which means that nearly 91.6% of the Ontario samples are below the LOD for

  • 30

    detecting PM2.5. Overall, this model performed very poorly in predicting PM2.5, indicating that the

    light absorbing properties and PM2.5 mass do not correlate with each other.

    𝑃𝑀2.5_𝑙𝑜𝑎𝑑𝑖𝑛𝑔[𝜇𝑔/𝑐𝑚2]~1 + 𝑅 + 𝐺 + 𝐵 + 𝑅 × 𝐺 + 𝐺 × 𝐵 + 𝑅 × 𝐵 Equation 7

    𝐿𝑖𝑚𝑖𝑡 𝑜𝑓 𝐷𝑒𝑐𝑡𝑖𝑜𝑛 (𝐿𝑂𝐷) = 𝑥𝑏𝑙𝑎𝑛𝑘 + 3.14𝜎𝑏𝑙𝑎𝑛𝑘 (𝑤𝑖𝑡ℎ 99% 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒) Equation 8

    CV(RMSE) =𝑅𝑀𝑆𝐸

    𝑚𝑒𝑎𝑛 Equation 9

    Figure 12 a) One of two thousand hold out validations for the linear interactions

    regression model. This randomly chosen 20% testing dataset does not show a good

    agreement with the model trained using the remaining 80% data. b) Distribution of

    the PM2.5 loading for all the Ontario samples and the distribution of the 2000 RMSE

    from hold out validations. The mean of 2000 R2 is 0.50, and the CV(RMSE) is 77.3%.

    Also, the detection limit of this model is 61.0 [µg/cm2], which means nearly 91.6% of

    the dataset was smaller than LOD.

  • 31

    The reason this RGB-based model shows poor predictability of PM2.5 loading is that the

    complexities of PM2.5 cannot be fully explained by the light absorption of particles deposited on

    the filter. In addition, different components in PM2.5 have different colours, such as BC is black,

    OC and other components may be brown or even colourless. With the help of the collocated

    aethalometer, the residual of this model versus BC/PM is shown in Figure 13. It shows that this

    model cannot predict PM2.5 loading when BC loading is relatively small. Moreover, the residual

    becomes negative as BC/PM increases, because this model relies on light absorbing properties of

    the particles deposited on the filter. In addition, BC contributes more to light absorption than any

    other components in PM2.5 do. Therefore, this model consistently overestimates PM2.5 loading

    when BC/PM is high. In summary, all these results point to the fact that RGB values are not

    predictive of PM2.5 loading, and subsequently, we focus on the relationship between the BC

    loading and RGB values.

    Figure 13 a) PM2.5 loading (µg/cm2) reported by the linear interactions regression

    model versus actual PM2.5 loading measured by the SHARP monitor for all Ontario

    samples (N=1266). The R2 for this model is 0.50, and RMSE is 33.5 [µg/cm2]. b) The

  • 32

    residuals of the selected model versus BC to PM2.5 ratio. It demonstrates that this

    model cannot predict PM2.5 loading when BC loading is relatively small. Moreover,

    the residual shows a trend as BC/PM increasing, which indicates that this model

    measures BC loading instead of the PM2.5 loading.

    3.3. RGB-based Model to Predict BC Loading

    3.3.1. Assessments of the Model

    In this study, 1266 samples were collected across Ontario, Canada (Table 3) with corresponding

    aethalometer data, the large number of samples makes it possible to build a robust model for BC

    loading prediction. From the results of Figure 13 b), it shows that the RGB-based linear

    interactions model is capable of measuring BC. This hypothesis was tested by training and

    assessing several models in MATLAB. The best model, the linear interactions regression model

    (Equation 10) was selected again, and the performance of the selected model using the whole

    dataset was demonstrated in Figure 15 a). It shows that BC loadings strongly correlate with RGB

    values with an R2 of 0.95 and an RMSE of 0.6 [µg/cm2].

    Hold out validations repeated over 2000 times were conducted for model assessment, the RMSE

    for each validation was computed and stored. The distributions of both the 1266 samples’ BC

    loading and the 2000 RMSE obtained from 2000 times hold out validations are shown in Figure

    14 b). This model can predict BC loading precisely and accurately with a CV(RMSE) of 18.1%.

    The limit of detection (LOD) is comparable to LOD (0.3 [µg/cm2]) of the reference instrument

    (off-line Sunset OC/EC analyzer) used in this study (Karanasiou et al., 2015).

    𝐵𝐶/𝐸𝐶𝑙𝑜𝑎𝑑𝑖𝑛𝑔[𝜇𝑔/𝑐𝑚2]~1 + 𝑅 + 𝐺 + 𝐵 + 𝑅 × 𝐺 + 𝐺 × 𝐵 + 𝑅 × 𝐵 Equation 10

  • 33

    Figure 14 One of two thousand hold out validations for the linear interactions

    regression model. This randomly chosen 20% testing dataset shows strong agreement

    with the model trained using the remaining 80% data. b) Distribution of the BC

    loading for all the Ontario samples and the distribution of the 2000 RMSEs from 2000

    times hold out validations. This model shows strong predictability of BC loading for

    2000 tests with the means of 2000 R2 and 2000 RMSEs equal to 0.95 and 0.6 [µg/cm2],

    respectively. Also, the detection limit of this model is 0.3 [µg/cm2], which means all

    the dataset are larger than LOD.

    As shown in Figure 15 b), the residuals are randomly scattered regardless of the change of the

    PM2.5 loading to BC loading ratio, which proves that this RGB-based linear interactions model

    does not have a bias when measuring BC loading with various BC fractions.

  • 34

    Figure 15 a) BC loading reported by the linear interactions regression model versus

    actual BC loading measured by the aethalometer (AE33/AE31) for all Ontario

    samples (N=1266). b) The residuals of the selected model versus PM2.5 to BC ratio. It

    exhibits a random distribution of the residuals, which indicates that the PM2.5 (except

    BC) deposited on the filter does not affect the predictability of this model.

    3.3.2. Diagnostics of Systematic Bias from Different Sources of PM2.5

    To identify if this model can predict BC loading consistently across various sources of BC, samples

    were classified based on the potential differences in sources, including the location, time period,

    and weekdays vs. weekends. As shown in Table 3, 1266 samples were collected from six different

    sites across Ontario, Canada and were classified into three groups based on the location: Near-

    Road, Highway, and Residential. For 8 hours samples collected across Ontario, we also classified

    the samples by the time of day (0:00-8:00; 8:00-16:00; and 16:00-0:00) as the vehicular traffic

    patterns may differ among these time periods. Lastly, given the different patterns on weekdays and

    weekends (e.g., differences in diesel truck traffic), the samples were classified into weekday

    (N=1012), and weekend (N=157) samples. For each category, we investigate the model residuals

  • 35

    (as shown in Fig. 16) of each category and identify any systematic differences between samples in

    each category.

    Figure 16 Boxplots of residuals for all Ontario samples: three categories, three time

    periods in the day, and Weekdays vs. Weekends. It shows all categories agree with

    the model without any systematic bias, which means that this RGB-based model can

    measure BC loading consistently and accurately despite the variety of the PM2.5

    sources.

    As shown in Figure 16, there is no systematic difference in the residuals observed in any of the

    categories, which suggests that the proposed RGB-based model can measure BC loading

    consistently and accurately for samples from various potential BC sources. It is likely that BC

    exhibits similar spectral properties regardless of the emission source, and thus enabling the

    proposed model to predict BC across different sources relatively accurately.

    3.4. RGB-based Model to Predict EC Loading

    Based on the differences in measuring principles between BC and EC, a different model was

    trained and assessed for Beijing samples. As shown in Table 3, 478 quartz filter samples were

  • 36

    collected from pre-diabetic participants living in downtown Beijing, China. All these samples were

    analyzed using Sunset OC/EC analyzer with NIOSH protocol as mentioned in Chapter 1. In this

    case, the linear interactions model (Equation 10) was chosen. As shown in Figure 17a), this

    model presents the best performance during hold out validations repeated over 2000 times with a

    mean R2 of 0.91 and a mean RMSE of 0.9 [µg/cm2], which is 21.1% of the mean for the EC

    loading. The LOD was determined using Equation 8, which is 0.5 [µg/cm2]. The LOD is

    comparable to LOD (0.15 [µg/cm2]) of the reference instrument (off-line Sunset OC/EC analyzer)

    used for EC quantification (Karanasiou et al., 2015).

    The validated model performs very well for the whole dataset with an R2 of 0.91 and an RMSE of

    0.9[µg/cm2], as shown in Figure 18 a). Furthermore, the residuals are randomly scattered despite

    the changes in the OC to EC ratio as shown in Figure 18 b), which suggests that this model is

    not influenced by OC when predicting EC loading.

    Figure 17 One of two thousand hold out validations for the linear interactions

    regression model. This randomly chosen 20% testing dataset shows strong agreement

    with the model trained using the remaining 80% data. b) distribution of the BC

  • 37

    loading for all the Ontario samples and the distribution of the 2000 RMSEs from 2000

    times hold out validations. This model shows strong predictability of BC loading for

    2000 tests with the median of 2000 R2 and 2000 RMSEs equal to 0.91 and 0.9 [µg/cm2],

    respectively. Also, the detection limit of this model is 0.5 [µg/cm2], which means nearly

    0.07% of the dataset is smaller than LOD.

    Figure 18 a) EC loading reported by the linear interactions regression model versus

    actual EC loadings measured by Sunset OC/EC analyzer for Beijing samples (N=478).

    b) The residuals of the selected model versus OC to EC ratio. It shows that the

    residuals are randomly distributed in two sides of the y=0 line, which indicates that

    the OC deposited on the filter does not affect the predictability of this model.

    All Beijing samples in this study have very different sources of PM2.5 because the pattern of

    personal exposure may vary significantly among individuals. The consistency and precision of the

    model predictability suggest that the selected EC model can quantify EC loadings independent of

  • 38

    potential variabilities in the sample sources and the amount of OC. However, the EC model does

    not perform as well as the BC model for Ontario samples, likely because of potential errors arising

    from colour information extraction and also the difference between the measurement principles

    underlying the reference instruments (aethalometer and Sunset OC/EC analyzer) and that used by

    the smartphone. Smartphone image analysis is based on the optical properties of the BC/EC loaded

    on the filter, which is a technique similar to that used by an aethalometer. In contrast, the Sunset

    OC/EC analyzer operates based on the chemical properties of the EC loaded on the filter.

    To investigate this phenomenon, the BC model from Canada was applied to the Beijing samples

    to predict the EC loadings (Figure 19). The slope (0.99) of this trendline indicates that smartphone

    image analysis is very consistent and precise for both the Ontario and Beijing samples with a 1%

    error. Thus, the relatively poor predictability of EC model may be due solely to the difference in

    the measurement techniques respectively deployed by the smartphone and the Sunset OC/EC

    analyzer (light absorption and the thermal-optical method). Furthermore, because of the similarity

    in the operating principle of an aethalometer and smartphone images, the BC model is expected to

    have a better agreement with BC measured using light attenuation than that indicated by the

    thermal-optical technique.

  • 39

    Figure 19 Predicted BC using the BC model vs. actual EC loading for Beijing samples.

    It does not show a good agreement when using the BC model to predict EC with an

    R2 of 0.84 and an RMSE of 0.9 (µg/cm2). However, the slope (0.99) of this trendline

    exhibits that smartphone image analysis is consistent with a 1% error. Thus, the

    relatively poor predictabilities of EC loading using the BC model is due to the

    differences in their measuring techniques.

    Lastly, as shown in Figure 19, it does not show a good agreement when using the BC model to

    predict EC with an R2 of 0.84 and an RMSE of 0.9 (µg/cm2), which is reasonable, due to the

    difference in operating principles of BC and EC. Numerous studies have shown that BC measured

  • 40

    by aethalometer and EC measured by thermal-optical OC/EC analysis does not show a good

    correlation (R2=0.65-0.85) between each other (Healy et al., 2017).

    3.5. Integrated RGB Model for All Samples

    An integrated RGB model was trained to investigate the possibility of using a single model to

    predict both BC and EC loadings collected in this study. Same as the separated BC and EC models,

    linear interactions regression model was applied because of its best performance in hold-out

    validations. As shown in Figure 20 a), the integrated RGB model shows good predictability with

    R2 of 0.92, and RMSE of 1.0 µg/cm2, respectively, which is comparable with previous studies as

    mentioned in Table 2. However, the results indicate that the integrated model cannot predict EC

    loading as robustly as Beijing EC model does by comparing Figure 20 a) and Figure 20 b). But

    the integrated model still has strong predictability for BC quantification. It is reasonable and

    worthwhile to train a separate model for measuring EC loading.

    Figure 20 a) BC/EC loading predicted by the linear interactions regression model

    trained using the whole data set versus actual BC/EC loadings measured by reference

  • 41

    instruments. b) BC loading predicted by Ontario BC model and EC loading predicted

    by Beijing EC model versus actual BC/EC loadings measured by reference

    instruments.

    Overall, in this study, a MATLAB program was developed for image analysis and BC/EC

    quantifications, which will be translated to a smartphone app both in iOS and Android platforms

    in the future. The link of demonstration videos of the MATLAB program and iOS app is available

    at https://drive.google.com/drive/folders/1kqDysjEmi_G5jaqiR2iqff5QgSo8EnO8?usp=sharing.

    Furthermore, the RGB-based BC and EC models were trained and assessed, and these results give

    enough evidence that the BC/EC models can quantify BC and EC loading with comparable

    accuracy (CV(RMSE)=18.1% and 21.1%, respectively) with previous studies as listed in Table

    2. Also, these two models are robust enough to consistently predict BC/EC loadings despite the

    various sources and compositions of BC with the LOD of 0.27, and 0.50 [µg/cm2], respectively.

    In another word, with the help of the great predictabilities of these two models, all the filter samples

    exposed at a flowrate of 8 [L/min] (breathing rate of a healthy adult while sitting) under a BC

    concentration of 0.84 [µg/m3] (annual mean BC concentration of downtown Toronto) for 13.3 hr

    will be detectable by this method. Moreover, the integrated RGB model shows a promising result,

    but it is reasonable to use a separate model for EC quantification.

    https://drive.google.com/drive/folders/1kqDysjEmi_G5jaqiR2iqff5QgSo8EnO8?usp=sharing

  • 42

    Chapter 4 Conclusions and Recommendation

    4.1. Conclusions

    The main contribution of this study is that with the aid of the image processing program, an

    affordable, accessible, and relatively accurate method for BC and EC quantifications was

    developed using smartphone images of particle-loaded filters. Moreover, this method is capable of

    predicting BC/EC loadings consistently and precisely for various sources of black carbon.

    The principle of this method is based on the light absorption of loaded particles on the filter

    substances, which is similar to the principles of one of the reference instruments, aethalometer

    (AE31/33). This is the reason that this method can predict BC loadings more accurately than that

    of EC loadings. However, despite the differences between our method and the thermal optical

    technique of the other reference instrument, Sunset OC/EC analyzer, the predictability of the EC

    model is still comparable with the previous literature.

    Smartphone offers distinct advantages in the measurement of BC/EC loading: it is non-destructive,

    easily accessible, off-the-shelf, low cost, and fast. The use of smartphone makes it possible to

    popularize a BC/EC sensor to the community, which will be possible to collect more data about

    BC/EC exposure, which will raise awareness of the adverse effects caused by black carbon both

    on public health and climate change.

  • 43

    4.2. Recommendation

    Based on the results of this study, several recommendations are presented. More work should be

    done to train a model based on different colour spaces (e.g., CIELAB, CIEXYZ) to investigate if

    the different expressions of the colour information will affect the predictabilities of the BC/EC

    models.

    To commercialize and popularize our method, more smartphone cameras should be tested, and the

    MATLAB program should be optimized and then translated into a smartphone app both in Android

    and iOS platforms.

    Figure 21 a) Sketch of the experimental set-up. b) Manikin wearing a facemask and

    “breathing” using a “breathing pump” (can inhale and exhale at a flow rate of

    8L/min). c) Ambient fine particle (PM2.5) concentrator (concentrate the PM2.5

    concentration in the chamber for 64 times).

    Furthermore, feasibility tests on some other easier sampling processes for PM2.5 are required, such

    as sampling of particles onto facemasks. This can be done with a face mask exposure experiment

    in our facility as shown in Figure 21. The “breathing pump” was plugged in the back of the

    manikin’s head with inhaling and exhaling operations at a flow rate of 8L/min (close to the

  • 44

    human’s breathing rate at rest) to simulate human’s breathing. Moreover, the BC concentration in

    the chamber can be monitored using an aethalometer. With the raw images of the exposed face

    masks and reference data of BC loading, a new “face mask model” could be built.

  • 45

    Bibliography

    Beijing Municipal Bureau of Statistics. (2018). Economic Development of Beijing Maintained a

    Stable and Good Momentum in 2017. Retrieved July 5, 2018, from

    http://tjj.beijing.gov.cn/English/PR/201801/t20180125_391609.html

    Bond, T. C., Streets, D. G., Yarber, K. F., Nelson, S. M., Woo, J. H., & Klimont, Z. (2004). A

    technology-based global inventory of black and organic carbon emissions from combustion.

    Journal of Geophysical Research: Atmospheres, 109(14), 1–43.

    https://doi.org/10.1029/2003JD003697

    Chakrabarti, A., Scharstein, D., & Zickler, T. (2009). An Empirical Camera Model for Internet

    Color Vision. Procedings of the British Machine Vision Conference 2009, 51.1-51.11.

    https://doi.org/10.5244/C.23.51

    Cheng, J. Y. W., Chan, C. K., & Lau, A. P. S. (2011). Quantification of airborne elemental

    carbon by digital imaging. Aerosol Science and Technology, 45(5), 581–586.

    https://doi.org/10.1080/02786826.2010.550960

    Cheng, M. H., Hu, M. C., & Lan, K. C. (2017). Tongue fur detection on the smartphone.

    Proceedings - 2016 IEEE International Conference on Bioinformatics and Biomedicine,

    BIBM 2016, 1365–1371. https://doi.org/10.1109/BIBM.2016.7822719

    Chow, J. C., Watson, J. G., Pritchett, L. C., Pierson, W. R., Frazier, C. A., & Purcell, R. G.

    (1993). The Dri Thermal Optical Reflectance Carbon Analysis System - Description,

    Evaluation and Applications in United-States Air-Quality Studies. Atmospheric

    Environment Part A-General Topics, 27(8), 1185–1201. https://doi.org/10.1016/0960-

    1686(93)90245-t

    Cooke, W. F., Liousse, C., Cachier, H., & Radioactivit, F. (1999). for carbonaceous aerosol and

    implementation radiative impact in the ECHAM4 model found using bulk aerosol emission

    factors , while global black carbon emissions carbon emissions m -2 were Because of

    secondary carbon aerosol be doubled m -2 . The resultant, 104.

    de la Sota, C., Kane, M., Mazorra, J., Lumbreras, J., Youm, I., & Viana, M. (2017).

  • 46

    Intercomparison of methods to estimate black carbon emissions from cookstoves. Science of

    the Total Environment, 595, 886–893. https://doi.org/10.1016/j.scitotenv.2017.03.247

    Du, K., Wang, Y., Chen, B., Wang, K., Chen, J., & Zhang, F. (2011). Digital photographic

    method to quantify black carbon in ambient aerosols. Atmospheric Environment, 45(39),

    7113–7120. https://doi.org/10.1016/j.atmosenv.2011.09.035

    He, K., Gkioxari, G., Dollar, P., & Girshick, R. (2017). Mask R-CNN. Proceedings of the IEEE

    International Conference on Computer Vision, 2017–Octob, 2980–2988.

    https://doi.org/10.1109/ICCV.2017.322

    Healy, R. M., Sofowote, U., Su, Y., Debosz, J., Noble, M., Jeong, C. H., … Munoz, A. (2017).

    Ambient measurements and source apportionment of fossil fuel and biomass burning black

    carbon in Ontario. Atmospheric Environment, 161, 34–47.

    https://doi.org/10.1016/j.atmosenv.2017.04.034

    Karanasiou, A., Minguillón, M. C., Viana, M., Alastuey, A., Putaud, J.-P., Maenhaut, W., …

    Kuhlbusch, T. A. J. (2015). Thermal-optical analysis for the measurement of elemental

    carbon (EC) and organic carbon (OC) in ambient air a literature review. Atmospheric

    Measurement Techniques Discussions, 8(9), 9649–9712. https://doi.org/10.5194/amtd-8-

    9649-2015

    Khuzestani, R. B., Schauer, J. J., Wei, Y., Zhang, Y., & Zhang, Y. (2017). A non-destructive

    optical color space sensing system to quantify elemental and organic carbon in atmospheric

    particulate matter on Teflon and quartz filters. Atmospheric Environment, 149, 84–94.

    https://doi.org/10.1016/j.atmosenv.2016.11.002

    Kuzmina, I., Lacis, M., Spigulis, J., Berzina, A., & Valeine, L. (2015). Study of smartphone

    suitability for mapping of skin chromophores. Journal of Biomedical Optics, 20(9), 090503.

    https://doi.org/10.1117/1.JBO.20.9.090503

    MathWorks. (2014). Object Detection in a Cluttered Scene Using Point Feature Matching -

    MATLAB & Simulink. Retrieved May 18, 2018, from

    https://www.mathworks.com/help/vision/examples/object-detection-in-a-cluttered-scene-

    using-point-feature-matching.html#d119e845

  • 47

    Ministry of Transportation. (2012). Provincial Highways Traffic Volumes 1998-2012 (Vol. i).

    Olson, M. R., Graham, E., Hamad, S., Uchupalanun, P., Ramanathan, N., & Schauer, J. J.

    (2016). Quantification of elemental and organic carbon in atmospheric particulate matter

    using color space sensing-hue, saturation, and value (HSV) coordinates. Science of the Total

    Environment, 548–549, 252–259. https://doi.org/10.1016/j.scitotenv.2016.01.032

    Petzold, A., Ogren, J. A., Fiebig, M., Laj, P., Li, S. M., Baltensperger, U., … Zhang, X. Y.

    (2013). Recommendations for reporting black carbon measurements. Atmospheric

    Chemistry and Physics, 13(16), 8365–8379. https://doi.org/10.5194/acp-13-8365-2013

    Poushter, J., & Stewart,Rhonda. (2016). Smartphone Ownership and Internet Usage Continues

    to Climb in Emerging Economies.

    Qin, Y., & Xie, S. D. (2012). Spatial and temporal variation of anthropogenic black carbon

    emissions in China for the period 1980-2009. Atmospheric Chemistry and Physics, 12(11),

    4825–4841. https://doi.org/10.5194/acp-12-4825-2012

    Ramanathan, N., Lukac, M., Ahmed, T., Kar, A., Praveen, P. S., Honles, T., … Ramanathan, V.

    (2011). A cellphone based system for large-scale monitoring of black carbon. Atmospheric

    Environment, 45(26), 4481–4487. https://doi.org/10.1016/j.atmosenv.2011.05.030

    Reinhard, E., Khan, E., Akyuz, A., & Johnson, G. (2008). Color Imaging: Fundamentals and

    Applications (2nd ed.). CRC Press.

    Schwartz E. R., S. E. and L. (2012). Interactive comment on “Are black carbon and soot the

    same?” by P. R. Buseck et al.: Disagreement on proposed nomen