near infrared spectroscopy and chemometrics: a marriage...

Post on 25-Jul-2020

9 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Near Infrared Spectroscopy andChemometrics: a Marriage Made inHeaven

Tom FearnUCL, Londontom@stats.ucl.ac.uk

Introduction

• Quantitative near infrared spectroscopy needschemometrics, but both sides of this partnershiphave benefited from the interaction:– NIR has provided chemometrics with a success story,– NIR applications have inspired new chemometric

methodology and have supplied a market for thedevelopment and sale of chemometric software.

• Over the last 30 years the two methodologieshave prospered together.

Near infrared spectra of 40 biscuit doughs

Spectra with the mean subtracted

Early history (1970s and 80s) - MLR

• Sometime in the 1970s, Karl Norris started usingmultiple linear regression to make quantitative NIRcalibrations.

• This led directly to a generation of filterinstruments, calibrated by MLR, with manysuccessful applications in food and agriculture.

• MLR combined with variable (i.e. wavelength)selection was the routine method for scanninginstruments.

Five wavelengths for biscuit composition

MLR and variable selection - comments

• The regression can be ill-conditioned, because ofhigh correlations between the predictors.

• In the early days wavelength selection washampered by slow computers. It went out offashion when methods like PLS appeared.However– It has enjoyed a revival: faster computers, stochastic

search, genetic algorithms, etc.– If you want to build a cheap and fast instrument, it is

useful to reduce the measurement to a few wavelengths

Consolidation (1980s - …) – PLS and PCR

• In the early 1980s the first commercial scanninginstruments appeared.

• This prompted a wave of research investigatingnew applications, especially in food & agriculture.Calibration methods that used the full spectrumsoon began to be employed– Principal component regression (PCR), Cowe and

McNicol in Scotland,– Partial least squares regression (PLSR), Wold, Martens

even further north.

• These have become the standard approaches.

Coefficients for PCR(7) and PLSR(5) for fat

PCR and PLSR - comments

• Although they involve more complex maths, theyare actually easier to implement (especially semi-automatically) than MLR with variable selection.

• In this application there are sound reasons forbelieving in low-dimensional bases for the spectra,so using factor-type methods makes sense.

• There is lots of scope for informative plots and(eventually) there is software that draws them.

• The plots allow (over-?) interpretation of the basisof a calibration.

Loadings for PCR and PLSR for fat

Getting cleverer (?) – ANN and SVM

• More recent developments have focussed on theuse of nonlinear calibration methods borrowedfrom Computer Science

• First came artificial neural networks (ANN)– Very fashionable for a while– A lot of work using small training sets and inadequate

validation caused some disillusion– A bit more of a black box than PLS– Still being used in some major applications, with large

training sets

. . . ANN and SVM, continued

• Then came support vector machines (SVM)– Still the height of fashion– Not being abused so badly (lessons learned?)

• Then came ??– Look and see what the machine learning community is

doing now. NIR will be doing it in a few years.

• Advice: try PLSR or PCR first!

Current challenges

• Large training sets – is it better to use local linearor global nonlinear calibration methods?

• Imaging spectroscopy– How to deal with a data cube of 80k spectra for one

sample?– How to calibrate when you don’t know what each pixel

is?

Thanks for listening.

Questions?

top related