s1p2.ppt - ec.europa.eu · murtagh f. (2005). correspondence analysis with r and java. chap. 5....

Assessing the repeatability of panels and panellists in l ti th h th i f t t tsensory evaluations through their free-text comments

Interest and use of textual data in sensometrics

NTTS 2011

Mónica Bécue BertautMarine CadoretB l hi K tBelchin Kostov

Jordi TorrensPilar Urpi

Jérôme Pagès

NTTS2011

Outline

1. Introduction

2 Data structures2. Data structures

3. Illustrative example

4. Statistical analysis. Correspondence analysis4. Statistical analysis. Correspondence analysis

r. Conclusive remarks

2/21

NTTS2011

1. Introduction Food and wine industries use sensory tests to assess productsquality and acceptance by panels of judges (experts orconsumers)

Every time more, other fields resort to this kind of tools

Sensometrics, that is, sensory data collection and analysisthrough statistical methods, is gaining importance as a tool forquality in industry

Among the data, textual data are more and more used andtextual statistics are involved in their analysis Statisticaltextual statistics are involved in their analysis. Statisticalmethods have to take into account the particular features of thecorpora issued from hall tests sessions (short corpora composedof very short texts, enriched by a very important information onthe products)the products)

3/213/21

NTTS2011

2. Complex data structuresa. Data are collected on (few) products through asking to panels of expertsor consumers that:

scale a series of descriptors,give liking scores,group the products into clusters (sorting task)group the products into clusters (sorting task),provide a two dimensional configuration of the products depending ontheir similarity (napping),add free text commentsadd free‐text comments,etc.

4/21

NTTS2011

Thus, the data that describe the products are aggregated data (means, etc.)

Pr. Judge. Rank Intensity.

… k … Global assessment

1 1 2 2 2 81 2 5 4 3

… 1 8 2 1

…

i j xijk

…

Pr. Intensity.

… k … Glo. Ass.

10 8 5 4 5

1 2.25 2.85 8.24 2 …

i

ikx

Mónica Bécue-Bertaut Free-text comments

… 10 5.57 4.78 5.08

5/21

NTTS2011

b. The assessments of experts and/or consumers are enriched by objectivecharacteristics ─chemical data etc ─ leading to complex multiplecharacteristics chemical data, etc. leading to complex multipletables (rows=products; columns=aggregated data):

J sets of JJ sets of

1 11 Jfjjj JcJq

Jq sets of Jc Jf frequencys tables

1 11 Jfjjj JcJq

Jq

quantitative variablesJf

sets ofcategorical variables

k1 Kj 1

1

1

Jf

Kjk

jq

k1 Kj 1

1

1

Jf

Kjk

jq

zikjxikj fikji zikjxikj fikji

1 Qj

I

1 Qj

I

Qj

Qj

Qj

6/21

NTTS2011

3. Illustrative example

Hall test session: 8 cavas are tasted by 17 experts in two occasions

Catalan “Cavas”, that is, “Champagne‐type” wines

…

Mónica Bécue-Bertaut Free-text comments 7/21

NTTS2011The judges have to observe the “labelled categorized napping protocol”

Mónica Bécue-Bertaut Free-text comments 8/21

NTTS2011

4. Statistical analysis

Objectives

4 y

Objectives

To measure the repeatibility of the results through both sessions,either for the whole panel or for every judge/panellist

In our case, to study the contribution of textual data to this problem

Mónica Bécue-Bertaut Free-text commentsMónica Bécue-Bertaut Lexical and textual statistics 9/399/21

Juge B

NTTS2011

Corpus issued from the free‐text commentsJ g

3040

B.1

6-Cava63- Cava 3

7-Cava7l

8 Cava 8

Words associated to groups to cavas

G1-frescG2-envellitG3-sulfur

1. poc aroma, poma2. reducció,evolucionat, fondo

brut

010

20

1-Cava 14-Cava 4

5-Cava 5

2-Cava28-Cava 8 brut

3. cítric, àcid4. melmelada5. torrat lleu6. mel, golós7. molt torrats, confituraLength of the corpus:

0 10 20 30 40 50 60

B.2 Words associated to groups to cavas

8. torratg p

N=1297 occurrences

Composed of

030

40

4-Cava 46-Cava 6

7-Cava 75-Cava 5

1-Cava 1

8_Cava 8

to groups to cavas

G1-envellitG2-sulfurG3-fresc-afruitat

1. 2. fruita madura, inici fons brut3. cític, àcid4. torrat l fi

Composed ofV=190 different words

V’ 6 d d l i

010

20

2-Cava 2

3-Cava 3 5. molt torrat, confitura6.7. melmelada8. poc aroma, poma

V’=62 words used at least 4 times

0 10 20 30 40 50 60

10/21

Words

Global table crossing

NTTS2011

Textual data structure

Cavas

Global table, crossingcavas×words

Keeping only the words used at least 4 times

Words Words

Cavas

s

vas

C Cav

Sesión 1 Session 2

11/21

Global representation through correspondence analysis of the cavas×words t able

NTTS2011

0.50Factor 2 ‐ 18.69 %

4‐Cava‐4

correspondence analysis of the cavas×words t able

0.25

5 4 Cava 4

5‐Cava‐58 Cava 8

0

0.25

3‐Cava‐36‐Cava‐6

8‐Cava‐8

0 3 Cava 37‐Cava‐7

‐0.25

2 Cavas 2

‐0.4 0 0.4

‐0.50

Factor 1 ‐ 37.24 %

1‐Cava‐12‐Cavas‐2

12/21

NTTS2011

Global representation. Superimposed representation of rows and columns

NTTS2011

Factor 2

melós

Superimposed representation of rows and columns

0.8

4-Cava-4encens especialfort lleuger

menys

metal·lic

moltoxidatpaladar

àcid0.4

3-Cava-3

5-Cava-5

6-Cava-6

8-Cava-8

autolitic

blanca

café

cíticdolç

equilibrat

floralfresc

fumat

intens

jovellarg llevat

mesos

nasquímic

rancirodósecs

sulfur

terciaristoffe

torratàcid

0 4

0 6 Cava 6

7-Cava-7l

aromacompota cos

criança envellimentfruita

fusta

madur

mesos

migmés plapomaprofunditat

secs

verd

-0.8

-0.41-Cava-1

2-Cava-2amarg

brutdiscret

evolucionat

greix

lactic mel

poc

reduit

-1.2 -0.8 -0.4 0 0.4 0.8Factor 1

clorofilacolgustatiu

mel

mentareduit reserva

13/21

NTTS2011

Comparison of the global representation issued from

correspondence analysis of the cavas×words t able

hierarchical multiple factor analysis of categorized napping

46

1 Cava 1

2‐Cava 2

‐0.4 0.4

0.501‐Cava‐1

2‐Cavas‐2

02

2 (1

8.29

%)

1‐Cava 1

5‐ Cava 5

6‐Cava 6 0.250

men

sion 2 ‐

9 % 3‐Cava‐36‐Cava‐6

7‐Cava‐7

-4-2

Dim

3‐Cava 3

4‐Cava 4

7‐Cava 7

8 Cava 8

‐0..25

Dim

18.69

5‐Cava‐58‐Cava‐8

-4 -2 0 2 4 6

-

Dim 1 (23.43 %)

8‐Cava 8‐0.50

Dimension r 1 ‐ 37.24 %

4‐Cava‐4

14/21

NTTS2011

Repeatibilityp y

15/21

High repeatibility between both sessions

NTTS2011

Factor 2 - 18.69 %

For each session, the wines are projected at the centroid of the

0 4

0.8 Cava4-2

Cava4‐1cava5-1

projected at the centroid of the words used in the free-comments of the session

0

0.4Cava5-2

Cava6‐2

Cava7‐2Cava8‐2

Cava3‐1

C

Cava8‐1RV=0.833P-value<0.01

-0.4

Cava2 2

Cava3‐2Cava2‐1

Cava6‐1Cava7‐1

1 1 0 0 0 0 1 0

-0.8

Cava1‐2Cava2‐2

Cava1‐1to compare with RV = 0,7when the repeatibilty is studied from categorized

-1.5 -1.0 -0.5 0 0.5 1.0Factor 1 - 37.24 %

napping

16/21

Study of the repeatibility of every judge

NTTS2011

Factor 2 - 18.69 %Factor 2 - 18.69 %

T14-3-2T14-8-2

0.75

T4-3-2T4-5-2

T4-3-1T4-5-1

T4-7-1

T4-8-1 0.75

T14-1-2

T14-4-2

T14-5-2T14-6-2

T14-7-2

T14-3-1T14-5-1T14-7-1

T14-8-1

-0.75

0

T4-4-2

T4-6-2

T4-7-2T4-8-2

T4-1-1

T4-4-1

T4-6-1

-0.75

0 T14-2-2

T14-1-1

T14-4-1

T14-6-1

-1.50

T4-1-2 T4-2-2

-1.50

T14-2-1

-1.5 -1.0 -0.5 0 0.5 1.0

-2.25

Factor 1 - 37.24 %

T4-2-1

-1.5 -1.0 -0.5 0 0.5 1.0

-2.25

Factor 1 - 37.24 %

J d P RV 0 293 J d F RV 0 683 ( l <0 05)Judge P RV=0.293(in categ napping, RV=0.408)

Judge F RV=0.683 (p.value<0.05)(in categ napping, RV=0.674)

17/21

NTTS2011

We have compared the free‐texts assessments given by onep g ypanel on the same products in two different sessions

We can also compare the free‐texts assessments given by twopanels on the same products, even if they use differentl b i t i f d l ilanguages, by using an extension of correspondence analysis(reproducibility)

18/21

NTTS2011

Eight Catalan wines are tasted by one French and oneC l l (l b ll d i k)

PG06

Catalan panels (labelled sorting task).

2

2 %

)

PG06

0.5

vegetal

complex

fruitaxocolata

pebrot 1.0

bon

noirrond

souple

floral0

1

Dim

2 (2

0.72 PG05

PS06

ES05 ES06

0.0 alcoholic

cafè

cassisconfitada

dolçespeciat

fumfusta

lleugermel

torrat

fruita

cosdó

madur

00.

5 chaleureux

complexe fraisfruitléger

mûr

non-boisé

puissancevégétal

équilibre

réglissegras

floral-1

EG05EG06

PS05

-0.5 acetal

acidesafloral

picant

regalèssiataní vainillasec

astringentrodó

-0.5

0.0

id

boischarpenté

farineuxmou

plat

réductionvanille

acétatesucréalcool

épiceréglisse

taninsastringencecuit

-3 -2 -1 0 1 2

-2

Dim 1 (28.5 %)-1.5 -1.0 -0.5 0.0 0.5 1.0

greixcítric

-1.0 -0.5 0.0 0.5

- acide

amer

animaldéfaut

sectoasté

Catalan French19/21

4. Conclusive remarksNTTS2011

Short corpora often composed of very short texts can provide usefulShort corpora, often composed of very short texts, can provide usefulinformation

On condition to be produced from stimuli that structure themp

Cross cultural studies in different languages can be easily performed

Very important economical spin‐off depend on sensory tests and ontheir analysis through sensometrics.

European norm for “extra olive oil “ label relies on panels assessments

Cross‐cultural studies are increasing

The use of free‐text comments is increasing in sensory studies

A similar approach can be used for open‐ended questions collected indifferent languages

20/21

NTTS2011

( )

Some references

Benzécri (1981). Pratique de l’analyse des données. Tome 3. Linguistique etlexicologie. Dunod

Lebart, L., Salem, A. (1998). Exploring textual data. Kluwer.

Murtagh F. (2005). Correspondence analysis with R and Java. Chap. 5.Content Analysis of Text. Chapman & Hall.

JADT P diJADT. Proceeding

Bécue‐Bertaut M., Pagès J. (2008) Multiple Factor Analysis and Clustering of a, g J ( ) p y gMixture of Quantitative, Categorical and Frequency Data. ComputationalStatistics and Data Analysis, 52, 3255 – 326

21/21

s1p2.ppt - ec.europa.eu · murtagh f. (2005). correspondence analysis with r and java. chap. 5....

Documents