s1p2.ppt - ec.europa.eu · murtagh f. (2005). correspondence analysis with r and java. chap. 5....
TRANSCRIPT
Assessing the repeatability of panels and panellists in l ti th h th i f t t tsensory evaluations through their free-text comments
Interest and use of textual data in sensometrics
NTTS 2011
Mónica Bécue BertautMarine CadoretB l hi K tBelchin Kostov
Jordi TorrensPilar Urpi
Jérôme Pagès
NTTS2011
Outline
1. Introduction
2 Data structures2. Data structures
3. Illustrative example
4. Statistical analysis. Correspondence analysis4. Statistical analysis. Correspondence analysis
r. Conclusive remarks
2/21
NTTS2011
1. Introduction Food and wine industries use sensory tests to assess productsquality and acceptance by panels of judges (experts orconsumers)
Every time more, other fields resort to this kind of tools
Sensometrics, that is, sensory data collection and analysisthrough statistical methods, is gaining importance as a tool forquality in industry
Among the data, textual data are more and more used andtextual statistics are involved in their analysis Statisticaltextual statistics are involved in their analysis. Statisticalmethods have to take into account the particular features of thecorpora issued from hall tests sessions (short corpora composedof very short texts, enriched by a very important information onthe products)the products)
3/213/21
NTTS2011
2. Complex data structuresa. Data are collected on (few) products through asking to panels of expertsor consumers that:
scale a series of descriptors,give liking scores,group the products into clusters (sorting task)group the products into clusters (sorting task),provide a two dimensional configuration of the products depending ontheir similarity (napping),add free text commentsadd free‐text comments,etc.
4/21
NTTS2011
Thus, the data that describe the products are aggregated data (means, etc.)
Pr. Judge. Rank Intensity.
… k … Global assessment
1 1 2 2 2 81 2 5 4 3
… 1 8 2 1
…
i j xijk
…
Pr. Intensity.
… k … Glo. Ass.
10 8 5 4 5
1 2.25 2.85 8.24 2 …
i
ikx
Mónica Bécue-Bertaut Free-text comments
… 10 5.57 4.78 5.08
5/21
NTTS2011
b. The assessments of experts and/or consumers are enriched by objectivecharacteristics ─chemical data etc ─ leading to complex multiplecharacteristics chemical data, etc. leading to complex multipletables (rows=products; columns=aggregated data):
J sets of JJ sets of
1 11 Jfjjj JcJq
Jq sets of Jc Jf frequencys tables
1 11 Jfjjj JcJq
Jq
quantitative variablesJf
sets ofcategorical variables
k1 Kj 1
1
1
Jf
Kjk
jq
k1 Kj 1
1
1
Jf
Kjk
jq
zikjxikj fikji zikjxikj fikji
1 Qj
I
1 Qj
I
Qj
Qj
Qj
6/21
NTTS2011
3. Illustrative example
Hall test session: 8 cavas are tasted by 17 experts in two occasions
Catalan “Cavas”, that is, “Champagne‐type” wines
…
Mónica Bécue-Bertaut Free-text comments 7/21
NTTS2011The judges have to observe the “labelled categorized napping protocol”
Mónica Bécue-Bertaut Free-text comments 8/21
NTTS2011
4. Statistical analysis
Objectives
4 y
Objectives
To measure the repeatibility of the results through both sessions,either for the whole panel or for every judge/panellist
In our case, to study the contribution of textual data to this problem
Mónica Bécue-Bertaut Free-text commentsMónica Bécue-Bertaut Lexical and textual statistics 9/399/21
Juge B
NTTS2011
Corpus issued from the free‐text commentsJ g
3040
B.1
6-Cava63- Cava 3
7-Cava7l
8 Cava 8
Words associated to groups to cavas
G1-frescG2-envellitG3-sulfur
1. poc aroma, poma2. reducció,evolucionat, fondo
brut
010
20
1-Cava 14-Cava 4
5-Cava 5
2-Cava28-Cava 8 brut
3. cítric, àcid4. melmelada5. torrat lleu6. mel, golós7. molt torrats, confituraLength of the corpus:
0 10 20 30 40 50 60
B.2 Words associated to groups to cavas
8. torratg p
N=1297 occurrences
Composed of
030
40
4-Cava 46-Cava 6
7-Cava 75-Cava 5
1-Cava 1
8_Cava 8
to groups to cavas
G1-envellitG2-sulfurG3-fresc-afruitat
1. 2. fruita madura, inici fons brut3. cític, àcid4. torrat l fi
Composed ofV=190 different words
V’ 6 d d l i
010
20
2-Cava 2
3-Cava 3 5. molt torrat, confitura6.7. melmelada8. poc aroma, poma
V’=62 words used at least 4 times
0 10 20 30 40 50 60
10/21
Words
Global table crossing
NTTS2011
Textual data structure
Cavas
Global table, crossingcavas×words
Keeping only the words used at least 4 times
Words Words
Cavas
s
vas
C Cav
Sesión 1 Session 2
11/21
Global representation through correspondence analysis of the cavas×words t able
NTTS2011
0.50Factor 2 ‐ 18.69 %
4‐Cava‐4
correspondence analysis of the cavas×words t able
0.25
5 4 Cava 4
5‐Cava‐58 Cava 8
0
0.25
3‐Cava‐36‐Cava‐6
8‐Cava‐8
0 3 Cava 37‐Cava‐7
‐0.25
2 Cavas 2
‐0.4 0 0.4
‐0.50
Factor 1 ‐ 37.24 %
1‐Cava‐12‐Cavas‐2
12/21
NTTS2011
Global representation. Superimposed representation of rows and columns
NTTS2011
Factor 2
melós
Superimposed representation of rows and columns
0.8
4-Cava-4encens especialfort lleuger
menys
metal·lic
moltoxidatpaladar
àcid0.4
3-Cava-3
5-Cava-5
6-Cava-6
8-Cava-8
autolitic
blanca
café
cíticdolç
equilibrat
floralfresc
fumat
intens
jovellarg llevat
mesos
nasquímic
rancirodósecs
sulfur
terciaristoffe
torratàcid
0 4
0 6 Cava 6
7-Cava-7l
aromacompota cos
criança envellimentfruita
fusta
madur
mesos
migmés plapomaprofunditat
secs
verd
-0.8
-0.41-Cava-1
2-Cava-2amarg
brutdiscret
evolucionat
greix
lactic mel
poc
reduit
-1.2 -0.8 -0.4 0 0.4 0.8Factor 1
clorofilacolgustatiu
mel
mentareduit reserva
13/21
NTTS2011
Comparison of the global representation issued from
correspondence analysis of the cavas×words t able
hierarchical multiple factor analysis of categorized napping
46
1 Cava 1
2‐Cava 2
‐0.4 0.4
0.501‐Cava‐1
2‐Cavas‐2
02
2 (1
8.29
%)
1‐Cava 1
5‐ Cava 5
6‐Cava 6 0.250
men
sion 2 ‐
9 % 3‐Cava‐36‐Cava‐6
7‐Cava‐7
-4-2
Dim
3‐Cava 3
4‐Cava 4
7‐Cava 7
8 Cava 8
‐0..25
Dim
18.69
5‐Cava‐58‐Cava‐8
-4 -2 0 2 4 6
-
Dim 1 (23.43 %)
8‐Cava 8‐0.50
Dimension r 1 ‐ 37.24 %
4‐Cava‐4
14/21
NTTS2011
Repeatibilityp y
15/21
High repeatibility between both sessions
NTTS2011
Factor 2 - 18.69 %
For each session, the wines are projected at the centroid of the
0 4
0.8 Cava4-2
Cava4‐1cava5-1
projected at the centroid of the words used in the free-comments of the session
0
0.4Cava5-2
Cava6‐2
Cava7‐2Cava8‐2
Cava3‐1
C
Cava8‐1RV=0.833P-value<0.01
-0.4
Cava2 2
Cava3‐2Cava2‐1
Cava6‐1Cava7‐1
1 1 0 0 0 0 1 0
-0.8
Cava1‐2Cava2‐2
Cava1‐1to compare with RV = 0,7when the repeatibilty is studied from categorized
-1.5 -1.0 -0.5 0 0.5 1.0Factor 1 - 37.24 %
napping
16/21
Study of the repeatibility of every judge
NTTS2011
Factor 2 - 18.69 %Factor 2 - 18.69 %
T14-3-2T14-8-2
0.75
T4-3-2T4-5-2
T4-3-1T4-5-1
T4-7-1
T4-8-1 0.75
T14-1-2
T14-4-2
T14-5-2T14-6-2
T14-7-2
T14-3-1T14-5-1T14-7-1
T14-8-1
-0.75
0
T4-4-2
T4-6-2
T4-7-2T4-8-2
T4-1-1
T4-4-1
T4-6-1
-0.75
0 T14-2-2
T14-1-1
T14-4-1
T14-6-1
-1.50
T4-1-2 T4-2-2
-1.50
T14-2-1
-1.5 -1.0 -0.5 0 0.5 1.0
-2.25
Factor 1 - 37.24 %
T4-2-1
-1.5 -1.0 -0.5 0 0.5 1.0
-2.25
Factor 1 - 37.24 %
J d P RV 0 293 J d F RV 0 683 ( l <0 05)Judge P RV=0.293(in categ napping, RV=0.408)
Judge F RV=0.683 (p.value<0.05)(in categ napping, RV=0.674)
17/21
NTTS2011
We have compared the free‐texts assessments given by onep g ypanel on the same products in two different sessions
We can also compare the free‐texts assessments given by twopanels on the same products, even if they use differentl b i t i f d l ilanguages, by using an extension of correspondence analysis(reproducibility)
18/21
NTTS2011
Eight Catalan wines are tasted by one French and oneC l l (l b ll d i k)
PG06
Catalan panels (labelled sorting task).
2
2 %
)
PG06
0.5
vegetal
complex
fruitaxocolata
pebrot 1.0
bon
noirrond
souple
floral0
1
Dim
2 (2
0.72 PG05
PS06
ES05 ES06
0.0 alcoholic
cafè
cassisconfitada
dolçespeciat
fumfusta
lleugermel
torrat
fruita
cosdó
madur
00.
5 chaleureux
complexe fraisfruitléger
mûr
non-boisé
puissancevégétal
équilibre
réglissegras
floral-1
EG05EG06
PS05
-0.5 acetal
acidesafloral
picant
regalèssiataní vainillasec
astringentrodó
-0.5
0.0
id
boischarpenté
farineuxmou
plat
réductionvanille
acétatesucréalcool
épiceréglisse
taninsastringencecuit
-3 -2 -1 0 1 2
-2
Dim 1 (28.5 %)-1.5 -1.0 -0.5 0.0 0.5 1.0
greixcítric
-1.0 -0.5 0.0 0.5
- acide
amer
animaldéfaut
sectoasté
Catalan French19/21
4. Conclusive remarksNTTS2011
Short corpora often composed of very short texts can provide usefulShort corpora, often composed of very short texts, can provide usefulinformation
On condition to be produced from stimuli that structure themp
Cross cultural studies in different languages can be easily performed
Very important economical spin‐off depend on sensory tests and ontheir analysis through sensometrics.
European norm for “extra olive oil “ label relies on panels assessments
Cross‐cultural studies are increasing
The use of free‐text comments is increasing in sensory studies
A similar approach can be used for open‐ended questions collected indifferent languages
20/21
NTTS2011
( )
Some references
Benzécri (1981). Pratique de l’analyse des données. Tome 3. Linguistique etlexicologie. Dunod
Lebart, L., Salem, A. (1998). Exploring textual data. Kluwer.
Murtagh F. (2005). Correspondence analysis with R and Java. Chap. 5.Content Analysis of Text. Chapman & Hall.
JADT P diJADT. Proceeding
Bécue‐Bertaut M., Pagès J. (2008) Multiple Factor Analysis and Clustering of a, g J ( ) p y gMixture of Quantitative, Categorical and Frequency Data. ComputationalStatistics and Data Analysis, 52, 3255 – 326
21/21