investigation of partition cells as a structural basis suitable for assessments of individual...
DESCRIPTION
Rons, N., PRESENTATION at Context Counts: Pathways to Master Big and Little Data, STI conference, 3-5 September 2014, Leiden, the NetherlandsTRANSCRIPT
STI 2014, Version 04.09.2014 p. 1
Investigation of Partition Cells as a Structural Basis Suitable for
Assessments of Individual Scientists
Nadine Rons
Research Coordination Unit, Vrije Universiteit Brussel (VUB)
STI 2014 p. 2
Reference domains for specialized entities
Context & options Investigated method: Partition cells (journal-based structures smaller than subject categories in global databases)
I. Closer fit to real publication records? II. Level of accuracy compared to customary levels? Potential effect Remaining issues & questions
STI 2014 p. 3
To measure, or not to measure ?
Citation-based figures in assessments of individual scientists ?
Issues Domain-dependent publication and citation behaviour, contributions of co-authors, career stage, data accuracy, …
Context More emphasis in research policy on individual excellence (e.g. ERC Advanced Grants, peer review based selection)
-> Need for suited indicators
STI 2014 p. 4
Performance compared to …
… domain related 'standards' or reference values: averages, thresholds, …, e.g.:
– Field normalized citation impact e.g. CPP/FCSm, MNCS
– Threshold for highly cited papers e.g. top x%; CSS outstandingly cited papers class
... for individual scientists
Needing a more accurately delineated reference domain than subject category-based, reflecting the adequate citation characteristics at ± specialty level
STI 2014 p. 5
Options for approaching a specialty
Import fine-grained classification schemes maintained by the research community (e.g. Chemical Abstracts)
Calculate approximations of specialties using algorithms (involving e.g. bibliographic coupling, co-citation, direct citation, cowords, or a combination) (paper-based)
This paper:
Use journal-based structures that are smaller than subject categories (of intermediate size between journals and entire subject categories)
STI 2014 p. 6
On specialties, journals & subject categories
• Articles on a given subject are published in a nucleus of periodicals more particularly devoted to the subject + with smaller productivities in several other groups of journals. (Bradford, 1934)
≈ • Journals contain articles on a number of different subjects
in varying proportions. The journal is too broad a unit of analysis to reveal the structure of specialties. (Small, 1974)
• Journals are assigned to WoS-categories by subjective, heuristic methods, incl. journal citation patterns. (Pudovkin & Garfield, 2002)
WANTED: Structures generating 'standards' applicable to specialties.
STI 2014 p. 7
Smaller journal-based structures
• Higher precision than subject categories + stability of a journal-based structure
• Cell: publications of interest to a specific set of subject categories (influencing citation characteristics)
A partition of a set X is a set of nonempty subsets of X (blocks, parts or cells of the partition) such that every element x in X is in exactly one of these subsets.
Subject category A
X = A ∪ B
Subject category B
Cell A \ B Cell A � B! Cell B \ A
Each cell of the partition contains all publications associated to exactly the same combination of subject categories:
A only, Cell CA! A and B, Cell CA;B! B only, Cell CB!
!
STI 2014 p. 8
From subject categories to partition cells
JCR Edition 2011 Science Social Sciences
Number of articles 1145591 132104Number of subject categories 176 56 range associated to articles/journals 1->6 1->5 mean number associated to articles 1.6 1.4 % of articles in subject categories with size in range ]10000, 50000] 55.0% 8.6% with size in range ]5000, 10000] 26.8% 24.5% with size in range ]1000, 5000] 17.3% 59.5% with size in range ]500, 1000] 0.7% 6.1% with size in range ]100, 500] 0.2% 1.4%
Number of partition cells 1714 458 % of articles in cells with size in range ]10000, 50000] 21.3% 0.0% with size in range ]5000, 10000] 17.9% 18.7% with size in range ]1000, 5000] 33.6% 41.8% with size in range ]500, 1000] 11.0% 12.3% with size in range ]100, 500] 13.4% 17.3% with size in range ]0, 100] 2.9% 9.8%
STI 2014 p. 9
Partition cells — Two perspectives
I. Closer fit to publication records of individual scientists ?
Sample: ERC Advanced Grants, 1st Call (2008), 2 Panels 'Mathematical foundations' M (21) 'Fundamental constituents of matter' F (14)
Distribution of articles over cells (2000-2007)
II. Level of accuracy compared to customary levels ?
Mean expected number of citations per publication Threshold number of citations for highly cited publications Calculation per cell (2 domains) <-> Customary accuracy levels with calculation per publication year & per citation window
STI 2014 p. 10
I. Concentration of real publication records
Grantee # Cells # Articles Top sharesCombination of subject categories defining the cell2000-2007 & secondaries
M1 1 11 100% Mathematics
M8 3 14 71% Physics, Mathematical21% Mathematics
M11 9 27 56% Mathematics, Applied19% Engineering, Multidisciplinary;Mathematics, Interdisciplinary Applications;Mechanics
M13 4 12 50% Statistics & Probability33% Physics, Mathematical
M16 8 19 37% Computer Science, Interdisciplinary Applications;Physics, Mathematical26% Mathematics, Applied
M20 12 22 23% Computer Science, Artificial Intelligence14% Mathematics, Applied
F1 4 34 74% Physics, Particles & Fields21% Physics, Multidisciplinary
F2 4 19 63% Astronomy & Astrophysics;Physics, Particles & Fields21% Physics, Multidisciplinary
F3 8 117 50% Optics;Physics, Atomic, Molecular & Chemical42% Physics, Multidisciplinary
F5 10 85 47% Physics, Multidisciplinary18% Multidisciplinary Sciences
F6 9 98 46% Optics28% Physics, Multidisciplinary
F13 14 69 29% Physics, Fluids & Plasmas25% Physics, Multidisciplinary
Data sourced from Thomson Reuters Web of Knowledge (formerly referred to as ISI Web of Science). Web of Science (WoS) accessed online 04-21.10.2013.
STI 2014 p. 11
II. Accuracy levels compared
A comparison in two domainsMathematics domain: CM, CM;MA, CMA; M = Mathematics, MA = Mathematics AppliedPhysics sub-domain: CAA, CAA;PPF, CPPF; AA = Astronomy & Astrophysics, PPF = Physics, Particles & Fields
Absolute relative difference Reference valuescompared for successive … Mean expected number of citations
per articleThreshold number of citations for
outstandingly cited articles… Publication years 3%-18% 0%-31%
Articles 2005, 2006, 2007 0%-9% 1%-20%… Citation window lengths 29%-51% 8%-59%
3, 4, 5 years 18%-37% 19%-42%≈
… Partition cells C defined by 6%-42% 3%-36%combinations of subject categories: 0%-28% 0%-20%
Data sourced from Thomson Reuters Web of Knowledge (formerly referred to as ISI Web of Science). Web of Science (WoS) accessed online 10.01-30.05.2013.
STI 2014 p. 12
When does it matter in particular?
# Articles % Articles Expected citation rate2000-2007 (citation window 5 years;
range for publication years 2000-2007)
Subject category PHYSICS, MATHEMATICAL 56006 100.0% 6.4-8
(articles associated to multiple subject categories fractionally counted)
Cells in subject category PHYSICS, MATHEMATICAL defined bij combinations of subject categories:
PHYSICS, FLUIDS & PLASMAS;PHYSICS, MATHEMATICAL 18594 33.2% 8.7-10.5
PHYSICS, MATHEMATICAL 8455 15.1% 4.9-6.4
PHYSICS, MATHEMATICAL;PHYSICS, MULTIDISCIPLINARY 7612 13.6% 4.8-6.1
COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS;PHYSICS, MATHEMATICAL
4652 8.3% 7.3-10.5
MATHEMATICS, APPLIED;PHYSICS, MATHEMATICAL 4416 7.9% 5.2-7.4
MATHEMATICS, INTERDISCIPLINARY APPLICATIONS;PHYSICS, MATHEMATICAL;PHYSICS, MULTIDISCIPLINARY
3379 6.0% 3.8-13.9
PHYSICS, APPLIED;PHYSICS, CONDENSED MATTER;PHYSICS, MATHEMATICAL 2443 4.4% 1.5-2.2
PHYSICS, MATHEMATICAL;PHYSICS, NUCLEAR;PHYSICS, PARTICLES & FIELDS 1744 3.1% 4-6.4
MATHEMATICS, APPLIED;PHYSICS, MATHEMATICAL;PHYSICS, MULTIDISCIPLINARY 1547 2.8% 7-8.9
… … … …
Data sourced from Thomson Reuters Web of Knowledge (formerly referred to as ISI Web of Science). Web of Science (WoS) accessed online 16.07.2014.
STI 2014 p. 13
How much can it matter?
Normalized Mean Citation Rate, citation window 5 yearsGrantee M8, 14 articles 2000-200771% in Cell Physics, Mathematical (4.9-6.4 citations per article)
21% in Cell Mathematics (2.1-3.1 citations per article)
7% in Cell Physics, Mathematical;Physics, Multidisciplinary;Physics, Particles & Fields(2.3-5.8 citations per article)
Mean expected number of citationsCell-based: 66.9Subject category-based: 93.5
Effect on indicator results (cell-based vs. subject category-based)Factor: 1.4Compare: CPP/FCSm significantly far below (< 0.5), below (0.5 - 0.8), around (0.8 - 1.2), above (1.2 – 1.5), and far above (>1.5) the international impact standard of the field
Data sourced from Thomson Reuters Web of Knowledge (formerly referred to as ISI Web of Science). Web of Science (WoS) accessed online 04-09.10.2013 & 16.06-16.07.2014.
STI 2014 p. 14
Conclusions
Advantages • Stable journal-based structure • Closer fit to specialty than possible than with larger subject categories • Possibility to differentiate between cells with different citation
characteristics • Readily available, for all disciplines
New and remaining issues • Minority in small cells • When is a reference set fit close enough? • Multidisciplinary journals and interdisciplinary research
Questions and opportunities • Indicators & validation (<-> peer review results) • Effects for different specialties and indicators • Other contexts besides individual scientists
STI 2014, Version 04.09.2014 p. 15
Thank you for your attention!
Nadine Rons
Vrije Universiteit Brussel (VUB) VUB R&D Dept, Research Coordination Unit
Pleinlaan 2, B-1050 Brussels, Belgium [email protected]
http://rd-ir.vub.ac.be/en_GB/people/show/id/554
http://be.linkedin.com/pub/nadine-rons/55/2a/436