Oct 2006
Research Metrics
What was proposed …… what might work
Jonathan Adams
Oct 2006
Overview
• RAE was seen as burdensome and distorting
• Treasury proposed a metrics-based QR allocation system
• The outline metric model is inadequate, unbalanced and provides no quality assurance
• A basket of metrics might nonetheless provide a workable way of reducing the peer review load
• Research is a complex process so no assessment system sufficient to purpose is going to be completely “light touch”
Oct 2006
The background
• RAE introduced in 1986– ABRC and UGC consensus to increase selectivity
• Format settled by 1992
• Progressive improvement in UK impact
• Dynamic change and improvement at all levels
Oct 2006
The RAE period is linked to an increase in UK share of world citations
9
10
11
12
13
1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003
UK share (%) of world citations
Arrows indicate RAE years
Oct 2006
UK performance gain is seen across all RAE grades (Data are core sciences, grade at RAE96)
0.6
0.8
1
1.2
1991 1992 1993 1994 1995 1996 1997 1998 1999 2000
Average normalised impact (world average = 1.0)
Grade 4 Grade 3A Grade 3B
16%
12%
17%
Oct 2006
Treasury proposals
• RAE peer review produced a grade– Weighting factor in QR allocation model– Quality assurance
• But there were doubters– Community said the RAE was onerous– Peer review was opaque– Funding appeared [too] widely distributed
• Treasury wanted transparent simplification of the allocation side
Oct 2006
The ‘next steps’ model
• Noted correlation between QR and earned income (RC or total)– Evidence drew attention to statistical link in work on dual support
for HEFCE and UUK in 2001 & 2002
• Treasury hard-wired the model as an allocation system– So RC income determines QR
• But …– Statistical correlation is not a sufficient argument– Income is not a measure of quality and should not be used as a
driver for evaluation and reward
Oct 2006
RC Income vs QR Funding (2003-04)
0
10,000,000
20,000,000
30,000,000
40,000,000
50,000,000
60,000,000
70,000,000
80,000,000
0 10,000,000 20,000,000 30,000,000 40,000,000 50,000,000 60,000,000 70,000,000
Research Council income
Soton
Cambridge
Oxfor
Imperia
Warwic
UC London
KC London
Manchester
QR funding
winner
LOSER
QR and RC income scale together, but the residual variance would have an impact
HEPI produced additional analyses in report
Oct 2006
Unmodified outcomes of outline metrics model perturb current system unduly
£MILLIONS Current HEFC R Change
WINNERS
Univ Southampton 32.6 15.7
Univ Cambridge 73.5 13.3
Univ Leicester 11.9 6.4
Univ Manchester 54.3 6.0
LOSERS Univ Oxford 72.2 -4.8
Royal Holloway, Univ London 9.9 -5.1
Univ Arts London 6.3 -5.7
Imperial Coll London 66.2 -6.6
Univ Coll London 73.7 -9.6
King's Coll London 38.5 -11.8
A new model might produce reasonable change, but few would accept that the current QR allocations are as erroneous as these outcomes suggest
Oct 2006
The problem
• The Treasury model over-simplifies
• Outcomes are unpredictable– There are confounding factors such as subject mix– Even within subjects there are complex cost patterns
• The outcome does not inspire confidence and would affect morale
• There are no checks and balances– Risk of perverse outcomes, drift from original model– Drivers might affect innovation, emerging fields, new staff
• There is no quality assurance
Oct 2006
What are we trying to achieve?We want to lighten the peer review burden so we need ‘indicators’ to evaluate ‘research performance’ but not simplistic mono-metrics
Inputs Research black box
Outputs
Funding Numbers.. Publications
research quality
Time Time
What we want to know
What we have to use
Oct 2006
Process Inputs Activity Outputs from research Outputs from development
Research grants and contracts
support Staff, trainees, facilities and projects
which produces
Knowledge as discovery leading to
Knowledge as process or product
Derived information
Income
Funding agents
Collaborating organisation
Research capacity
Skilled employees
Trained technical staff
Trainee students and researchers
People
Journal papers and reports
Co-authorship
Citation impact
Patents
Licenses
Spin-out companies
Joint ventures
Contribution to economy
Improved networking and cooperation
Increased ability to tackle and solve industrial problems
Increased pool of trained and highly skilled personnel
Know-how
Improved collaboration
Shared information base
Recognition
Improved commercial competitiveness
Innovation
Growth
Data sources RAE, HESA for UK
EuroStat, OECD
RAE, HESA for UK
EuroStat, OECD
Thomson ISI
Evidence Ltd
Thomson Derwent
EuroStat
© 2002
Informed assessment comes from an integrated picture of research, not single metrics
Oct 2006
Data options for metrics and indicators
• Primary data from a research phase– Input, activity, output, impact
• Secondary data from combinations of these– e.g. money or papers per FTE
• Three attributes for every datum– Time, place, discipline– This limits possible sources of valid data
• Build up a picture– Weighted use of multiple indicators– Balance adjusted for subject– Balance adjusted for policy purpose
Oct 2006
We need assured data sourcing
• Where the data comes from– Indicator data must emerge naturally from the process being
evaluated– Artificial PIs are just that, artificial
• Who collects and collates the data– This affects accessibility, quality and timeliness
• HESA– Data quality and validation– Discipline structure
• Game playing
Oct 2006
We need to agree discipline mapping What is Chemistry?
Research Council
Chemistry grants committee
University
School of Chemistry
ISI
Chemistry journals
FUNDING
ACTIVITY
OUTPUT
Other departments
Other journals
Other funders
Other researchers
Oct 2006
We have to agree how to account for the distribution of data values e.g. income
0
5
10
15
20
Income category
Frequency
Income per FTE Gross income
RAE2001 - research income for units in UoA14 Biology
£10m per unit
£250k per FTE
MaximumMinimum
Oct 2006
0
100
200
300
400
Impact category (normalised to world average)
Frequency
UK Physics papers for 1995 = 2323
World average
Maximum0
Distribution of data values - impact
The variables for which we have metrics are skewed and therefore difficult to picture in a simple way
Oct 2006
Agree purpose for data usage
• Data are only indicators– So we need some acceptable reference system
• Skewed profiles are difficult to interpret
• We need simple, transparent descriptions– Benchmarks– Make comparisons– Track changes
• Use metrics to monitor performance– Set baseline against RAE2008 outcomes– Check thresholds to trigger fuller reassessment
Oct 2006
Example - categorising impact data
All papers
Uncited papers
Cited papers .
Papers cited less often than
benchmark Papers cited more often than benchmark
Papers cited more than
benchmark, but less than four times as often
Papers cited more than four times as
often as benchmark
= 0 >0 >0.125 >0.25 0.5 < 1 1 < 2 2 < 4 4 < 8 > 8
This grouping is the equivalent of a log 2 transformation. There is no place for zero values on a log scale.
Oct 2006
UK ten-year profile 680,000 papers
0
5
10
15
20
25
RBI = 0 RBI >0 - 0.125 RBI 0.125 - 0.25 RBI 0.25 - 0.5 RBI 0.5 - 1 RBI 1 - 2 RBI 2 - 4 RBI 4 - 8 RBI > 8
Percentage of output 1995-2004
% of UK output over decade
AVERAGERBI = 1.24
MODE (cited)
MEDIAN
THRESHOLD OF EXCELLENCE?
MODE
Oct 2006
Subject profiles and UK reference
0
5
10
15
20
25
30
RBI = 0 RBI >0 - 0.125 RBI 0.125 - 0.25 RBI 0.25 - 0.5 RBI 0.5 - 1 RBI 1 - 2 RBI 2 - 4 RBI 4 - 8 RBI > 8
Percentage of output 1995-2004
% of UK for subject and time period shown as a
smoothed line
UK average shown as red
symbol
Oct 2006
HEIs – 10 year totals – 4.1
0
5
10
15
20
25
30
RBI = 0 RBI >0 - 0.125 RBI 0.125 - 0.25 RBI 0.25 - 0.5 RBI 0.5 - 1 RBI 1 - 2 RBI 2 - 4 RBI 4 - 8 RBI > 8
Percentage of output 1995-2004
Leading research university Big civic 'Robbins' type university Former Polytechnic
Smoothing the lines would reveal the shape
of the profile
Oct 2006
HEIs – 10 year totals – 4.2
Absolute volume would add a further element
for comparisons
0
5
10
15
20
25
30
RBI = 0 RBI >0 - 0.125 RBI 0.125 - 0.25 RBI 0.25 - 0.5 RBI 0.5 - 1 RBI 1 - 2 RBI 2 - 4 RBI 4 - 8 RBI > 8
Percentage of output 1995-2004
Leading research university Big civic 'Robbins' type university Former Polytechnic
Oct 2006
Conclusions
• We can reduce the peer review burden by increased use of metrics– But the transition won’t be simple
• Research is a complex, expert system
• Assessment needs to produce– Confidence among the assessed
– Quality assurance among users
– Transparent outcome for funding bodies
• Light touch is possible, but not featherweight– Initiate a metrics basket linked to RAE2008 peer review
– Set benchmarks & thresholds, then track the basket
– Invoke panel reviews to evaluate change, but only where variance exceeds band markers across multiple metrics
Oct 2006
Overview (reprise)
• RAE was seen as burdensome and distorting
• Treasury proposed a metrics-based QR allocation system
• The outline model is inadequate, unbalanced and provides no quality assurance
• A basket of metrics might nonetheless provide a workable way of reducing the peer review load
• But research is a complex process so no assessment system sufficient to purpose is going to be completely “light touch”