cluster randomised trials with excessive cluster sizes: ethical and design implications

20 May 2016

Ottawa

Cluster randomised trials: in pursuit of 80% or 90% power –not all that it’s

cracked up to be

Karla Hemming

University of Birmingham

Sandra Eldridge and Gordon Forbes (Queen Mary’s University) and Monica Taljaard (Ottawa)

Cluster randomised trials: methods to minimise numbers of

participants needed in trials

The cluster randomised controlled trial (CRT):

– Common study design in which clusters are the unit of

randomisation.

– Gaining in popularity as commissioners are increasingly

funding evaluations of service and policy interventions.

Introduction

Huge diversity in types of CRTs

Intervention might be delivered at the level of the

cluster:

– Evaluation of a new prescribing system.

Intervention might be targeted at the individual:

– Evaluation of a new vaccination or new diagnostic test.

Diversity in type of intervention

Guthrie B, Treweek S, Petrie D, Barnett K, Ritchie LD, Robertson C, Bennie M. Protocol for the Effective Feedback to Improve Primary Care Prescribing

Safety (EFIPPS) study: a cluster randomised controlled trial using ePrescribing data. BMJ Open. 2012 Dec 13;2(6).

Palmer AC, Schulze KJ, Khatry SK, De Luca LM, West KP Jr. Maternal vitamin A supplementation increases natural antibody concentrations of

preadolescent offspring in rural Nepal. Nutrition. 2015 Jun;31(6):813-9.

Data for outcomes might be elicited using routinely

collected data:

– In which case there may be no direct contact with patients.

Or, as is more frequently the case, patients will be

recruited into the trial for elicitation of outcomes often

via data questionnaires:

– There will be direct contact with patients.

Diversity in methods of data collection

Depending on who is the target of the intervention,

the level of risk and local governance or laws:

– The patient maybe fully consented into the trial.

– The patient may be consented into the trial for elicitation of

outcomes (data questionnaires);

– Or, the patient may not be told their data will be included within

a trial; or opt out consent model.

Diversity in what the patient is told

There are a wide range of CRTs.

Some are of relatively lower risk interventions –

evaluations of changes in policy, in which the

evaluation uses routinely collected data.

But, others are of higher risk interventions,

participants have some degree of burden, either with

respect to exposure to an intervention of unknown

efficacy; or for data collection.

Therefore…

1. Raise awareness of the perils of large cluster

sizes.

2. Promote the use of suite of power and precision

curves to establish efficient designs.

3. Demonstrate how large cluster sizes have potential

ethical ramifications.

Aim of this talk – raise awareness

The perils of large cluster sizes

In the CRT more individuals are recruited the

increase in power and precision starts to level-off.

This feature doesn’t happen to the same extent in a

individually randomised trial.

Relatively well known in methodological literature –

less widely appreciated in the trial literature.

The law of diminishing returns

Donner, Allan. “Some Aspects of the Design and Analysis of Cluster Randomization Trials”. Journal of the Royal Statistical Society.

Series C (Applied Statistics) 47.1 (1998): 95–113

Illustration of diminishing return in

power as cluster size increases

Illustrative example includes 12 clusters per arm to detect a moderate standardised effect size of 0.25 which needs approximately a sample size of

340 per arm under individual randomisation (significance level of 5%, 90% power). Red (1/ICC) and green (2/ICC) lines represent attempts at

identification of points of diminishing returns (see later sections). Light blue line is sample size needed under individual randomisation (per arm).

Trial may be infeasible:

– Minimum number clusters is ICC*Sample size under iRCT.

Very large cluster sizes required:

– If number of clusters is close to the minimum very large cluster

sizes needed to obtain desired power.

Precision of treatment effect limited:

– Increasing the cluster size will not decrease the precision of

the treatment effect (i.e. CIs will NOT get narrower).

Implications of point of diminishing

returns

Note: The ICC measures the degree to which observations within a cluster are correlated

“..a useful rule of thumb for continuous outcomes is

that power does not increase appreciably once the

number of subjects per cluster exceeds 1/ICC”

Identification of point of diminishing

returns – current guidelines

Campbell MJ, Donner A, Klar N. Developments in cluster randomized trials and Statistics in Medicine. Stat Med. 2007 Jan 15;26(1):2-19.

When ICC is high:

– Guidelines typically under-estimate point of diminishing returns

When ICC is low:

– Guidelines typically over-estimate point of diminishing returns

Current guidelines don’t work well…

Current guidelines – to identify the point of

diminishing returns -don’t work that well

Illustrative example includes 12 clusters per arm to detect a moderate standardised effect size of 0.25 which needs approximately a sample size of

340 per arm under individual randomisation (significance level of 5%, 90% power). Red (1/ICC) and green (2/ICC) lines represent attempts at

identification of points of diminishing returns (see later sections). Light blue line is sample size needed under individual randomisation (per arm).

Methods to ensure an efficient use of data

in a CRT

1) Identify if the trial is feasible:– If there is a limited number of clusters available, is this more than the

minimum number required?

2) Identify a reasonable cluster size:– Identify reasonable cluster sizes in which all observations make

some non-negligible contribution to precision.

3) Identify if the trial could be made more efficient.– Identify if with a small increase in the number of clusters a similar

level of precision or power could be achieved by increasing the clusters by.

New proposal

Study design:

– 90% power;

– 5% significance;

– Small effect size (0.05);

– Low ICC (0.008);

– 9,000 per arm under individual randomisation.

Proposal using a worked example

Big data trial – once-in-a-lifetime opportunity –

small effect size, high power, clinical outcome, low ICC

Identify minimum number of clusters

Minimum number clusters: 73 (per arm)

Identification of reasonable cluster sizes

Cluster sizes up to about a maximum of 2,000 reasonable

Power curve setting clusters to minimum

(73 per arm)

Power doesn’t increase much above a cluster size of 2,000.

90% power achievable with cluster sizes above 7227

Power curve setting clusters to slightly

above minimum (74 per arm)

90% power is achievable with a cluster size of about 4,000;

Sample size need per arm under individual

randomisation is 9,000.

Minimum number of clusters is 73 per arm:

– Equates to a cluster size of about 9,000.

Increase the number of clusters to 74 per arm:

– Equates to a cluster size of about 9,000/2=4,500.

BUT!! Precision hardly increases above about 2,000

Identification if an increase in the number of

clusters would improve efficiency

Approximate results only

Study design:

– 80% power;

– 5% significance;

– Large effect size (0.25);

– High ICC (0.1);

– Sample size per arm under individual randomisation 253.

Proposal using a worked example –

large effect size

Run-of-the-mill trial:

moderate effect size, moderate power, process outcome, high ICC

Identify minimum number of clusters

Minimum number clusters: 72 (per arm)

Minimum number clusters 26 per arm

Identification of reasonable cluster sizes

Cluster sizes up to a maximum of about 200 seem reasonable

Cluster sizes up to about a maximum of 2,000 reasonable

Cluster sizes up to about 100 look reasonable

(note new guidelines 800)

Power curve setting clusters to minimum

(26 per arm)


80% power is achievable with a cluster size of 257 (approx N under iRCT)

Power curve setting clusters to slightly

above minimum (27 per arm)


80% power is achievable with a cluster size of 121 (approx 50% of N under iRCT)

Sample size need per arm under individual

randomisation is 253.

Minimum number of clusters is 26 per arm:

– Equates to a cluster size of about 250.

Increase the number of clusters to 27 per arm:

– Equates to a cluster size of about 250/2=125.

BUT!! Precision hardly increases above about 200

Identification if an increase in the number of

clusters would improve efficiency

Approximate results

Ethical ramifications

Evaluations of policy changes or service delivery

interventions.

Target is health care provider; not individual.

Often considered low risk interventions.

If data are routinely available – data might be

considered “free” (once linkage established).

Lower risk settings

Uncontentious Issues

Striving for large cluster

sizes, to obtain desired

power might be less

contentious.

Risks to the individual and

costs to the funder are

both low.

Implications in lower risk settings..

Contentious Issues

Delay timeliness with

which study results are

available.

Example 1: Evaluation of a policy (low risk)

intervention with very large cluster sizes

Example 1 – summary of study

Trial Findings:

ICC 0.28;

Average cluster size

1,400;

OR 0.88 (95% CI: 0.62-

1.27);

Results imprecise, despite

very large sample size;

Study ran for 14 months.

Trial Design:

30 clusters (15 per arm);

Binary outcome;

Powered to detect a

change from 23% to 44%;

80% power and 5%

significance

TSS is about 300 under

individual randomisation.

Example 1: Was this an efficient trial

design?

The primary outcome was available from routinely

collected data.

Similar level of power achievable had the trial ran for

1 month (TSS of 140) rather than 14 months (TSS

1,400).

Some outcomes were not routinely collected.


design?

Many interventions evaluated in CRTs target

individual (60%).

About 12% of CRTs evaluate medicinal products.

Many CRTs recruit individuals to elicit outcomes

(data questionnaires).

Higher risk settings

McRae A, Taljaard M, Weijer C, Bennett C, Skea Z, Boruch R, Brehaut J, Eccles M, Grimshaw J, Donner A. Reporting of patient consent in healthcare cluster

randomised trials is associated with the type of study interventions and publication characteristics. J Med Ethics. 2013 Feb;39(2):119-24.

Giraudeau B, Caille A, Le Gouge A, Ravaud P. Participant informed consent in cluster randomized trials: review. PLoS One. 2012;7(7)

Uncontentious Issues:

Trial costs - elicitation of

many outcomes.

Balance between cluster

and individual costs.

Implications in higher risk settings

Contentious Issues:

Patient burden -

completion of data

questionnaires.

Patient risk - Intervention

will be of unknown

efficacy.

Example 2: Evaluation of individually targeted

(higher risk) intervention with large clusters

Example 2 – summary of study

Trial Findings:

ICC not reported

(estimated from their data

by KH to be about 0.02);

Average cluster size

4,000;

IRR 1.00 (95% CI: 0.75-

1.34);

Results imprecise, despite

very large sample size.

Trial Design:

15 clusters (7 / 8 per arm);

Rate outcome;

Powered to detect an IRR

of 0.6;

80% power and 5%

significance;

TSS is about 2,500

(person years) under

individual randomisation.


design?

Intervention consisted of mass screening and

treatment for tuberculosis.

Routinely collected unclear but unlikely.

Only included LARGE mines (clusters). Had they

included more small mines, a more conclusive result

might have been obtained.


design?

In CRTs there is diminishing returns in power when increasing the cluster size.

Increasing the cluster size by a large amount might only give a very small return in increase in power.

Implications greater in higher risk interventions, none routinely collected data.

Precision curves can identify worthwhile contributions of large cluster sizes.

Summary

cluster randomised trials with excessive cluster sizes: ethical and design implications

Data & Analytics