sct2013 boston,randomizationmetricsposter,d6.2

•  Quantitative metrics clearly distinguished advantages and disadvantages between methods • Results confirmed that there are consistent trade-offs between efficiency and unpredictability over all methods •  Simulations for studies & their expected populations can:

•  take some of the guesswork out of selecting a particular randomization method. •  Allow better risk management for potential biases and impacts on analysis

•  Measures of randomness & predictability were complementary, •  Potential Selection Bias (Predictability) is customized to what a participating clinician might know •  Syntropy (Randomness) provides a universal randomness measure

Randomization methods generally are designed to be both unpredictable and balanced between treatment

allocations overall and within strata. However, when planning studies, little consideration is given to measuring these characteristics, nor are they examined jointly, and published comparisons between methods often use incompatible metrics and simulation assumptions. Furthermore, for purposes of real-world planning, such simulations often make unrealistic assumptions (e.g., equal sized strata), and summary statistics give limited information.

In order to better reflect real-world study performance, we carried out a series of simulations with 2 treatment arms, and stratification factors that are unequally populated (e.g., 1:2, 1:2:3, or a power law distribution).

To measure predictability, we modified the potential selection bias (Efron, Blackwell-Hodges) in which an observer guesses the next treatment to be the one that previously occurred least in the strata containing the subject (i.e., limiting the observer’s knowledge to individual strata, such as site). This reflects a game theory model of randomization pitting observers versus statistician, and is easy to calculate and interpret.

To measure imbalances, we calculated efficiency loss using Atkinson’s method because: The main impact of imbalances on the outcomes of a study is a loss of statistical power; Even if treatments are balanced overall, imbalances within sall strata can have a disproportionate impact on efficiency; And it is easy to interpret as lost sample size.

We applied these methods to evaluate the performance of several popular and novel randomization methods for a variety of parameters, including methods based on permuted blocks, dynamic allocation/minimization, and urn designs. Simulation results were summarized with the median and confidence intervals to estimate best & worst-case scenarios as well as expected performance.

Results showed consistent trade-offs between efficiency and unpredictability over methods and parameters, supporting no ‘best’ method to optimize both.

Randomization Metrics: Jointly assessing predictability and efficiency loss in randomization designs

Abstract

Randomization Metrics: Median and Confidence Intervals

Discussion

Atkinson, AC. (2003) The distribution of loss in two-treatment biased-coin designs. Biostatistics, 2003, 4, 2, pp. 179–193 Blackwell, D. and J.Hodges Jr (1957). Design for the control of selection bias. Ann Math Stat 28, 449-460 Wikipedia contributors. "Entropy (information theory)." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 23 Apr. 2013. Web. 14 May. 2013. Lebowitsch, J, et al, (2012). “Generalized multidimensional dynamic allocation method”. Statistics in Medicine,2012;

Bibliography

Questions: What is the best method for randomizing a clinical trial? What is the best method for randomizing this kind of clinical trial?�What is the best method for randomizing this design of clinical trial in this population of patients? Objective: Develop a set of quantifiable metrics •  To evaluate performance of covariate-adjusted randomization

methods and their parameters •  For specific study designs •  In realistic study populations

Objectives

Dennis E. Sweitzer, Ph.D Principal Biostatistician

Simulation Strategy: ➣ Simulate 200 subjects for each of 500 simulations ➣ Assign each subject values of covariates:

{Sex, Age, Site} ➣ Randomize each set of subjects by each randomization method ➣ Evaluate each randomization schedule using proposed metrics for subsets of: 25, 50, 100, 200 ➣ Summarize metrics across simulations with medians and 80% Confidence Intervals ➣ Plot metrics Simulated Study Populations: ➣ Stratify by Age and Sex ➣ Use Age & Sex for covariate-adjusted Randomization ➣ Site as an additional covariate included in Analysis model, but not in randomization NB: Age, Sex, and Site are proxies for real-world covariates – which rarely are distributed evenly.

Simulation Methods

Comparing Methods and their parameters ☞ (CR) Complete Randomization – also use as comparator ☞ (PB) Stratified Permuted Block: 1:1, 2:2, 3:3, 4:4, 6:6, 8:8

Treatments are assigned in blocks such as for 2:2: {TTPP, TPTP, TPPT, …} ☛ Dynamic Allocation / Minimization, weighted to:

☞(DAS) Stratified: Balance on strata ☞(DAM) Marginal: Balance on marginal balances ☞(DAE) Equal: Balance with equal weight on strata, marginal and overall balances.

Treatments are assigned to best reduce weighted sum of imbalances within stratification factors, eg, Randomizing young male male to PLA might worsen the imbalance within males, but improve it in the Young. Choice of weights will determine the treatment ☛ 2nd Best Probabilities of: {0, 0.15, 0.25, 0.35, 0.5} This is the probability of assigning a treatment which worsens the imbalance.

This parameter adds randomness to the algorithm

Randomization Methods

Sex (2:1) Female Male

Age Group (3:2:1) 67% 33% Mid. Aged 50% 33% ♀, M 17% ♂, M

Young 33% 22% ♀ , Y 11% ♂, Y Older 16% 11% ♀, O 5% ♂, O

Site ( 1 : ½ : ⅓ ) a b c

55% 27% 18%

Over all Ages:

Pla Test

Over both sexes

Pla Test

Males Females

18-35 yo

35-65 yo

>65 yo

Pla Test

Marginal Imbalances within age groups, ignoring sex

Marginal Imbalances within sexes, ignoring age

groups

Imbalances Within

Strata: All combinations

of Age x Sex

Overall Imbalance over all subjects

Potential Selection Bias, limited to the knowledge available to an observer (Blackwell-Hodges, 1957) Observer always “guesses” to restore balance

Example: For treatment sequence “TPPP” Initial guess ⟶ Expectation = ½ “T” ⟶ Imbalance =+1 ⟶ Guess “P” ⟶ Correct “TP” ⟶ Imbalance=0 ⟶ Guess either ⟶ Expectation=½ “TCC” ⟶ Imbalance=-1 ⟶ Guess “T” ⟶ Wrong “TCCC” ⟶ # Correct= ½ + 1+ ½ +0 =2 Score = #Correct - #Expected = 2 - 2 = 0

Use: Potential Selection Bias (Strata): Based only on imbalances observed within Strata Also: Potential Selection Bias (Site): Based only on imbalances observed within sites

•  Customize this metric to reflect the information available to study personnel --typically only at their site.

Measuring Predictability

For an analysis model: z ≣treatment allocation α ≣treatment effect

X ≣Covariates / Design Matrix β ≣Covariate effects The variability of the treatment estimate is: And

is the Loss of Efficiency, or loss of Statistical power (Atkinson, 2003)

LOE ≈ # Subjects “Wasted” by sub-optimal treatment assignments

☞ In a Designed Experiment, one can select z and X to minimize LOE ☞ In a Randomized Controlled Trial, can only assign treatments z, but have very little control over X (which is the subjects arriving in the study)

Use: Model is: Treatment + Sex + Age + Sex*Age + Site

Measuring Impact on Analysis

LOE = ztX(XtX)−1Xtz

! ! = !! + !!!!Var(α̂) = σ 2

ztz− ztX(XtX)−1Xtz

Simulation results are summarized as Medians + 80% Confidence intervals Meaning: 10% of simulations were beyond the

upper limit, and 10% were beyond the lower limit.

Y-axis: Measures of Randomness: (Syntropy or Potential Selection Bias)

X-axis: Loss of efficiency.

Note on Graphic Results

0# 1# 2# 3# 4# 5# 6#

Poten&

al)Selec&o

n)Bias)

Loss)of)Efficiency)

Simula&on)results)as)80%)Confidence)Intervals)

DA(0),#Margin#Balance#

PB(1:1)#

DA(0),#Margin#Balance#

PB(1:1)#

Potential Selection Bias vs LOE All methods, one Graph ☞ Clear trade off between LOE and Potential Selection Bias

Results

1 2 3 4 5 6 7 8

N= 200

DAM0 DAM15 DAM25

1 2 3 4 5 6 7 8

DAS35 DAS50

1 2 3 4 5 6 7 8

Syntropy

PB3 PB4

PB6 PB8

DAS25 DAS35

1 2 3 4 5 6 7 8

N= 200

Syntropy

PB6 PB8

DAS35 DAS50

Syntropy vs LOE ☞ Clear trade off between LOE and Potential Selection Bias

Potential Selection Bias vs LOE All methods become more efficient,

Some become more predictable DAM becomes less predictable

Results, Increasing N

1 2 3 4 5 6 7 8

Permuted Block, 1:1, 3:3, 8:8 + CR

n= 50 n= 100

n= 200

n= 50 n= 100 n= 200

n= 100 n= 200

1 2 3 4 5 6 7 8

DA (Strata), phi=0, 0.15, 0.35, 0.5

Pot.Sel.Bias.Strata

n= 25 n= 50

n= 100 n= 200

n= 25 n= 50

n= 100

n= 200

n= 100 n= 200

n= 25 n= 50

n= 100 n= 200

1 2 3 4 5 6 7 8

DA (Equal Wts), phi=0, 0.15, 0.35 0.5

n= 25 n= 50

n= 100 n= 200

n= 25 n= 50

n= 100 n= 200

n= 50 n= 100 n= 200

n= 50 n= 100

n= 200

1 2 3 4 5 6 7 8

DA (Margin), phi=0, 0.35 0.5

Pot.Sel.Bias.Strata

n= 100

n= 200 n= 25

n= 50 n= 100

n= 200 n= 25

n= 50 n= 100

n= 200

(Shannon) Entropy: From information theory: Syntropy: Rescale Entropy to [0,1] so that: Where: pi = Pr{Treatment i } 0 = Max Randomness, 1 = Min Randomness

•  Essentially this is the maximum possible average predictability if an observer had complete information about the algorithm and the subjects being randomized.

•  H/log(2) is interpretable as the theoretical lower limit of the number of bits required to encode a randomization schedule.

Measuring Randomness

H = − pi log(pi )∑1−H max(H ) = 1−H / log(2)

sct2013 boston,randomizationmetricsposter,d6.2

randomization metrics

adjusted randomization

novel randomization

methods simulations

simulation methods

methods results

best method

randomization schedule

Technology

sacra d6.2 - integration specification - europa

impact d6.2 communicationmaterial v.draft01 · impact...

deliverable d6.2 integration concept of non web service

d6.2 cloudscape br 1st workshop - takeaways & future steps...

d6.2 mid-term pilot evaluation report - prominent-iwt ·...

d6.2 communication & dissemination plan (v1) ›...

d6.2: specification concept of the general technical system...

d6.2 pu.pdf

d6.2: prototype demonstration · web viewduration:...

d6.2: final proof-of-concept results for selected

d6.2 vertical demo and testbed setup and initial validation...

d6.2 - static social orchestration: implementation and...

d6.2 first report on dissemination activities-v2 ·...

d6.2 real-time event broker ii - composition project

d6.2 report on ethical impact for regulation ·...

d6.2: complete report on validation and demonstration of the...

deliverable d6.2 initial report on horizontal topics...

d6.2 elastest platform toolbox and integrations

d6.2 | transition practice · 2019. 4. 30. · d6.2 |...

d6.2 draft communication dissemination plan