CLINICAL TRIALS DESIGN, CONDUCT,AND ANALYSIS

Item

Title: CLINICAL TRIALS DESIGN, CONDUCT,AND ANALYSIS
extracted text: ■

f

-

■

Monographs in Epidemiology and Biostatistics

Monographs in Epidemiology and Biostatistics
edited by Abraham M. Lilienfeld

1 THE EPIDEMIOLOGY OF DEMENTIA
James A. Mortimer and Leonard M. Schuman
1981

Volume 8

CLINICAL TRIALS

I

Design, Conduct, and Analysis

2 CASE CONTROL STUDIES
Design, Conduct, Analysis
James J. Schlesselman
1982

Curtis L. Meinert, Ph.D.

3 EPIDEMIOLOGY OF MUSCULOSKELETAL DISORDERS
Jennifer L. Kelsey
1982

Professor of Epidemiology and Biostatistics
School of Hygiene and Public Health
The Johns Hopkins University

4 URBANIZATION AND CANCER MORTALITY
The United States Experience, 1950-1975
Michael R. Greenberg
1983

In collaboration with

Susan Tonascia, M.Sc.
Research Associate
School of Hygiene and Public Health
The Johns Hopkins University

5 AN INTRODUCTION TO EPIDEMIOLOGIC METHODS
Harold A. Kahn
1983
6 THE LEUKEMIAS
Epidemiologic Aspects
Martha S. Linet
1984

7 SCREENING IN CHRONIC DISEASE
Alan S. Morrison
1985

■J;-7NOV

8 CLINICAL TRIALS
Design, Conduct, and Analysis
Curtis L. Meinert
1986

i

9 VACCINATING AGAINST BRAIN DYSFUNCTION SYNDROMES
The Campaign Against Rubella and Measles
Ernest M. Gruenberg
1985
10 OBSERVATIONAL EPIDEMIOLOGIC STUDIES
Jennifer L. Kelsey, W. Douglas Thompson, Alfred S. Evans
1986

New York Oxford
OXFORD UNIVERSITY PRESS
1986

i

I

n

♦K'J

Oxford University Press
Oxford New York Toronto
Delhi Bombay Calcutta Madras Karachi
Petaling Jaya Singapore Hong Kong Tokyo
Nairobi Dar es Salaam Cape Town
Melbourne Auckland

and associated companies in
Beirut Berlin Ibadan Nicosia

To

Copyright © 1986 by Oxford University Press. Inc.

Susie, Julie, Nancy, and Jill

Published by Oxford University Press. Inc.,
200 Madison Avenue, New York, New York 10016

my wife and daughters for their help,
encouragement, forbearance, and understanding

Oxford is a registered trademark of Oxford University Press

All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any form or by any means,
electronic, mechanical, photocopying, recording, or otherwise,
without the prior permission of Oxford University Press.

Library of Congress Cataloging-in-Publication Data
Meinert. Curtis L.
Clinical trials.
(Monographs in epidemiology and biostatistics; v. 8)
Bibliography: p.
Includes indexes.
1. Clinical trials. I. Tonascia, Susan. II. Title.
III. Series. [DNLM. I. Clinical Trials. Wl MO567Lt v.8
I QV 771 M5l4c]
85-11530
R853.C55M45 1986
6l0'.72
ISBN 0-19-503568-2

|
I
I
I

I’1'

• i|

Printing (last digit):

987654321

Printed in the United States of America
on acid-free paper

Preface
And further, by these, my son. be admonished: Of making many books there is no end; and
much study is a weariness of the flesh.
Ecclesiastes 12:12

This book consists of seven parts:
Introduction and Current Status (7
chapters)
Part II: Design Principles and Practices (5
chapters)
Part III: Execution (4 chapters)
Part IV: Data Analysis and Interpretation (4
chapters)
Part V: Management and Administration
(3 chapters)
Part VI: Reporting Procedures (3 chapters)
Part VII: Appendixes (9 in number)

Part 1:

It is intended as a general reference for practi
tioners of clinical trials. The main focus is on
trials involving uncrossed treatments and a clini
cal event as the outcome measure. It is not con
cerned with trials designed to assess bioavailabil
ity or with trials involving crossover designs.
However, this is not to say that it is of no value
for researchers with such interests, since some of
the design and operating principles and practices
described herein extend to such trials as well.
Parts of this book, such as the chapters con
cerned with sample size calculation, randomiza
tion. forms design, quality assurance, and report
ing procedures, apply to most kinds of trials.
This book deals with single-center as well as
multicenter trials, as defined in Chapter 4. No
distinction is made between the two types in
most of the chapters, because the design and
operating practices are largely the same for both.
There are only two chapters, 5 and 23, that deal
exclusively with multicenter trials, and even they
have some relevance to single-center trials.
Appendix A contains a glossary of terms and
acronyms used in this book and serves as a
starting point for a dictionary of terms for clini
cal trials in general. Appendix B contains opera
tional information for 14 of the trials referenced.
Tabulations based on the information contained

in this appendix appear in various chapters of
the book. Appendix I contains a combined bib
liography of references cited in the various chap
ters and appendixes (except B and C). Refer
ences in the combined bibliography have been
arranged alphabetically by first author and then
chronologically. The reference lists in Table B-3,
Appendix B, for the studies sketched, are in
chronological order. Journal abbreviations used
in the reference listings throughout correspond
to those used by the National Library of Medi
cine in Index Medicus and MEDLINE. The
other five appendixes relate to specific chapters
in the book.
The impetus for this book emerged from a
long-standing involvement in clinical trials, be
ginning with the University Group Diabetes Pro
gram in 1961. The urge to develop a general text
concerned with the design and conduct of clini
cal trials led to development of an initial draft in
the spring of 1972. The emphasis in that and
subsequent drafts during the next two years fo
cused exclusively on a few large-scale multicen
ter trials. Work continued, but at a decelerating
rate, until it came to a virtual halt by 1975,
primarily because of other work commitments.
The work lay dormant until late 1978 when,
while still at the University of Maryland School
of Medicine. I was persuaded to start anew by
the late Abraham Lilienfeld. The revised outline
involved 8 chapters. It gradually expanded to
the current size.
Writing proceeded slowly until my move to the
Department of Epidemiology of The Johns Hop
kins University School of Hygiene and Public
Health in late 1979, where I was faced with the
challenge of developing a course on the design
and conduct of clinical trials. That teaching ef
fort and Susan Tonascia’s participation in that
activity helped me to organize my thoughts and
to collect the materials needed for this book. I
am indebted to her for her help.

Baltimore. Maryland
November 1985

C.L.M.

}

■■■^

Acknowledgments

I wish to begin by expressing my thanks to two
individuals who stimulated and focused my in
terest in clinical trials. My sincere thanks to
Jacob Bearman for his confidence in me when I
had none and to Chris Klimt for his role in
developing and promoting my career and for
giving me the chance to learn about clinical trials
by doing. 1 am indebted as well to the late Abra
ham Lilienfeld for his words of encouragement
and editorial help during this writing effort.
One other individual, James Tonascia, de
serves special thanks. I have benefited im
mensely from his teaching and counsel. His ideas
and input are reflected in various places through
out this book. In fact, his notes, developed for
an advanced course in epidemiology at Johns
Hopkins, provided the starting point for much
of what is contained in Chapter 9 and for some
of Chapters 10 and 18 as well. In addition, he
has spent hours reviewing various parts of this
book, including all of Chapter 9 and parts of
Chapter 10 and the Glossary.
Various others helped on specific parts of the
book. I wish to thank each of them for their
contribution. They include:

• Helen Abbey, for review of Chapter 9

• Barbara Andreassen. for help on Chapter 12
and art work on figures in the book
• Steven F. Bingham, for review of the VACSP
#43 sketch in Appendix B

• Thomas Blaszkowski, for review of glossary
terms and for referencing help concern
ing administration of government grants
and contracts

• Robert Bradley, for referencing help on Chap
ter 7

• Lloyd Fisher, for review of the Coronary Ar
tery Surgery Study sketch in Appen
dix B and for information concerning
procedures used in CASS

• Lawrence Friedman, for review of glossary
terms and for review of the Aspirin Myo
cardial Infarction Study sketch in Ap
pendix B
• Curt Furberg, for review of glossary terms
• Barbara S. Hawkins, for review of glossary
terms, for review of the Macular Photo
coagulation Study sketch in Appen
dix B, and for information concerning
procedures used in the MPS

• C. Morton Hawkins, for review of the Hyper
tension Detection and Follow-Up Pro
gram sketch in Appendix B
• Charles Hennekens, for review of the Physi
cians’ Health Study sketch in Appen
dix B
• Fred Heydrick, for critical review of and edi
torial help on Chapter 21
• Genell L. Knatterud, for review of the Univer
sity Group Diabetes Program sketch in
Appendix B and for information con
cerning procedures used in the UGDP,
DRS, and ETDRS

• William F. Krol, for review of the Persantine
Aspirin Reinfarction Study sketch in Ap
pendix B and for information concern
ing procedures used in PARIS and
AMIS
• John M. Lachin, for review of the National
Cooperative Gallstone Study sketch and
for information concerning procedures
used in the NCGS

• Paul L. Canner, for review of the Coronary
Drug Project sketch in Appendix B and
for information concerning procedures
used in the CDP

• John M. Long, for review of the Program on
the Surgical Control of Hyperlipidemia
sketch in Appendix B

• Jeffrey A. Cutler, for review of the Multiple
Risk Factor Intervention Trial sketch in
Appendix B

• Medical librarians of the Johns Hopkins Uni
versity, especially Katherine Branch and
Karen Higgins, for referencing help
throughout the book

• Marie Diener, for help on Chapter 7

• Maureen Maguire, for help on Chapter 9

x

A ck no wledgments

• Lisa Mele, for help on Chapter 7
• Larry Moulton, for help on Chapter 9
• Ronald Prineas, for review of the Hyperten
sion Prevention Trial sketch in Appen
dix B

?•

• Michael Terrin. for help in carrying out the
SMOG analysis on consent statements
contained in Appendix E

3

.<

• Robert Weiss, for review of the International
Reflux Study in Children sketch in Ap
pendix B

Thanks also go to Sheila Booker, Janet Hiller,
Mary Hurt. Joan Jefferys, Jeanette Lautenschlager, Teresa Lee, Mark Van Natta, and

Deborah Zeiler for help in producing this manu
script. Sheila Booker deserves a special note of
thanks since it is she who did the bulk of the
typing, always with great efficiency, accuracy,
and good grace, even when working with copy in
my worst hand that was almost impossible to
decipher, even by me!
A thank you also to colleagues in the School
of Hygiene and Public Health for their encour
agement and for providing an environment and
support structure conducive to a writing effort
of this sort. And last I wish to express my grati
tude to Jeffrey W. House and Joan Bossert,
editors at Oxford University Press in New York.
I am especially indebted to Joan Bossert for her
exquisite editorial eye and ear and for her pro
fessionalism, patience, and persistence in dealing
with a stubborn Midwesterner!

Contents
3.4.6 Ongoing planning and priority

PART 1. INTRODUCTION AND
CURRENT STATUS

i

Chapter 1. Introduction

3

1.1 Definition
1.2 History of clinical trials
1.3 Terminology conventions

1.4 Focus

assessment
3.4.7 Minimal overlap of activities

3
3
8
10

Chapter 2. Clinical trials: A state-of-

the-art assessment
2.1 Existing inventories

11

11

2.2 Trials as seen through the
published literature
2.3 Small sample size: A common
design flaw

2.4 Future needs

13

15
15

Chapter 3. The activities of a

18

clinical trial
3.1 Stages of a clinical trial
3.2 Division of responsibilities
3.3 Common impediments to the
orderly performance of
activities
3.3.1 Separation of responsibilities
in government-initiated trials

3.3.2 Structural deficiencies
3.3.3 Overlap of activities from stage
to stage
3.3.4 Inadequate time for planning.

development, and
implementation

3.3.5 Inadequate funding

18
18
19

19
19
19

20
20

3.4 Approaches to ensure orderly
transition of activities
3.4.1 Phased initiation of data intake

20
20

3.4.2 An adequate organizational

structure
3.4.3 Opportunities for design
modifications in sponsor-

20

initiated trials
3.4.4 Certification as a management

21

tool
3.4.5 Realistic timetables

I

21
21

Chapter 4. Single-center versus
multicenter trials

4.1 Definition
4.2 National Institutes of Health
(NIH) count of single-center
and multicenter trials
4.3 Design characteristics of single
center versus multicenter trials
4.4 The pros and cons of single
center versus multicenter trials
4.5 Initiation of single-center versus
multicenter trials
4.6 Investigator incentives for single
center versus multicenter trials
4.7 Timing of single-center versus
multicenter trials
4.8 Cost of single-center versus
multicenter trials
Chapter 5. Coordinating and other
resource centers in multicenter trials

5.1 Introduction
5.2 Coordinating centers
5.2.1 General activities
5.2.2 Location
5.2.3 Staffing
5.2.4 Equipment
5.2.5 Relative cost
5.2.6 Internal allocation of funds

5.3 Central laboratories
5.4 Reading centers

5.5 Project offices
5.6 Other resource centers

21
21

23
23

24

24

25
27

27
28
29

30

30
30
31
31
33
34
34
36
36
38
38
39

Chapter 6. Cost and related issues

40

6.1 Government expenditures for
clinical trials
6.2 Who should finance clinical

40

trials?
6.3 Factors that influence the cost of

a trial

42
45

[3
I"*!
I ,.4 '
. •I1' i

i

5-

iWiiarmB
xii

-3

Contents

Contents xiii

6.3.1 Design

45
45

6.3.2 Planning
6.3.3 Multipurpose studies
6.3.4 Ancillary studies
6.3.5 Equating the data collection

46
46

needs of the trial with those
for patient care
6.3.6 Undisciplined data collection

46

philosophy

6.4 Cost control procedures

46

46

ancillary studies
6.4.6 Justification of data items
6.4.7 Use of low-technology

procedures

6.5 Need for better cost data

Chapter 7. Impact of clinical trials on
the practice of medicine

7.1 Introduction
7.2 Factors influencing treatment
acceptance

46

47
47
47

48
48

49
49
49
49

7.2.2

experience with a treatment
Clinical revelance of the

50

7.2.3

outcome measure
Degree to which test

7.2.4

treatment simulates realworld treatment
Consistency of findings with

7.2.6
7.2.7
7.2.8

resources
Design and operating features
of the trial
7.2.10 Study population
7.2.11 Method of presentation
7.2.12 Counterforces
7.3 Impact assessment

7.2.9

50

50
50
50
50

50
51
51
51

51
52

7.4 The University Group Diabetes
Program: A case study

52

7.5 Ways to increase the impact of
clinical trials

65

65

62

9.4.1 Binary outcome measures
9.4.1.1 Fisher’s exact test
9.4.1.2 Chi-square
approximation
9.4.1.3 Inverse sine transform
approximation
9.4.1.4 Poisson approximation
9.4.2 Continuous outcome

measures
66

67
68

9.4.2.1 Normal approximation for
two independent means
9.4.2.2 Normal approximation for
mean changes from
baseline
9.5 Power formulas

Chapter 9. Sample size and power

Prior opinion and previous

previous results
Direction of results
Importance of the treatment
Cost and payment schedule
Treatment facilities and

8.1 Introduction
8.2 Choice of the test and control
treatments
8.3 Principles in the selection of the
outcome measure
8.4 Principles in establishing
comparable study groups /
8.5 Principles of masking and biasx
control

65

47
48

7.2.1

7.2.5

9.4 Sample size formulas

63

Chapter 8. Essential design features of

a controlled clinical trial

6.4.1 General cost control
procedures
6.4.2 Method of funding
6.4.3 Cost reviews
6.4.4 Periodic priority assessments
6.4.5 Review and funding for

PART II. DESIGN PRINCIPLES
AND PRACTICES

estimates

9.1 Sequential versus fixed sample
size designs
9.2 Sample size and power
calculations as planning guides
9.3 Specifications for sample size
calculations

9.3.1
9.3.2
9.3.3
9.3.4

Number of treatment groups
Outcome measure
Follow-up period
Alternative treatment
hypothesis
9.3.5 Detectable treatment
difference
9.3.5.1 Binary outcome measures
9.3.5.2 Continuous outcome

measures
9.3.6 Error protection
9.3.7 Choice of allocation ratio
9.3.8 Losses to follow-up
9.3.9 Losses due to treatment
noncompliance
9.3.10 Treatment lag time
9.3.11 Stratification for control of
baseline risk factors
9.3.12 Degree of type I and II error
protection for multiple
comparisons
9.3.13 Degree of type I and II error
protection for multiple
looks for safety monitoring
9.3.14 Degree of type I and II error
protection for multiple
outcomes

71

72
74

74
74
75
76
76

9.5.1 Binary outcome measures
9.5.1.1 Fisher’s exact test
9.5.1.2 Chi-square approximation
9.5.1.3 Inverse sine transform
approximation
9.5.1.4 Poisson approximation
9.5.2 Continuous outcome
measures
9.5.2.1 Normal approximation for
comparison of two
independent means
9.5.2.2 Normal approximation for
mean changes from
baseline

77
77
78
78

9.6.1

9.6.2

9.6.3

78
79
80

9.6.4

80

9.6.5

80

81

9.6.6

Illustration I: Sample size
calculation using chi-square
and inverse sine transform
approximation
Illustration 2: Sample size
calculation using Poisson
approximation
Illustration 3: Sample size
calculation using Coronary
Drug Project design
specifications
Illustration 4: Sample size
calculation for blood
pressure change
Illustration 5: Sample size
calculation using Fisher’s
exact test
Illustration 6: Power
calculation based on chisquare and inverse sine
transform approximation

9.6.1

83

Illustration 7: Power for
design specifications given
in Illustration 2 for 1500
patients per treatment
group
Illustration 8: Power for
design specifications given
in Illustration 4 for 150
patients per treatment
group

84

9.7 Posterior sample size and power
assessments

82
9.6.8

83
83

88

88
88

Chapter 10. Randomization and the

84
84
84
84
84
85
85
85

85

85

9.6 Sample size and power
calculation illustrations

76
76

81
82
82

85

85
86

86

87
87

88

mechanics of treatment masking

90

10.1 Introduction
10.2 Adaptive randomization
10.3 Fixed randomization

90

10.3.1 Allocation ratio
10.3.2 Stratification
10.3.3 Block size

92
93

10.4 Construction of the
randomization schedule
10.5 Mechanics of masking treatment
assignments
10.6 Documentation of the
randomization scheme
10.7 Administration of the
randomization process
10.8 Illustrations
10.8.1 Illustration I: Restricted
randomization using a
table of random
permutations
10.8.2 Illustration 2: Unblocked
allocations using a table of
random numbers
10.8.3 Illustration 3: Blocked
allocations using the
Moses-Oakford algorithm
and a table of random
numbers
10.8.4 Illustration 4: Stratified and
blocked allocations using
the Moses-Oakford
algorithm and a table of
random numbers
10.8.5 Illustration 5: Sample
allocation schedule for the
Macular Photocoagulation
Study using pseudo
random numbers

91

92

95
96

97
100
101

105

105

105

107

107

110

I'

MM&iHiMiSSO

xiv

•rh'iF.nr***?*.'

Contents

Contents
10.8.6 Illustration 6: Double
masked allocation
schedule using the MosesOakford algorithm and a
table of random numbers
10.8.7 Illustration 7: Sample CDP
double-masked allocation
schedule

Chapter 11. The study plan
11.1 Introduction

11.2 Design factors and details to be
addressed in the study plan
11.3 Objective and specific aims
11.4 The treatment plan
11.5 Composition of the study
population
11.6 The plan for patient enrollment
and follow-up
11.7 The plan for close-out of patient
follow-up
Chapter 12. Data collection
considerations
12.1 Introduction
12.2 Factors influencing the clinic
visit schedule
12.2.1 Introduction
12.2.2 Baseline clinic visit
schedule
12.2.3 Follow-up clinic visit
schedule
12.2.4 Visit time limits
12.3 Data requirements by type of
visit

12.3.1 General considerations
12.3.2 Data needed at baseline
visits
12.3.3 Data needed at follow-up
visits
12.4 Considerations affecting item
construction
12.4.1 Implicit versus explicit
item form
12.4.2 I nterviewer-completed
versus patientcompleted items
12.4.3 Questioning strategy
12.4.4 Single versus multiple
use forms
12.4.5 Format and layout

12.5 Item construction

no
112

113
113
113
113
114

116
118
118

119

119
120
120
120

121
122

122
122
122
124

124

12.5.1 General
12.5.2 Language and terminology
12.5.3 Use of items from other
studies
12.5.4 Closed- versus open-form
items
12.5.5 Response checklist
12.5.6 Unknown, don't know, and
uncertain as response
options
12.5.7 Measurement and
calculation items
12.5.8 Instruction items
12.5.9 Time and date items
12.5.10 Birthdate and age items
12.5.11 Identifying items
12.5.12 Tracer items
12.5.13 Reminder and
documentation items
12.6 Layout and format
considerations
12.6.1 Page layout
12.6.2 Paper size and weight
12.6.3 Type style and form
reproduction
12.6.4 Location of instructional
material
12.6.5 Form color coding
12.6.6 Form assembly
12.6.7 Arrangement of items on
forms
12.6.8 Format
12.6.8.1 Items designed for
unformatted written
replies
12.6.8.2 Items requiring formatted
written replies
12.6.8.3 Items answered by check
marks
12.6.9 Location of form and
patient identifiers
12.6.10 Format considerations for
data entry
12.7 Flow and storage of completed
data forms

126
126
126

127
127
128
129

129
130
130
130
131
131
131

132
132
132

132
133
133
134
134
135

135
135

13.1.1 1RB and other approvals
13.1.2 IND and IDE submissions
13.1.3 OMB clearance
13.2 Approval maintenance
13.2.1 IRB
13.2.2 FDA
13.2.3 Other approvals
13.3 Developing study handbooks
and manuals of operations
13.4 Testing the data collection
procedures
13.5 Developing and testing the data
management system

141
142
144
144
144
144
145
145

145

13.6 Training and certification

147
147

13.7 Phased approach to data
collection

148

Chapter 14. Patient recruitment
and enrollment

149

14.1 Recruitment goals
14.2 Methods of patient recruitment
14.3 Troubleshooting
14.4 The patient shake-down process
14.5 The ethics of recruitment
14.6 Patient consent
14.6.1 General guidelines
14.6.2 The consent process
14.6.3 Documentation of the
consent
14.6.4 What constitutes an
informed consent?
14.6.5 Maintenance of consents
14.7 Randomization and initiation of
treatment
14.8 Zelen consent procedure

Chapter 16. Quality assurance
16.1 Introduction
16.2 Ongoing data intake: An
essential prerequisite for
quality assurance
16.3 Data editing
16.4 Replication as a quality control
measure
16.5 Monitoring for secular trends
16.6 Data integrity and assurance
procedures
16.7 Performance monitoring reports
16.8 Other quality control procedures
16.8.1 Site visits
16.8.2 Quality control committees
and centers
16.8.3 Data audits

149
PART IV. DATA ANALYSIS AND
149
INTERPRETATION
152
152
x Chapter 17. The analysis database
153
17.1 Introduction
153
17.2 Choice of computing facility
153
154
17.3 Organization of programming
resources

156
156
157
157
157

17.4 Operational requirements for
database maintenance
17.5 Data security precautions
17.6 Filing and storing the original
study records
17.7 Preparation of analysis tapes

xv
166
166

166
168

171
172

173
173
175
175

176
176

177
179

179
179

181
181
182

182
184

'si
tl

135

136

136
136

124

PART III. EXECUTION

139

125
125

Chapter 13. Preparatory steps in
executing the study plan

141

126
126

13.1 Essential approvals and
clearances

141

Chapter 15. Patient follow-up,
closeout, and post-trial follow-up
15.1 Introduction
15.2 Maintenance of investigator and
patient interest during follow
up
15.2.1 Investigator interest
15.2.2 Patient interest
15.3 Losses to follow-up
15.4 Close-out of patient follow-up
15.5 Termination stage
15.6 Post-trial patient follow-up

159

159

159
159
160
160
163
164
165

Chapter 18. Data analysis requirements
and procedures
18.1 Basic analysis requirements
18.2 Basic analytic methods
18.2.1 Simple comparisons of
proportions
18.2.2 Lifetable analyses
18.2.3 Other descriptive methods
18.3 Adjustment procedures
18.3.1 Subgrouping
18.3.2 Multiple regression
18.4 Comment on significance
estimation

jl-lH

I’"'

185

185
187

187
188
192
193
193
194
195

|;;l

I ;i

n
xvi

Contents

Contents

Chapter 19. Questions concerning the
design, analysis, and interpretation of
clinical trials

19.1
19.2

Introduction
Questions concerning the study
design
19.3 Questions concerning the
source of study patients
/I9.4 Questions concerning
randomization
19.5 Questions concerning masking
19.6 Questions concerning the
comparability of the
treatment groups
19.7 Questions concerning treatment
administration
19.8 Questions concerning patient
follow-up
19.9 Questions concerning the
outcome measure
19.10 Questions concerning data
integrity
19.11 Questions concerning data
analysis
19.12 Questions concerning
conclusions

Chapter 20. Interim data analyses for
treatment monitoring

20.1 Introduction
20.2 Procedural issues
20.3 Treatment monitoring reports
20.4 Special statistical problems
20.4.1 The multiple looks problem
20.4.2 The multiple outcomes
problem
20.4.3 The multiple comparisons
problem
20.5 Data dredging as an analysis
technique
20.6 The pros and cons of stopping
rules in monitoring trials
20.7 Steps in terminating a treatment

1%
196
196
197
198
200

20!
201

202
203
203
204
206

208

208
209
209
211
212
212

213

Chapter 22. Essential management
functions and responsibilities

220
220
221
221

223

223
223
224
224
224
225
225
226
228
228
228
228
228
228
228
229
229
229
230
230
230

230
231

232

214
215
216

PART V. MANAGEMENT AND
ADMINISTRATION

217

Chapter 21. Funding the trial

219

21.1 Introduction

21.2 NIH grant proposals
21.2.1 Deadlines and review
process
21.2.2 Application outline
21.2.3 Content suggestions
21.3 NIH requests for contract
proposals
21.3.1 Deadlines and review
process
21.3.2 Factors to consider when
deciding whether or not to
respond
21.3.3 The response
21.4 The study budget
21.4.1 Grants
21.4.2 Contracts
21.5 Budget breakdown
21.5.1 Personnel
21.5.2 Consultants
21.5.3 Equipment
21.5.4 Supplies
21.5.5 Travel
21.5.6 Patient care costs
21.5.7 Alterations and renovations
21.5.8 Consortium/contractual
costs
21.5.9 Other expenses
21.5.10 Budget justification
21.6 Preparation and submission of
the funding proposal
21.7 Negotiations and award
21.8 Grant and contract
administration
21.9 Special funding issues
21.9.1 Direct versus indirect
funding for multicenter
trials
21.9.2 Work unit payment
schedules

219

22.1 Management requirements
22.2 Management deficiencies
22.2.1 Failure to delegate authority
with responsibility
22.2.2 Inadequate provisions for
personnel backup
22.2.3 Ill-defined decision-making
structure
22.2.4 Inadequate funding
22.2.5 Lack of performance
standards

232
232
232

233
233
233
233

22.2.6 Failure to separate essential
activities
22.2.7 Ill-defined communication
structure
22.3 Patient safety monitoring: An
essential function
22.4 Advisory-review functions
22.5 Committee procedures
22.6 Preferred separation of
responsibilities and functions
22.6.1 Separation of treatment
administration and data
collection personnel in
unmasked trials
22.6.2 Separation of personnel
responsible for patient
care and safety monitoring
22.6.3 Separation of investigative
and advisory-review roles
22.6.4 Separation of sponsor and
investigative roles
22.6.5 Separation of data collection
and data processing
functions
22.6.6 Separation of centers in
multicenter trials
22.7 Special management issues
22.7.1 Disclosure requirements for
potential conflicts of
interest
22.7.2 Level of compensation for
committee members
outside the trial
22.7.3 Review and approval of
proposed ancillary studies
22.7.4 Publication and internal
editorial review procedures
22.7.5 Publicity and information
access policy issues

Chapter 23. Committee structures of
multicenter trials
23.1 Introduction
23.2 Study chairman
23.3 Steering committee
23.4 Executive committee
23.5 Other subcommittees of the
steering committee
23.6 Treatment effects monitoring
and advisory-review
committees
23.7 Committee-sponsor interaction

xvii

23.8 Center-to-center
communications

250

233
234
234

PART VI. REPORTING
PROCEDURES

253

235

Chapter 24. Study publication and
information policies

255

233

233

235

236
236

236
236
237
237

237

238
238
238

24.1 Information constraints
24.2 Publication questions
24.2.1 When to publish?
24.2.2 Presentation or publication?
24.2.3 Where to publish?
24.2.4 What to publish?
24.2.5 Journal supplements versus
regular issues
24.3 Authorship and internal review
procedures
24.3.1 Introduction
24.3.2 Individual versus corporate
authorship
24.3.3 Writing responsibilities
24.3.4 Credit rosters
24.3.5 Internal review procedures
24.4 Information access policy issues
24.4.1 Access to study data during
the trial by outside parties
24.4.2 Access to study data at the
conclusion of the trial
24.4.3 Access to study forms and
manuals
24.4.4 Inquiries from the press
24.4.5 Special analyses in response
to criticisms
24.4.6 Outside audits

255
256
256
256
257
257

258
259
259
259
260
260
260
261

261
262
262
262

|ln'

263
263

239

240
240
242
244
245

246

246
248

Chapter 25. Preparation of the
study publication

264

25.1 Introduction
25.2 Preparatory steps
25.3 Content suggestions
25.3.1 Title section
25.3.2 Abstract section
25.3.3 Introductory section
25.3.4 Methods section
25.3.5 Results section
25.3.6 Discussion section
25.3.7 Conclusion section
25.3.8 Reference section
25.3.9 Appendix section

264
264
264
264
265
268
268
268
268
268
268
269

!:il
Li

;i
'."i

saHHDMM
xviii

Contents

Contents

25.4 Internal review and submission
25.5 Acceptance and publication

Chapter 26. Locating and reading
published reports
26.1 Introduction
26.2 Bibliography development
26.3 Questions and factors to
consider when reading a
report from a clinical trial
26.4 Valid and invalid criticisms
26.5 Desirable characteristics of a
critic

269
270
271

271
271
272
276
277

PART VII. APPENDIXES

279

Appendix A. Glossary

281

A. I Preface

A.2 Glossary

Appendix B. Sketches of selected trials
B.l Introduction
B.2 Methods
B.3 Results

Appendix C. Year 1980 clinical trial
publications
C.l Papers reviewed
C.2 Papers excluded

281
281
309
309
309
309

355
355
359

Appendix D. Activities by stage of trial

363

Appendix E. Sample consent statements

374

E. 1 Consent statement for the
Macular Photocoagulation
Study (MPS): Senile Macular
Degeneration Study
E.2 Consent statement for the
Persantine Aspirin
Reinfarction Study (PARIS)
E.3 Consent statement for the
Hypertension Prevention Trial
(HPT)

Appendix F. Data items and forms
illustrations

F.I
F.2

Item numbering
Items that indicate presence or

374
376
377

379
379

absence of a finding or
condition
F.3 Unnecessary words
F.4 Double negatives
F.5 Compound questions
F.6 Comparative evaluations
F.7 Inverted meaning of a yes reply
F.8 Presence versus absence of a
condition
F.9 Time references
F.I0 Direction of response
F. 11 Leading questions
F. 12 Vertical versus horizontal
response lists
F.13 Unit specifications
F.I4 Precision specifications
F.I5 Calculation items
F. 16 Instruction items
F.17 Age and birthdate items
F. 18 Reminder and documentation
items
F.19 Full-page versus two-column
layout
F.20 Layout for SKIP items
F.2I Instructional information
F.22 Unformatted responses
F.23 Formatted responses
F.24 Layout for check positions
F. 25 Field designations and precoded
responses

Appendix G. Sample manual of
operations, handbook, and monitoring
report
G. l Introduction
G.2 Table of contents of the National
Cooperative Gallstone Study
Clinic Manual of Operations
(July 1975 version)
G.3 Listing of pages in the
Hypertension Prevention Trial
Handbook (April 7, 1983
version)
G.4 Sample tables from Macular
Photocoagulation Study
Treatment Monitoring Report
(January 31, 1982 Report)
G.5 Listing of tables in the Final
Treatment Effects Monitoring
Report of the Persantine
Aspirin Reinfarction Study
(October 15, 1979, Database)

380
382
383
383
385
386

Appendix H. Budget summary for
Hypertension Prevention Trial Data
Coordinating Center

425

xix

Appendix I. Combined bibliography

430

Index

453

386
386
388
389
390
392
394
395
398
399

400

401
405
408
409
410
411

414
pfn-'

417

417

■:

|i
11‘'*

417

i oil

iji
419

421

423

I

Tables and figures

Part I. Introduction and current status

Table I-I Historical events in the
development of clinical
trials
Table I -2 Frequency of selected terms
in titles published in 1980

i

I
1

Table 2-1 Number of trials, median
sample size, and percent
randomized by fiscal year,
as reported in NIH
Inventories of Clinical
Trials
Table 2-2 Design features of trials
reported in the 1979 NIH
Inventory of Clinical Trials
Table 2-3 Number of trials, median
sample size, and percent
randomized, as reported in
the 1979 NIH Inventory of
Clinical Trials
Table 2-4 1980 publications cited in
MEDLINE as of October
1981
Table 2 5 Literature selection process
for papers appearing under
heading clinical trials
Table 2-6 Number of journals
represented in sample of
113 papers
Table 2-7 Journal of publication for
113 papers reviewed
Table 2-8 Subject matter of 113 papers
reviewed
Table 2 9 Design characteristics of
sample of 113 trials
appearing in 1980
published literature

Table 4-2 Design features of N1H
single-center and
multicenter trials
Table 4-3 Design features of single
center and multicenter
trials, as reflected in a 1980
sample of clinical trial
publications
Table 4-4 Funding mode for N1H
extramural trials in fiscal
year 1979
Table 4-5 NIH expenditures for trials
in fiscal year 1979 by type
of trial

i

4
9

12
12

13
14
14

14
14
15

16

Table 3-1 Stages of a clinical trial

19

Table 4 1 NIH-sponsored single-center
and multicenter trials by
Institute, for fiscal year
1979

24

Table 5-1 Type of resource center
represented in the 14 trials
sketched in Appendix B
Table 5-2 Coordinating center activities
by stage of trial, with
emphasis on data
coordination activities
Table 5-3 Percent of full-time
equivalents by category of
personnel and year of
study for the CDP
Coordinating Center
Table 5-4 General equipment
requirements of
coordinating centers
Table 5-5 Relative cost of coordinating
centers for five trials
reviewed in the
Coordinating Center
Models Project
Table 5-6 Budget allocation for
coordinating centers by
category and year of study.
Results for centers from
AMIS, CDP. CAST,
HDFP. LRC-CPPT. and
MRF1T
Table 5-7 Budget allocation of the
CDP Coordinating Center,
by category and year of
study

25

26
27
28

31

32

34

35

d
i

35

?!•
36

37

I

xxii

Tables and figures

Table 5-8 Central versus local
laboratories in multicenter
trials
Table 5-9 Conditions under which
centralized readings may
be required
Figure 5-1 Percentage cost of the CDP
Coordinating Center,
relative to total direct
study cost
Table 6-1 Number of NIH-sponsored
trials, by institute and
fiscal year
Table 6-2 NIH expenditures for clinical
trials as a percentage of
total NIH appropriations
Table 6-3 Percent distribution of total
NIH expenditures for
clinical trials, by institute
and fiscal year
Table 6-4 Percent distribution of total
NIH projected
expenditures for clinical
trials, by institute and
fiscal year
Table 6-5 Mean and median projected
expenditures per patientyear of study for trials
listed in the 1979 inventory
Table 6-6 VA expenditures for
multicenter clinical trials,
by fiscal year

Table 7-1 Chronology of events
associated with the UGDP
Table 7-2 Criticisms of the UGDP and
comments pertaining to
them
Table 7-3 Advertising for oral
hypoglycemic agents in the
Journal of the American
Medical Association for
1969 and 1979
Table 7-4 Percentage of patient
physician visits for
diabetics by type of
prescription issued
Table 7-5 Estimated U.S. wholesale
dollar cost for oral
hypoglycemic prescriptions

37
38

36

41
41

42

42

43
43
53
56

60

60
61

Figure 7-1 Estimated total number of
hypoglycemic
prescriptions (new and
refill) for the U.S.
Figure 7-2 Estimated number of insulin
prescriptions (new and
refill) and ratio of oral
hypoglycemic Rx’s to
insulin Rx’s for the U.S.
Figure 7-3 Type of hypoglycemic
prescription on discharge
from general hospitals for
diabetes as a percentage
of total diabetic
discharges
Part II. Design principles and practices

Table 8-1 Requirements for the test and
control treatments
Table 8-2 Desired characteristics of the
primary outcome measure
Table 8-3 Requirements of a sound
treatment allocation
scheme
Table 8-4 Masking guidelines

Table 9-1 Illustration of a sample size
presentation, a — 0.01
(two-tailed), P — 0.05, and
A= I
Table 9-2 Illustration of a power
presentation, given a
sample size of 800,
a = 0.01 (two-tailed), and
A= I
Table 9-3 Design specifications
affecting sample size
considerations
Table 9-4 Sample size and power
calculation summary for
Sections 9.4 and 9.5
Table 9-5 Z values for M0,1)
distribution for selected
error levels
Table 9-6 Values of (/I), the
proportion of area of a
M0.1) distribution point
lying to the left of a
designated point A, for
selected values of A

59

Figure 9 1 Schematic illustration of
boundaries for open
sequential design
Figure 9-2 Schematic illustration of
boundaries for closed
sequential design
Stratification
considerations for
randomization
Table 10 2 Blocking considerations
Table 10-3 Moses-Oakford assignment
algorithm for block of
size k
Table 10-4 Moses-Oakford treatment
assignment worksheet
for block of size k
Table 10-5 Illustration of MosesOakford algorithm
Table 10-6 First 25 lines of page 17 of
The Rand Corporation’s
1 million random digits
Table 10-7 Items that should be
included in the written
documentation of the
allocation scheme
Table 10-8 Safeguards for
administration of
treatment allocation
schedules
Table 10-9 Sample CDP treatment
allocation schedule
Table 10-10 Sample CDP allocation
form and envelope
Table 10-11 Reproduction of 20 sets of
random permutations of
first 16 integers, from
page 584 of Cochran
and Cox (1957)
Table 10-12 Allocations for
Illustration I
Table 10-13 Allocations for
Illustration 2
Table 10-14 Allocations for
Illustration 3
Table 10 15 Allocations for
Illustration 4
Table 10 16 Sample allocation schedule
from the Macular
Photocoagulation Study
for Illustration 5

72
73

Table 10 I
61

61
63

66
67

68
69

74

75
75
81
82

82

93
95
97
98
99

100

101

101

102
103

106

106

107
108
109

110

Tables and figures

xxiii

Table 10-17 Allocation schedule for
double-masked drug
trial described in
Illustration 6
Figure 10-1 Stylized bottle label for
medication dispensed in
the XYZ trial

111

Table 11-I Example of a factorial
treatment design for a
two-drug study
Table 11-2 Numbers of patients by
treatment group in
PARIS
Table 11-3 Major items to be included
in the treatment protocol
Table 11 -4 Advantages and
disadvantages of opposing
selection strategies
Table 11-5 Primary selection criteria of
trials sketched in
Appendix B

101

115
115

116
116

117

Table 12-1 Sample appointment
schedule and permissible
time windows, as adapted
from the Coronary Drug
Project
Table 12-2 Methods for avoiding errors
of omission and
commission in the data
form construction process

123

Part III. Execution

139

Table 13-1 Information required for
IRB approval
Table 13-2 Items of information
required for IND and
IDE submissions to the
FDA
Table 13-3 Suggestions for
development of study
handbooks and manuals
of operations
Table 14-1 Methods of patient
recruitment
Table 14-2 Comments concerning the
choice of recruitment
methods

124

142

143

146
150
150

irthr..!
xxiv

**1
Tables and figures

Tables and figures

Table 14-3 General elements of an
informed consent
Table 14-4 Suggested items of
information to be
imparted in consents for
clinical trials
Table 15-1 Aids for maintaining
investigator interest
Table 15-2 Factors and approaches that
enhance patient interest
and participation
Table 15-3 Methods for relocating
dropouts
Table 15-4 Data items that may be
used in searches of the
National Death Index
Table 15-5 Study close-out
considerations
Table 15-6 Activities in the termination
stage
Figure 15-1 Lifetable cumulative
dropout rates for the
clofibrate, niacin, and
placebo treatments in
the CDP

Table 16-1 Quality assurance
procedures
Table 16-2 Types of edit checks
Table 16-3 Edit message rules
Table 16-4 Data integrity checks
Table 16-5 Performance characteristics
subject to ongoing
monitoring
Figure 16-1 MPS Coordinating Center
edit message of August 3,
1983
Figure 16-2 MPS Coordinating Center
edit message of October 4,
1983

Part IV. Data analysis and
interpretation
Table 17-1 General-use versus
dedicated computing
facilities
Table 17-2 Considerations in choosing
among computing
facilities

154

155
160
161

162
163
163
165

162
167
168
169
173

174

169
170

177

180

180

Table 17-3 Precautions and safeguards
for database operations
Table 18-1 Examples of analysis
ground rule violations
Table 18-2 Percentages of UGDP
patients with indicated
baseline characteristics
Table 18-3 Percentages of PARIS
patients with indicated
complaint during follow

up
Table 18-4 Hypothetical trial involving
comparison of percentage
of patients dead at
indicated time points
Table 18-5 Lifetable cumulative
mortality rates for the
placebo and tolbutamide
treatments in the UGDP,
as of October 7, 1969
Table 18-6 Log rank test for comparing
lifetables in Table 18-5
Table 18-7 Percentage distribution of
UGDP patients by level
of treatment adherence
Table 18-8 Percentage of patients dead
within specified
subgroups created using
selected baseline
characteristics
Table 18-9 Observed and adjusted
tolbutamide-placebo
difference in percent of
patients dead
Figure 18-1 Number of deaths in the
UGDP through October
7, 1969, by treatment
group
Figure 18-2 Plot of observed ESG1placebo difference in
percent of CDP patients
dead from lung cancer
Figure 18-3 UGDP cumulative lifetable
mortality rates by year
of follow-up and by
treatment assignment
Figure 18-4 Lifetable cumulative
dropout rates for the
clofibrate, niacin, and
placebo treatments in
the CDP

183

186
188

188

189

191

192
193

194

195

187

189

190

190

Figure 18-5 CDP lifetable plot of the
DT4-placebo mortality
differences and 2.0
standard error limits for
the differences
Figure 18-6 Percent change in fasting
blood glucose levels for
cohorts of patients
followed through the
nineteenth follow-up
visit
Table 20-1 Content of treatment
monitoring reports
Table 20-2 Ground rules for data
dredging via subgroup
analyses
Figure 20-1 Ninety-five percent
mortaility monitoring
bounds for the
tolbutamide-placebo
treatment comparison in
the UGDP

Part V. Management and
administration
Table 21-1 Number and percent of
NIH extramural
sponsored trials, by type
of support
Table 21-2 Grant application content
suggestions for clinical
trials
Table 21-3 Questions to be considered
when deciding on the
merits of a response to a
Request for Proposal
(RFP)
Table 21-4 Direct cost items, by budget
category
Table 21-5 Direct versus indirect
(consortium) funding for
centers in multicenter
trials
Table 21-6 Factors influencing the
choice between direct
versus indirect
(consortium) funding
Table 22 I Classes of trials requiring
safety monitoring

Table 22-2 Guidelines for committee
operations

192

193
210
214

213

217

Table 23-1 Key organizational units
Table 23-2 Functions and
responsibilities of the
main organizational units
of multicenter trials
Table 23-3 Functioning committees of
the Coronary Drug
Project
Table 23-4 Characteristics of steering
committees and
committees responsible
for safety monitoring in
the 14 trials sketched in
Appendix B
Table 23-5 Do’s and don’t’s for
formation of the steering
committee
Table 23-6 Considerations leading to a
separate ARC and
TEMC or a combined
ARTEMC
Figure 23-1 Committee-sponsor
interaction models
P«rt VI. Reporting procedures

220

222

Table 24-1 Pros and cons of interim
publications not related
to a treatment protocol
change
Table 24-2 Options for initial
communication of results
Table 24-3 Long versus short papers
Table 24-4 Pros and cons of individual
versus corporate
authorship

xxv

235
240

241
242

245
246

247

249
253

256
257
258

259

224
226

231

231

234

Table 25-1 Content suggestions for the
study publication
Table 26-1 Selected printed and
computerized databases
of published literature
and work in progress
Table 26-2 Questions to consider when
assessing a published
report
Table 26-3 Universal criticisms
Table 26-4 Characteristics of a
responsible critic

266

272
274
276

277

i'lHihU.

xxvi

Table.1! and figures

Part VII. Appendixes

Table B-l List of trials sketched
Table B 2 Abstract summaries of trials
sketched
Table B-3 Publication list of sketched
trials
Table B-4 Summary tabulations from
sketches
Table B-5 Sample sketch for the
UGDP
Table B-6 Data coordinating centers
for multicenter trials
referenced in this book

i

■y--2222B&BfiHittiiSStjl

Table E-l Content checklist for sample
consent statements

279

313
314
319
327
349
353

375

Table H-l DCC ceiling support levels
as specified in NHLBI
Notice of Grant Award
Table H 2 Projected allocation of funds
by budget category and
year of study
Table H 3 Projected staffing patterns
by year of study, in fulltime equivalents (FTEs)
Table H-4 Projected travel expenses by
year of study
Table H-5 Other DCC expenses by
year of study
Table H-6 DCC percent allocation of
funds, excluding nonDCC-related costs
Table H-7 Cost of DCC relative to
total projected HPT cost

Part I. Introduction and current status
425
425
Chapters in This Part

426
427
428

1. Introduction
2. Clinical trials: A state-of-the-art assessment
3. The activities of a clinical trial
4. Single center versus multicenter trials
5. Coordinating and other resource centers in multicenter trials
6. Cost and related issues
7. The impact of clinical trials on the practice of medicine

429
429

The seven chapters in this Part cover a number of background issues. The first provides a
historical sketch of clinical trials and defines the class of trials considered in this book.
Chapter 2 reviews the state of the art of clinical trials, as gleaned from published reports of
clinical trials. Chapter 3 defines the stages of activities in a “typical" trial and discusses factors
which influence these activities. Chapter 4 provides a definition of single and multicenter
trials and discusses a number of issues related to these two classes of trials. Chapter 5 focuses
on specialty centers of a multicenter trial with emphasis on coordinating centers. Chapter 6
summarizes available cost data on trials, as provided in the National Institutes of Health
Inventory of Clinical Trials, and reviews factors that influence the cost of trials. The last
chapter discusses factors that influence the way in which results from trials are viewed and
used in everyday medical practice. The University Group Diabetes Program is used as a case
study.

J

I

A

1. Introduction

Those who cannot remember the past are condemned to repeat it.

1.1 Definition
1.2 History of clinical trials
1.3 Terminology conventions
1.4 Focus
Table I-I Historical events in the development
of clinical trials
Table 1-2 Frequency of selected terms in titles
published in 1980

1.1

DEFINITION

A clinical trial is a planned experiment designed
to assess the efficacy of a treatment in man by
comparing the outcomes in a group of patients
treated with the test treatment with those ob
served in a comparable group of patients receiv
ing a control treatment, where patients in both
groups are enrolled, treated, and followed over
the same time period. The groups may be estab
lished through randomization or some other
method of assignment. The outcome measure
may be death, a nonfatal clinical event, or a
laboratory test. The period of observation may
be short or long depending on the outcome mea
sure.
Under this definition, studies involving test
and controMreated groups that are treated and
followed over different time periods, such as
studies involving a historical control group, do
not qualify as a clinical trial. Also excluded are
comparative studies involving animals other
than man, or studies that are carried out in vitro
using biological substances from man.

1.2 HISTORY OF CLINICAL TRIALS
The history of clinical trials has been traced by
several persons, most notably by Bull (1959) and
more recently by Lilienfeld (1982). Table I-I
provides a summary of some of the historical
events in the field of clinical trials.
The concepts involved in clinical trials are
ancient. The Book of Daniel, verses 12 through

George Santayana

15, contains an account of a planned experiment
with both baseline and follow-up observations.
Prove thy servants, f beseech thee, ten days:
and let them f'ive us pulse to eat. and water
to drink. Then let our countenances be
looked upon before thee, and the counte
nance of the children that eat of the por
tion of the King 's meat: and as thou seest.
deal with thv servants. So he consented to
them in this matter, and proved them ten
da vs. And at the end of ten days their coun
tenances appeared fairer and fatter in flesh
than all the children which did eat the por
tion of the King’s meat (American Bible
Society. 1816).

Avicenna, an Arabian physician and philoso
pher (980-1037), in his encyclopedic Canon of
Medicine, set down seven rules to evaluate the
effect of drugs on diseases. He suggested that a
remedy should be used in its natural state, with
uncomplicated disease, and should be observed
in two “contrary tvpes of disease." His Canon
also suggested that the time of action and re
producibility of the treatment effect should be
studied (Crombie, 1952).
Many of the early observations affecting
choice of treatment were fortuitous and arose
from natural consequences rather than planned
experiments. The famous observation^of the Re
naissance surgeon, Ambroise Pare (1510'1590),
during the battle to capture the castle of Villaine
in 1537. is a case in point (Packard, 1921). Nor
mal treatment procedure for battlefield injuries
was to pour boiling oil over the wound. When
Pare ran out of oil he found it necessary to resort
to an alternative treatment consisting of a diges
tive made of egg yolks, oil of roses, and turpen
tine. Pare recognized the superiority of the treat
ment the next day.
/ raised mvself very early to visit them,
when bevond mv hope I found those to
whom I had applied the digestive medica-

3

hi''

I'".

5;;;

I

liiiiiTili iiimimfilwAj

4

Ji

1.2 History of clinical trials 5

Introduction
Table 1-1

Historical events in the development of clinical trials

Date

A uihor

Event

1747
1799

Experiment with untreated control group (Lind, 1753)

1800

Lind
Haygarth
Waterhouse

1863

Gull

Use of placebo treatment (Sutton, 1865)

1923

Fisher

U.S.-based smallpox trial (Waterhouse. 1800, 1802)
Application of randomization to experimentation (Fisher and MacKenzie, 1923)

Special committee on clinical trials created by the Medical Research Council of
Great Britain (Medical Research Council. 1931)

1931

1931

Use of sham procedure (Haygarth, 1800)

Amberson

1937

Random allocation of treatment to groups of patients (Amberson et al., 1931)

Start of NIH grant support with creation of the National Cancer Institute
(National Institutes of Health, 1981b)
Publication of multicenter trial on treatment for common cold (Patulin Clinical
Trials Committee, 1944)

1944
1946

Promulgation of Nuremberg Code for Human Experimentation (Curran and
Shapiro. 1970)

1962

Hill

Publication of book on clinical trials (Hill. 1962)

1962

Kefauver,
Harris

Amendments to the Food. Drug and Cosmetic Act of 1938 (United States
Congress. 1962)

1966

1967
1979
1980

Publication of U.S. Public Health Service regulations leading to creation of
Institutional Review Boards for research involving humans (Levine. 1981)

Chaimen

Structure for separating the treatment monitoring and treatment administration
process (Coronary Drug Project Research Group. 1973a)
Establishment of Society for Clinical Trials (Society for Clinical Trials, Inc.,
1980)
First issue of Controlled Clinical TYials

ment, feeling but little pain, their wounds
neither swollen nor inflamed, and having
slept through the night. The others Io
whom I had applied the boiling oil were
feverish with much pain and swelling about
their wounds. Then / determined never
again to burn thus so cruelly the poor
wounded by arquebuses (Packard, 1921).

would last, three spoonfuls every morning
fasting: not suffering them to eate any thing
after it till noone. This juice worketh much
the better if the partie keepe, a short Dyet,
and wholly refraine salt meate, which salt
meate, and long being al the Sea is the only
cause of the breeding of this Disease (Drum
mond and Wilbraham, 1940).

An indication that lemon juice was effective in
preventing scurvy was the result of a fortuitous
decision made by the East India Shipping Com
pany in 1600. Only one of the company’s four
ships that sailed February 13, 1600, that of
General James Lancaster, was supplied with
lemon juice. Almost all of the sailors on board
Lancaster’s vessel remained free of scurvy, while
most of the men on board the other three vessels
fell victim to the disease. This led shipping com
pany officials to conclude:

The first planned experiments were done with
out a formal comparison group. The results of
the experiment, contrasted with previous expe
rience, provided the basis for evaluation. The
early smallpox experiments are a case in point.
A study carried out by Lady Mary WortleyMontague and Maitland in 1721 involved six
inmates from Newgate prison, all assumed to
have had no previous exposure to smallpox. The
inmates were recruited through a policy, urged
by Lady Wortley-Montague, in which King
George I commuted the sentence of convicted
felons if they agreed to inoculation. The prison
ers were inoculated by engrafting smallpox mat
ter from a patient with the natural disease onto
both arms and the right leg. The fact that they

And the reason why the General's men
stood better in health then [.nc] the men of
other ships, was this: he brought to sea with
him certaine Bottles of the Juice of Limons,
which hee gave to each one, as long as it

remained free of smallpox was taken as evidence
in favor of inoculation1 (Creighton, 1894).
Jenner (1749 1823) described a scries of exper
iments that involved 14 persons, or thereabouts,
who had been vaccinated with cowpox (Baron,
1838). He later inoculated three of these people
with smallpox and the others with cowpox. He
subsequently wrote:
After the manyfruitless attempts to give the
Small-pox to those who had had the Cow
pox, it did not appear necessary, nor was it
convenient to me, to inoculate the whole of
those who had been the subjects of these
late trials; yet / thought it right to see the
effects of variolous matter on some of them,
particularly William Summers, the first of
these patients who had been infected with
matter taken from the cow. He was there
fore inoculated with variolous matter from
a fresh pustule; hut, as in the preceding
Cases, the system did not feel the effects of
it in the smallest degree (Jenner. 1798).
Early experiments with anesthetics (ether and
chloroform) in the 1840s by Long, Wells, Mor
ton. and Simpson involved only a few patients
and no control group (Duncum, 1947). The abil
ity to render an individual unconscious and then
to revive that individual was sufficient to estab
lish the usefulness of anesthetics.
None of the early evaluations of penicillin
involved controls. The dramatic recoveries
achieved in treating infections, theretofore fatal,
were by themselves sufficient to establish the
efficacy of the treatment (Keefer et al., 1943).
One of the first experiments designed with a
concurrently treated control group involved
scurvy victims and was carried out by James
Lind in 1747, while at sea on board the Salis
bury. The study consisted of six different dietary
regimens as described by Lind.
On the 20th of May 1747, 1 took twelve
patients in the scurvy, on board the Salis
bury at sea. Their cases were as similar as 1
could have them. They all in general had
putrid gums, the spots and lassitude, with
weakness of their knees. They lay together
in one place, being a proper apartment for
the sick in the fore-hold; and had one diet
common to all, viz., watergruel sweetened
I. The results were not as convincing as first perceived. One of the
six inmates was suhsequentlv found to have had smallpox before
inoculation and a second may have had the disease in childhood
(Creighton, 1X94).

with sugar in the morning: fresh mutton
broth often times for dinner: at other times
puddings, boiled biscuit with sugar, etc.;
and for supper, barley and raisins, rice and
currants, sago and wine, or the like. Two of
these were ordered each a quart of cyder aday. Two others took twenty-five gulfs of
elixir vitriol three times a-day, upon an
empty stomach; using a gargle strongly acid
ulated with it for their mouths. Two others
took two spoonfuls of vinegar three limes aday, upon an empty stomach; having their
gruels and their other food well acidulated
with it, as also the gargle for their mouth.
Two of the worst patients, with the tendons
in the ham rigid, (a symptom none of the
rest had), were put under a course of sea
water. Of this they drank half a pint every
day, and sometimes more or less as it oper
ated, by way of gentle physic. Two others
had each two oranges and one lemon given
them every day. These they eat with greedi
ness, at different times, upon an empty
stomach. They continued but six days
under this course, having consumed the
quantity that could be spared. The two re
maining patients, took the bigness of a nut
meg three times a-day, of an electuary rec
ommended by an hospital surgeon, made of
garlic, mustard-seed, rad raphan. balsam of
Peru, and gum myrrh; using for com
mon drink, barley-water well acidulated
with tamarinds; by a decoction of which,
with the addition of cremor tartar, they
were gently purged three or four times dur
ing the course.
Those receiving a daily ration of oranges and
lemons fared best.

The consequence was, that the most sudden
and visible good effects were perceivedfrom
the use of the oranges and lemons; one of
those who had taken them, being at the end
of six days fit for duty (Lind, 1753).

Still, in spite of these findings. Lind and others
clung to the notion that the best treatment in
volved placing patients stricken with scurvy in
“pure dry air." The reluctance to accept oranges
and lemons as treatment for the disease had to
do. in part, with the relative expense of acquir
ing such fruits as opposed to the “dry air” treat
ment. It was 1795 before the British Navy sup
plied lemon juice for its ships at sea (Drummond
and Wilbraham, 1940).

I

3
6

Introduction

The importance of a control treatment as a
means of identifying placebo effects was recog
nized by Haygarth (1740-1827) in his 1799 study
of Perkin’s Tractors—metallic rods used to
stroke the body of an ailing person (Haygarth,
1800). The rods were widely used at the time for
a variety of conditions, including crippling rheu
matism, pain in the joints, wounds, gout, pleu
risy, and inflammatory tumors, as well as for
“sedating violent cases of insanity." Haygarth
used imitation tractors made of wood on five
patients affected with chronic rheumatism.

Let their [the Tractors’] merit be impartially
investigated, in order to support theirfame,
if it be wellfounded, or to correct the public
opinion, if merely formed upon delusion.
Such a trial may be accomplished in the
most satisfactory manner, and ought to be
performed without any prejudice. Prepare a
pair of false, exactly to resemble the true
Tractors. Let the secret be kept inviolable,
not only from the patient, but every other
person. Let the efficacy of both be impar
tially tried; beginning always with the false
Ti-actors. The cases should he accurately
staled, and the reports of the effects pro
duced by the true andfalse Tractors he fully
given, in the words of the patients. . . .
On the 7th of January, 1799, the wooden
Tractors were employed. All the five pa
tients, except one. assured us that their pain
was relieved. . . .
The following day Haygarth used the metallic
tractors on the same patients. He observed:
All the patients were in some measure, but
not more relieved by the second applica
tion. except one, who received no benefit
from the former operation, and who was
not a proper subject for the experiment,
having no existing pain, but only stiffness of
her ankle (Haygarth, 1800).

Sir William Gull (1816-1890), in collabora
tion with Henry Sutton, demonstrated the im
portance of placebo treatment in assessing the
natural variability of the course of disease and
the possibility of spontaneous cure. They gave
mint water to 44 rheumatic fever patients and.
after close observation, concluded:
The cases show that too much importance
has been attached to the use of medicines,
especially those acute cases where the ten
dency to a natural cure is the greatest (Sut
ton. 1865).

1.2 History of clinical trials 7
Most of the early experiments involved arbi
trary, nonsystematic schemes for assigning pa
tients to treatment, such as that described by
Lind. More systematic approaches were needed
for trials in which patients were enrolled in a
sequential fashion. Johannes Fibiger, in an eval
uation of a therapeutic serum for the treatment
of diphtheria patients, used a scheme in which
“serum was injected into all those admitted on
every other day" (Fibiger. 1898). Park and co
workers. in 1928, described a scheme involving
use of an experimental treatment for lobar pneu
monia on every other patient.

Patients were therefore taken alternatively
for antibody treatment or control depend
ing only on the order of their admission to
the service. It was believed that with a suffi
ciently large series the distribution of cases
by type would be equalized between the
and ,he untreated group (Park et al..
The concept of randomization as a device for
treatment assignment was introduced by Fisher
while he was involved in agricultural experimen
tation (Box. 1980; Fisher and MacKenzie. 1923;
Fisher, 1926, 1973). Amberson and his co
workers, in a study of sanocrysin in the treat
ment of pulmonary tuberculosis, were among
the first to use the concept for treatment assign
ment in an actual clinical trial.

The 24 patients were then divided into two
approximately comparable groups of 12
each. The cases were individually matched,
one with another, in making this divi
sion. . . . Then, by a flip of the coin, one
group became identified as group I (sanocrysin-treated) and the other as group H
(control). The members of the separate
groups were known only to the nurse in
charge of the ward and to two of us. The
patients themselves were not aware of any
distinctions in the treatment administered
(Amberson et al.. 1931).
It was several years later before the process of
randomization was used for assigning individual
patients to treatment. Diehl and co-workers
(1938) described a method of randomly assign
ing University of Minnesota student volunteers
to treatment in a double-masked, placebo-con
trolled trial involving treatment of the common
cold.
Great Britain, under the influence of men such
as Sir Austin Bradford Hill, has been a leading
force in the development of modern-day clinical

center (Francis et al., 1955). They are note
worthy because of their size. They involved tens
of thousands of volunteers.
The creation of the National Cancer Institute
in 1937 signaled the start of federally sponsored
medical research in the United States and the
creation of what ultimately has come to consti
tute the National Institutes of Health (National
The Medical Research Council announce
Institutes of Health, 1981b). The Institutes of
that they have appointed a Therapeutic
this agency support by far the largest number of
Trials Committee, as follows, to advise and
trials among all United States governmental
assist them in arranging for properly con
agencies. The largest and most complex multi
trolled clinical tests of new products that
center trials have been carried out by the Na
seem likely, on experimental grounds, to
tional Heart, Lung, and Blood Institute
have value in the treatment of disease. . . .
(NHLBI). Some, such as the Multiple Risk Fac
The Therapeutic Trials Committee will be
tor Intervention Trial (Multiple Risk Factor In
prepared to consider applications by com
tervention Trial Research Group, 1977) and the
mercial firms for the examination of new
Hypertension Detection and Follow-Up Pro
products, submitted with the available
gram (Hypertension Detection and Follow-Up
experimental evidence of their value, and
Program Cooperative Group, 1979a), have in
appropriate clinical trials will he arranged
volved thousands of patients and years of fol
in suitable cases (Medical Research Coun
low-up.
cil. 1931).
One of the first multicenter trials sponsored
by the National Heart Institute (now the Na
The concept of multiple investigators from
tional Heart, Lung, and Blood Institute) was a
different sites, all following a common study
trial involving the use of ACTH, cortisone, and
protocol in the conduct of a clinical trial, did not
aspirin as a treatment for rheumatic heart dis
emerge until the late 1930s and early 1940s. One
ease. The trial was initiated in 1951 and was
of the first applications of this approach ap
carried out in conjunction with the Medical Re
peared in a 1944 publication of a trial to evaluate
search Council of Great Britain, the American
patulin for treatment of the common cold (PaHeart Association, and the Canadian Arthritis
tulin Clinical Trials Committee, 1944).
and Rheumatism Society (Rheumatic Fever
A multicenter trial involving the use of strep
Working Party, I960).
tomycin in patients with pulmonary tuberculosis
Multicenter trials, focusing on the treatment
was published in 1948 (Medical Research Coun
of chronic noninfectious diseases, began to ap
cil, 1948). One of the first multicenter trials in
pear in the 1960s. One of the first examples in
the United States involved assessment of the
this category was the University Group Diabetes
same drug (Mount and Ferebee 1952, 1953a,
Program, started in I960 and completed in 1974
1953b). The study was initiated about the same
(University Group Diabetes Program Research
time as the British study but did not produce any
Group, l970e, 1978).
published results until 1952—-four years after the
The advent of multicenter clinical trials as a
British publication.
treatment evaluation tool has required collabo
The Veterans Administration (VA), in con
ration among various disciplines. In addition to
junction with the United States Armed Services,
medical and biostatistical expertise, a typical
carried out a series of multicenter trials between
large-scale multicenter trial requires close partic
1945 and I960 in an attempt to establish the
ipation with various other specialists. This multi
efficacy of various chemotherapeutic agents in
disciplinary approach has served to stimulate
the treatment of tuberculosis (Tucker, I960). The
communication across disciplines, as evidenced
VA provided support for various other multicen
by formation of the Society for Clinical Trials
ter trials in the 1960s under a relatively informal
in 1979 and publication of Controlled Clinical
funding structure. A more formal structure was
Trials starting in 1980.
created in 1972.
A major stimulus for the execution of clinical
The United States poliomyelitis vaccine trials,
trials in the United States arose from language
started in the autumn of 1953, sponsored by the
included in the 1962 Kefauver-Harris amend
National Foundation for Infantile Paralysis and
ments to the United States Food. Drug and Cos
done in collaboration with the Public Health Ser
metic Act of 1938. The Act set forth a series of
vice and state health departments, were multi

trials. His book Statistical Methods in Clinical
and Preventive Medicine (1962) represents an
important milestone in the field of clinical trials.
The Medical Research Council of the United
Kingdom recognized the need for clinical trials
at least as early as 1930. An announcement in a
1931 issue of lancet stated:

8

Introduction

legal requirements which had to be satisfied be
fore a drug could be approved by the Food and
Drug Administration—FDA (Colsky, 1963;
Food and Drug Administration, 1963; Kelsey,
1963; United States Congress, 1962). A unique
feature of the amendment was language spelling
out the nature of scientific evidence required for
a drug to be approved for human use—a specifi
cation heavily dependent on what are referred to
in the act as “adequate and well-controlled in
vestigations.”

The term “substantial evidence ” means evi
dence consisting of adequate and wellcontrolled investigations, including clinical
investigations, by experts qualified by sci
entific training and experience to evaluate
the effectiveness of the drug involved, on
the basis of which it could fairly and re
sponsibly be concluded by such experts that
the drug will have the effect it purports or is
represented to have under the conditions of
its use prescribed, recommended, or sug
gested in the labeling or proposed labeling
thereof (United States Congress, 1962).
Regulations published in the Federal Register
(Food and Drug Administration, 1969a, 1969b,
1970a, 1970b) have set forth general design and
execution standards for trials carried out as part
of a FDA Investigational New Drug Applica
tion (INDA) and New Drug Application (NDA)
processes. They were taken in large measure
from testimony given by William Beaver in a
court case involving the Pharmaceutical Manu
facturers Association versus Robert H. Finch,
Secretary of Health, Education and Welfare, and
Herbert L. Ley, Commissioner of Food and
Drugs (Crout, 1982; United States District
Court, 1969, 1970).
The Medical Device Amendments of 1976
have extended some of the testing requirements
established for drugs to medical devices as well
(United States Congress, 1976). Certain devices
cannot be marketed without supporting evi
dence of safety and efficacy as obtained through
controlled trials.
The importance of safe and effective treat
ments for major diseases has led Congress to
earmark money for targeted areas of research.
The Coronary Drug Project (CDP) is an early
example of a trial funded via this route (Coro
nary Drug Project Research Group, 1973a). The
emphasis on focused research has led to in
creased use of research contracts in place of
grants by the NIH as funding vehicles for many

1.3 Terminology conventions 9
of the large-scale multicenter trials (see Chap
ters 5 and 21).
The long-term multicenter trial has created a
new class of organizational and analysis prob
lems. A special task force convened by the Na
tional Heart Institute in 1967 outlined organiza
tion guidelines that have been used for many of
the large-scale trials since then (Greenberg.
I967).2 The analytic problems created by the
need for periodic data analyses as the trial pro
ceeds have led to the development of organiza
tional structures that provide for a separation of
the patient care and treatment evaluation func
tions. The structures, described in Chapter 23,
emerged from concerns regarding the possibility
of bias if study physicians are permitted access
to study data during the course of the trial (Mei
nert, 1981). Chalmers was an early proponent of
this separation of functions in the organization
of the CDP.’
Cornfield played a major role in developing a
philosophy that dealt with the problems of on
going analyses in long-term clinical trials (Green
house and Halperin, 1980; Seigel, 1982). His
work on Bayesian analysis and on the use of the
likelihood principle as an analytic tool served to
de-emphasize the role of significance testing in
data evaluation (Cornfield, 1969).

1.3

TERMINOLOGY CONVENTIONS

The language of clinical trials is confusing. Lan
guage conventions have not been established for
characterizing the key design, organizational,
and operational elements of trials (Meinert,
1980a). Appendix A provides a glossary of
terms, abbreviations, and acronyms used in this
book.
The term patient (see Glossary for the deriva
tion) will be used throughout to denote an indi
vidual enrolled in a trial. It will be used even
though it may not always be appropriate, for
example, as in trials that involve people without
clinical disease. The term test treatment will de
note the treatment to be evaluated in the trial.
The term control treatment will denote the treat
ment used for comparison with the test treat-

2. This report, according to William Zukel of the NHLBI (per
sonal communication. 1982), drew heavily on organizational expe
rience gained from earlier multicenter studies, most notably those
done by the Committee on I ipoprotcins (1956) and by the Rheu
matic Lever Working Party (I960)

V A written communication from Thomas Chalmers to the Chair
man of the CDP Policy Board. Robert Wilkens, in 1967. led to the
separation of these (unctions in the CDP.

Table 1-2

Frequency of selected terms in titles published in 1980*
Titles under:

Term used

Me SH of
clinical trials

Titles containing the term triaKs)

502

(100)

191

(100)

201
131
79
74
22
15
99

(40)
(26)
(16)
(15)
(4)
(3)
(20)

41
23
16
13
9
4

(22)
(12)
(8)
(7)
(5)
(2)
(54)

Titles containing the term trial(s) plus:
Clinical
Controlled
Douhle-hlind
Random(ized)
Comparative
Field
Titles containing the term triaKs) and none of the above terms

Other MeSH

103

•MEDLINE search, as of June 1982. Run restricted to nonreview articles in English appearing under the check tag
human.

ment. For convenience, study designs will be
discussed as if they involve a single test and
control treatment, although certain trials may
involve several test treatments. The term study
treatments will denote the entire set of test and
control treatments used in a trial.
The term trial is from the Anglo-French word
trier, meaning to choose, sort, select, or try
(Klein, 1971). Thomas Bayes (1702-1761), an
English mathematician, made frequent use of
the term in a nonmedical experimental sense in
an essay on probability involving repeated drops
of a billiard ball onto a surface to observe the
position of its fall (Bayes, 1763). The use of the
term in a medical context is not easy to trace.
However, even a cursory search indicates it has
been in use for some time. It appears in the
writings of both Haygarth and Jenner around
1800. Its use today covers a wide variety of de
signs ranging from uncontrolled observations in
volving the first use of a treatment in man to a
formal experiment, complete with a control treat
ment and randomization. The use of the term
without modifiers implies nothing about the ob
servational unit. It may be man or some other
animal species—always man in this book.
Tkial is frequently modified by the term clini
cal and/or one or more design terms (e g., ran
domized. placebo, controlled, or double-blind).
Table 1 -2 provides an indication of modifier us
age as seen in 1980 nonreview, publications in
English appearing in the MEDLINE4 data file.
4. Medical l iterature Analysis Retrieval System On Line, a com
puter database of literature citations produced by the National
Library of Medicine (Williams et al.. 1979).

The results presented are for articles appearing
under the check tag human—a designation ap
plied by indexers at the National Library of Medi
cine (NLM) to identify studies involving hu
mans.5 Tabulations presented in the first column
of the table are based on a search of all the titles
indexed under the medical subject heading
(MeSH) clinical trials (1,949). Of the 502 articles
containing the term trials, 40% also contained
the term clincial. The term trial appeared with
out any of the modifiers listed in Table 1-2 in
20% of the titles (99 out of 502). It is worth
noting that nearly three-fourths of the 1,949 arti
cles screened did not contain the term trial.
Other more nondescript terms such as study
were used instead (see Chapter 2 and Coordinat
ing Center Models Project Research Group,
1979e). Unfortunately, this pattern of use creates
problems when an attempt is made to identify
trials via title searching routines.
The results in the last column in Table 1-2
concern the use of the term trial(s) in articles
appearing under MeSH headings other than clin
ical trials. A number of these may very well in
volve studies that are nonexperimental. Theoret
ically, this should be true for all articles not
classified under the MeSH clinical trials. How
ever. some of the articles identified appear to be
germane to the field, as suggested by use of
modifiers such as clinical, controlled, double
blind. random, or randomized.
5. Mom of the articles under the heading clinical trials appear
under (his tag. However, beginning in January 1981. the heading
includes veterinary studies and hence contains studies where only
the check tag animal is used.

l:

0

Introduction

i.4

FOCUS

'his book will focus on the class of trials that
ivolve:

• Man
• A fixed, nonsequential sample size design
Random allocation of individual patients to
treatment, as opposed to some larger ran
domization unit such as family, hospital
ward, community, etc.
• An uncrossed treatment design (i.e., where
the treatment design requires patients to
receive either the test or control treatment,
but not both)
• Concurrent enrollment, treatment, and fol
low-up of patients in the test and control
treatment groups
A clinical event, such as death or some other
nonfatal event (e.g., a myocardial infarc
tion, recurrence of cancer, loss of vision,

2. Clinical trials: A state-of-the-art assessment

etc.), as the primary outcome measure for
evaluating the test treatment
The fixed sample size design is by far the most
commonly used design for the class of trials
considered. Sequential designs (see Chapter 9
for further discussion) are not practical for com
paring treatments in trials requiring long periods
of follow-up for outcome assessment.
Emphasis will be on trials that require multi
ple clinics in order to enroll the required number
of patients (see Appendix B for examples). The
researcher who can cope with the challenges pre
sented by such trials is in a good position to deal
with less complicated trials carried out in a sin
gle clinic.
Many of the principles discussed herein have
applicability beyond the setting outlined. This is
true for several of the chapters, particularly
those concerned with data collection (Chap
ter 12) and with organization and management
practices (Chapters 22 and 23).

One’s knowledge of Science begins when he can measure what he is speaking about and
express it in numbers.
Lord Nelson

2.1 Existing inventories
2.2 Trials as seen through the published litera
ture

I

l

2.3 Small sample size: A common design flaw
2.4 Future needs
Table 2-1 Number of trials, median sample size,
and percent randomized by fiscal
year, as reported in NIH Invento
ries of Clinical Trials
Table 2-2 Design features of trials reported in
the 1979 NIH Inventory of Clinical
Trials
Table 2-3 Number of trials, median sample size,
and percent randomized, as re
ported in the 1979 NIH Inventory
of Clinical Trials
Table 2-4 1980 publications cited in MED
LINE as of October 1981
Table 2-5 Literature selection process for papers appearing under heading clini
cal trials
Table 2-6 Number of journals represented in
sample of 113 papers
Table 2-7 Journal of publication for 113 papers
reviewed
Table 2-8 Subject matter of 113 papers reviewed
Table 2-9 Design characteristics of sample of
113 trials appearing in 1980 pub
lished literature

2.1

ventory was created to provide information for
ongoing trials of psychopharmacological agents
in the United States and elsewhere (Levine et al.,
1974). The National Cancer Institute (1983), via
the International Cancer Research Data Bank,
maintains a worldwide file of ongoing phase 11
and phase III cancer trials. The Veterans Ad
ministration (VA) maintains a list of trials car
ried out under its collaborative studies program
(list available from the VA Central Office, 810
Vermont Avenue N.W., Washington, D.C.).
The Division of Research Grants of the Na
tional Institutes of Health (N1H) has maintained
an inventory of NIH-sponsored trials for several
years (National Institutes of Health, 1975, 1980).
Responsible officials of institutes of the NIH
involved in extramural or intramural research
are asked to complete inventory sheets for all
ongoing studies that they consider to satisfy the
definition of a clinical trial, as specified in the
inventory. The definition used is:
A scientific research activity undertaken to
define prospectively the effect and value of
prophylacticI diagnosticI therapeutic agents,
devices, regimens, procedures, etc., applied
to human subjects. It is essential that the
study be prospective, and that intervention
of some sort occur. The choice of number
of cases or patients will depend on the hy
pothesis being tested, but must be sufficient
to permit a definite result to be anticipated.
Phase I, feasibility, or pilot studies are ex
cluded.

EXISTING INVENTORIES

This definition allows inclusion of trials with
only one treatment group. One can only surmise
that evaluation of the treatment is made against
some hypothetical standard control treatment or
through use of historical controls in such cases
(see Chapter I and Glossary for definition of
clinical trial as used in this book). The broad
nature of the definition and the lack of surveil
lance by the Division of Research Grants in mon-

Various groups have assumed responsibility for
developing and maintaining inventories of ongo
ing clinical trials. Some are organized according
to disease; others relate to trials sponsored by a
specific agency. An early example of the first
type of inventory originated from the National
Institute of Mental Health with the creation of
the Biometric Laboratory Information Process
ing System (BLIPS) in the mid-1960s. The inII

I

'1
! I
"I
if
>1!'

•pi:
,..i:
. i-'

12

Clinical trials: A state-of-the-art assessment

itoring for differences in how the definition is
applied allows for considerable variability in the
reporting behavior of institutes contributing lo
lhe inventory. It is likely that some of the varia
tion among institutes, within and across years,
evident in tables in this chapter and in Chapters
5 and 6, is due to differences in reporting prac
tices. Unfortunately, the inventory is not de
signed to provide data on the nature of the dif
ferences.
The number of trials reported for the 5-year
period for which inventory data are available
ranged from a low of 746 in 1977 to a high of 986
in 1979 (Table 2-1). The “typical” NIH trial, as
reflected in the 1979 NIH Inventory,' involved
between 30 and 300 patients (median sample
size: 100) apportioned among the different treat
ment groups (Table 2-2). Most of the trials were
:lassified as therapeutic (81%), as compared to
I. There have been no inventones since 1979, but one is planned
or 1984 or 1985.

Table 2-2

2.2 TYials as seen through the published literature
Table 2-1

Number of trials, median sample size, and per

cent randomized by fiscal year, as reported in NIH Inven
tories of Clinical Trials

Total
number
Fiscal

of

Median
sample

vear

trials

size

Percent
randomized

1975

755
926
746
845
986

127
114
125
103
100

62
60
62
60
60

1976

1977
1978
1979

Design features of trials reported in the 1979 N1H Inventory of Clinical Trials

Design features

Number
of trials

Percent

258
438
290

26
44
29

Number of treatment group* per trial
I

2
>3
Median number of treatment groups/trial: 1.48
Sample aize
Median number of patients/lrial
Range (20th to 80th percentile)

100
30 to 300

Number of patients/trial/trealment group*
Range (20th to 80th percentile)

68
20 to 203

Method of treatment allocation

Random
Nonrandom
Method not reported

589
391
6

60
40
0

801
126
58

81
13
6

19
101
223
642

2
10
23
65

986

100

Type of trial
Therapeutic
Prophylactic
Diagnostic
Anticipated length of trial

<1 year
1 year to <2 years
2 years to <3 years
>3 years

Total number of trials listed

'Calculated by dividing median number of patients per trial by the median number of treatment groups
per trial.

Table 2-3 Number of trials, median sample size, and percent randomized, as reported in the 1979 NIH
Inventory of Clinical Trials

Number
of trials

Institute

prophylactic (13%) and diagnostic (6%) (see
Glossary for definitions). The majority (65%)
were funded for a period of 3 years or longer.
Trials sponsored by the individual institutes
vary in number and size (Table 2-3). The Na
tional Cancer Institute (NCI) sponsored by far

13

Median
sample
size

Percent
randomized

National Heart. Lung, and Blood Institute (NHLBI)

20

National Institute of Allergy and Infectious Diseases (NIAID)

120

National Institute of Arthritis, Metabolism, and Digestive
Diseases (NIAMDD)

67

100
200
850
100
70

National Institute of Child Health and Human Development
(NICHD)

32

100

62

National Institute of Dental Research (NIDR)

26

663

National Institute of Neurological and Communicative
Disorders and Stroke (NINCDS)

40

30

65
55

Total

985*

100

60

National Cancer Institute (NCI)

654

National Eye Institute (NE1)

26

59

85
100
53

60

*One trial sponsored by the National Institute of General Medical Services not included.

the most trials (654 out of the 985 listed for 66%
of all NIH trials). The National Heart, Lung,
and Blood Institute (NHLBI) sponsored the larg
est trials (median sample size: 850). This varia
tion in size is due, in part, to differences in the
nature of the health problems addressed. The
NCI plays a major role in developing and test
ing chemotherapeutic agents. Hence, many of
their trials are of the phase I or II variety (see
Glossary), involving relatively small numbers of
patients. The NHLBI has concentrated on as
sessing the usefulness of various drugs and proce
dures in the primary or secondary prevention of
heart disease. Their trials, of necessity, have had
to involve large numbers of patients and long
periods of follow-up because of low underlying
event rates for the outcomes of interest.

2.2 TRIALS AS SEEN THROUGH
THE PUBLISHED LITERATURE
An indication of the nature of completed trials
can be obtained from a review of the published
literature, as identified through Index Medicus
or MEDLINE—the computerized version of
Index Medicus(Beatty, l979;Charen, 1977; Ken
ton and Scott, 1978; McCarn, 1980; Williams
et al., 1979). The introduction in 1980 of a sub
ject heading for clinical trials has made it possi
ble to retrieve articles under this heading.2 The
2. Before 1980, trials were classified under the general heading
clinical research.

definition used by indexers at the National Li
brary of Medicine—the agency responsible for
entries into Index Medicus and MEDLINE—is:

Pre-planned usually controlled studies of
the safety, efficacy, or optimum dosage
schedule (if appropriate) of one or more di
agnostic. therapeutic, or prophylactic drugs
or technics in humans selected according to
pre-determined criteria of eligibility and ob
servedfor pre-defined evidence offavorable
and unfavorable effects (National Library
of Medicine, 1980).
This definition, as with the one used by NIH, is
designed to permit inclusion of studies with a
wider number of design features, including some
without a comparison group.
The heading included 2,409 citations bearing a
1980 publication date, as of an October 1981
MEDLINE search.3 This number represents less
than 1% of the total 1980 MEDLINE citations
(Table 2-4). The 1,796 titles remaining after ex
clusion of review articles and foreign-language
papers were ordered by date of entry into the
MEDLINE file (approximately chronological
by date of publication) and then sampled using a
random start and a I in 10 sampling fraction. A
total of 67 (37%) of the 180 papers selected were

3. This run included most of lhe 1980 publications Evidence from
previous years indicates that 95% of all entries for a given calendar
year are indexed and entered into lhe system by October of the
following year.

’ .I'd

■

2.4 Future needs

14

Clinical trials: A state-of-the-art assessment
Subject matter of 113 papers reviewed

Table 2-4 1980 publications cited in MEDLINE as of
October 1981

4:4'

249.150

Total number of 1980 entries in MEDLINE

2.409

Number of 1980 titles under heading clinical
trials

2.317

Number of 1980 titles remaining after exclusion
of review articles

1.796

Number of 1980 titles remaining after the
exclusion of review and foreign-language
publications

Table 2-6 Number of journals represented in s»mrw ■/
113 papers

Number
of papers

Number of journals represented in sample of
113 papers

tird-osascular
i n’ti'inicctinal
►...ho-neurological

Number of journals with:
1 of the 113 papers
2 of the 113 papers
3 or more of the 113 papers (see Table 2 7)

( *n«r

• ■ -jr\ svstem
tk .tnd joint

Source: Reference citation 321. Reprinted with permit*
Elsevier Science Publishing Co., Inc . New York

<

'wmatology
’vuil

•nriratory

n

eliminated for reasons indicated in Table 2-5.
The tabulations given in Tables 2-6 through 2-9
are based on the 113 remaining papers. Appen
dix C contains a list of all 180 papers (see also
Meinert et al., 1984).
It would have been necessary to subscribe to
no less than 82 different journals in order to
have access to the 113 articles reviewed. More
over, no combination of 4 or 5 journals ac
counted for a majority of the articles. Only 17 of
the 82 journals contained 2 or more of the pa
pers selected for review (Table 2-6). The 8 most
frequently cited journals accounted for a little
more than a quarter (27%) of the 113 articles
(Table 2-7).
Each paper in the sample was classified as to
major subject area (Table 2-8). General design
characteristics of the trials represented in the
sample are summarized in Table 2-9. The typical
trial, as seen through published literature, is car
ried out in a single clinic and involves about 25
patients per treatment group followed over a
relatively short time—usually less than 3
months.

Table 2-7
Table 2-5 Literature selection process
appearing under heading clinical trials

for

Total number of English, nonreview. 1980
publications

Journal

Number of papers selected in sample

180

Number of papers excluded after initial review
No comparison group
Editorial or letter
Review or methodological paper
Other reasons*

67

Number reviewed

113

•Includes I paper that could not be located. 8 position or
philosophical papers, and 2 others not classified as clinical trials
under the definition used in this book (see Chapter I and
Glossary).

<>/

3
3
14

’ Wil number of papers in sample

113

'hlx 2 of the 113 trials showed any evidence of a
Miiplc size calculation and they involved se4'jcniial designs. Of the others, 3 mentioned the
’•atiMical power (see Glossary) associated with
trial. The virtual disregard of power con* derations is consistent with other literature
ews. None of the 83 gastrointestinal trials
^••Kwed by Chalmers and co-workers (1978)
■Kluded any discussion of power. Only 2 of the
41 papers from breast cancer trials reviewed by

J Int Med Res
S Afr Med J

All other journals (74)

h.i relief
■•■Mious disease

2-3 SMALL SAMPLE SIZE:
K COMMON DESIGN FLAW

Br J Clin Pharmacol
Br J Dis Chest
Cancer

Total number of papers in sample

6
5
5
4
4
3

• nEle-center (89 studies) or could not be classi-d because of lack of information in the papers
J studies). Slightly over half of the papers (53%)
-dicalcd a source of funding. Acknowledgment
■ j contribution of a supply item, such as drugs,
• is ignored in the classification, unless there
• iscMdcnce that money was also provided.
Mom trials presented results for a number of
■:tcome measures (see Glossary). Many of the
rtpers presented results for several different out. -nes. It was impossible in nearly all those cases
'' dentify the measure considered to be primary
(ilossary for definition). Most measures
»?'t o( a nonclinical nature (e.g., usually labora' rv or physiological measures). Only 3 trials
i'cd mortality as an outcome measure.

Br Med J
Lancet
J Clin Pharmacol

15
17
24
II

10
8
7

Reference citation 321. Reprinted with permission of
i
Science Publishing Co.. Inc.. New York
M-ruhesiology; ear. nose, and throat: diabetes; contraception;
• mt. diagnostic; trauma, drugs; and weight control.

Journal of publication for 113 papers rtstewrd

papers

1.796

.-<i ologic
•—-th.ilmologic

Over 70% of the trials were classified as thcu
peutic (see Glossary for definition). The o\fr
whelming proportion of trials involved drug
treatments. Only 10 of the 113 studies insohn?
some other form of treatment. Of the 10. 5 wfr
surgical trials, 2 involved behavior modification
2 involved a radiologic procedure, and I
volved testing a medical device. Approximate1'
a third (31%) of the trials used crossover design*
(see Glossary).
The median length of follow-up was slight \
over 2 months. There were only 12 trials tH'
provided for a year or more of follow-up (hr
two-thirds of the trials were reported to be dou
ble-masked; 80% of the reports indicated use o’
some random method for treatment assignment
Treatment assignment was classified in the non
random or unstated category if the paper con
tained an explicit statement indicating use ni *
nonrandom method or if there was no was to
determine how assignments were made.
Fifteen of the trials (13%) were classified r
multicenter. The remainder were classified a*

14
14
13

II’

Source Reference citation 321. Reprinted with permtwo*
Elsevier Science Publishing Co.. Inc.. New York

4

15

Hosteller and co-workers (1980) contained a dis
cussion of power. Power considerations are es
pecially important in trials where investigators
conclude in favor of the null hypothesis (Freiman and co-workers, 1978).

2.4

FUTURE NEEDS

The annals of medicine are filled with accounts
of potions, drugs, devices, and the like, that have
been heralded as great advances only to be
shown as useless or even harmful later on. Blood
letting (venesection) has been used therapeuti
cally as well as prophylactically from prehistoric
times to the 1950s (Bryan, 1964; Holman, 1955;
King, 1961). The death of George Washington
was presumably associated with bloodletting
(Donaldson and Donaldson, 1980; Knox, 1933).
It fell from favor as a treatment for hyperten
sion, not so much because of concerns regarding
efficacy of the treatment, but rather, because of
the advent of other modes of therapy. Holman,
as late as 1955. after a review of medical texts in
use at that time, wrote:

Bloodletting is still mentionedfor control of
arterial hypertension. . . . Hypertensive pa
tients not in circulatory failure have often
been observed to gel symptomatic relief
from venesection for varying periods of
time. ... If the early promise of Rauwolfia
and similar recently introduced antihyper
tensive agents is fulfilled, this indication for
venesection is apt to be supplanted also.
Perkin’s tractors, introduced in 1795 and men
tioned in Chapter I. continued to be used long
after Haygarth’s study in 1800 showed them to
be of no value (Elliott, 1913; Haygarth. 1800).
Nathan Smith, the founder of the Yale Medical
School, not only gave testimony to their efficacy
but was reported to have sold them (Haggard,
1932).
Changes in treatment philosophy are slow to
occur, especially if the new philosophy must re
place an established one. Max Planck (18581947). a physicist, noted that:

A new scientific truth does not triumph by
convincing its opponents and making them
see the light, but rather because its oppo
nents eventually die. and a new generation
grows up that is familiar with it (Strauss.
1968).
The promotion andI use of ineffective treat
ments is not simply a imistake of the past, as is

uu

16

Clinical trials: A state-of-lhe-art assessment
Table 2-9
literature

2.4 Future needs

Design characteristics of sample of 113 trials appearing in I9K0 published

Design characteristic

1

_3»

MM

Number
of trials

Percent

Number of treatment groups

70
24
19

62
21

Therapeutic
Prophylactic
Diagnostic

81
21
2

72
19
2

Uncertain

9

8

2
3
>4

17

Type of trial
s,’

Treatment design

I

Drug trials
Uncrossed treatment
Crossed treatment
Treatment structure unclear
Other trials

103

91
68
32
3

66
31
3

10

9

19
36
21
20

17
32
19
18

15
2

13
2

<1 week
>1 week but <1 month
>1 month but <3 months

22
20
26

19
18
23

>3 months but <1 year
>1 year

19
12

Not stated

14

17
II
12

Sample size
<20
21-49

50-99
100 299
>300
Unstated
Median number: 52.5 (range 4 to 3.427)
Median number per treatment group: 26.2 (range 2 to 1,714)

Length of follow-ap

Median: 2.1 months (range <1 day to >2 years)

Method of treatment assignment
90
23

80
20

76
4

67
4

17

15

16

14

Single center*

98

87

Multicenter

15

13

Public
Private
Public and private

24
22
14

Not stated

53

21
19
12
47

Random
Nonrandom or not stated

I

17

evident from work of the Drug Efficacy Study
Implementation, DESI (Food and Drug Admin
istration. 1972b). Of the 3,185 prescription drugs
reviewed by the FDA as of June 1982, 31% were
classified as ineffective.4
The adoption of treatments as established
forms of therapy without adequate testing ap
plies to nondrug forms of therapy as well. Coro
nary artery bypass surgery was introduced in
1964 (DeBakey and Lawrie, 1978; Garrett et al.,
1973). Since that time it has become one of the
most common forms of surgery performed. Only
recently have trials been mounted to evaluate
the efficacy of the operation (Braunwald. 1977;
Coronary Artery Surgery Research Group,
1981, 1983; European Coronary Surgery Study
Group, 1982b; Murphy et al., 1977).
Coronary care units, regarded as standard
treatment for patients with myocardial infarc
tion since their introduction in 1962, have never
been adequately evaluated (Day, 1965; Gordis
et al., 1977). The few controlled trials that have
been done raise doubts concerning widespread
use of such units (Christiansen et al., 1971; Hill
et al., 1977, 1978; Mather et al., 1971, 1976).
The development of electronic fetal monitor
ing (EFM) devices in the late 1960s has led to
their widespread use in delivery rooms. Their use
has been accompanied by a marked rise in cesar
ean section rates, without any apparent improve
ment in neonatal outcome (Haupt, 1982; Ott,
1981). All of the randomized trials reported to
date have failed to show any benefit for the
EFM devices tested (Haverkamp et al., 1976,

1979; Kelso et al.. 1978; Renou et al., 1976).
However, those results have not had any appar
ent effect on the use of the devices.
Demands from the public for access to new
“miracle" drugs can also influence health care
practices. Public clamor for Laetrile has led state
legislators in 26 states to enact laws making the
drug available to the public,5 in spite of a skepti
cal medical profession and trials failing to indi
cate any merit for the treatment (Bross, 1982;
Moertel et al., 1982; Reiman, 1982). Lobbying
by lay groups for a relaxation of proscriptions
against the use of dimethyl sulfoxide (DMSO)
has led to availability of the compound in 9
states5 even though there are serious doubts re
garding its usefulness (National Research Coun
cil, 1973).
The need for clinical trials is not limited to the
medical profession. A case in point is the wide
spread and often indiscriminate use of diethylcarbamazine to protect dogs against heart
worms. The risks associated with the chronic use
of such medications, year in and year out for the
life of a dog, may be greater than the risk from
the heart worm itself, especially if the animal
lives in a low infestation area and spends most of
its time indoors.
The clinical trial has been termed the “indis
pensable ordeal" by Fredrickson (1968). Indeed
it is, if we are to eliminate the uncertainty that
stems from lack of data needed to evaluate the
merit of many of our current treatment prac
tices.

4. Personal communication with staff of the Office of the Division
of Federal and State Relations. Food and Drug Administration.
1982

5. Personal communication with staff of the Office of the Division
of Federal and State Relations. Food and Drug Administration.
1982

ll'in'

I

I

Level of treatment masking
Double-masked
Single-masked
Unmasked
Not stated

I it»i

i::h

r:i|!

Number of centen

Type of funding

Source: Reference citation 321. Reprinted with permission of Elsevier Science Publishing Co.. Inc., New
York.
•This category includes 9 trills with inadequate information to make a classification.

3.3 Common impediment, to the orderly performance ofactMlie,
Table 3-1 Stages of a clinical trial

3. The activities of a clinical trial

Field trials are indispensable. They will continue to be an ordeal. They lack g,amorsUa
our resources and patience, and they protract the moment of truth to excruttaling
_
thev are among the most challenging tests of our skills. 1 have no doubt that when the problem
k well choTen jhe study is appropriately des.gned, and that when all the
H
are made aware of the route and the goal, the reward can be commensurate with the effort. If
in major medical dilemmas, the alternative is to pay the cost of perpetual uncertainty, have
really any choice?
Dona|d Fredrickson (|968)

3.1’ Stages of a clinical trial
3.2 Division of responsibilities
3.3 Common impediments to the orderly per
formance of activities
3.3.1 Separation of responsibilities in govern
ment-initiated trials
3.3.2 Structural deficiencies
3.3.3 Overlap of activities from stage to stage
3.3.4 Inadequate time for planning, develop
ment, and implementation
3.3.5 Inadequate funding
3.4 Approaches to ensure orderly transition of
activities
3.4.1 Phased initiation of data intake
3.4.2 An adequate organizational structure
3.4.3 Opportunities for design modifications
in sponsor-initiated trials
3.4.4 Certification as a management tool
3.4.5 Realistic timetables
3.4.6 Ongoing planning and priority assessment
3.4.7 Minimal overlap of activities

stage. The list is an adaptation of one developed
as part of the Coordinating Center Models Proj
ect—CCMP (Coordinating Center Models Proj
ect Research Group, I979d). It should be used
only as a rough guide to activities in specific
trials. It has been constructed assuming no over
lap of activities from one stage to the next. In
actual fact, as noted in Section 3.3.3, the overlap
can be quite extensive.

3.2 DIVISION OF
RESPONSIBILITIES

Any trial involving two or more investigators,
whether done at a single center or multiple cen
ters, must provide for a division of responsibili
ties. Some responsibilities, such as those related
to patient care or to data analysis, require spe
cialized skills associated with a particular disci
pline and may automatically be assumed by per
sons trained in that discipline. However, many
of the required functions are not uniquely asso
ciated with a specific discipline and can be per
formed by any one of several individuals or
groups in the trial. This fact was evident m the
Table 3-1 Stages of a clinical trial
review of the data coordinating centers carried
out as part of the CCMP. All centers had the
responsibility for data intake and analysis, but
3.1 STAGES OF A CLINICAL TRIAL
they showed wide variation in the number of
A clinical trial progresses through a series of
other general support functions performed. In
stages from beginning to end. The stages dis
some trials, the center had responsibility for vir
cussed in this book are outlined in Table 3-1,
tually all support functions, whereas in others
along with the event that is used to designate the
responsibilities were shared with or assumed by
end of one stage and the start of the next. The
individuals or groups outside the center (McDill,
dates listed in the last column of the table are
1979).
from the CDF (Coronary Drug Project Re
It is useful to list required activities and the
search Group, 1973a, 1976).
individual or group expected to perform them.
Appendix D provides a listing of activities by

18

Stage

I. Initial design
11. Protocol development
III. Patient recruitment
IV. Treatment and follow-up
V. Patient close-out
VI. Termination
VII. Post-trial follow-up (optional)

Event marking end of stage_________

Illustration
using CDP*

Initiation of funding
Initiation of patient recruitment
Completion of patient recruitment
Initiation of patient close-out
Completion of patient close-out
Termination of funding for original trial
Termination of all follow-up________

March 1965
March 1966
October 1969
May 1974
August 1974
March 1979
December 1983

19

was

awarded in March of 1965.

This should be done early tn the trial to avoid
confusion as to who is doing what. These specifi
cations are especially important in trials with
multiple resource centers that have overlapping
responsibilities (McDill. 1979). The specifica
tions. once developed, should be reviewed and
revised at intervals over the course of the trial to
cover new responsibilities and to realign old
ones.

making processes for resolving key design and
operational issues (e.g., when to stop patient
recruitment, how long to continue patient fol
low-up, when to terminate a treatment because
of adverse or beneficial effects). The ambigu.ties
can cause different individuals or groups to view
themselves as the “final authority” m resolving a
particular issue and can cause delays and ineffi
ciencies in the way activities are conducted.

3.3.3 Overlap of activities from
stage to stage
The activities normally associated with a partic
ular stage may continue into the next or sub
sequent stages. Experience during the patient
recruitment stage may require re-evaluation of
sample size and other criteria set down when the
The responsibilities for planning and executing a
study was designed. New treatments may be
trial rest with the investigators in the typical
added after the start of patient recruitment, tor
investigator-initiated trial. They design it. they
example, as in the UGDP (University Group
propose the investigators to be involved in it,
Diabetes Program Research Group. l970d)
and they carry it out. The sponsor has only a
Similarly, it is rare for patient recruitment to
peripheral role. The situation is different in a
be completed by the time treatment and follow
typical sponsor-initiated trial. In this case, the
up begin. In fact, it is not uncommon for all
sponsor assumes major responsibility for design
three of these processes to go on simultaneously
of the trial and for selection of the investigators
in long-term trials. In addition, data analyses,
to carry it out. The separation of the design and
while typically associated with the termination
execution functions may lead to sponsor-investi
stage, may be necessary long before that point is
gator tensions that may impede progress in the
reached for performance and treatment moni
trial if they are not addressed.
toring. as discussed in Chapters 16 and 20. r
spectively.
.
Overlap of activities from one stage to the
3.3.2 Structural deficiencies
next has staffing implications. A trial in which
In a survey of multicenter trials. Smith (1978)
patients are still being recruited, while others are
classified over half of the operational problems
have- already
in various stages of follow-up or Iencountered as organizational or administrative
been separated trom the study, requiresi more
in nature. Many of these organizational prob
in
elaborate organization and staffing than one
<
lems can be traced to ambiguities in decision

3.3 COMMON IMPEDIMENTS TO
THE ORDERLY PERFORMANCE
OF ACTIVITIES
3.3.1 Separation of responsibilities in
government-initiated trials

llii'’*

(

i il
tl''
11''

i?
51

20

I
«

d

3.4 Approaches to ensure orderly transition of activities 21

The activities of a clinical trial

which it is possible to complete one stage before
the next one starts.

3.3.4 Inadequate time for planning,
development, and implementation
The time schedule for a trial, as established in
the design stage, often proves to be unrealistic.
Among the ten Requests for Proposals (RFPs)
reviewed in the CCMP, only six made any men
tion of a time period for planning and protocol
development (Coordinating Center Models Proj
ect Research Group, 1979b). The start-up time
(i.e., time from start of funding to the enrollment
of the first patient) for the trials listed in Appen
dix B ranged from 2 months to 3 years. The
average time was just over I year.
Unrealistically ambitious time schedules tend
to exert pressure on investigators to initiate data
collection before the necessary data forms and
related documents have been fully developed
and tested. Doing so can lead to a chronic crisis
atmosphere in the data center as staff struggle to
develop better data forms and intake procedures
while trying to maintain existing procedures.

3.3.5

Inadequate funding

The level of activities in a trial should be com
patible with available funding. It is a mistake to
embark on a trial without adequate support. The
effort proposed should be scaled to match avail
able support. Further, funds should be equitably
distributed across activities within the trial. Si
tuations should be avoided where support for
one aspect of the trial, such as data collection, is
overfunded, while another, such as data intake
and analysis, is underfunded. A successful trial
requires balance in the amount of money availa
ble for all essential activities.

3.4 APPROACHES TO ENSURE
ORDERLY TRANSITION
OF ACTIVITIES
3.4.1

Phased initiation of data intake

It may be prudent to limit the number of pa
tients to be enrolled at the outset, especially if a
clinic has a large backlog of patients waiting to
be enrolled. The limit may be lilted once a clinic
has demonstrated proficiency in the data collec
tion process and after the basic data forms and
intake procedures have been shown to work.

One approach to phased data collection in
trials with multiple clinics involves funding only
a small number of clinics at the outset, with new
clinics being added as the trial proceeds. This
approach was used in the CDP. It started with
five clinics. Additional clinics were added over a
2-year period to make up the total of 55 ulti
mately involved in the trial (Coronary Drug Proj
ect Research Group, 1973a).
A gradual progression to full-scale recruit
ment and data collection can be part of the study
plan, even if all the participating clinics are iden
tified from the outset. It may be wise in such
cases to designate one or two clinics to serve
as testing sites for the treatment protocol and
data collection procedures before the others are
brought into the study. This approach was used
in the Multiple Risk Factor Intervention Trial
(Sherwin et al., 1981). Another approach allows
all clinics to begin recruitment at the same time,
but at a reduced rate to start with. The Hyper
tension Prevention Trial—HPT (see Sketch 13,
Appendix B) used this approach. Each of the 4
clinics in that trial was required to enroll a test
cohort of 20 patients before it was allowed to
start full-scale recruitment.

3.4.2 An adequate organizational
structure
Coordination of activities in a trial requires a
sound organizational structure. One of the first
orders of business should be its oevelopment. A
sound structure takes time to develop and to
reach maturity. There should be adequate time
for that maturation process before the start of
patient intake. As a rule, the period of time
required for this process is related to the size and
complexity of the trial, and it may be longer for
sponsor-initiated trials than for investigatorinitiated trials. A well-designed investigatorinitiated trial will include details on organization
in the funding application. The period of time
between submission of the application and initi
ation of funding (see Section 21.2.1 of Chap
ter 21) may provide investigators with opportu
nities to refine the structure proposed and may
even allow it to reach a degree of functional
maturity because of investigator interactions re
quired in preparing and defending the funding
request. Such opportunities do not exist in the
typical sponsor-initiated trial because of the way
centers are selected (see Section 21.3 of Chap
ter 21).

3.4.3 Opportunities for design modifi
cations in sponsor-initiated trials

The separation of responsibilities discussed in
Section 3.3.1 is an inherent feature of most spon
sor-initiated trials, especially those initiated by
the government via RFPs. The timetable for the
trial should provide investigators with adequate
opportunity to consider and accept the design
tenets proposed before the start of data collec
tion. This process begins before the proposal is
submitted in the typical investigator-initiated
trial, but cannot begin until after the centers are
selected and funded in the typical governmentinitiated trial.
3.4.4

Certification as a management tool

Patient recruitment should not start until the
clinics and data center have demonstrated that
they are properly staffed and equipped to sup
port this activity. Some trials, such as the
National Cooperative Gallstone Study (see
Sketch 5, Appendix B), have required clinics to
carry a minimum number of patients through
key study procedures before recruitment could
begin. A formal certification of clinics was re
quired in the HPT prior to the start of recruit
ment.
The certification process has been extended to
individuals making key measurements in some
trials (e g., see Early Treatment of Diabetic Reti
nopathy Research Group, 1982; Knatterud,
1981; Rand and Knatterud, 1980). The person
nel certification process is useful in that it pro
vides a landmark that must be passed before a
person is cleared for data collection in a trial.

3.4.5

Realistic timetables

The timetables for activities proposed in grant
applications or RFPs for clinical trials should be
based on realistic appraisals of times required to
complete those activities. Unrealistically ambi
tious schedules may raise doubts regarding the
feasibility of the study in the minds of those
responsible for overseeing it, may lead to frus
tration among investigators in the trial, and may
result in decisions to implement activities before
the required procedures and support systems
have been adequately tested and developed. The
timetable constructed at the beginning of a trial
should be reviewed and, when necessary, revised

as the trial proceeds if it is to retain its value as a
management tool and performance monitoring
standard over the course of the trial.

3.4.6 Ongoing planning and
priority assessment
Planning and priority assessment are continuing
needs in a trial. The leadership of the trial has a
responsibility for implementing an active review
process in order to make certain that work sched
ules and goals are compatible with the needs and
resources of the trial. When they are not, priori
ties must be revised to reflect reality.
The leadership committee of the trial should
take responsibility for setting priorities for data
analyses when demands for them exceed re
sources available in the data center for carrying
them out. The failure of the leadership commit
tee to act in this capacity will leave staff in the
data center open to criticisms if the priorities
they set are not acceptable to everyone in the
trial.

3.4.7

Minimal overlap of activities

The mix of activities under way at any one time
influences the staffing needs of centers in the
trial. The greater the heterogeneity of activities,
the larger the staffing needs. The goal should be
to minimize the number of activities under way
at any one time. Pursuing this goal requires
completion of patient recruitment in the shortest
possible time. This means that all clinics in a
multicenter trial should be prepared to continue
patient enrollment until the study recruitment
goal is met, even if some clinics exceed their
goals while others fall short of theirs. For exam
ple, the CDP cut off patient enrollment at all
clinics at the same time, even though it used a
phased approach to clinic enrollment (see Sec
tion 3.4.1). Clinics that achieved their stated re
cruitment goal were asked to continue enroll
ment in order to reduce the time needed to
achieve the study-wide recruitment goal ol
8,300. Allowing each clinic to cut off recruitment
when it achieves its prestated goal is inefficient
for the data center, especially if there is wide
variability among the clinics as to when the cut
off occurs. The data center will be required to
maintain treatment allocation and baseline data
intake procedures as long as recruitment con
tinues in any clinic.

pUM*

I

22
■'

’■

I ’Xc

The activities of a clinical trial

Similarly, the patient close-out process is most
efficient when all patients are separated from the
trial at the same time, regardless of when they
were enrolled. The alternative is to separate each
patient after a specified period of follow-up (e.g..

4. Single-center versus multicenter trials

2 years). However, this approach is incf\r^
when patient recruitment has extended
»
long period of time. See Chapter 15 lor d v
sion.

It is not the fault of our doctors that the medical service of the community, as at present
provided for. is a murderous absurdity.... To give a surgeon a pecuniary interest in cutting off
your leg. is enough to make one despair of political humanity.. . . And the more appalling the
mutilation, the more the mutilator is paid. He who corrects the ingrowing toe-nail receives a
few shillings; he who cuts your insides out receives hundreds of guineas, except when he does it
to a poor person for practice.
George Bernard Shaw

i

5

•A?,

Ikfmition

i : \.itional Institutes of Health (NIH) count of
mogle-center and multicenter trials
i ’ IkMgn characteristics of single-center versus
multicenter trials
tt I he pros and cons of single-center versus
multicenter trials
«‘ Initiation of single-center versus multicenter
inals
«• Imcstigator incentives for single-center vermis multicenter trials
»' I immg of single-center versus multicenter
trials
4» < .nt of single-center versus multicenter
trials

cal centers, data centers, coordinating centers,
project offices, central laboratories, reading cen
ters, quality control centers, and procurement
and distribution centers. To qualify as a center, a
unit must have a defined function to perform
during one or more stages of a trial. In addition,
it must be administratively distinct from other
centers in the trial, and must be made up of two
or more individuals who devote some portion of
their time to the defined functions of the center.
A trial, to be considered as multicenter in this
book, must involve:
• Two or more clinics

• A common treatment and data collection pro
tocol

• A center to receive and process study data

’»K'c4 | NIH-sponsored single-center and
multicenter trials by institute for
fiscal year 1979
‘‘■ f 4 2 Design features of NIH single-center
and multicenter trials
■|‘,>4 3 Design features of single-center and
multicenter trials, as reflected in a
1980 sample of clinical trial publi
cations
Funding mode for NIH extramural
trials in fiscal year 1979
•Klc 4 5 MH expenditures for trials in fiscal
year 1979 by type of trial

All other trials will be considered single-center.
This category includes:
• A single clinic, with or without satellite clin
ics (see Glossary) and with or without a
center to receive and process study data or
other resource centers (see Glossary)
• A trial involving multiple clinics, with or with
out satellite clinics, but not having a com
mon study protocol, regardless of whether
it has a center to receive and process study
data
• A trial involving multiple clinics, with or with
out satellite clinics, that does not have a
center to receive and process study data,
even if clinics purport to follow a common
study protocol
• A trial, such as the Physicians’ Health Study
(PHS), that does not involve any clinical
centers, even if it has multiple resource cen
ters

definition
' entr-r. in this book, is defined as any autonunit in a clinical trial that is involved in
^'llection. determination, classification, as*"">€01. or analysis of data, or that provides
■ » 'txal support for the trial. Included are clini-

23

t

nt

J

£

4.4 The pros and cons of single-center versus multicenter trials 25

24 Single-center versus multicenter trials
The four elements of the definition are necessary
with the binary language structure used to char
acterize the physical structure of trials. However,
the fact is that most trials are characterized by the
first element in the category and. hence, they
are discussed from this perspective throughout
this book.
4.2 NATIONAL INSTITUTES OF
HEALTH (NIH) COUNT OF
SINGLE-CENTER AND
MULTICENTER TRIALS

The 1979 NIH Inventory of Clinical Trials was
the first inventory generated by that agency that
distinguished between single-center and multi
center trials (National Institutes of Health, 1975,
1980). The institutes vary widely with regard to
support for the two types of trials.' For example,
all of the 26 trials supported by the National
Institute of Dental Research were single-center,
whereas all but I of the 20 trials sponsored by
the National Heart. Lung, and Blood Institute
were multicenter (Table 4-1). The differences are
due, in part, to the nature of the evaluation
I. The definition of multicenter trials used by the NIH is less
stringent than the one stated above Trials in the Inventory were
classified as multicenter without the requirement of a common
protocol or the presence of a center to receive and process study
data.

Table 4-1

question faced by the various institutes (set Sa
tion 2.1).
Overall, the institutes of the NIH spont^
about as many multicenter trials. 476. as sinf*
center trials. 510 (last line. Table 4 I) |t H n
teresting, in view of this fact, to note the prep. derance of single-center trials in puhlishtj
literature. Only 25% of the 306 gastrointesima
trials reviewed by Juhl and co-workers H9’-,
involved multiple clinics. Chalmers and co
workers (1972), in their review of cancer truh
identified only 49 as multicenter trials out of
reviewed. Only 15 of the 113 trials published
1980 and reviewed for this book were mu''
center by the definition used in this book (
r
4-3).

Design features of NIH single-center and multicenter trials
Single-center

Feature

Total number of trials

Table 4-2 provides a summary of a feu of iNkey design features of single-center trials sersm
multicenter trials for NIH-sponsored trials rr
ported in the 1979 NIH Inventory (National h
stitutes of Health, 1980). Table 4-3 presides »
corresponding summary for the 113 trials d»
cussed in Chapter 2.
A major difference between multicenter iv‘
single-center trials, apparent in both tables ,

Single-center

Total
number
of trials

Number

National Cancer Institute (NCI)

654

National Eye Institute (NEI)

26

National Heart, Lung, and Blood Institute
(NHLB1)

Multicenter

Percent

Number

Percent

261

39.9

393

60 I

18

69.2

8

20

I

5.0

19

30 8
95 0

National Institute of Allergy and Infectious
Disease (NIAID)

120

104

86.7

16

I’J

National Institute of Arthritis, Diabetes, and

67

42

62.7

25

.17.1

32

29

90.6

3

94

National Institute of Dental Research (NIDR)

26

26

40

29

100.0
72.5

0

National Institute of Neurological and
Communicative Disorders and Stroke
(NINCDS)

II

00
27 5

National Institute of General Medical
Sciences (NIGMS)

I

0

0.0

I

100 0

986

510

51.7

476

48.3

Total

Percent

Number

Percent

510

100.0

476

100.0

159
217

31.2
42.6
36.3

99
221
156

20.8
46.4
32.8

Number of treatment groups/trial
I
2
>3

134

2

2

60
25 to 200

166
52 to 362

30
12 to 100

83
26 to 181

Median number

Sample size
Median number of patients/trial

Range*
Number of patients/trial/treatment group!

Range*
Random
Nonrandom

259
251

50.8
49 2

334
142

702

432
36
8

908
7.6

9
50

1.9
10.5
19.7
67.9

29.8

Type of trial

Therapeutic
Prophylactic
Diagnostic

369
90
50

72.5
17.7

10
51
129
319

2.0
10.0
25.3

94

62.7

323

9.8

1.7

Anticipated length of funding
>1 year < 2 years
>2 years < 3 years

•?0!h io 80th percentile.
•Calculated by dividing median number of patients per trial by the median number of treatment groups per trial.

sample size. The typical multicenter trial has
"'•'re patients than does the typical single-center
■’ il I his difference is most apparent for the
Hi papers reviewed in Chapter 2. The median
•umher of patients enrolled per trial was 283 for
'** 15 multicenter trials, which contrasts with 40
' ’f the 9X single-center trials (Table 4-3).
4 4 THE PROS AND CONS OF
''INGLE-CENTER VERSUS
Ml I.T1CENTER TRIALS

Digestive and Kidney Diseases (NIADDK)
National Institute of Child Health and
Human Development (NICHD)

Multicenter

Number

Method of treatment allocation

4.3 DESIGN CHARACTERISTICS
OF SINGLE-CENTER VERSUS
MULTICENTER TRIALS

NIH-sponsored single-center and multicenter trials by institute for fiscal year 1979

Sponsoring institute

Table 4-2

f '•nam features of single-center trials make
'be’n appealing. They are generally easier to
’’sunt and carry out than their multicenter coun•e-pans. The fact that all study personnel are
"fed in the same institution in most single•"'rr trials obviates the need for and expense of
ntaming communications and decision-mak“t structures needed for execution of most multcenter trials. In addition, the physical proxim-

ity of study personnel may make it possible for
them to work more efficiently and to achieve a
higher degree of uniformity in the procedures
they perform than might be expected in a multi
center trial. Further, the fact that all patients
enrolled in the trial come from the same area in
the typical single-center trial should produce I
a more homogeneous study population than
might be expected of a population made up of
patients from different clinics.
The main weaknesses of the single-center trial
are sample size and resource limitations. One
center and a few investigators will find it difficult
to recruit and tollow the numbers of patients
needed. Compromises will have to be made in
order to bring the number of patients required
for study into line with reality while still provid
ing adequate type 1 and II error (see Glossary)
protection. The original trial, planned to focus
on a single clinical event as the outcome, may
have to be converted to one involving composite

T
1I
4
.■I

28

4.8 Cost of single-center versus multicenter trials 29

Single-center versus multicenter trials

forts involved in mounting and carrying out a
multicenter trial. It is much easier and less time
consuming to design and carry out a short-term
trial in a single clinic than it is to mount and
execute one extending over a period of years and
involving multiple clinics. Most investigators
lack the time and wherewithal to initiate such
trials. And even if they do have the resolve to
carry such efforts forward, they may not have
the support needed to cover developmental costs
for the work. The demise of NIH planning
grants has virtually precluded the acquisition of
government funds for planning multicenter
trials. As a result, responsibility for initiative
rests in the hands of senior investigators with
other sources of support and in the hands of
sponsoring agencies.
Another reason for the prominence of single
center trials is that promotions in most academic
institutions are based, in large measure, on the
originality, number, and quality of papers pro
duced by those considered for promotion. As a
result, an investigator who carries out a number
of short-term, single-center trials and who uses
them to produce a series of papers as sole or
senior author is more likely to be promoted than
one who works on a few long-term multicenter
trials and who produces relatively few papers,
even if of high quality. The prospects for promo
tion may be further diminished if the papers
produced are written under a corporate mast
head (see Remington. 1979. and see also Chap
ter 24 for a discussion of authorship policies).

Tible 4-5

Third, there should be an identifiable set of
- with adequate support staff and facilities
.jrn out the trial.

4.7 TIMING OF SINGLE-CENTFR
VERSUS MULTICENTER TRIMS
Many investigations of a new or existing trrr
ment modality begin with uncontrolled obv-.»
tional studies, followed by small-scale cIi-m
trials. Only after the results of these trials N-f to appear in print, and especially if the\
inconclusive or conflicting, is the need for l.ur
trials recognized. Even then, sponsors and is,
review groups that advise them will be reluu*to commit the money required for a multuc’-'trial if they think answers can be obtained » *
less effort and money.
Some evaluation questions are slow to p* >
gress beyond the stage of uncontrolled stud o
some never progress beyond that point (Uben
may be considered only in the context of mu’’
center trials from the outset. A case in po »• .
risk factor reduction for cardiovascular div-jv
There is no realistic way to address this ns.»
except via large-scale trials, such as MR! IT
(Multiple Risk Factor Intervention Inal Pr
search Group, 1982).
Three general conditions should be xatisfe*
before a multicenter trial is considered I
there should be evidence that multiple clmio art
needed to meet the sample size requiremcnn ■
the trial. A single-center trial may suffice d
sample size requirement is modest Seco-xJ
there should be an identifiable group of clmx*
investigators who are willing and able to t< " »
a common treatment and data collection pr •

4S COST OF SINGLE-CENTER
M RSI'S MULTICENTER TRIALS
W (,nlv database available for a comparative
provided via the 1979
\IH Inventory of Clinical Trials.2 The total dol-

i-iKms of cost is that

■)

NIH cxpcndilurcs for trials in fiscal year 1979 by type of trial

Amount
(millions
of dollars)

TYials

Type of trial

Number

Percent

Dollars

Percent

Median patient
cost per year*

Single-center

510

51.7

35.0

25.7

$587

Muhicenter

476

48.3

101.1

74.2

$523

Total

986

100.0

136.2

100.0

$574

lar cost for multiccnter trials was nearly three
times that for single center trials in 1979 (101.1
million versus 35.0 million). However, this figure
is misleading in that it is not adjusted for the
differences in sample size noted in Table 4-2 for
the two types of trials. This has been done in
Table 4-5 using median cost per patient per year
of study. When viewed in this way, the cost is
actually less than for single-center trials—a note
worthy fact in view of oft-expressed concerns
regarding the cost of multicenter trials.

•The dollar coo per patienl per year for a given trial was derived by dividing the total
projected expenditures for that trial by the product of the number of patients to be enro e
(projected) and years of support (projected) required for execution of the trial The me ian
dollar cost per patient per year for a given type of trial was determined by ranking the resu ing
figures (or individual trials from lowest to highest and then locating the dollar va ue corre
ponding to the 50th percentile point in the resulting distribution (median value).

'

f

i. page 12

5.2 Coordinating centers

5. Coordinating and other resource centers
in multicenter trials

4-1 Type of resource center represented in the 14
,. A cu bed m Appendix B*

«.fntrr

;I

. .-J n«unf! center*

►

Technical skills, like fire, can be an admirable servant and a dangerous master.

oltice

».»•-< center

A. Bradford H '! i

..fii laboratory
r x-rement and distribution center
».i •» control center

5.1 Introduction
5.2 Coordinating centers
5.2.1 General activities
5.2.2 Location
5.2.3 Staffing
5.2.4 Equipment
5.2.5 Relative cost
5.2.6 Internal allocation of funds
5.3 Central laboratories
5.4 Reading centers
5.5 Project offices
5.6 Other resource centers
Table 5-1 Type of resource center represented
in the 14 trials sketched in Appen
Appen-
dix B
Table 5-2 Coordinating center activities by
stage of trial, with emphasis on
data coordination activities
Table 5-3 Percent of full-time equivalents by
category of personnel and year of
study for the CDP Coordinating
Center
Table 5-4 General equipment requirements of
coordinating centers
Table 5-5 Relative cost of coordinating centers
for five trials reviewed in the Coor
dinating Center Models Project
Table 5-6 Budget allocation for coordinating
centers by category and year of
study. Results for centers from
AMIS, CDP, CAST, HDFP,
LRC-CPPT, and MR FIT
Table 5-7 Budget allocation of the CDP Coor
dinating Center, by category and
year of study
Table 5-8 Central versus local laboratories in
multicenter trials
Table 5-9 Conditions under which centralized
readings may be required

..

Figure 5-1 Percentage cost of the COP ('<»<•• ‘
nating Center, relative to tola; •
reel study cost
5.1

14

13
12
11

6
2

< -t tr.ab were classified as multicenter except one. the
.. ,-c Health Study (PHS).
i.uh had multiple laboratories and/or reading centers,
-in table B 4. Appendix B for specifics.
.. . ,hr Uiah had both a data and treatment coordinating

INTRODUCTION

A resource center is any center involved m i
trial, other than a clinical center, that i» •
charge of performing a specific set of hind -'
concerned with the design, conduct, or an.iS» •
of the trial. Resource centers include (see (• ••
sary for definitions):
• Data centers
• Data coordinating centers
• Treatment coordinating centers
• Coordinating centers
• Project offices
• Central laboratories
• Reading centers
• Quality control centers
• Procurement and distribution centers

••'ponsihlc for receiving, editing, processing,
i-i \/mg. and storing data generated in the
• »i In fact, some studies may use multiple
r"’rr* io perform this function. The most com- n approach when this is the case, is to esiregional data centers, with each of the
r-'rr* performing identical functions. Such
>■’ Kturcs. while relatively uncommon for
>•
done in one country, may be necessary in
-•'rn.uional studies, especially when different
•"fujges are involved. Both the International
n Study in Children, IRSC (see Sketch 14,
Vrpendix B). and the International Mexiletine
Antiarrhythmic Coronary Trial, 1MPM I (Alamercery et al., 1982) had separate
.‘irj coordinating centers to service United
'' i’n and European based clinics.
Ihe data center (or centers), at least in the
»rrtr multiccnter trials, will typically have a
• -mScr of coordination responsibilities. This
•**4 makes a distinction between two types of
. '’'fdinating functions—those related to data
'ection and those related to treatment. A data
- 'Jinatmt' center is defined as one that, in
Jiimn to responsibilities for receiving, editing,
r* *esMng. analyzing, and storing data generm a trial, has responsibilities for coordinat"f the data generation activities of the clinics
I bn implementing and maintaining quality
•*» jrance procedures related to the data generan prixess. Responsibilities for coordinating
i lmmistration of treatments in the trial and
'uneillance of clinic activities are vested in a
*x ,’>d center - a treatment coordinating center.

This chapter focuses on coordinating centfv
because of their key role in the typical mulur-*
ter trial. The coordinating center, or data o*”
dinating center when there are separate
nating centers for data collection and treatnx—
will be among the first to be funded and the
to cease operations when the trial is comp■<•**•*
It may, in fact, operate after the trial n tc’nated if post-trial follow-up (see Glowan' *
required.
All 14 trials sketched in Appendix B include'either a coordinating center or data coordmat y
center. No other resource center was comm-'n •
all the trials (Table 5-1).

5.2

Number of trials
with center]

COORDINATING CENTERS

As noted in the previous chapter, a muliKTr’r
trial is defined herein to include a center im •

30

f

31

The unmodified term coordinating center will be
used to designate a center that fulfills both the
data and treatment coordination functions.
Use of the term coordinating center outside
this book does not always conform to these con
ventions. For example, the facility designated as
the coordinating center in the National Cooper
ative Gallstone Study (NCOS) was responsible
for treatment coordination and for dispersal of
funds.to the other participating centers, but had
no data coordinating responsibilities. The center
with those responsibilities in the NCOS was re
ferred to as the Biostatistical Center (National
Cooperative Gallstone Study Group, 1981a).
5.2.1

General activities

The general activities of the coordinating center
by stage of the trial are summarized in Table 5-2
(see also Appendix D). The list is adapted from
one developed in the Coordinating Center Mod
els Project, CCMP (Coordinating Center Mod
els Project Research Group, 1979a, I979d). The
activities listed for the first stage—the initial de
sign stage—and some of those for the second
stage—the protocol development stage—may be
assumed by the sponsor in sponsor-initiated
trials.
No one center will necessarily have responsi
bilities for all the functions listed, especially if
there are separate centers for treatment and data
coordination. A review of coordinating centers
for the trials included in the CCMP revealed
important differences in their duties, partly be
cause of the differences in the roles assumed by
other units in the trial, most notably the project
office and the office of the study chairman
(McDill, 1979).
One of the major responsibilities of the coor
dinating center relates to preparation and distri
bution of key study documents, such as the man
ual of operations and data collection forms. In
addition, the center typically serves as the reposi
tory for completed data forms (except for studies
with distributed data entry systems), minutes of
study meetings, progress reports, performance
monitoring reports, and treatment effects moni
toring reports.
5.2.2

Location

The coordinating center, under ideal circum
stances. will be administratively and physically
distinct from the sponsor and from all other cen-

r

5.2 Coordinating centers

32

I

Table 5-2

• Calculate required sample size
• Outline data collection schedule, quality control
procedures, data analysis plans, and data intake
and editing procedures

• Develop organizational structure of the trial
• Prepare funding proposal for coordinating center
• Coordinate preparation of the funding application

Protocol development stage

• Develop treatment allocation procedures

I

Coordinating center

Coordinating center activities by stage of trial, with emphasis on data coordination activities

Initial design stage

K

33

Coordinating and other resource centers in mu/ticenter trials

• Develop computer programs and related procedures
for receiving, processing, editing, and analyzing
study data

• Prepare, in conjunction with the stud*
renewal or supplemental funding requnu
• Update study manuals

Patient recruitment stage

• Administer treatment allocations, including checks
for breakdowns in the assignment process
• Assume leadership role in outlining study needs for
quality assurance

• Implement editing procedures to detect data deficien
cies
• Develop performance monitoring procedures and pre
pare data reports to summarize performance of par
ticipating clinics
• Develop treitment monitoring and reporting proce
dures to detect evidence of adverse or beneficial
treatment effects
• Respond to requests for analyses from within the
study structure
• Site visit participating clinics
• Prepare study progress reports for submission to
sponsor

the trial. This separation insulates the cenin
... itom the direct administrative control of the
.nM*r. and helps it to establish and maintain
',1’ked working relationships with all other
—-rx m the trial. This balance may be difficult
thieve if the center is part of the sponsoring
or if it is physically or fiscally a part of

• Evaluate data processing procedures and mv ■. r

necessary

• Assume responsibility for location of patienu

• Serve as the payment center for general study needs,
such as study insurance, and other specialized proce
dures not provided for in the grants or contracts of
other participating centers

• Store, under adequate security, names of study pa
tients and other identifying information for future
follow-up

required for post-trial follow-up

• Carry out periodic training sessions to mamti • » r
level of proficiency at clinics in treatmem »•»! r,,
collection procedures

• Train clinic personnel in required data collection

• Provide a repository for official records of the study.
including minutes of meetings, manuals of opera
tion. etc.
• Serve as the funding center for a trial operated under
a consortium agreement, unless this function is ful
filled by some other center

• i ordinate mailings, telephone calls, or clinic visits

• Update existing data files with data collected during
post-trial follow-up
• Assume leadership role in drafting and distributing
any manuscript using post-trial follow-up results

• Prepare periodic reports on performance <*( e
and resource centers

• Prepare summary of study results for prew"’** •
participating investigators for use in clow •• ■ . »r

• Develop manuals needed in the trial, including the
treatment protocol, clinic manual of operations,
coordinating center manual of operations, etc.

rrn »hereabouts are unknown

• Prepare periodic data reports for safeis m. • • ■ ,,
committee

• Develop interface for data transmission from clinics
and other resource centers to coordinating center

• Distribute study data forms and related materials

up

• —s to locate patients whose cur• i^p!cmeni procedures

Treitment and follow-up stage

• Design and test data forms

procedures

rs- mat follow-up Mage (optional)
. (..mrtle a list of pattents eligible for post-trial follow

,

• Develop and test data collection forms for , : v
stage

• Implement clinic and personnel certification proce
dures

activities by stage of trial, with emphasis on data coordination activities (continued)

»{ the clinics in the trial.
1 A?Ke of the 14 trials sketched in Appendix B
‘m! coordinating or data coordinating centers
k j’rd in academic institutions. This setting has
- :*<■* and minuses. A prestigious teaching insti• n. especially one with a recognized degree
•• rim in biostatistics, epidemiology, or re»•-*! Iiclds. provides a pool of bright and enerr’s people to meet the programming and data
i-i **is needs of the center. In addition, the
unity to teach and to interact with other
» j'»s may help the center attract and retain
w- 'f professional personnel.
I he minuses stem from the internal bureau’us of any large academic institution. Most of
-e coordinating centers reviewed in the CCMP
i' except one of ten centers reviewed were lo»•*<! m academic institutions) complained of
r sullies in recruiting intermediate-level perv -'*<■1 because of pay and promotional restric-s imposed by their respective institutions,
^vml had difficulty in purchasing computing
* r !*»are for their own needs because of policies
•
at discouraging dedicated facilities.
I he real or perceived lack of administrative
'■rvhiiuy of such settings, coupled with small
^.i ness set-asides for government-funded stud« • I nited States Congress, 1981), has given
-prtus to coordinating centers located in pri•••f iprofit or nonprofit) business firms. The
^•santine Aspirin Reinfarction Study (PARIS)
■'■•'dinahng center, located at the Maryland
M-dical Research Institute, is a case in point
’ ^’'antme Aspirin Reinfarction Study Rewvih Group. 1980a). The main advantage of
»setting is the administrative flexibility it pro"des for personnel hiring and pay practices and

follow-up
• Take initiative for reviewing study priorities
.
proposing changes in the organizational or . -r »
ing structure of the trial
• Assume major role in writing paper on de* r
methods
Patient close-out stage
• Monitor for adherence to agreed-upon patient <
out procedures

• Develop plans for final data editing
• Design and test computer programs needed h* f
data analysis
• Develop plans for final disposition of stud* d»'»
• Coordinate logistics of patient disengagement ■"
treatment
• Assume key role in writing papers summar.r •>< ■»
suits of the trial
• Develop plans for disengagement of clinwal <»••»•»
from the trial

Termination stage
• Perform final data edit and undertake final aniSvs r
data according to plans outlined by stud* *»**•
ship
• Implement study plans for disposition of stud* •*
ords
• Assume leadership role in paper writing aciiv*’*’
• Undertake extra measures to locate patients -■* “

follow-up
• Supervise collection and disposal of unused slud*
ications
• Distribute draft manuscripts and published pare-’
participating centers
• Serve as funding center for activities in the ma! »' termination of support for clinics

I

for acquisition of needed computing; hardware
and software. The main disadvantage stems
from the lack of stability of any operation de
voted to a specialized set of activities. That lack
may make it difficult to recruit and retain
needed personnel.

5.2.3

Staffing

Ten of the 14 trials sketched in Appendix B had
coordinating centers headed by persons with a
doctorate in biostatistics. Three centers were
headed by persons with M.D. degrees; and one
was headed by a person with a master’s degree in
applied mathematics.
All coordinating centers require expertise in
the areas of biostatistics and computer program
ming. Ideally, the staff should include someone
trained in medicine who is knowledgeable in the
disease under treatment as well. When this is not
possible, the director of the center should estab
lish a working relationship with appropriate
medical personnel located outside the center.
The relationship may be established via collabo
ration with a medical department in the direc
tor's parent institution or nearby medical facil
ity, or via relations with one of the clinics in the
trial.
The CCMP has provided summary staffing
data for seven of the coordinating centers re
viewed in that project (Hawkins, 1979). A de
tailed staffing profile for the Coronary Drug
Project (CDP) coordinating center is provided in
Table 5-3 (see also Meinert et al., 1983). The
figures in the table were based on data contained
in annual budget requests of the CDP coordinat
ing center to the National Heart, Lung, and
Blood Institute (NHLBI).
The total number of full-time equivalents
(FTEs) rose from 7 in the first year to a high of
36 in the tenth year (column 3 of Table 5-3).
Programmers and master's-level statisticians
accounted for about one-quarter of the staff
during the 13-year period covered in the table

34

Coordinating and other resource centers in multicenter trials

5.2 Coordinating centers

Table 5-3 Percent of full-time equivalents by category of personnel and year of study for the CDP
Coordinating Center

Percent offull-time equivalents (FTPs)
Year

of
study*

Total

Stage

FTEs

MD or PhD
in statistics

MSc in
statistics

. <

Support

coders

pcrumnet*

Protocol dev.
Recruitment
Recruitment

6.8
14.8
19.0

27.0
32.4
20.1

29.2
20.3
31.5

29.2
27.0
31.5

146
20 1
16 X

4th
Sth
6th

Recruitment
Follow-up
Follow-up

24.3

19.8
17.3

28.8
24.2

18.6

22.7

32.9
32.3
26.5

18 5
26 2
32 2

7th
8(h

Follow-up

27.3

17.6

22.0

Follow-up

30.3

15.8

26.4

9th

Follow-up

29.5

16.3

23.7

25.6
23.1
24.4

34 X
34 6
35 6

10th
I Ith
12th

Close-out
Termination
Termination
Termination

35.7

12.3

23.8

22.6
17.2

14.2

23.8
24.8

23.5
19.2

40 I
37 6
34 1

11.5

21.7

17.4

35 7

13th

27.9
25.2

18.6

^puling facilities* for storing, editing, and analyzing

«• jjx data
• (Kt work stations for use by programming and data

Data coords,
key punch,

1st
2nd
3rd

24.8
26.4

General equipment requirements of coordinat-

T.Mr

pri’ces'tng staff

• ait uation with high-speed printer
• fvdxated minicomputer for data storage and simple

• i

-iruTKontrolled graphics equipment

• t muonic calculator*
• !>i-> enm equipment* (e g., key punches, key-to-tape

u-.tt ke' -to-diskette units)
• w »d processing equipment*
• r* i.x-opsing equipment*

*von and report binding equipment

• '

• ’-viopier (for transmitting and receiving special docu• Mi

ng equipment (postage meter, scale, etc.)

Source: Reference citation 320 Reprinted with permission of Elsevier Science Publishing Co.. Inc.. New York

• *

•The study started in April 1965. Patient recruitment began near the end of the first year (March 1966) md
completed during the fourth year of the study (October 1969). Close-out of follow-up occurred in 1974 during thr
half of the tenth year. The main activity thereafter had to do with analyses for paper-writing activities

• 1 '-roof, environment-controlled storage vaults for

•Administrative, secretarial, and clerical personnel Also includes a graphic artist.

(column 5). Data processing activities were con
centrated on systems development and program
ming for data intake and editing during the early
part of the trial. Reductions in these activities as
the study progressed were offset by increased
demands for data analyses.
Data coordinators, key-punch operators, and
coders accounted for one-quarter to one-third of
all coordinating center personnel through the
eleventh year (column 6 of Table 5-3). The drop
in years 12 and 13 resulted from reductions in
data intake and keying operations following
completion of patient close-out.
Secretarial, clerical, and administrative staff
constituted the largest personnel category start
ing with the sixth year. Growth of this category
from 15% in the first year to more than 40% of
the FTEs in the tenth year was a reflection of
an increasing workload associated with manu
script production and maintenance of various
reading and quality control procedures in the
study.

5.2.4 Equipment
The equipment listed in Table 5-4 represents
items that are likely to be needed in a “typical"
coordinating center operation. The list does not
include general office furniture and equipment.

such as desks, chairs, typewriters, and dicta’ -t
and transcribing equipment. These are a*su-v
to be part of any office setting.
The approach a coordinating center takes t
data entry and processing may be dictated large measure by the equipment that exists at tsr
institution housing the center and the data pro
cessing philosophy held by key people in ’•a'
institution. The factors that should be co"» •’
ered in choosing between a dedicated nr ern’-a
ized approach to computing are discussed *
Chapter 17.
5.2.5

Relative cost

Table 5-5 provides data on the relative c«Ht d
five coordinating centers reviewed in the (( MP
(Meinert, 1979a). The percentage annual emt
the individual centers, relative to the total c***
of the trials for that year, ranged from 5 11<> * I
in the first year and from 7.3 to 16.8 for the otbe*
years covered in the table.
Figure 5-1 is based on data from the ( PF
(Meinert et al.. 1983). As in Table 5 5. the
reported represent the proportionate com of '**
coordinating center, expressed as a pcrten’ir
of the total direct cost of the study. The nm-' •'
of the expenditures during the first
,x
curred in connection with equipment pure

->i i jhmets with locks*

• m • ■hlming equipment and viewers

•it• upes and other essential study documents

i-«! work in developing data forms and manuals
’ the study. Expenditures in the clinics were
- slest until the start of patient recruitment in
••v M-cond year. Support for clinics terminated
nit the eleventh year. Only the coordinating
.f’-ter was supported beyond that time. The
<r*.hjj| increase in proportionate costs starting
• 'h the third year and continuing through the

tenth year is a reflection of increased demands
for analyses related to treatment and perfor
mance monitoring and for paper writing, super
imposed on continuing demands for mainte
nance of established data collection, intake, and
editing procedures.
There are no accepted rules of thumb for de
termining the correct allocation of funds for the
coordinating center, relative to other centers in
the trial. The amount will depend on the nature
and complexity of the data collection, editing,
and analysis procedures needed, and on the total
number of clinical centers in the trial. The rela
tive costs, all other things being equal, will fall as
the number of clinics increases, since many of
the developmental, programming, and analysis
costs incurred by the coordinating center are
independent of the number of clinics. Part of the
drop in relative cost, shown in Figure 5-1, is due
to the addition of new clinics during the First two
years of the CDP. There were only five clinics
funded during the first year. Twenty-three addi
tional clinics were funded early in the second
year. The last complement of 27 clinics was
added near the end of the second vear.
The funds available for the coordinating cen
ter must be in line with the demands placed on
it. Experienced investigators and sponsors will
review the overall allocation of funds at intervals
over the course of the trial and will reallocate
funds among centers if there are gross imbal
ances. The way in which this is done depends on
the funding vehicle. It is relatively easy to do
with either a consortium approach to funding or
with contracts, but not when each center has its
own grant (see Chapter 21).

Table 5-5 Relative cost of coordinating centers for Five trials* reviewed in the Coordinat
ing Center Models Project
Percent of total study cost

Year
of trial

Number
of trials^

I
2
3

5

5
5
5
4
4

6

3

4

35

Lowest

Median

Highest

5.1
7.3

9.0
9.7
10 I

51.7

10.1
11.2

13.6
13.6
16.1

8.6

9 7
98
10.7

114

16.8
14 0

•Aspirin Myocardial Infarction Study (AMIS). Coronary Drug Project (CDP). Hypertension Detecnon and Follow-Up Program (HDFP). Lipid Research Clinics. Coronary Primary Prevention Trial
(I R( ( PPT). and Multiple Risk Factor Intervention Trial (MRFIT)
AMIS reported data through the third year. MRFIT reported data through the fifth year.

5.3 Central laboratories 37

36

Coordinating and other resource centers in muhicenter trials
allocation of the CDP Coordinating Center, by category and year of study

Figure 5-1

Percentage cost of the CDP Coordinating Cen

ter, relative to total direct study cost.*

50

40o
Id
O

30-

<

u

20-

UJ
10-

0+0

♦

'

6

'

8

'

'

10

2

YEAR OF STUDY

•Based on direct cost expenditure data from the NHLBI, exclud
ing costs for the central laboratory and drug distribution center.
Total costs for all the centers combined ranged from $3.3 to $4.3
million during the third through the tenth year of the study.
Figures for the first two years and the eleventh year were $0.5.
$1.3. and $0 9 million, respectively.

A

i

Source: Reference citation 320. Reprinted with permission of El
sevier Science Publishing Co., Inc., New York.
*■

5.2.6

Internal allocation of funds

The allocation of funds within the coordinating
center is as important as the allocation of funds
among centers. The amount of support available
for personnel must be balanced against that
available for equipment, computing, and other
support services.

T«blf 5-7

The internal allocation of funds, as rtfleiin?
by annual budget requests submitted to the sp..n
soring agency for several different centers ,
given in Table 5-6 (Meinert, 1979a). Table < ■
provides a detailed look at the allocation •
funds within the CDP coordinating center i Mnert, 1983). Ideally, the results in both tj*--s
should be based on after-the-fact expend'.data, but reliable data of this sort are alm impossible to obtain.
The typical coordinating center, as refleurd
by the median values recorded in Table < »
budgeted somewhere between 50 and NY, of n
direct cost funds to personnel and about 2<> t.
computing. The latter category includes lund,
for rental of data processing equipment, as *e
as time charges for computer use and for s.d1
ware rentals or purchases.
Funds requested for travel ranged from 1 t
6% of the annual budget. They were um-J ■
cover travel for center staff to attend sluds
mittee meetings, meetings of the entire insot ft
tive group, visits to participating centers.
scientific meetings. The “All other categories' Tables 5-6 and 5-7 contain cost items needed !.•
support general activities in the trials and
elude funds for items such as study publications
study insurance, and consultant fees and related
expenses.

Percent of direct costs devoted to:

Year
uu<h

An issue in any trial that requires laborit. o
determinations is where those dcterminat
are to be made. In this regard it is imporurt

i
2
3
4

6
6
6

5

5*

6

4*

6

All other
categories

Computing]

TYavel

50

6
6
4

25

60

19
19
16

61
60
60

18
18
16

3
4
4

18
18
20

62

13
20

•Budget data were available for all six centers only through the first four;yei
y”rs ”‘J*?
prepared AMIS d.d not yield data for years 5 and 6 CASS d.d not yield data for yea f>. A* .
did not prov.de a personnel budget for year 5. Hence, the med.an value for personaH for hat year
on results from only four centers. All other entr.es for that year are based on Five stud.es.
and (or
♦Includes funds for computer time, as well as for purchase or rental of data entry equ.pmen
computing hardware and software

4th
Sth
6th

Recruitment

7lh
8th
9th

Follow-up
Follow-up
Follow-up

Recruitment
Recruitment

Follow-up
Follow-up

Close-out
Termination
Termination
Termination

Personnel

Computing*

TYavel

All other
categories

29.0
75.7

55.7

2.1
4.1
2.9

13.2
11.3
119

$188.111
196,103
279.749
316.384
372.242
403.991
507.745
432.996
569.170
595,756
498.494
396.023
339.736

73.0

8.9
12.2

65.6
68.1
67.6

22.4
22 I
17.6

2.5
2.4

9.5
7.4

2.2

12.6

75.1
73.7

15.4
12.1

12.4

72.5

16.9

2 1
1.8
1.9

73.8
67.8
64.2
60.2

17.1

1.8

20.6

2.2
2.5
2.4

24.7
25.3

7.4
8.7
7.3
9.4
8.6
12 I

- •• • ‘
... Inc.. New York.
r Reference citation 320. Reprinted with permission of Elsevier Science Publishing
Co.,

5.3 CENTRAL LABORATORIES

Personnel

Protocol dev.

CC funds
requested
(direct costs)

for purchase or rental of data entry equipment and for computer
•InJudes funds for computer time as well as f
haid*ard and software.

Median percent of direct costs devoted to:
Number
of centers

Stage

1st
2nd
'rd

10th
llth
12th
I'th

5-4 Budget allocation for coordinating centers by category and year of study. Results
Table 5-4
for centers from AMIS, CDP, CAST. HDFP, LRC-CPPT, and MRFIT

Year
of study

Budget

I

.! st.nguish between determinations required for
•it.nc patient care and those needed for treat-*nt comparisons. The former set of determina• ns may be performed locally and need not
r.en be part of the central data file. The latter set
■ determinations may be done locally or in a
.entr.il laboratory and should be part of the
.mtral data file.
All but three of the trials listed in Appendix B
c<l on central laboratories for making certain
.‘'•ermmations. However, many of those same
•
also relied on local laboratories for other
'-•ermmations.
It will be necessary to rely on local determina• ■•ns where it is impractical to use a central
i^'raton, or where rapid feedback is required
-1. m determining patient eligibility or in mak■r treatment decisions that depend on labora• n values). Even if this is done, however, the
>’erminations may be repeated at a central lab'itorv m order to provide results that are free
d laboratory variation. In such cases, investiga■ n must decide which set of determinations are
••
used for assessment of patient eligibility
»-'d lor treatment decisions.
I he general factors to be considered in decid“r whether to use a central laboratory at all are
'•rimed in Table 5-8. The costs and logistical
! " culties of establishing and operating a cen'»! laboratory must be balanced against need.
<1 treatment comparisons can be made with
results obtained from local laboratories as long

Table 5-8

Central versus local laboratories in multicenter

trials

Local laboratory needed or permissible when:

• Specimens cannot be preserved for shipment to a cen
tral laboratory

• Determinations are needed quickly for the acute man

agement of patients
• Higher level of precision possible through use of a
central laboratory not essential to the trial

• All participating clinics have laboratories that per
form the required determinations
• Individual laboratories are all certified by the same
agency and are part of an ongoing standardization

and quality assurance program
• Local laboratories agree to participate in standardiza
tion and monitoring efforts required by the study

• Senior personnel of each local laboratory are sensitive
to the specific needs of the study and are willing to

make adjustments in their procedures

• Risks of treatment feedback bias (i.e., where the labo
ratory reading obtained is influenced by knowledge
of a patient's treatment) is minimal, e g., as in dou

ble-masked trials
Central laboratory needed or desired when:

• Required determinations cannot be performed at the

local laboratory
• Required level of standardization is not feasible with

individual laboratories
• Separation of laboratory and clinics is needed to re
strict flow of laboratory results back to clinics
• Laboratory measure is subject to wide variability from
laboratory to laboratory

5.6 Other resource centers 39

38

Coordinating and other resource centers in multicenter trials

as the treatment allocations are balanced by
clinic.
The fact that the central laboratory is remote
from the clinics has advantages and disadvan
tages. The location adds to the cost and logisti
cal difficulties involved in transport of the speci
mens. However, it also helps to ensure that the
required masks are maintained (e.g., that the
determinations are performed by personnel with
no knowledge of patient treatments).

5.4

• Reading procedures are complex and requirr
skills or training
• High degree of uniformity and standardization n
quired in the readings, especially for determtnint • t’■
bility for the trial and for key items nl
information
• Large volumes of records are to be read
• Separation of the reading and treatment procn. . >
sired

READING CENTERS

A reading center is a facility designed to provide
the technical skills needed to read and code mate
rials or records collected in the trial. The read
ings should be made by individuals who have no
knowledge of the treatment assignment to en
sure separation of the treatment and reading
processes. They may involve extracting informa
tion from ECGs, fundus photographs of the eye.
angiograms of the vascular structure of the
heart, cholecystograms, chest x-rays, liver biop
sies, food records, death certificates, or autopsy
material. Of the 14 trials sketched in Appen
dix B, 12 had one or more such centers.
The conditions under which a centralized ap
proach to reading is advantageous are outlined
in Table 5-9. They are in large measure similar
to those discussed for central laboratories.
The way in which central readings for eligibil
ity assessments are to be used poses problems
when they do not agree with local readings,
if local readings are used for decisions on en
rollment and randomization. Decisions must be
made in such cases as to the disposition of pa
tients where there are disagreements. Patients
should be retained if the disagreements are
minor. Procedures that allow investigators to
exclude patients after randomization must be
administered by personnel masked to treatment
assignment and treatment results (see questions
38b, 39, and 50, Chapter 19).
The number of independent readings per rec
ord is a design question that should be resolved
before any records are read. It is common to
require two independent readings, with or with
out subsequent adjudication of disagreements.
Duplicate readings offer a more precise basis for
treatment comparisons than is possible with a
single reading. However, valid comparisons can
be made with just one reading per record, so
long as the readings are independent of treatment
assignment.

(Hagans, 1974; Veterans Administration Co
operative Studies Program, 1982).
PARIS had a quality control center. Its duties
outlined in one of the publications from that

Table 5-9 Conditions under which centrali/nj
may be required

5.5

PROJECT OFFICES

The project office, as defined in this book .
located at the sponsoring agency and is dcsir-^ •
to serve as an interface between the sponsor jv
the investigative group involved in the trial I v
main functions assumed by staff in the pr •
office are to:

• Represent the interests of the sponsor in
design and operation of the trial
• Perform coordinating functions assigned
the leadership committee of the studs
• Perform special functions assumed or »'
signed to the office by the sponsor or ins?'
tigative group
• Serve as members of the key leadership
mittees of the study
• Carry out special analyses and tabulation
The National Institutes of Health (MU' ‘■r
used different terms to designate the oflnr
filling these functions. It is usually designated »
the project office but may have other
such as medical liaison office or program <*
The role of the project office will be related '
the perceived importance of the trial bs the 'P
soring agency and the size of its financial m'n
ment. Generally, the greater the investment.
greater the involvement of the project ofke
role will also be influenced by the respond -o
of the sponsor in initiating the project I •
tend to have a more pronounced role in sp< nv initiated trials than in investigator-in - ctrials.
...
._
There should be a well-defined dnwon
sponsibilities between the project office a?* *
coordinating center. Failure to sped \ a
can lead to friction between the office a-x.
center. Any division is workable so long i'
principals involved understand and accep ■

k,ivilv influenced by the personalities ot
n the trial The opportunity for an active
v:’i ie encouraged by a weak study leader.iXcmre and discouraged by a strong one.
H OTHER RESOURCE CENTERS
.n-ral trials sketched in Appendix B includedla
£ procurement and distribute center. The
v t "operative Studies Program has a general
K n located in Albuquerque, New Mexico,
... fulfills this function for all Us drug trials

X, Group,
___ inO
n->- see also
akn Sketch
search
1980a;
Sketch 8.
8, AnnenAppen
dix B). One of the prime functions of the center
was to check on the accuracy of the data entry
and analysis procedures carried out by the coor
dinating center. It also played a role in the devel
opment of new data analysis procedures for the
The HPT (Sketch 13, Appendix B) includes a
treatment coordinating center. One of its duties
is to compile materials used in counseling study
patients to make the required diet changes.

H
pfclfeTv

6. / Government expenditures for clinical trials 41

6. Cost and related issues

Tibk k-l

Number of NIH-sponsored trials, by institute and fiscal year

Fiscal year (FY)

A man may do research for the fun of doing it but he cannot expect to be supported f. •
of doing it.
I H. ^-- i.

1975

1976

1977

1978

1979

l jneer INCH

405

522

418

515

654

IsrlNIII

20

21

22

28

26

Mcrgs and Infectious Diseases (NIA1D)

109

141

93

99

120

Vthrms. Diabetes, and Digestive and Kidney Diseases

49

50

49

51

67

< hild Health and Human Development (NICHD)

41

52

53

39

32

Dental Research (NIDR)

44

34

36

37

26

General Medical Services (NIGMS)

2

0

0

I

I

Nrurologtcal and Communicative Diseases and Stroke

59

73

51

55

40

• N| \I)DK)

6.1 Government expenditures for clinical trials
6.2 Who should finance clinical trials?
6.3 Factors that influence the cost of a trial
6.3.1 Design
6.3.2 Planning
6.3.3 Multipurpose studies
6.3.4 Ancillary studies
6.3.5 Equating the data collection needs of
the trial with those for patient care
6.3.6 Undisciplined data collection philos
ophy
6.4 Cost control procedures
6.4.1 General cost control procedures
6.4.2 Method of funding
6.4.3 Cost reviews
6.4.4 Periodic priority assessments
6.4.5 Review and funding for ancillary studies
6.4.6 Justification of data items
6.4.7 Use of low-technology procedures
6.5 Need for better cost data

Institutes of Health, 1975, 1980). 3 he nu-s-trials reported ranged from a low ol
k,
year (FY) 1977 to a high of 9X6 m I) '<•
Table 6-2 gives the NIH expenditure', for . •
cal trials as a percentage of total MH .ippr •« ,
lions. The dollar figures given for total apr’ ations are from an NIH fact book iVi1 .
Institutes of Health, 1981a). Expcndiiiift
clinical trials represented from 4 I t<» * ’■
total appropriations over the 5-ycar per;-'.!
ered in the table. (See Section 2.1 lor n-t*' •
how the inventories were compiled I
Table 6-3 gives expenditures by mMitutr i-»'
fiscal year for clinical trials as a perccnDr
total NIH expenditures. The relative <J.»- <tion of expenditures among institutes
•
mained fairly constant over the 5-\car f*covered. The National Heart. I ung. and B ••
Institute (NHLBI) has had the largest cv*->'
lures for trials, even though the number ■ »
(Table 6-1) is small relative to some ol the
institutes. This Institute plus the Cancer Haccounted for over three-fourths ol .ill nr---lures for trials in the 5-year period cme'e/ s-Section 2.1 for comments on dillcrenco type of trials undertaken by the two mo ’
The total projected expenditures' for < -•»
trials are shown in Table 6 4 by H Res- ■'
the table are given as a percentage of the ■
projected expenditures for all institute' > •“
bined. The percentage distribution for I >
1
expenditures (Table 6-3) was about the u-- »
for FY 1979 projected expenditures 11.<* ' • 1
This was not true for FY 1975 through M ' • ‘
Some of the change was due to the le’-r
trials sponsored by the NHLBI and M 1
average length of NCI trials listed m t^e • ,
Inventory was 2.47 years, contrasted *
years in the 1979 Inventory. The corre'P
'J
figures for NHLBI trials were 3.5* and . ’

Table 6-1 Number of NIH-sponsored trials, by
institute and fiscal year
Table 6-2 NIH expenditures for clinical trials as
a percentage of total NIH appro
priations
Table 6-3 Percent distribution of total NIH ex
penditures for clinical trials, by in
stitute and fiscal year
Table 6-4 Percent distribution of total NIH pro
jected expenditures for clinical
trials, by institute and fiscal year
Table 6-5 Mean and median projected expendi
tures per patient-year of study for
trials listed in the 1979 Inventory
Table 6-6 VA expenditures for multicenter clin
ical trials, by fiscal year

respectively.

6.1 GOVERNMENT EXPENDITURES
FOR CLINICAL TRIALS
Table 6-1 gives a count of trials for the various
institutes of the NIH by fiscal year (National

iN|N( DS)
Heart. I ung. and Blood (NHLBI)

26

26

24

20

20

Ml MH

755

926*

746

845

986

•i-h iudr» ' trials done in the NIH Clinical Center.

6 5 provides total projected expendiper paticnt-ycar of study for FY 1979
i < I his ligurc. for a given trial, was derived
' * .’dmg ihc total projected expenditures for
•r tr ji h\ the product of the projected sample
■
ihc number of years the trial was ex* ’-»! t<» run This calculation was made for
* * tri.il listed in the Inventory. The resulting
i ' were ranked from lowest to highest. The
» ' fjllmg at the 50th percentile constituted
"^.lian proiected expenditure per patient• ! study Ihc mean projected expenditure
* ri’ rnt-year was calculated by dividing the
«i protected expenditures for all trials by the
■" ' priducts derived by multiplying the pror ■ ! sample size and expected duration of the
.bial trials.
v ’r that both the median and mean are immates of the actual per patient-year ex* ' ' ires since they are derived under the as“ "n' that the full complement of patients.

TiMe 4-2

as given by the projected sample size, is enrolled
as soon as the trial is funded and that it remains
under follow-up to the end of funding for the
study. Neither assumption is likely to be true.
However, more refined calculations were not pos
sible with the data provided.
The median expenditure per patient-year for
FY 1979 trials was S574 and ranged from a low of
$70 to a high of $1,657. The mean expenditure
was $273 and ranged from $31 to $889. (See
Chapter 4 and Meinert, 1982, for discussion of
expenditures for single-center versus multicenter
trials.)
Table 6-5 also provides sample size data. The
median sample size of all 986 trials was 100
(range 30 to 850). The mean was 670 (range 99 to
2,589).
Table 6-6 provides expenditure data2 from
1970 through 1981 for Veterans Administration
2. From the Veterans Administration Cooperative Studies Pro
gram. VA Central Office. Washington. DC.. 1981.

NIH expenditures for clinical trials as a percentage of total NIH appropriations
Fiscal year {FY}

5

total NIH appropriations (millions $)

B

MH expenditures* for clinical trials (millions $)

’

FVrcem of total (i.e.. Br AX 100)

1975

1976

1977

1978

1979

S2.O93

$2,302

$2,544

$121

$105

4.2

5.3

4.1

$2,843
$122
4.3

$3,190

S88

$136

4.3

I. Previous expenditures plus projected future

trials counted in Table 6-1.

40

t ’ciixJc, general support provided to the Division of Research Resources of the NIH and to the NIH Clinical Center.

? IF
i>sS

42

6.2 Who should finance clinical trials 43

Cost and related issues
Table 6-3

and median projected expenditures' per patient-year of study for trials listed in the

Percent distribution of total NIH expenditures for clinical trials, by institute and Fiscal seir

fiMe 6-5 Mean
H'Q Inventory

Fiscal year (FY)
Institute

1975

1976

1977

197H

/v'v

Cancer (NCI)

30 2

34.7

35.9

31 9

U*

Eye (NF.I)

3.5

3.9

4.4

53

Allergy and Infectious Diseases (NIAID)

3.5

4.1

2.8

3.1

( inter (NCI)

Arthritis. Diabetes, and Digestive and Kidney Dis

3.8

6.4

6.1

6.6

lie INFD

Child Health and Human Development (NICHD)

4.4

5.0

4.3

3.1

’ I

SrthntK. Diabetes, and Digestive and Kidney Dis-

Dental Research (NIDR)

2.0

1.3

2.7

2.5

I '

General Medical Services (NIGMS)

0.1

0.0

0.0

0.2

o

Civrs (NIADDK)
< hild Health and Human Development (NICHD)

Neurological and Communicative Diseases and

3.9

2.4

2.6

2.5

Sample size

Infitutf

eases (NIADDK)

Mlergs and Infectious Diseases (NIAID)

Ikntal Research (NIDR)

Stroke (NINCDS)

,v

Neurological and Communicative Disorders and

Heart. Lung, and Blood (NHLBI)

48.6

42.1

41.1

44 9

41 ’

All NIH

100.0

100.0

100.0

1000

1000

Total NIH expenditures for clinical trials
(millions $)

$87.8

$120.6'

$105.3

$122.3

Projected
expenditure per
patient-year

Number
of trials

654
26
120
67

Mean

Median

Mean

Median

$ 603
$ 350
$ 302
$1,036

269

100

482

200

1,373

100

180

70

$237
$706
$ 31
$674

32
26
40

473

100

$383

$

483

943

663

$ 55

$

70

99

30

$889

$1,155

20

2.589

850

$873

SI.657

986f

670

100

$273

$ 574

stroke (NINCDS)
Heirt. I ung. and Blood (NHLBI)
Ml MH

SI*.’

•l-.iudM expenditures through FY 1979 plus projected future expenditures.
•l-.ludrs I tnal sponsored by the National Institute of General Medical Sciences.

•Includes expenditures for 7 trials done in the NIH Clinical Center.

1
(VA) sponsored multicenter trials. The support
for such trials represented a little over 3% of the
total VA research and development (R and D)
budget in 1970, contrasted with slightly over
7% in 1981. The portion of VA research funds
awarded to individual centers to conduct single
center trials was not available.

6.2 WHO SHOULD FINANCE
CLINICAL TRIALS?

t .r-nmenl funding is already too high and that
•• vjppcirt is siphoning funds from other more
'M . arras of research.
Han ideal world, the drug and device industry
• . <! underwrite the costs for establishing both
•s.icv and long-term safety of proprietary
Government support would be limited
•• «!ustv (

Clearly, the federal government via the NIM
VA, or other agencies can provide only a Irx* •
of the support needed to carry out chntcjl j >
In fact, there is concern that the present le--

Table 6-4

Percent distribution of total NIH projected expenditures* for clinical trials, by institute and

fiscal year

____________________

- "atih to commercial products that offer
-i I’actiircrs little or no opportunity for prof> Health insurance carriers, such as Blue Cross
i-.! H ue Shield, as well as Medicare and Medi• ' would support trials designed to evaluate
'■*. 'iv health care procedures, as well as trials
i
at assessing the cost effectiveness of difmethods of health care delivery.
arc still a long way from the ideal. Drugs
• * as the hypoglycemic agents have been mar• • -! without any evidence of long-term safety
' '"vacs m relation to the prime reason for
v • vontinued use reduction of morbidity and
•
aturc death associated with diabetes. Most
data on the long-term safety and efficacy
' r' pnctary drugs used for chronic conditions,
’-‘••as diabetes and heart disease, have been
» --m*»lcd at government expense.
H-alih insurance carriers and their clients,
•♦•'ad of encouraging trials, have payment
••
m that discourage them. The general pro“ ’ r’ "n against payments for “experimental’’
s-dures m most health insurance plans leads
paradox in which coverage may be denied

Fiscal year (FY)
!■

Institute

1975

1976

1977

I97S

Cancer (NCI)

20.6

23.2

22.9

24.2

Eye (NE1)

3.1

3.3

7.4

7.7

Allergy and Infectious Diseases (NIAID)

2.0

2.9

2.2

2.2

Arthritis. Diabetes, and Digestive and Kidney Dis

5.4

6.2

5.8

6.0

Child Health and Human Development (NICHD)

3.0

3.8

3.3

2.9

Dental Research (NIDR)

1.8

1.6

1.6

1.7

Io

<4

eases (NIADDK)

General Medical Services (NIGMS)

0.0

0.0

0.0

0.0

no

Neurological and Communicative Disorders and

2.8

3.1

2.3

2.5

I ’

61.4

55.9

54.5

52.7

41*

100.0

100 0

$848 4

$1,000

Stroke (NINCDS)

Heart, Lung, and Blood (NHLBI)

All NIH
Total projected expenditures for clinical trials

100.0
$641.8

100.0

$739.3f

100.0

$848.6

(millions $)

•Includes expenditures through the indicated Fiscal year plus projected future expenditures (see Table f> I for ci'u""

♦Includes expenditures for 7 trials done in the NIH Clinical Center.

I

when a procedure is being tested as part of a
clinical trial but not when that same procedure is
used by practitioners outside the context of any
trial.
t
The drug prescribing practices of the medical
profession have an effect on the testing and licensing practices of the drug industry. It is clear
that physicians prescribe drugs for purposes

Table 6-6

VA expenditures for multicenter clinical trials,

by fiscal year

Fiscal
year

1970
1971
1972
1973
1974
1975
1976+
1977
1978
1979
1980
1981

___________

Cost as percent
of total
R and D
budget

Total
R and D
budget*

Multicenter
clinical
trials*

$ 58 I
$ 60 9
$ 69.1

$1.8
$1.8
$1.8

3.1

$ 78.6
$ 81.8
$ 95.4

$2.4
$4.3

3.1

$101 6
$1096
$118.0
$126.3
$137.7
$137.5

$5.9
$5.8
$6.3

5.8
5.3

$8.5
$9.0

6.7
6.5
7.1

$5 4

$9.7

3.0
2.6

5.3
5.7

5.3

•In millions of dollars.
♦ Adjusted for switch in starting date for fiscal year from July I to
October I.

IT ;

44

6.3 Factors that influence the cost of a trial 45

Cost and related issues

other than the approved indications (Committee
on Drugs, 1978; Erickson et al., 1980; Mundy
et al., 1974). The sales spurt following approval
of cimetidine (Tagamet®) in 1977 for use with
duodenal ulcer and Zollinger-Ellison syndrome
is a case in point. The spurt was due in large
measure to use of the drug for unapproved indi
cations. A total of 2.840 patients were identified
as having received cimetidine in two Baltimore
area hospitals from July 1978 to January 1979
(Cocco and Cocco, 1981). Among this number,
only 604 (21%) had established diagnoses for the two
approved indications. A survey by Schade and
Donaldson (1981) involved 200 consecutive pa
tients admitted to the Yale University Hospital
and the West Haven Veterans Administration
Medical Center (100 patients from each of the
two institutions) who received a prescription for
cimetidine. Only 15 of the patients (7.5%) were
given the drug for an approved indication. The
authors concluded that:

Our findings strongly suggest that physi
cians now prescribe cimetidine for remark
ably diverse purposes, most of which have
not been validated.

Why should a drug company undertake the ex
pense of testing an established drug for a new
indication if it is already being used for that
indication?
The Food and Drug Administration (FDA)
approval process for a drug to be used with a
chronic condition, such as elevated blood glu
cose or lipid levels, requires the manufacturer to
show only that the proposed drug is safe and
effective (e.g., in the case of a hypoglycemic
agent, that it lowers blood glucose levels). Evi
dence of effectiveness in reducing morbidity or
mortality associated with the condition is not
required. Others, outside the drug industry, via
government funded trials such as the UGDP and
CDP, have had to gather the evidence (see Coro
nary Drug Project Research Group, 1973a; Uni
versity Group Diabetes Program Research
Group, l970d).
Even the patent law that protects proprietary
drugs may serve to reduce incentives for indus
try-sponsored long-term trials. Protection is lim
ited to a 17-year period. Proprietary products
can be marketed by other manufacturers under
their own trade names once the period of protec
tion expires. The period for protected sales will
be less, sometimes much less, than the 17 years
after deducting time needed by the manufacturer

V, cnjtcd health care procedures are minuscule
IflM»n There is need for a more realistic
r ’creation of a fund pegged at just l%of
,. i s expenditure for health care would have
. ,•
.m evaluation budget of neaj!yz$2-5i,bU’
|Q\n Contrast that with the $136 million
.^n.huirem EY 1979 for NIH-sponsored clin. i -t jK I lable 6 2).

to lest the drug and obtain approx.il froFDA for marketing the drug.
There are proposals before the United KtCongress to extend the period of protean- •
they have not yet been acted upon I he ■■ legislation involving so-called orphan dnir .,.
example of the importance of the legist u •.? cess in facilitating the development ol drur .
this case for rare diseases that offer little. —. tunity for industry profit (Finkel. I9JQ>
Mechanisms need to be developed th.,- ,
facilitate the mixture of public and private ’
for conduct of worthwhile trials. Drug hr-,
provide limited support for some govern—sponsored trials, via drugs, devices, and
materials they supply free of charge »<•»- •
they will be reluctant to provide massive • ->•
cial aid unless the leadership of the mu.!.
responsive to their needs in the FDA apr »
process. A prototype organizational struj.—
required. In fact, many of the necessan ■•tnzational principles have already been dc\r
For example, the organizational guidelm'*
ensuring a separation of functions m I* x &
(Persantine Aspirin Reinfarction Studs »•
search Group, 1980a) were similar to thi^r .v
in AMIS (Aspirin Myocardial Infarction v
Research Group, 1980a). The latter iru •»
government funded; the former was pt j
funded.
Private health insurance companies and
clients must be encouraged to take a more p- live approach toward the support of uont'** ■
trials. Investments of this sort could pav *
dends in reduced costs for health care inMri-.*
in the future, if coverage for new procedure •»
denied until or unless they were shown io
benefit via properly designed and executed t- i
The NIH, even with greatly expanded rev*
cannot be expected to bear the full hurdethese costs and still provide needed supp« ’•' •
basic research. Other resources are reuu rn?
the momentum developed in the lQ
planned evaluations is to be continued in’
eighties and beyond.
Expenditures for health care have
at an average rate of nearly 12r< per
'• • ‘
the last two decades, as contrasted with * *■
the gross national product for the
(Weichert, 1981). Expenditures totaled 5.4
lion in 1980 with $20 billion for
*2
$28 billion for Medicare in FY P 9 (I
ment of Health and Human Services.
Expenditures for trials aimed at evaluate

» I factors that influence
IHF (OST OF A TRIAL

♦ «I Design
v >1. especially when carefully designed and
•.- • 'rd. can be a costly undertaking. The need
...si cllicicncv is obvious, particularly in an
, • shrinking budgets and skyrocketing costs.
• »’.-x mllucncing cost include:
• i'i! ent eligibility criteria
• x ;"ihcr ol patients required for study
• '
required to develop the study protocol
jnd data collection forms
• '»me variable to be used to measure suc.c'x ol the treatments
• X mher ol clinics and speciality resource cen
ter' required for the trial
• I" itment procedures to be used
• I
ol patient identification and enrollment
plcxity and frequency of data collection
• I enj-th of follow-up
• I "qticncv of follow-up contacts and exami
nations
• I •nc required for final data analysis
• I "ie required to close out the study

IN- frequency of patient contacts and the
jni o| data collected per contact is a major
-• d’-tcrminant. A trial requiring treatment ad- - '’ration over an extended time period and
• ' I'comc measure that can be observed only
» "gular clinic visits will require a more elaborr !.>l|ow-up examination schedule than one
'•mr a short period of treatment and death
’« "x other easily diagnosed event as the out~r measure. The Physicians' Health Study—
?ltx ixkctch I. Appendix B) is an example of a
i ’rrm drug trial not involving any direct pa.•*m,ict. Patients—in this case physicians—
'rvruitcd via mail. Those who agree to par* n’t receive their assigned medication (daily
- ’•r' of aspirin, aspirin and beta-carotene, or
* *
m the mail. Follow-up for mortality is
■*
the National Death Index (National

•

I

Center for Health Statistics, 1981) or via pa
tients' families.
No-contact designs, such as that used in the
PHS, can be considered only under special cir
cumstances. General conditions required include
use of:

• A reliable, easily observed outcome measure
• Treatments that have few side effects or com
plications
• Entry criteria that are not dependent on clini
cal assessments
• A literate, reasonably sophisticated study pop
ulation

6.3.2

Planning

Starting a trial with an ill-conceived research
plan or inadequately tested data forms can result
in a waste of money. Serious design mistakes
may make it necessary to abort the trial. Even if
such drastic action is not needed, modifications
to the data collection procedures after the trial is
under way can be costly to implement, especially
when the formats of data that have already been
collected must be changed to render them com
patible with revised formats. A cost element that
is often underestimated is that of data processing
and analysis. Underfunding this activity can se
riously hamper the entire data collection process
(see Chapter 5 for a discussion of data center
costs).
It is not uncommon for long-term trials to
cost more than originally anticipated. This can
be illustrated with trials sponsored by the
NHLBI, although the problem is not unique by
any means to this Institute. Among the NHLBI
trials appearing in both the 1975 and 1979 NIH
inventories of clinical trials, only one reported a
lower projected cost in 1979 than in 1975. The
projected total expenditures given in 1979 were
more than double the figures given in 1975 for
three of the trials. Some of the changes undoubt
edly were due to failure to anticipate inflationary
trends over the 5-year period. However, most of
the increases were too large to be explained by
inflation.
One reason for increased costs has to do with
shortfalls in patient recruitment and the actions
taken to make up for the shorfalls via more
intensive recruitment efforts and extensions of
the periods of follow-up. A paper published by
investigators in the Cooperative Studies Pro
gram of the Veterans Administration reviewed

46

6.4 Cost control procedures 47

Cost and related issues

the recruitment performance of seven multicen
ter trials supported by that program (Collins
et al.. 1980). One trial was terminated due to
recruitment problems. None of the other six
trials were able to complete recruitment within
the time frame originally proposed. All six re
quired extensions for patient recruitment or had
to settle for fewer patients than originally
planned. Even with extensions, none of the trials
achieved the original sample size goal.

6.3.3

Multipurpose studies

It is not unusual for a trial to be designed to
satisfy a number of secondary objectives in addi
tion to the primary one. A common one relates
to the description of the natural history of the
disease under treatment in long-term trials, such
as the CDP (Coronary Drug Project Research
Group, 1973a). The addition of secondary objec
tives can add to the cost of the trial. The increase
will be smallest for objectives that can be
pursued with data needed for the primary objec
tive as well, and largest when added data are
needed. The decision as to whether to pursue
secondary objectives should depend on the scien
tific importance of those objectives, the suitabil
ity of the trial as a vehicle for pursuing them, the
chances of successfully achieving them, and the
costs associated with their pursuit.
6.3.4

Ancillary studies

The trial, especially in a large multicenter trial,
may provide investigators with opportunities for
a number of ancillary studies (see Glossary for
definition). Some may involve added patients,
whereas others may simply require special anal
yses of existing data. However, as with pursuit of
secondary objectives, they can add to the cost
and complexity of the trial. Priorities should be
given to those studies that are needed to under
stand the action of the treatments under study
and to those concerning methodological issues
of direct importance to the trial. No study
should be undertaken that jeopardizes pursuit of
the primary objective.
6.3.5 Equating the data collection needs
of the trial with those for patient care

The data required to satisfy the research aims of
the trial may be different from those needed for
patient care. Failure to distinguish data needed
for this latter purpose from those needed for the

141

trial can lead to the collection of superflu.
information that is a burden to collect a-s'
process.

6.3.6

Undisciplined data collection

philosophy
The data collection schedule for the trial ih.*
be kept as simple as possible. Strong leader-•> •
is required to ensure the development ot a
cused data collection philosophy and related «of data forms. Without this leadership, the dn
collection scheme can be a hodgepodge el pe'
erally related data items designed to cater t
special interests of specific investigator* m k.
trial.

6.4

COST CONTROL PROCF.Dl Rr

6.4.1

General cost control procedures

Cost control is the combined responsibd tv
the sponsor and study investigators. 1 here substitute for a cost-conscious invcstigati»nk r
Some of the more obvious extravagance* t v
avoided are:
• Use of costly state-of-the-art techno! t
when less sophisticated technologs u . fice
• Unnecessary travel at study expense
• Use of study funds for lavish office fum
ings or for activities not related to the

*

• Overstaffing

“Cost saving” measures to be avoided inchxF
• Submission of an unrealistically low bar
request in the hope of improving the r ■'
pects for funding
• Undue reliance on existing staff paid I’
other sources to perform essential G*4
tions in the trial
• Cutbacks on financial support for data
sis in order to increase support lor •'> «
collection activities
• Reduction of the sample size requireme«t *
the trial by switching from a single ever*
a composite of events or to a labora
measure as the outcome measure
• Changing the sample size calculation w
bring it in line with the number ot pa <
available for study
• Sponsor-imposed travel restriction* in a ticenter trial that limit the abihtv of <
gators to interact and function as ■ ‘
sive unit

Method of funding

funding structure for the trial will in itself
~ .vde some cost controls. Ceilings placed on
.x-^duurcs when awards are made, as with
MH grant awards, encourage the conser,n of funds, provided unused funds accrued
scar can be carried over for use in the
\c.ir Awards with cost-reimbursement feas with some NIH contracts, generally
- ' iJc provisions for periodic cost reviews by
»ron*or over the life of the award (see Chap21 lor additional discussion).
ddlcrcnccs between grant and contract
of funding are most apparent in the
• .'.rung process. An investigator is required to
■
a budget for the specified number of
Hctore the start of the trial with a fixed cost
• orar grant. Budgeting is done with the realon that the funds requested may be reduced
•
budget is perceived as excessive by review• the proposal. Approved applications that
• mded are supported up to, but not above,
• »ppro\ed ceiling figures set when awards are
»' Xn investigator who has done a poor job
• inticipating costs for the trial will have to cut
‘kt on activities planned or seek supplemental
•.h to make up for deficits.
Ibr huilgct preparation process is different
• ..M reimbursement contracts. Costs can exthe original budget and still be recovered.
H ar\er. reliance on the cost-reimbursement
of binding can pose dilemmas for investir«when preparing their initial budget retn conjunction with Request for Proposals
•'ll'i Submission of a realistic budget that
• hfes support for activities deemed necessary
‘ the investigator but not mentioned in the
u ’ I’ mas cause the response to be viewed as
• •x-mipctitivc. Realization of this fact may
him to adopt a more “pragmatic” ap■■ uh to the budgeting process (i.e., by prepart j budget which he believes to be in the
" pc’itive range, even if he considers it to be
•’ small), since the costs for “unanticipated”
‘ ustiliahlc activities can be recovered later as
••( the cost-reimbursement process.
I inding is tied to the actual level of activities
cost-reimbursement approach. This is
" difficult to do with fixed-cost awards. One
of lunding that combines features of the
• approaches, at least for clinics, involves
! np a designated amount for fixed costs,
' »variable sum that depends on numbers of
1 "i's enrolled and followed. However, a word

of warning is in order. Capitation forms of pay
ment can lead to questionable practices if clinic
personnel are tempted to cut corners in order to
ensure an adequate flow of patients to maintain
a desired level of funding.

6.4.3

Cost reviews

The investigator cannot develop or maintain a
cost-conscious attitude without periodic reviews
of activities and their associated costs. Such re
views are especially important in trials involving
two or more primary work components, such as
in CASS (Coronary Artery Surgery Study Re
search Group. 1981). That study required a sepa
ration of the coordinating center costs for the
trial and registry components of the study. The
separation was used as a management tool to
make certain that data intake and analysis prior
ities were met for both components.
6.4.4

Periodic priority assessments

The usual approach is to add new data collection
and quality control procedures as they are
needed over the course of the trial, without
much thought regarding their importance in
meeting the main objectives of the trial (Meinert, 1977). Periodic revisions and prunings per
formed by the leadership of the trial are neces
sary if the procedures are to remain lean and
efficient.

6.4.5 Review and funding for ancillary
studies
The study leadership should develop an internal
review process for proposed ancillary studies
(see Glossary). Only those studies that do not
interfere with patient recruitment, data collec
tion. or other essential activities in the trial,
should be approved. Studies that are too costly
to undertake without additional funding should
be reviewed subject to acquisition of funding.
Ancillary studies, by definition, are designed
to address questions that are of secondary or
peripheral importance to the main objectives of
the trial. However, since they are done by inves
tigators involved in the trial and are often car
ried out on subgroups of study patients, they can
add to both the cost and the complexity of the
trial. Thev may even compromise the ability of
the investigators to pursue the main aims of the
trial. Part of the purpose of the review process is

5
48

h . y-.

« Mi

fcl
*11

w
ft B

If

Cost and related issues

to make certain that this does not happen and to
ensure that the investigations do not siphon
away resources needed for the trial itself. Small
amounts of support, particularly in the form of
study staff, may be derived from the trial. Under
takings requiring added staff should be funded
and operated independently of the trial.
6.4.6

The data collection requirements of the trial
should be limited to those that are directly re
lated to the aims of the trial and should not be
confused with other needs, such as those re
quired for patient care or for ancillary studies.
Every item that appears on the data forms
should be required for pursuit of one of the aims
of the trial. Items that cannot be justified in this
manner should not be made part of the official
data set of the trial.
6.4.7

£

■

Justification of data items

Use of low-technology procedures

The cost of a trial will be influenced by the level
of technology needed for the procedure used in
the trial. Insistence on high-technology proce
dures can result in a significant increase in ex
penses, especially if special equipment must be
purchased and skilled personnel hired to operate
it. State-of-the-art instrumentation is generally
not essential to the success of most trials.

6.5

NEED FOR BETTER COST D U <

Reliable data on the costs of trials arc difr K. ■
obtain. Expenditure records maintained
v
NIH are too crude to permit anything m..rr
a rough analysis of cost (Meincrt. 19'Qji <
comparisons across governmental agenem » j «
as the NIH and VA, are further comnlKJ!f/ differences in funding and accounting pr^-..For example, NIH-sponsored trials h- .
include salary support for senior as udl
sential support staff, whereas personnel c<v. •
VA-sponsored trials are generails limitei*
those needed for essential support stall ( parisons between countries are esen more
cult to make. For example, studies done r
United Kingdom always appear to he less elu
sive than in the United States because o! !..• ■,
mental differences in the way health care r' »•
dures are paid for in the two countnc'.
Reliable cost data for industry spun*.
trials are even more difficult to obtain A
profit business firm is not eager to pros,.!' ••
tailed research expenditure data for rc\w» *•
the general public or competing firms
Nevertheless, designers of trials need to hj •»
better understanding of the way in which . - •
accumulate and how they are influenced H •»
tors under the designers’ control, especa • •
relation to the types and amounts ol di'j
lected. This understanding can onls be
through the collection of detailed cost da’i ••
lated to specific data collection and ana’." < »
tivities in a variety of trials.

7 Impact of clinical trials on the practice of medicine

A new scientific truth does not triumph by convincing its opponents and making them see the
light, but rather because its opponents eventually die. and a new generation grows up that is
familiar with it.

Max Planck

• t l-tnxhiction

•l Kton influencing treatment acceptance
': i Prior opinion and previous experience
uilh a treatment
Clinical relevance of the outcome mea
sure
Ikgrec to which test treatment simu
lates real-world treatment
' ? 4 Consistency of findings with previous
results
Direction of results
Importance of the treatment
Cost and payment schedule
Ircatment facilities and resources
Design and operating features of the
trial
* ? 10 Study population
’II Method of presentation
' 1? ( ountcrforces
Hpjct assessment
» IS-1 mversity Group Diabetes Program: A
>. isc study
4 'A ass to increase the impact of clinical trials
I Chronology of events associated with
the UGDP
',fc ■ ’ 2 Criticisms of the UGDP and com
ments pertaining to them
’ 3 Advertising for oral hypoglycemic
agents in the Journal of the Ameri
can Medical Association for 1969
and 1979
4 Percentage of patient-physician visits
for diabetics by type of prescrip
tion issued
•‘■f 5 Estimated U.S. wholesale dollar cost
for oral hypoglycemic prescrip
tions
r '' I Estimated total number of hypogly
cemic prescriptions (new and refill)
for the U.S.

Figure 7-2 Estimated number of insulin pre
scriptions (new and refill) and
ratio of oral hypoglycemic Rx’s to
insulin Rx’s for the U.S.
Figure 7-3 Type of hypoglycemic prescription
on discharge from general hospi
tals for diabetes as a percentage of
total diabetic discharges

7.1

INTRODUCTION

There is need for a better understanding of the
way trials influence the practice of medicine.
What is their role in establishing new treatments
or in discrediting old ones? When can they be
expected to play a role and when not? Does the
design or the way in which a trial is executed
influence the way it is perceived—in the medical
community and by the lay public? Answers to
questions of this kind could promote the design
of better, more potent, trials in the future. (See
references 59 and 366 for additional discussion.)

7.2 FACTORS INFLUENCING
TREATMENT ACCEPTANCE
7.2.1 Prior opinion and previous
experience with a treatment

A treatment that has been around for a long
time, even if trials have shown it to be of no
value, will fade from favor more slowly than one
still in its infancy. Chalmers has noted the con
tinued use of bed rest in the treatment of acute
viral hepatitis after several trials, all of which
have failed to indicate any merit for the treat
ment. Similarly, ulcer patients continue to be
placed on “sippy" diets, even though trials have
failed to show the value of such diets (Chalmers,
1974).

49

id

50

7.2 Factors influencing treatment acceptance 51

Impact of clinical trials on the practice of medicine

The time to do a trial is before the treatment is
accepted as standard practice. It will be difficult
to mount one once that has happened. For ex
ample. it would be quite difficult to mount trials
now to evaluate the efficacy of coronary care
units (CCU) in the treatment of acute myocar
dial infarction (MI) victims. The units are pre
sumed to be of value. Assigning patients to a
CCU or regular hospital care at random might
well be regarded as a questionable practice in
today’s climate.
7.2.2 Clinical relevance of the
outcome measure
All other things being equal, a trial with death or
some other serious morbid event as the outcome
should receive more attention than one involving
less relevant outcomes. It is distressing, in this
context, to note the number of trials that rely on
nonclinical measures, such as a laboratory test,
to evaluate a treatment (see Chapter 2).

7.2.3 Degree to which test treatment
simulates real-world treatment
Ideally, the test treatment should be used in the
exact same manner as in the real world. How
ever, this is not always possible. The need for
uniformity in the treatment process makes it
necessary to impose conditions on usage not
ordinarily encountered in real life. For example,
drugs may have to be given in a single fixed dose
in double-masked trials, even though they are
not used this way in practice.

7.2.4 Consistency of findings with
previous results
The judgment regarding the virtues of a treat
ment should be based on a digest of all pertinent
data—not only the last report. Survey papers,
such as those produced by Chalmers and co
workers (1972, 1977), represent examples of ef
forts aimed at amalgamating information from
several trials to assess the merits of a treatment.
It is desirable to have several replications of a
trial before reaching a conclusion regarding a
treatment. Unfortunately, the world is usually
not so obliging. The high cost of some trials,
such as the Multiple Risk Factor Intervention
Trial (in excess of $100 million), makes it im
practical to consider replication. Replication in
other cases may be ruled out on ethical grounds.
For example, it would be impossible to replicate

..-M of trained transplant teams, and availsupport facilities.

the Veterans Administration (VA) stud.n frank hypertensives. No physician would r.
pared to have such patients assigned i ,
placebo treatment (Veterans Admimstrr Cooperative Study Group on Antihspcrtr"* •
Agents, 1967, 1970).

7.2.5

Design and operating features of
tV trial

•••j s the weight given to a result should be
.-•r-nmed by an unbiased, objective evaluation
•‘•e strengths and weaknesses of the trial. In
the evaluation may be done carelessly and
t 3 preconceived point of view. Design or
rvuting features regarded as major weaknesses
• "c trial may be overlooked or ignored in
•Ser. depending on the direction of the re.
| xidcncc of such double standards can be
from a comparison of the criticisms di- -nJ at the UGDP study of tolbutamide with
• directed at studies done by Keen and by
koi (Keen and Jarrett, 1970; Keen, 1971;
;
Gm. 1970). The UGDP results were nega- whereas the other two were considered to

Direction of results

The direction of the trial results will inP,^-.the way in which they are received. It is <•
accept a positive finding than a ncg.itnc »
especially if the finding pertains to an
fished” treatment. Physicians are trained !.■ v
more comfortable giving a treatment than » •*
holding one. Patients as well usually find it rconsoling to receive a treatment than to he
nied one.

7.2.6

Importance of the treatment

The interest generated by a particular trial »
be influenced by the number of person* m
medical community who regard the treatme- r
useful. The attention accorded the l'(il)l’! v
ings was much greater than that for the (.•
nary Drug Project (CDP). Undouhtcdlv the '
ference was due in part to the (act that •**
treatments used in the UGDP were cM.ib'
modes of therapy for the mild, nomnsuln ,pendent diabetic, whereas this was not the
for the drugs used in the CDP for patient* * ‘ i
prior Ml.

7.2.7

’ll® Study population
T'v degree to which the study population ap•• •» m.ites a real-life mix of patients may influ- - the way results are received. A clinician’s
-r'.rrtmn that patients treated in the trial were
- r*-dK different from those he treats may lead
* -* downplay or completely reject the results.
’211

Cost and payment schedule

The cost of the treatment and the opp1'”nity for covering those costs from third pi"
sources, such as insurance carriers, will p r- >
role in treatment “acceptance.” Use of du •”
for end-stage renal disease is a case in point 1
big spurt in use of the treatment came » •
enactment of legislation in 1972 that pr*"
payment for the procedure from Social ''f*
funds. The number of people on dial'v*
United States jumped from 2.400 in lQ "
nearly 27,000 by 1977 and to over 44.000 K N‘(Burton and Hirschman, 1979a. I979h)

7.2.8

Method of presentation

'•’I’mcnt acceptance can be influenced by the
• i which results are presented. Negative attithat develop in the medical community
v » imt of the mode of presentation may cause
•> -"embers to reject the findings for emotional
ns (here is some evidence that this hapwith the UGDP. The tolbutamide findings
- presented at a national meeting of the
'"fr:l..in Diabetes Association (ADA) in June
* 1 I he paper containing the results first ap*«--d 5 months after the presentation, and then
■■•ma speciality journal with limited circula<1 niversity Group Diabetes Program Re■'•‘•ib Group, l970e). The press coverage folof
•
the presentation resulted in a deluge
<
*. i ’ies to practicing diabetologists around the
-’•n regarding the treatment. Many of them
■'r<l haxing to deal with the questions before
’-'nits were published.
IK- potential for ill will is not limited to trials
• k negative findings, as may be
........
seen
...............
in the
*Knljr Photocoagulation fStudy (MPS) with

Treatment facilities and resources

The opportunities for administering a trej’-^*
will be limited by the nature of stall and *ur>
facilities needed for its administration ,
transplantation is a case in point. I he ut>’ _
the treatment is limited by organ availahi it'

f

presentation of results for treatment of senile
macular degeneration (Macular Photocoagula
tion Study Group. 1982, 1984). The study
avoided the UGDP publication lag by mailing a
preprint of the manuscript to all practicing oph
thalmologists in the U.S. The National Eye In
stitute scheduled a press conference a few days
after the mailing and just before the manuscript
appeared in print. The national TV coverage
of the results took many treating ophthalmolo
gists by surprise, particularly those who had not
yet received the paper or who had not read it.
The public relations problem might have been
avoided if there had not been a press conference,
but public awareness of the results was consid
ered to be essential because of the need for pa
tients to recognize the symptoms of senile macu
lar degeneration so as to obtain early diagnosis
and treatment.

7.2.12

Counterforces

There may be a number of counterforces work
ing against the acceptance of a finding. Such
forces can be expected to emerge whenever re
sults run contrary to established dogma, and
especially when major financial considerations
are involved. A medical specialist whose practice
depends on the treatment being questioned will
be much more reluctant to accept negative find
ings than positive findings. The Committee for
he Care of the Diabetic was formed by a group
of diabetologists largely as a means of counter
acting the UGDP findings and the proposed
Food and Drug Administration (FDA) labeling
changes for the oral hypoglycemic agents (see
Section 7.4).
The drug company whose product is threat
ened by the study can be expected to question
the
.. . findings
---- „ and to express doubts regarding the
study. These expressions may take the form of
prepared press releases indicating that the trial
should not be regarded as definitive and making
the universal call for further research. Upjohn,
the manufacturer of Orinase^ (tolbutamide), as
well as other manufacturers of hypoglycemic
agents, sent “Dear Doctor” letters to practicing
diabetologists warning of the need for caution
when interpreting the findings of the UGDP (see
Knox. 1971. and Mintz. 1970b for references to
‘
»were hired by Upjohn to
the letters). Consultants
critique the study and to speak at meetings
where the findings were discussed. Company
sales personnel were provided with “informational material’’ for answering questions concerning the study. The material summarized crit-

52

If
iB

icisms of the study and reminded physicians of
other work supportive of the treatment.
Another force with interests allied to the phar
maceutical firms is that associated with the socalled “throw-away" medical journals.1 Such
publications rely heavily on advertising from
drug manufacturers for their income (Chalmers,
1982a; Warner et al., 1978). The editorial policy
of publications such as the Medical Tribune and
the Hospital Tribune was negative, if not down
right hostile, toward the UGDP, while carrying
ads for hypoglycemic agents.

7.3

I

7.4 The university group diabetes program: A case study 53

Impact of clinical trials on the practice of medicine

IMPACT ASSESSMENT

Changes in health care practices occur gradually
and for a variety of reasons. Methods used to
relate such changes to specific events, such as the
publication of results from a particular trial, are
at best approximate. It is always dangerous to
associate any change involving complex behav
iors with any single event. A case in point is the
growing emphasis on the diagnosis and treat
ment of hypertension. Unquestionably, the em
phasis stems, at least in part, from trials support
ing the value of antihypertensive treatment. But
it is also due to massive efforts by the federal
government and the medical profession to alert
the public to the dangers of hypertension. Com
munities throughout the nation have carried out
screening programs to identify hypertensives.
The National High Blood Pressure Education
Program, founded in 1972 and sponsored by the
National Heart, Lung, and Blood Institute
(NHLBI), has been aimed at educating members
of the public and the medical community to the
importance of blood pressure control (National
Heart, Lung, and Blood Institute, 1973; Szklo,
1980). Physician visits during which at least one
antihypertensive drug was prescribed increased
by about a third from 1968 to 1978 (from data
provided in the National Disease and Therapeu
tic Index, IMS America Ltd.,2 Ambler, Pennsyl
vania). There was a 27% decrease in mortality
rates for coronary heart disease over the same
time interval (Working Group on Arteriosclero
sis, 1981).
In the light of such evidence, it is tempting to
attribute the decline to more aggressive treat
ment resulting from trials and educational pro
grams. However, those who do so ignore the fact
1. So termed because they are distributed to practicing physicians
free of charge
2. IMS is a private firm that specializes in the compilation of drug
utilization data for sale to various business firms and agencies.

that mortality due to cardiovascular dwate
already on the decline before the first \ \
tension trials started and before widespread
lie awareness of the dangers of hvpertenv.A
Prescription and sales data can be uwd •
provide gross indications of changes in
ment patterns. Data from IMS arc used m <tion 7.4 to chart changes in the use of oral
glycemic agents from 1964 forward.
Other indications of change may be obti -r-?
from other data sources, such as the I'*
sional Services Review Organization (l‘SR< i .
from the Commission on Professional and H n
pital Activities (CPHA). The Commit * ■
based in Ann Arbor, Michigan, and mam’a -•,
variety of usage statistics for member hosr i
Payment data maintained by private hea •
surance carriers and by Medicare and Med i.
also can be helpful in tracing treatment pa.....
More direct measures of change can ** *
tained from special surveys, such as the one d v
by Stross and Harlan (1979) designed to li
the awareness of primary-care physicians rtfi-j
ing results from the Diabetic Retmopi"Study—DRS (done about 18 months a!tc’ n
DRS results were published). Only 28r; (U ..
of 137) of the family physicians and 46' . .>* - v
internists surveyed (42 out of 91) were awi'f
the results. A similar approach was used i ■ t»
sess the level of physician familiarity with rr>
from the Hypertension Detection and
•
Up Program—HDFP (Stross and Hr r
1981). Survey techniques also were used - •
contract issued by the NHLBI to assess pk>%
cian knowledge of findings from the ( DP i->'
Aspirin Myocardial Infarction Studs
'
(Market Facts, Inc., 1982).

7.4 THE UNIVERSITY GROUP
DIABETES PROGRAM:
A CASE STUDY
The UGDP was started in I960, enrolled it» *
patient in 1961, completed data collecti^"
1975, and published its final report in P''- ’
tations 464 through 470, 472. 473. 475. ind < •
(Appendix I) refer to a series of original
tions that detail the design, methods, and rn. •>
of the study. Citations 83. 95, 161. 173. D ’
192-194, 261, 386, 409, 413 , 419. 459. 4N'
471, relate to the controversy that de\r’
starting in mid-1970 with a UGDP data
tation that questioned the value of iolbu!i’‘^
for use in diabetics. Table 7 I provides • <

Uhh 7-1

Chronology of events associated with the UGDP

Month, tlav

Event

June
September

First planning meeting of UGDP investigators (467)*

IQM

February

Enrollment of first patient (467)

IQ*. 2

September

Addition of phenformin to the study and recruitment of 5 additional
clinics (467)

I ehruary

Completion of patient recruitment (467. 468)

1*0

June 6

UGDP investigators vote to discontinue tolbutamide treatment (468
and UGDP meeting minutes)

IQ'f)

May 20
May 21. 22

Tolbutamide results on Dow Jones ticker tape (327)

N'll

N’O

June 14

Tolbutamide results presented at American Diabetes Association meet
ing. St. Louis (464. 465. 466)

October

Food and Drug Administration (FDA) distributes bulletin supporting
findings (179)

N'O

November

Tolbutamide results published (468)

IQ'll

November

Committee for the Care of Diabetics (CCD) formed (183)7

IQ'I

April

Feinstein criticism of UGDP published (161)

N'l

May 16

UGDP investigators vote to discontinue phenformin treatment in
UGDP (470. 472, and UGDP meeting minutes)

IQ'I

June

FDA outlines labeling changes for sulfonylureas (180)

IQ'I

August 9

UGDP preliminary report on phenformin published (470)

IQ'I

September 14

Associate Director of National Institutes of Health (NIH) asks presi
dent of International Biometrics Society to appoint a committee to
review UGDP (83)

IQ'I

September 20

Schor criticism of UGDP published (409)

IQ'I

September 20

IQ’I

October 7

Cornfield defense of UGDP published (95)
CCD petitions commissioner of the FDA to rescind proposed label
change (183 and actual petition)

IQ'2

May

FDA reaffirms position on proposed labeling change (181)

IQ'2

June 5

FDA commissioner denies October 1971 request to rescind proposed
label change (183)
CCD requests evidentiary hearing before FDA commissioner on pro
posed labeling changes (183)

|0<0

l-M)

July 13

Initiation of grant support for the coordinating center and first 7 clinics
(467)

Wall Street Journal. Washington Post, and New York Times articles on
tolbutamide results (280. 326. 408)

I >1’2

August 3

l>)'2

August 11

10’2

August 30

10'2

August

10'2

September

Seltzer criticism of UGDP published (419)

10'2

October 17

Second motion for injunction against label change filed by CCD in the
United States District Court for the District of Massachusetts (451)

10'2

Oct ober

Response to Seltzer critique published (471)

io’2

November 3

Temporary injunction order granted by Judge Murray of the United
States District Court for the District of Massachusetts (451)

10'2

November 7

Preliminary injunction against proposed label change granted by
United States District Court for the District of Massachusetts (183)

Commissioner of FDA denies CCD request for evidentiary hearing
(451)
CCD argues to have the FDA enjoined from implementing labeling
change before the United States District Court for the District of
Massachusetts (451)
Request to have the FDA enjoined from making labeling change de
nied by Judge Campbell of the United States District Court for the
District of Massachusetts (183. 451)
Biometrics Society Committee starts review of UGDP and other re
lated studies (83)

Of
I

Q

54

7.4 The university group diabetes program: A case study

Impact of clinical trials on the practice of medicine
Table 7-1

Chronology of events associated with the UGDP (conlinueth

Table 7-1
)ra'

Chronology of events associated with the UGDP (continurd)

Month, day

Event

January

Appeal of October 21. 1977, court ruling Filed by CCD in United States
Court of Appeals for the District of Columbia Circuit

Year

Month, day

Event

1973

July 31

Preliminary injunction vacated by Judge Coffin of United States Cotin
of Appeals for the First Circuit. Case sent back to FDA for lurthrr
deliberations (183. 451)

|4'K

July 7

1973

October

FDA hearing on labeling of oral agents (183)

|4'S

July 11

1974

February

FDA circulates proposed labeling revision (183)

1974

March April

FDA holds meeting on proposed label change, then postpones action
on change until report of Biometrics Committee (183)

1974

September 18, 19. 20

Testimony taken concerning use of oral hypoglycemic agents before the
United States Senate Select Committee on Small Business. Monop
oly Subcommittee (459)

i«r«

July 25

14'*

October 17

I4'S

S’ovember 14

Results of FDA audit of UGDP announced (188)

IQ'M

November 15

|4'Q

January 15

|4'Q

April 10

Commissioner of FDA orders phenformin withdrawn from market
(462)
CCD petitions the United States Supreme Court for writ of certiorari
to the United States Court of Appeals for the District of Columbia
Circuit (461)
Appeal of October 21. 1977, ruling denied*

1975

January 31

Added testimony concerning use of oral hypoglycemic agents before
the United States Senate Select Committee on Small Business. Mo
nopoly Subcommittee (460)

Preliminary report on insulin Findings published (474)
Judges Leventhal and MacKinnon of the United States Court of Ap
peals for the District of Columbia Circuit rule that public does not
have right to UGDP raw data under the FOIA. Judge Bazelon
dissents (450. 461)
CDC petitions United States Court of Appeal f6r the District of Co
lumbia Circuit for rehearing on July 11 ruling (461)

Petition for rehearing at the United States Court of Appeals for the
District of Columbia Circuit denied (461)

1975

February 10

Report of the Biometrics Committee published (83)

1975

February

UGDP final report on phenformin published (472)

1975

July 9, 10

Added testimony concerning use of oral hypoglycemic agents before
the United States Senate Select Committee on Small Business. Mo
nopoly Subcommittee (460)

1975

August

Termination of patient follow-up in UGDP (476)

14'4

May 14

1975

September 30

CCD files suit against David Mathews, Secretary of Health. Education
and Welfare, et al., for access to UGDP raw data under the Freedom
of Information Act (FOIA) in the United States District Court lor
the District of Columbia (452)
Ciba-Geigy files suit against David Mathews, Secretary of Health
Education and Welfare, et al., for access to UGDP raw data under
the FOIA in the United States District Court for the Southern
District of New York (457)
FDA announces intent to audit UGDP results (461)

14'4

October 31

14*0

March 3

IQX?

April

Expiration of NIH grant support for UGDP

|4X2

November

14*2

November

Final report on insulin results published (476)
UGDP deposits patient listings plus other information at the National
Technical Information Service for public access (476. 477. 478)

14X4

March 16

Revised label for sulfonylurea class of drugs released (192, 193. 194)

1975

October 14

1975

December

1976

February 5

1976

February 25

1976

September

1976

October

1977

March 8

1977

April 22

1977

May 6

1977

May 13

1977

July 25

1977

August

1977

October 21

1977
1977

October 23

December

United States District Court for the District of Columbia rules UGDI’
raw data not subject to FOIA (453)
CCD files appeal of February 5 decision in United States Court of
Appeals for the District of Columbia Circuit (461)
FDA audit of UGDP begins
FDA Endocrinology and Metabolism Advisory Committee recom
mends removal of phenformin from market (184)
United States District Court lor the Southern District of New York
rejects Ciba-Geigy request for UGDP raw data (458)
Health Research Group (HRG) of Washington. D C., petitions Secre
tary of HEW to suspend phenformin from market under imminent
hazard provision of law (185)
FDA begins formal proceedings to remove phenformin from martei

(185)
FDA holds public hearing on petition of HRG (185)
Secretary of HEW announces decision to suspend New Drug Applica
tions (NDAs) for phenformin in 90 days (185)
CCD requests that United States District Court for the district of
Columbia issue an injunction against HEW order to suspend
for phenformin*
CCD request to United States District Court for the I^trict pf
bia for injunction against HEW order to suspend NDAs for phentor

min denied*
NDAs for phenformin suspended by Secretary of HEW under immi
nent hazard provision of law (187)
UGDP announces release of data listings for individual patients (

55

Writ of certiorari granted
UGDP case of Forsham el al., versus Harris et al., argued before the
United States Supreme Court (462)
United States Supreme Court holds that HEW need not produce
UGDP raw data in 6 to 2 decision (462)

•Sumhers in parentheses refer to citations in the Combined Bibliography (Appendix I).
•r.M.nal communications with Robert F Bradley. Joslin Diabetes Center. Boston, who was the First chairman of the
<< h

' >n of ('GDP related events (see also Appen' » H for a sketch of the UGDP).
GMc 7 2 provides a listing of the main criti'"t» of the study as offered by others and com•••"U on their validity by the author (one of the
‘•ntigaiors in the trial). Most of the attention
'»»loxused on the tolbutamide results because
were the first released and because of the
* ruhnty of the drug. Table 7-2 reflects this

I he news media carried a number of articles
the tolbutamide results, beginning with a re> on May 20 appearing on the Dow Jones
■ ‘enapc ' One article in particular, suggesting
*• -rr-rt »a« prepared from information in an abstract of a
».».,,lf(j (o )hc American Diabetes Association (ADA) for
’
•• ■« ii Hi june meeting. Study investigators were surPublicity before the meeting. They were not aware
• the practice of the ADA to make the program, and
' " "’amed therein, available to the press in advance of its
I mg

that the drug caused as many as 8,000 deaths per
year,4 created a good deal of patient anxiety and
physician hostility toward the study even before
the results were presented in June. (Incidentally,
the number had escalated from 10,000 to 15,000
without benefit of any new data in news reports
a few years later, e g., as in the Philadelphia
Inquirer, January 28, 1975.)
The controversy and resulting doubts about
the study led to two independent audits of it.
The first was undertaken by a blue-ribbon com
mittee appointed by the International Biomet
rics Society and was published in 1975 (see cita
tion 83). The second was carried out by the FDA
and appeared in November, 1978 (see citation
188). Neither audit found any basis to reject the
conclusions of the study.
4. The article appeared in Thr Washington Post on May 22. 1970.
and in several other papers around the country over the next
several days.

56

i

I

<J|I

Impact of clinical trials on the practice of medicine

Tible 7-2

Criticisms of the UGDP and comments pertaining to them

Criticism

Com men I

• The study was not designed to detect differences in
mortality (Schor. 1971).

• The main aim of the trial was to detect diflere^r, .
nonfatal vascular complications of d
(UGDP Research Group. l970d) Howe'r- •' .
focus in no way precludes comparisons |,>r r-. ••,
ity differences. In fact, it is not possible to In.r.p...
results for nonfatal events in the abseme . • •, ,
on fatal events.

• The observed mortality difference was small and not
statistically significant (Feinstein. 1971. Kiloet al..
1980).

• It is unethical to continue a trial, especial's .
involving an elective treatment, to produce
equivocal evidence of harm.

• The baseline differences in the composition of the
study groups are large enough to account for ex
cess mortahtv in the tolbutamide treatment group
(Feinstein. 1971; Kilo et al.. 1980; Schor. 1971;
Seltzer. 1972).

• The tolbutamide-placebo mortality differen'e t
mains after adjustment for important b.i»e
characteristics (Cornfield. 1971)

• The tolbutamide-treated group had a higher concen
tration of baseline cardiovascular risk factors than
anv of the other treatment groups (Feinstein. 1971;
Kiio et al.. 1980; Schor. 1971; Seltzer. 1972).

• Differences in the distribution of baseline ihir*---

• The treatment groups included patients who did not
meet studv eligibilitv criteria (Feinstein, 1971;

Comment

tients. The percentage of patients judged to have
fair or good control, based on blood glucose deter
minations done over the course of the study, was
74 in the tolbutamide-treated group versus 59 in
the placebo-treated group (UGDP Research
Group, 1971a, 1976).
• t*r rsccss mortality can be accounted for by differ-

r-srs m the smoking behavior of the treatment
f -up (source unknown).

istics. including CV risk factors, is within the - i-r
of chance. Further, the mortality excess o a* v >'
for the subgroup of patients who were free
< s
risk factors as for those who were not I
,

Schor. 1971).

• Data from patients who received little or none of the
assigned study medication should have been re
moved from analysis (Kilo el al.. 1980; Seltzer,

1972).

• Correct. However, the number of such cases
small and not differential by treatment group I •
ther. analyses in which ineligible patient'
••
moved did not effect the tolhutamide-pl.iie 1'. tality difference (UGDP Research Group tv
• The initial analysis included all patients to ased
introduction of selection biases. This anal" > »r
proach tends to underestimate the true effesi
alyses in which noncompliant patients were n <
counted enhanced, rather than diminished
mortality difference (UGDP Research (o. jr

•

•

'J

• The study failed to collect relevant clinical data (Fein
stein. 1971; Seltzer. 1972).

• The analysis philosophy for this variable
’**
same as for drug compliance. The removal <>' r*
tients using a variable influenced bv treatment
a good chance of rendering the treatment gn-..-.
noncomparable with regard to important base
characteristics. In any case, analyses bs lest1
blood glucose control did not account for the - ■
tality difference (UGDP Research Group. N'i»’

• This criticism can be raised for any trial. However, it
lacks validity since there is no reason to assume
treatment groups in a randomized trial are any less
comparable for unobserved characteristics than
for observed characteristics. And even if differen
ces do exist, they will not have any effect on ob
served treatment differences unless the variables in
question are important predictors of outcome.

majority of deaths were concentrated in a few
. n>o t Feinstein. 1971; Seltzer, 1972).

• Differences in the number of deaths by clinic are to
be expected in any multicenter trial. However,
they are irrelevant to comparisons by treatment
groups in the UGDP. since the number of patients
assigned to treatment groups was balanced by
clinic (UGDP Research Group. l970d, l970e).

uuds included patients who did not meet the
’usual’ criteria for diabetes (Seltzer, 1972).

• There are a variety of criteria used for diagnosing

• f** patients received a fixed dose of tolbutamide.

• Most patients in the real world receive the dosage
used in the study (UGDP Research Group. 1972).

•

• The criticism is unjustified. The study collected d»'»

on a number of variables needed for asses'int "*
occurrence of various kinds of peripheral sjsu *•
events. It is always possible to identify some '*■ *
ble that should have been observed »ith the re
spective of hindsight. The criticism lack' cred *•
ity, in general and especially in this case, heca/w
of the nature of the result observed It ■' hard
envision other clinical observations that •oud

• ‘■e usual practice is to vary dosage, depending on

• The patients did not receive enough medication for
effective control of blood glucose levels (Seltzer.
1972).

• Correct. However, the changes were made he'-”
investigators had noted any real difference tn m.tality and were, in any case, made without reta’«
to observed treatment results (Cornfield.

• A higher percent of tolbutamide-treated patterns had

blood glucose values in the range mdtcat.vt^
good control than did the placebo-treated p*

diabetes, all of which are based, in part or totally,
on the glucose tolerance test. The sum of the fast
ing one. two. and three hour glucose tolerance test
values used in the UGDP represented an attempt
to make efficient use of all the information pro
vided by the test (UGDP Research Group, l970d).

lFeinstein. 1971; Schor. 1971; Seltzer. 1972).
•

randomization schedules were not followed
■Vhor. 1971).

• The Biometrics Committee reviewed the randomiza
tion procedure and found no evidence of any
breakdown in the assignment process (Committee
for the Assessment of Biometric Aspects of Con
trolled Trials of Hypoglycemic Agents. 1975).

“*’r were “numerous" coding errors made at the
'•■'dinating center in transcription of data into

• There is no evidence of any problem in this regard.
The few errors noted in audits performed by the
Biometrics Committee and FDA audit team were
of no consequence in the findings of the trial (Com
mittee for Assessment of Biometric Aspects of
Controlled Trials of Hypoglycemic Agents. 1975.

offset mortality, an outcome difficult to rever*
• There were changes in the ECG coding procedures
midway in the course of the study (Schor, 1971;
Seltzer. 1972).

• The argument is not plausible. While it is true that
the study did not collect baseline smoking histo
ries. there is no reason to believe the distribution
of this characteristic would be so skewed so as to
account for the excess (Cornfield, 1971). The study
did in fact make an effort to rectify this oversight
around 1972 with the collection of retrospective
smoking histories. There were no major differen
ces among the treatment groups with regard to
smoking. However, the results were never pub
lished because of obvious questions involved in
constructing baseline smoking histories long after
patients were enrolled and then with the use of
surrogate respondents for deceased patients. The
oversight is understandable in view of the time the
trial was designed. Cigarette smoking, while recog
nized at that time as a risk factor for cancer, was
not widely recognized as a risk factor for coronary
heart disease.

.•Ktcrsed mortality difference can be accounted
dillcrenccs in the composition of the treatgmup (or unobserved baseline characteris• sileinstein, 1971; Schor. 1971).

• '•

l970d).
• The data analysis should have been restricted to
patients with good blood glucose control (Kilo et
al.. 1980).

57

Criticisms of the UGDP and comments pertaining to them (coniinued)

simultaneous adjustment for mator C\ tuw -<
risk factors did not eliminate the excess il (.I'C
Research Group, l970e; Cornfield. I97|t

I®?

If

7.4 The university group diabetes program: A case study

‘•■mputer readable formats (Feinstein, 1971).

Food and Drug Administration. 1978).

RH

■
58

I

Table 7-2

si >« -.

-

7.4 The university group diabetes program: A case study

Impact of clinical trials on the practice of medicine

well as refills) for all hypoglycemic

Criticisms of the UGDP and comments pertaining to them (continued)

Criticism

Comment

• There were coding and classification discrepancies in
the assembled data (Kolata. 1979).

• The coding and classification error rate wax .n
low and the errors that did occur were not d
ential by treatment group. There were no err..n •
the classification of patients by treatment im <ment or by vital status. Hence, the argument d
not provide a valid explanation of the vnurii •.
differences observed (Committee for the
ment of Biometric Aspects of Controlled
. ■
Hypoglycemic Agents. 1975; Food and Druf
ministration. 1978; Prout et al . 1979)

• ns (new as
in the United States has declined
-li agents i

Meh of 21 million in 1973 to 13.6 million
. ,«> (Figure 7-1, Part A). The largest de-.xe occurred for the sulfonylurea, tolbutaL,.f (figure 7-1, Part B). However, it is worth
. ■ ng that the decrease began before publican >| ihc UGDP results and that it was accom... ed hi increases in sales of chlorpropamide
,„|a,amide, also members of the family of

.•

i
H|f
i®

• The cause of death information was not accurate
(Feinstein. 1971: Schor. 1971. Seltzer. 1972).

• Independent review of individual death record* fc.
the FDA audit team revealed only three cla»« • »
tion discrepancies, only one of which alleded
tolbutamide-placebo comparison (food and I '• <
Administration, 1978). However, in am i.t*e
main analyses in the study and the comhi*.-*
drawn from them relate to overall mortalm

• The study does not prove tolbutamide is harmful
(Feinstein, 1971; Schor. 1971: Seltzer. 1972).

• Correct. It would be unethical to continue . tn* t
establish the toxicity of an elective treatment I •
icity is not needed to terminate an elecine trr*'
ment (UGDP Research Group. I97(M)

ii

|

, %.n\lurea compounds.
decline of phenformin sales, beginning
. •*. |9't was the result of a general concern in
s- medical community related to isolated cases
■ Ktic acidosis and of a negative report from
I GDP on the treatment. The drug was for
, mtents and purposes removed from the

59

market in 1977 through special powers vested in
the Secretary of Health, Education and Welfare
(see citations 184, 185, and 187).
The de-emphasis on the oral hypoglycemic
agents is reflected by advertising, as seen in the
Journal of the American Medical Association
(see Table 7-3). The only product advertised in
1979 was Pfizer’s Diabinese®. In addition, ad
vertising for the oral hypoglycemic agents rep
resented 4.6% of the total advertising space in
the journal in 1969, compared with 2.3% in 1979
(total advertising space estimated from a 25%
sample of the 52 issues of the Journal published
in the two time periods).
The National Therapeutic Index provides a
more direct measure of physician prescribing
habits. Data in this Index (IMS America, Ltd.,

All oral agents combined

(a)

I
The FDA started work on a revised label in
sert for tolbutamide shortly after the results were
presented in 1970. The revised label warned of
potential cardiovascular complications associ
ated with prolonged use of the drug (Food and
Drug Administration, 1972a). Doubts regarding
the validity of the study and concerns regarding
the implications of the proposed label change
led to the formation of the Committee for the
Care of the Diabetic (CCD). The committee was
made up of practicing diabetologists from
around the country (first headed by Robert F.
Bradley of the Joslin Clinic and subsequently by
Peter H. Forsham of the University of Califor
nia). This committee, with legal counsel, ob
tained a court order on November 7, 1972 stay
ing the use of the revised label5 (Food and Drug
Administration, 1975).
A side issue of importance to the field of
clinical trials—and other research fields as well
for that matter—had to do with public access to
UGDP raw data. Records generated by the
study and housed at the UGDP Coordinating

Center in Baltimore were requested on heha" '
the CCD under the Freedom of Inform.!' ■
Act—FOIA (Morris et al.. 1981. Sial!
1982; Watson, 1981; see also Chapter 241 lk*
request was denied by the United Stales
Court for the District of Columbia on I ehrujn
5, 1976 (see citation 453). The decision «•>' u *
mately upheld by the United States Supr?"*
Court in a six-to-two decision issued M.Hvb ’•
1980 (see citation 462).
In spite of the controversy—or more lAr'*
because of it—the study appears to have hx! «effect on the treatment practices of diahet-gists. It has caused both friends and foes o!
study alike to re-examine the undcrhmg ra?
nale for treatment of the noninsulin-depenJr*
diabetic and to consider dietary rather than rk •'
macological treatment of such patients iReo- et al., 1979; West, 1980).
Sales data compiled by IMS from the *
tional Prescription Audit6 show a drop in the u*
of the oral hypoglycemic agents beginning *
1974. The estimated total number of prevr ?

5. The revised label had actually been prepared and distributed to
manufacturers for use when the restraining order was issued It
contained a special warning concerning the possibility of an in
creased risk o( cardiovascular death with the use of sulfonylurea
oral hypoglycemic agents and rclerred specifically to the IIGDI’
results The label was finally revised in 19X4 to include the special
warning and a synopsis of the UGDP results (see citations 192.
193. and 194)

>«"
6. The National Prescription Audit is basedj on a nati*”1"
pie of pharmacies that supply monthly data tto IMS
. -Rep-”1 •« *‘ •' ,
of new and refill prescriptions issued per month Rer-'
dures were changed in 1981 and again in |9X. A' •
,—
obtained after the changes are difficult to compare •
obtained before the change. Therefore, they are n

'♦•A

1966

1968

1970

1974

1972

1976

1978

1980

YEAR

Bv type of agent

(b)

CHLOWPROMMIDE

PHENFORMIN
TOLBUTAMIDE
TOLAZAMIDE

Figure 7-1 Estimated total number of
hypoglycemic prescriptions (new and re
fill) for the U.S.

ACETOHEXAMIDE

■♦M

1966

1968

1970

1972

1974

1976

1978

1980

YEAR

’ Mirket and Prescription Data, copyright © I9M I9R0. IMS America. I td . Ambler. Pa (reference citation 244)

herein.

I

e

60

7.4 The university group diabetes program: A case study

Impact of clinical trials on the practice of medicine
Table 7-3 Advertising for oral hypoglycemic agents in the Journal of the American
Medical Association for 1969 and 1979

1969

■x

0

Number of
pages

Drug

I

(a)

pages

0

Diahinese®

0

0

36

100

Dymelor®

11

8

0

0

Orinase®

49

36

0

0

Tolinase®

74

54

0

0

Total for hypoglycemic agents

136

100

36

100

1977) are obtained from participating physi
cians. According to data in the Index, the
number of physician visits of diabetics that re
sulted in a prescription of an oral hypoglycemic
agent declined from 56% in 1969 to 36% by
1976, while the number of visits involving insulin
prescriptions increased from 29% to 34% (Table
7-4). The apparent increase in use of insulin is
reflected in Figure 7-2 as well. The figure sug
gests an increasing use of insulin relative to the
oral agents. However, this conclusion is valid
only if it is reasonable to assume that participat
ing pharmacies in the National Prescription
Audit have not changed their reporting habits
with regard to insulin.7
Data from the CPHA indicate a similar trend
for patient discharge data from U.S. short-term.
7. Technically, insulin is not a prescription drug, although it is
usually issued by prescription and. hence, reported in the Audit.

Estimated* wholesale dollar cost (in millions)

Total number

1597

J *

nonfederal, general hospitals (Figure 7 1) Atu’
vey of 14 large teaching hospitals in 1969
again in 1971 showed less reliance on oral atrr- i
and a sharper drop in their use than noted ' ■
general hospitals (Commission on Pro(e\Mo-»
and Hospital Activities, 1972, 1976). The penetages of patients receiving a prescription lor ioral hypoglycemic agent on discharge dropped
from 33% in 1969 to 24% in 1971 lor the 14
teaching hospitals, as contrasted with ad'?
from 38% to 34% for general hospitals. Ibe”
was only a slight increase in the use ol insulin ! •
the time period in the teaching hospitals (61 ■ 1969 and 64% in 1971), as compared with »
somewhat larger increase in the general hosp u»
(61% in 1969 and 64% in 1971).
The UGDP cost about $8.5 million to earn
out. That cost is minuscule when contrasted • ,k
the amount of money spent on prescriptions
oral hypoglycemic agents (Table 7 5). 1 he n*

O'

i»«S

1968

1970

1972

1974

1976

1978

I960

YEAR

to insulin Rx’s

1973

No drug Rx

16

19

24

26

Drug Rx
Oral hypoglycemics
Sulfonylureas
Phenformin

84
56
49
10

81
52
45
11

76

45
37
12

74
41
34
10

29

28

29

31

34

100

100

100

100

100

Source: Market and Prescription Data, copyright ©
ence citation 244).

2.3
3.9
3.5

22.4
28.2
35.1

7.1
7.9
8.4

38.1
35.3
28.7

58.0

10.5
140
15.2

29.0

62.1

24.7
21.8

65.0
65.8

26.7
28.3
26.7

34.8

104.8
112.0
109.3

25.2
I7.lt

284
31 8
30.9
26.4

34.1
31.2

t
t

28.9
38.6

47.2
58.9
54.5

114 9
119.8
109.8

110.5

Source: Market and Prescription Data, copyright O 1964 1980,
IMS America. Ltd.. Ambler. Pa (reference citation 244).

Figure 7-3

Type of hypoglycemic prescription on dis

of total diabetic discharges.

!r:
701
INSULIN R«

60
>9«6

1968

1970

1972

1974

1976

1978

1980

1 »» r Market and Prescription Data, copyright © 1964 1980.
Hs America. Ltd . Ambler. Pa. (reference citation 244)

OT
UJ

g 50
4

i

o
o

Oct. 1974 Oct. 1973
through
through
Sept. 1975 Sept. 1976

1970

Insulin

1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979

charge from general hospitals for diabetes as a percentage

i

Percentage of patient-physician visits for diabetics by type of prescription issued

1969

Total

Tolbutamide

<

YEAR

Type of Rx given

Phenformin

•Method of estimation changed in 1973. The large increase from
1972 to 1973 is an artifact of that change
<NDAs for drug suspended in 1977; ordered off the market in
1978.

(h) Ratio of oral hypoglycemic Rx's

-M4

Table 7-4

All oral
agents

Year

Percent

0

2953

Estimated U.S. wholesale dollar cost for oral

hypoglycemic prescriptions

Number of

Percent
1

Total number of advertising pages

Table 7-5

____ j number
of insulin prescriptions
Estimated
i
and ratio of oral hypoglycemic Rx‘s to
rtfilll
•— ••
- Rc\ for the U S.

1979

2

DBI®

... ,

61

40

£ 30
4
O

28
72*

"uted wholesale cost of the 21 million prescrip’ • ns for oral hypoglycemic agents written in
N 1 (Figure 7-1) was $105 million. This trans1'es into an average cost of $ 10 per prescription,
»"uming retail cost is twice wholesale cost. The
-‘•”P m 1980 to 13.5 million oral hypoglycemic
rrwriptions represents a “savings” of $75 mil;1 <>n
°n about nine times the cost of the study.

36

30
8

1964 1980. IMS America. Ltd.. Ambler. Pa. (refer-

I

ORAL HYPOGLYCEMIC Rn

20

*
10-

OJ
1970

1972
YEAR

Source: Reference citation 82.

1974

I

I

62

I

One obvious way to increase the impact of clini
cal trials is through improvements in their design
and conduct. Continued proliferation of trials
that have inadequate sample sizes, that involve
clinically irrelevant outcome measures, and that
are poorly executed cannot help having an ad
verse effect on the way clinical trials are viewed
by the public.
The pharmaceutical industry needs to be en
couraged to develop better structures for their
trials. There needs to be a clearer separation of
those responsible for execution of the trial from
the sponsoring firm. The collection and analysis
of data by firms with a proprietary interest in the
product being tested is automatically open to
question. Both industry and the public would
ultimately benefit from trials that are above re
proach.
There must also be efforts made to educate
the public on the importance of clinical trials as
an evaluation tool. The public must be taught to

I

Impact of clinical trials on the practice of medicine

7.5 WAYS TO INCREASE THE
IMPACT OF CLINICAL TRIALS

have a realistic appreciation of the strengths and
weaknesses of the tool. Research societies. mkH
as the Society for Clinical Trials and others, hast
a responsibility to assume leadership roles m this
education process.
Investigators carrying out trials have, in effect
a public trust. They must take pains to as^d
even the appearance of conflict of interest in the
collection, analysis, or interpretation of results
A public trust cannot be established and main
tained without high standards of integrity on the
part of everyone involved in trials.
Editors of journals can help by establishing
more stringent review criteria to make certain
that the results of trials that are published base
been generated and analyzed using sound meth
ods. They should reject papers from trials with
inadequate design features or standards of exe
cution. Imposition of higher editorial standards
would ultimately serve to elevate the design and
execution standards of future trials.
Finally, as mentioned at the beginning of this
chapter, there is a need for a better understand
ing of the way in which clinical trials influence
the practice of medicine.

Part II. Design principles and practices

Chipters in This Part

R. Essential design features of a controlled clinical trial
9. Sample size and power estimates
10. Randomization and the mechanics of treatment masking
11. The study plan
12. Data collection considerations

The five chapters of this Part are intended to outline the primary principles and procedures to
He followed in designing a trial. Chapter 8 discusses the general principles underlying selec
tion of the study treatment, the choice of the outcome measure, and the roles of randomiza
tion and masking in data collection. Chapter 9 discusses the role of sample size and power
estimates in planning a trial and details the methods for making such calculations in trials
involving fixed sample size designs. Chapter 10 is devoted to a discussion of the principles
and practices to be followed in administering the randomization schedule. Chapter 11 details
the items that must be addressed in developing the study plan and treatment protocol for the
trial. Chapter 12 outlines factors that influence the data collection schedule and contains
megestions concerning the design and content of data forms.

<
•I

63

i

8. Essential design features of a controlled clinical trial

•yi

On being asked to talk on the principles of research, my first thought was to arise after the
chairman’s introduction, to say. “Be careful." and to sit down.
Jerome Cornfield (1959)

« I Introduction

» ( hoice of the test and control treatments
» t Principles in the selection of the outcome
measure
‘ 4 Principles in establishing comparable study
groups
• < Principles of masking and bias control
TjMe X 1 Requirements for the test and control
treatments
I i^e X 2 Desired characteristics of the primary
outcome measure
h^le X 3 Requirements of a sound treatment
allocation scheme

I
i

TiNe 8 4 Masking guidelines

treatment protocol and a support organization
to monitor the trial for evidence of adverse or
beneficial treatment effects. The time involved in
developing a common study protocol, writing
and testing the necessary data forms, obtaining
required support staff, and establishing the struc
ture needed for proper data intake and analysis,
not to mention the time needed to fund the trial,
makes it difficult to start randomization with the
first use of a treatment.
Once the question of timing has been resolved,
the next set of issues involves basic design ques
tions. Any controlled clinical trial requires speci
fication of:
• A test and control treatment

I

II

• An outcome measure for evaluating the study
treatments
• A bias-free method for assigning patients to
the study treatments

INTRODUCTION

Ibe first question in any clinical trial is whether
’ h appropriate to mount the trial at all. Timing
•» <>f prime importance. The trial cannot proceed
n ’he face of widespread doubts regarding its
'•hoi base. Investigators must be satisfied that
• o proper to expose patients to either the test or
,se control treatment. The ethical window for a
'■ al may be quite narrow. Use of an agent in any
•'■al setting may be deemed unethical if the agent
•' ’egarded as “too" experimental, yet that same
*rni may be accepted by the medical profession
• 'hurt time later as the standard of treatment—
• ’hout the benefit of any experimental evi-

Considerations in arriving at each of these speci
fications are discussed in the sections that fol
low.

8.2 CHOICE OF THE TEST AND
CONTROL TREATMENTS

The choice of the test and control treatments
is key. The general requirements to be satisfied
are outlined in Table 8-1. The test treatment
must be different from the control treatment:
otherwise there is no point to the trial. Further,
both treatments must be justifiable on medical
grounds in order to allow investigators to assign
patients to either treatment.
The choice of the test treatment is straight
forward in settings where there is only one viable
alternative to the control treatment, or where
there are practical reasons for concentrating
on a particular treatment (e.g., in an industrysponsored trial done to satisfy Food and Drug
Administration requirements for licensure of a
particular drug). It is not when a number of

Idcally, the best time to start a trial is with the
"’roduction of a treatment, before preconceived
""turns regarding its merit develop. Chalmers
h*' argued for randomized trials the moment a
treatment is introduced (Chalmers, 1975,
•''*26) This approach, while laudable, is not
• ’hout problems. The rush to start may lead to
» senes of uncoordinated, small-scale efforts,
•‘•'te of which is adequate to answer the ques’interest. Randomization of patients
*h"uld not be started until there is a defined

I

65

66

I’

I

Table 8-1
ments

11 i
R

8.4 Principles of establishing comparable study groups 67

Essential design features of a controlled clinical trial
Requirements for the test and control treat

• They must be distinguishable from one another
• They must be medically justifiable
• There must be an ethical base for use of either treatment
• Use of the treatments must be compatible with the
health care needs of study patients
• Either treatment must be acceptable to study patients
and to physicians administering them
• There must be a reasonable doubt regarding the efficacy
of the test treatment
• There should be reason to believe that the benefits will
outweigh the risks of treatment
• The method of treatment administration must be com
patible with the design needs of the trial (e.g , method
of administration must be the same for all the treat
ments in a double-masked trial) and should be as
similar to real-world use as practical

alternatives exist. This was the situation faced
by investigators designing the University Group
Diabetes Program (UGDP). They had to choose
from among several different types of hypogly
cemic agents (University Group Diabetes Pro
gram Research Group, l970d). The same was
true for planners of the Coronary Drug Project
(CDP) in choosing among various lipid-lower
ing drugs (Coronary Drug Project Research
Group, 1973a).
The choice of the control treatment has impli
cations for the size of the treatment difference
that can be expected. The largest difference can
be expected when the control treatment is inac
tive. However, this design is only feasible when it
is ethical to allow patients assigned to the con
trol treatment to remain untreated (except for
. use of a placebo or sham treatment). The more
effective the control treatment, the more difficult
it will be to establish the superiority of the test
treatment.
The choice of the control treatment will be
dictated by current medical practice. The usual
control in a surgery trial is the best available
medical therapy. Some surgery trials have used
sham operations as controls (Cobb et al., 1959;
Dimond et al., I960; Perry et al., 1964). How
ever, their use has been curtailed in recent years
for ethical reasons. The control treatment in a
drug trial will be a standard form of drug ther
apy, a placebo, or no treatment at all, depending
on the nature of the disease.
Treatment cannot be withheld from control
patients if it is unethical to do so. Some form of
medical care must be provided if a patient has a

condition that requires treatment. The nature •
the treatment chosen can cause a dilemma ■ ■
investigators, especially when the test treatr-fis a refinement of the standard treatment Inm
tigators in the Hypertension Detection and T
low-Up Program (HDFP) had to face this rnK
lem. It was recognized that it would be unetb..*
to identify hypertensive patients and then
them untreated. It was also recognized tri
clinic personnel could not be expected to ad r
two standards of care—an aggressive appro*, r
to blood pressure control for patients assign
to stepped-care and a laissez-faire approach ■
patients assigned to regular care. The <Jilem~»
was resolved by referring patients assigned to tv
control treatment back to their private phv
cians for treatment (Hypertension Detection
and Follow-Up Program Cooperatise (.roup
1979b).
Some trials may involve more than one cn«v
trol treatment. The UGDP included both i
placebo and fixed-dose insulin treatment group
The placebo treatment was used primarils I *
comparison with the tolbutamide and phenfor
min treatments, whereas both the fixed-dose m
sulin and placebo treatments were useful in nal
uating the insulin variable treatment (I'nisen ?s
Group Diabetes Program Research Group
l970e, 1971b, 1978, 1982).

8.3 PRINCIPLES IN THE
SELECTION OF THE
OUTCOME MEASURE
The outcome measure used for treatment com
parisons will be a clinical event (e g., death, rm
ocardial infarction, significant loss of vision, rr
currence of a disease) or a surrogate outcome
measure (e.g., a score on a psychological in’
blood pressure change, serum lipid lesel) I he
focus in this book is on trials using a clmx*!
event aS the outcome measure.
Table 8-2 provides a list of desired char*
teristics for the primary outcome measurr
The measure should be specified when the tn*
is planned, before the start of data collection
Otherwise the value of the trial may be com
promised, especially if there is reason to be!icy
that data collected during the trial were uvd
select the measure.
The rate of occurrence of the outcome e't
will affect the power of the study and the lcntfh
of time it is required to run (sec (. haptc’
Trials involving a laboratory measure or "*
other surrogate outcome usually involve fewer

De'ired characteristics of the primary outcome
• • |.» to diagnose or observe
• » rt <4 measurement or ascertainment errors
• < jriMe of being observed independent of treatment
i.> fnment

• i - tails relevant
• . - ten before the start of data collection

ri’,rnts and take less time to complete than
- sc using death or some other nonfatal clinical
».fnr as the outcome, but these economies are
*k csed at the expense of medical relevancy.
I'r implications of a trial with a clinical event
r rhe outcome will, as a rule, be easier to undercjnd than one in which clinical relevance must
•v nferred by relying on the presumed relation
in p of a surrogate outcome and the clinical
. -Mition of interest.
It is not uncommon for trials to provide data
•" * number of secondary outcome measures as
•-1 1 his is almost always the case in a trial in
• k ,h mortality serves as the primary outcome,
t -r example, the CDP collected data on the
xkurrcnce of myocardial infarctions and a
*'>es of other nonfatal events in addition to
hr* on deaths (Coronary Drug Project Rewirch Group. 1973a).
hsestigators may design the trial to detect a
«pccified treatment difference using a combina• n of events. Use of composite events will in”**e the expected event rates and hence may
•’ifuce the required size of the trial (see Chap‘h However, the practice is ill advised beof the potential for confusion when interp'rtmg results based on composite measures.
14 PRINCIPLES OF
KTABLISHING COMPARABLE
STI DY GROUPS

The baseline characteristics of the test- and con”"l treated groups must be more or less similar
n order to provide a valid basis for comparison.
Thu need was recognized by Lind in his famous
K-uny experiment. He wrote:

Or the 20th of May 1747, 1 took twelve
patients in the scurvy, on board the Salis
bury at sea. Their cases were as similar as I
(•'Hid have them. They all in general had
putrid gums, the spots and lassitude, with
weakness of their knees. They lay together

in one place, being a proper apartment for
the sick in the fore-hold; and had one diet
common to all. . . . (Lind, 1753)

The ideal experimental model for comparing
two treatments is one in which the baseline char
acteristics of the two study groups are identical
in all aspects. This requires a homogeneous
group of patients who are arbitrarily assigned to
the test and control treatments. An alternative
design involves enrolling pairs of patients into
the trial, with each pair matched on all impor
tant baseline characteristics, and where one
member of the pair is assigned to the test treat
ment and the other to the control treatment.
However, matching is not practical. The number
of patients that must be screened to find suitable
matches is usually unacceptably large, to say
nothing of the time required to achieve even a
modest recruitment goal.
Usually the focus is on the recruitment of
patients one by one, with no attempt to match.
The comparability of the study groups for a few
key baseline characteristics may be assured by
first classifying patients into subgroups defined
by those characteristics, and then assigning
members of each subgroup to the test or control
treatment in the same proportion as for all other
subgroups. However, this approach, referred to
as stratification and discussed in Chapter 10, at
best can control the distribution of only a few
variables.
The need for comparability can be partially
satisfied by appropriate patient selection. The
eligibility and exclusion criteria in most trials are
designed to reduce the variability of the studyi
populations by placing restrictions on the type of
patients that may be enrolled. However, the de
sire for patient homogeneity and the resultant
improvement in study precision must be bal
anced against reduced opportunities for general
izations when a highly homogeneous population
is studied.
Once an eligible patient has agreed to be en
rolled, it is imperative that the treatment assign
ment be made free of influence from both the
patient and clinic personnel so as to avoid selec
tion biases in the way study groups are formed. '
The general conditions that should be satisfied
in order to have a sound allocation scheme are
outlined in Table 8-3.
Any system in which the study physician has
access to a patient’s treatment assignment before
enrollment is open to suspicion and violates the
first requirement listed in Table 8-3. This is the

■
68

I"
I
:?

p

I

I

8.5 Principle.1! of masking and bias control

Essential design features of a controlled clinical trial

main problem with allocation procedures based
on characteristics associated with patients, such
as birth dates or Social Security numbers. Odd
even schemes, for example, in which patients
seen on odd-numbered days receive one treat
ment and those seen on even-numbered days
receive the other treatment, are unsatisfactory
for the same reason. (See Wright et al., 1954, for
example.) Schemes of this sort are open to chal
lenge and are almost always impossible to de
fend.
Systematic schemes in which every other pa
tient is assigned to the test treatment violate the
second requirement listed in Table 8-3. Even
random allocation schemes can violate this re
quirement if the assignments are balanced at
intervals known to clinic personnel (e.g., after
every second allocation in a study involving only
two study treatments). Several of the papers re
viewed in Chapter 2 described or alluded to sys
tematic nonrandom allocation schemes that ap
peared not to meet the second requirement (e.g.,
deAlmeida et al., 1980; Marks et al., 1980; Mil
man et al., 1980; Scott et al., 1980). However,
there was not sufficient information in most of
the papers to make a reliable judgment as to the
soundness of the allocation process.
The third requirement, that the sequence of
assignments be reproducible, is violated by any
scheme that does not generate the same sequence
of assignments when replicated. Coin flips are
unsatisfactory for this reason, among others.
Schemes in which individual assignments are
contained in sealed envelopes at the clinics are
preferable to schemes described above. How
ever, they are subject to manipulation as well if
they fail to satisfy the first requirement listed in
Table 8-3 (see Carleton et al.. I960, for examTable 8-3
scheme

Requirements of a sound treatment allocation

• Assignment remains masked to the patient, physician,
and all other clinic personnel until it is needed for
initiation of treatment
• Future assignments cannot be predicted from past as
signments

• The order of allocations is reproducible
• Methods for generation and administration of the sched
ule are documented
• The process used for generation has known mathemati
cal properties
• The process provides a clear audit trail

• Departures from the established sequence of assign
ments can be detected

-mt assignment, or some other item of mfor-HKm is withheld from some individual or
r,.up of individuals in the study as a means of
- proving the objectivity of the treatment, data
...lection, reporting, or analysis processes. It
\ conventional to refer to trials as unblinded,
, nde-blmdcd. or double-blinded (unmasked,
, "ilc-maskcd. or double-masked in this bookf
of the method of
lkf terms serve as decriptors
<
...... administration. For example, a dou•rr.itmcnl
s <- m.i.ivkcd trial is one in which neither the pa• mt nor the physician responsible for treatment
, mlormcd of the patient’s treatment assign-mt. a single-masked trial is one in which the
rr.cnt is not informed of the treatment assign-f?i but the treating physician is, and an un- »Aed trial is one in which both the patient and
*■< phvsician are informed of the treatment as» mment. Technically, the term single-masked
- n he used to characterize a trial in which
, her the patient or physician is unaware of the
••'.iimcnt assignment; however, it usually refers
• •designs in which the patient is masked and the
r^vsician is not.
I he logistics of masking are not simple. They
ire discussed in Chapter 10 in relation to bot• "g and dispensing drugs, and briefly in Chap•'t H in relation to data collection.
\mong the randomized trials listed in the
\ IH Inventory of Clinical Trials (see Chap•rr 21, the majority were unmasked (388 out of
<*'J. or b6rr) Another 12% were single-masked,
i>! 22'7 were double-masked. These results
»iand in marked contrast to published reports,
i' summarized in Chapter 2. Of the 113 trials
"viewed, 76 (67%) were reported to be double-*'ked.
Reports of trials that are single- or double"U'ked should contain information on the effec’ 'mess of the mask. The information is useful
n assessing the possibility of bias in the study.
H ’wcver, only a few groups have addressed this
•"uc (e g.. Beta Blocker Heart Attack Trial Re
search Group, 1981; Howard et al., 1981).
Treatment-specific side effects can reduce the
effectiveness of the masking. This was the case
• th the estrogen treatments in the CDP. A total
■'<
of patients assigned to the high dose of
evirogen and 45% of patients assigned to the low
■ '*e of estrogen complained of decreased libido,
•* contrasted with only 1.5% of the placebo’’tated patients (Coronary Drug Project Re
x-arch Group, 1970a).
The principle of masking is general and apwhenever it is practical to withhold infor-

pie). Precautions must be taken to make certain
that the envelopes are used in the order prodded
and that their contents remain unknown t..
clinic personnel until they are used.
The assignment process should have known
mathematical properties. A major shortcoming
of most informal methods of assignment, such as
the odd-even scheme described above, is the a^
sence of a mathematical base. It should also
provide a clear audit trail and should be con
structed and administered in such a was that
departures from the established procedure can
be detected.
The accepted standard for creating treatment
groups is randomization. Unfortunately, there o
still a good deal of misunderstanding regardrr
the reasons for randomizing. While the process
does provide a basis for certain types of statist
cal analyses (Pitman, 1937), it is far more usefu
as a method of making bias-free treatment as
signments. The term random is often misused n
medical circles by investigators who equate hip
hazard and random processes (as in refernng
to a random blood sugar determination when
really meaning a haphazard one. or in character
izing a group of arbitrarily selected individuals
as a random sample). It should be reserved m
research settings for processes that satisfv the
definition stated in the Glossary.
Chapter 10 provides a discussion of methods
for administering the treatment allocation sched
ule. It also contains a discussion of issues to be
considered when the randomization schedule o
constructed, including those related to stratifica
tion and blocking.

8.5 PRINCIPLES OF MASKING
AND BIAS CONTROL
The aim of any trial should be to collect dan
that are free of bias, especially treatment-related
bias (see Glossary for definition). The latter tvpe
of bias is of particular concern since it has the
potential for obscuring a treatment difference of
creating the impression that one exists when in
fact it does not. The usual procedure used to
protect against treatment-related bias is mask
ing.
The term masked,1 when used throughout tM
book, refers to a condition in which the treat
I. Used in this book instead of blinded because‘ " '*
more ant description of the process involved further, u •
latter term, as in double-blinded trial, leads to con u‘"
settings, such as in vision trials where the outcom
blindness.

I

69

mation that, if known, may influence the way in
which data are collected or how the treatments
are adminstered. Table 8-4 lists suggested mask
ing guidelines. Masked data collection is espe
cially important in trials involving outcome mea
sures that are subject to measurement or
ascertainment errors.
A double-masked trial, as defined above, is
characterized by masked data collection. However, even single-masked or unmasked trials may
be designed so that data collection is done in
a masked fashion via structures in which treat
ment information is withheld from personnel
responsible for data collection. The structures
require one set of personnel to administer the
study treatments and another to collect the data
needed for assessing the study treatments.
Laboratory tests should be performed, re
corded, and reported by personnel who are
masked to treatment assignment, regardless of
the level of treatment masking in the rest of the
trial. The only exceptions are cases in which
treatment assignment is needed to determine the
tests to be performed. Likewise, records such as
ECGs, fundus photographs, and x-rays should
be read by individuals who are masked to treat
ment assignment. The same is true of personnel
responsible for coding or classifying outcome
events.
Ideally, all keying, editing, and data analysis
activities in the data center should be performed
by personnel who are masked. This standard is
not easy to achieve because of the obvious prac-

Table 8-4

Masking guidelines

• Use a treatment allocation scheme that meets the mask
ing criteria listed in Table 8-3 (i.e., the treatment
assignment for a patient cannot be determined in ad
vance of enrollment)
• Administer treatments with the highest level of masking
feasible (e.g.. double-masked if possible: single
masked if double masking is impossible: unmasked
only if any level of masking is out of the question)

• Require, when possible, that essential data collection,
measurement, reading, and classification procedures
on individual patients be made by persons who have
no knowledge of treatment assignment or course of
treatment
• Require, when possible, that outcome measurements
that are subject to interpretation errors (e.g., measure
ments requiring a subjective evaluation) be made by
personnel who are masked to treatment assignment

• Do not require masked treatment administration if
doing so requires study patients to assume measurable
risks in order to achieve or maintain the masking

I

70

Essential design features of a controlled clinical trial

tical problems involved in maintaining the mask.
However, it is important when it cannot be ac
hieved to make certain that decisions regarding
the way in which data are keyed or used for
analysis purposes are made without regard to
treatment assignment or observed treatment dif
ferences.
The principle of masking has been extended
to treatment monitoring committees as well.

9. Sample size and power estimates

Treatment monitoring reports presented in t1*
Diabetic Retinopathy Study (DRS)
masked with regard to treatment group ei~
though the trial itself was unmasked. Houe\~
in this case the masking was subsequenth aKjdoned because of the logistical difficulty *
volved in producing the monitoring report*
because of its limited usefulness (Knatterud
1977).

A difference to be a difference must make a difference.

JI Sequential versus fixed sample size designs
JSample size and power calculations as plan
ning guides
91 Specifications for sample size calculations
9 t I Number of treatment groups
9 1 2 Outcome measure
9 11 Follow-up period
9 1 4 Alternative treatment hypothesis
9 t 5 Detectable treatment difference
*) 3.5 I Binary outcome measures
Q 3.5.2 Continuous outcome measures
Q 16 Error protection
9 1 7 Choice of allocation ratio
9 t x Losses to follow-up
919 Losses due to treatment noncom
pliance
9 310 Treatment lag time
9 i 11 Stratification for control of baseline
risk factors
9 1 12 Degree of type I and II error protec
tion for multiple comparisons
9 t p Degree of type I and II error protec
tion for multiple looks for safety
monitoring
9 t |4 Degree of type I and II error protec
tion for multiple outcomes
J 4 Sample size formulas
9 4 I Binary outcome measures
9 4 1.1 Fisher’s exact test
9 4.1.2 Chi-square approximation
9 4 1.3 Inverse sine transform approxima
tion
9 4 1.4 Poisson approximation
94? Continuous outcome measures
9 4 2.1 Normal approximation for two in
dependent means
9 4 2.2 Normal approximation for mean
changes from baseline
9 ’ Power formulas
9 5 I Binary outcome measures
9 5 1.1 Fisher’s exact test
9 5 12 Chi-square approximation
9 5 1.3 Inverse sine transform approxima
tion
9 5.1.4 Poisson approximation
9 5 2 Continuous outcome measures

9.5.2.1 Normal approximation for compar
ison of two independent means
9.5.2.2 Normal approximation for mean
changes from baseline
9.6 Sample size and power calculation illustra
tions
9.6.1 Illustration I: Sample size calculation
using chi-square and inverse sine
transform approximation
9.6.2 Illustration 2: Sample size calculation
using Poisson approximation
9.6.3 Illustration 3: Sample size calculation
using Coronary Drug Project design
specifications
9.6.4 Illustration 4: Sample size calculation
for blood pressure change
9.6.5 Illustration 5: Sample size calculation
using Fisher’s exact test
9.6.6 Illustration 6: Power calculation based
on chi-square and inverse sine trans
form approximation
9.6.7 Illustration 7: Power for design specifi
cations given in Illustration 2 for 1500
patients per treatment group
9.6.8 Illustration 8: Power for design specifi
cations given in Illustration 4 for 150
patients per treatment group
9.7 Posterior sample size and power assessments

Table 9-1 Illustration of a sample size presenta
tion, a = 0.01 (two-tailed), P =
0.05 and A = I
Table 9-2 Illustration of a power presentation,
given a sample size of 800,
a = 0.01 (two-tailed), and X = I
Table 9-3 Design specifications affecting sam
ple size considerations
Table 9-4 Sample size and power calculation
summary for Sections 9.4 and 9.5
Table 9-5 Z values for M0,l) distribution for
selected error levels
Table 9-6 Values of d> M), the proportion of
area of a N(0,1) distribution point
lying to the left of a designated
point A. for selected values of A

71

I

Source unknown

y
I

72

9.1 Sequential versus fixed sample size designs 73

Sample size and power estimates

Figure 9-1 Schematic illustration of boundaries
for open sequential design
Figure 9-2 Schematic illustration of boundaries
for closed sequential design

9.1 SEQUENTIAL VERSUS FIXED
SAMPLE SIZE DESIGNS

I

This chapter deals with sample size and power
estimates for fixed sample size designs. All of the
trials sketched in Appendix B are of this type.
Strictly speaking, a fixed sample size design is
one in which the investigator specifies the re
quired sample size before starting the trial. The
specification may be based on a formal sample
size calculation or on practical considerations
related to cost, patient availability, or other fac
tors. The investigator then proceeds to enroll the
number of patients specified, unless there are
extenuating circumstances to the contrary (e.g.,
the specified number cannot be recruited as
planned or recruitment has to be stopped be
cause of adverse or beneficial treatment effects).
In practice, the sample size may not be set until
after the trial is started or may never be formally
set in some cases. In other cases, it may be

merely implied by other conditions, such *
amount of time allowed for patient recruuntThe approach is quite different with
tial designs. A classical open sequential lS r
provides for continued patient enrollment tthe observed test-control treatment different
ceeds a predefined boundary value (^ec F r
ure 9-1). The simplest application of this
r
is the enrollment of patients in pain (>-,
member of each pair is assigned to the test trer
ment and the other member is assigned to •scontrol treatment. The decision as to whether •
enroll the next pair of patients is based on < •
comes observed for patients already enro’e-j
The pair of patients is enrolled if the cumula* •
test-control difference for all previousls enro
pairs of patients is still within the defined hou-.,’
aries. The pair is not enrolled if one ol the t»boundaries is exceeded.
The expected sample size, given a spec
type I and II error level, is smaller fcr a seuuctial design than for its fixed sample size counT
part (Armitage, 1975). However, the number patients required in any given replication a*
exceed the number required with a fixed samr*size design. In fact, there is a chance. Ainfinitesimally small, that the treatment ......
ence will remain within the defined houndrrs
no matter how many pairs of patients art t-

S2

Figure 9-1

whether it is appropriate to continue patient en
rollment up to the limit set, but also whether it is
appropriate to continue the trial after enroll
ment is completed. He should stop the trial once
it becomes clear that the test treatment is supe
rior or inferior to the control treatment, regard
less of whether patients are still being enrolled
(see Chapter 20 for further discussion).
The use of sequential designs is limited to
situations in which outcome assessment can be
made shortly after patients are enrolled in the
trial. They are not practical where long periods
of follow-up arc required to accumulate suffi
cient outcome data to make reliable treatment
comparisons. The usual approach in such cases
is to use a fixed sample size design. This ap
proach, as discussed herein, utilizes a frequentist
v. jential designs).
analysis philosophy—a philosophy based on
I here are sequential aspects to any trial, even
work of Neyman and Pearson (1966) and one
• J..nc using a fixed sample size design. Patients
that is widely used in biostatistics for analysis of
- Kqh tvpes of designs are typically enrolled
medical research. Other analysis philosophies in
str time The temporal nature of the enrollclude those built on the likelihood principle and
-rrt process leads to a gradual accumulation of
on Bayes’ theorem. Plackett (1966) has reviewed
-Aome data for use in making treatment comall three philosophies. The frequentist approach
,-i- »ons As noted above, a new comparison is
is reviewed by Armitage (1963) and by Armitage
-*»*e .liter each pair of patients is enrolled in the
and co-workers (1969). The likelihood approach
Acttcal sequential design. The results of the
is reviewed by Anscombe (1963). The Bayesian
-pansun arc used to decide whether to stop
approach is reviewed by Colton (1963) and Corn
r*. ent enrollment. The decision-making process
field (1966a).
» —re complicated in the typical fixed sample
. ?T estimates for fixed
Sample size and power
rn. at least for the class of trials discussed in
discussed
by a number of
sample size designs are C
—.
*i
An investigator must not only decide

.. .j This possibility is eliminated by imposing
,
on the number of patients that may be
... .iicd as illustrated in Figure 9-2. Closed se^(..il'dcsigns (so named because of the limit
.^cd on the number of patients that may be
>d) are preferred to open sequential de.
m most medical settings because they allow
investigator to stop the trial if the study
.-aimcnts appear to be of about equa value.
I he initial work on open sequential designs
done bv Wald (1947). The closed modifims come from work by Bross (,952) and
\ - tage (1957). A book by Armitage (1975)
. ves on applications of closed designs to med» .rials (See Grant, 1962, and Snell and Ar- •ace. 1957. for examples of the two types of

Schematic illustration of boundaries for open sequential design.
TEST TREATMENT

boundary a

Flftire 9-2

BOUNDARY

TEST TREATMENT
ACCEPTANCE REGION

ACCEPTANCE REGION

2

Schematic illustration of boundaries for closed sequential design.

*io
2

«---- BOUNDARY C

♦,o

BOUNDARY C
t.i

If

*5

I"
Eg

»5-

REGION OF,

SE

REGION OF

NUMBER OF PAIRED
PREFERENCES

0
NO DIFFERENCE

5

si

REGION OF,

Is 0

INDECISION

Ss-s
BOUNDARY C

I®
z
-10

BOUNDARY B

Note: Trial continues until observed number of preferences (ignoring ties) crosses a boundary line. The test treafnent
is considered superior to the control treatment if boundary line A is crossed, inferior to the control treatment ■
boundary B is crossed, and equal to the control treatment if boundary C is crossed. The C boundary lines are dr et
in trials designed to continue until the test treatment is declared superior or inferior to the control treatment

.............. ’

INDECISION

NUMBER OF PAIRED
PREFERENCES

NO DIFFERENCE

51
z

TEST TREATMENT
REJECTION REGION

REGION OF

-10

---- BOUNDARY C

TEST TREATMENT
REJECTION REGION

—
BOUNDARY B

S<
e es
'«’’»r
ir Trial
Trill continues
continues until
until observed
observed number
number of
of preferences
preferences (ignoring
(ignoring ties)
ties) crosses
crosses a
a boundary line.
is considered superior to the control treatment if boundary line A is crossed, inferior to the contro rea
••oundary line B is crossed, and equal to the control treatment if boundary line C is crossed.

.,

P

74

Sample size and power estimates

authors, including Cochran and Cox (1957),
Cox and Hinkley (1974), Fleiss (1981), Lachin
(1981), Schlesselman (1982), and Snedecor and
Cochran (1967). Readers may refer to these ref
erences or to other basic statistics texts for de
tails not covered in this chapter.

® V

II
!

*

I

9.2 SAMPLE SIZE AND
POWER CALCULATIONS AS
PLANNING GUIDES
It is unwise to undertake a fixed sample size trial
without a calculation to determine the number
of patients required or the power available with
a specified sample size. With a sample size calcu
lation, the investigator sets out to determine the
number of patients required to detect a desig
nated treatment difference with specified levels
of type I and II error protection. With a power
(see Glossary) calculation, the investigator deter
mines the power associated with a specified treat
ment difference, given a specified sample size.
Either one of these calculations, may lead to
subsequent design modifications. The modifica
tions may include expansion from a single center
to multiple centers to increase the number of
patients available for study, changes in the pa
tient admission criteria to make recruitment eas
ier, or abandonment of the trial.
The archives of clinical trials are cluttered
with inconsequential trials. Such trials are, in
one sense, unethical in that they require patients
to accept the risks of treatment, however small,
without any chance of benefit to them or future
patients. Small-scale preliminary investigations
may be justified when part of a larger plan, but
not as an end in their own right.
The absence of a planned approach to study
design is evident from a review of the published
literature, as discussed in Chapter 2. Few of the
trials cited there show any evidence of having
involved sample size or power calculations (see
also Freiman et al.. 1978, and Mosteller et al.,
1980).
The design documents prepared when the trial
is planned should indicate the recruitment goal
for the trial and how it was determined. If the
goal was the result of a sample size calculation,
the details of that calculation should be pro
vided. If it was set by practical considerations,
such as cost or the presumed availability of pa
tients, it should be accompanied by appropriate
calculations to indicate the power that can be
expected with the proposed number of patients.

9.3 Specifications for sample size calculations

In cither case, the calculations, such as sho»n Tables 9-1 and 9-2, should indicate hot*
trial is affected if the control event rale used the sample size calculation proves to be uro-,
or how power changes as a function of sam-r
size.
*
The main thrust of the discussion in this chap,
ter relates to the use of sample size and po.r
estimates in planning the trial. However r
noted in Section 9.7, the same methods are used
for sample size adjustments during the trial <■<
for posterior power calculations at the end of the
trial.

9.3 SPECIFICATIONS FOR SAMPI F
SIZE CALCULATIONS
A determination of the required sample size can
not be undertaken until the basic design featum
of the trial, such as outlined in Table 9 3. hrt
been set. It may take months and a good du
of interaction between investigators, especial
between the physicians and biostatisticians, i.
reach agreement on the specifications.
The subsections that follow detail the consd
erations that go into setting the specifications
and provide discussion of the ways in which tbe-*
influence sample size requirements. Most of i»x
same points pertain to power calculations a*
well.

9.3.1

a-J Illustration of a power presentation, given «
f M,r of 800. a = 0-01 (two-tailed), and A - I
Pc-_Pl

Pc
-jtr

,

r

0.125

0.250

0.375

A 10

0.043

0.209

0.432

f I'

0.668

0.359

0.192

0.968

0.476

0.067

>1

<>

0 181

0.201

0.004

'.'ng as investigators plan to allocate the same
-u-iber of patients to each of the test treat-rnis The total sample size, N, for a trial with
-r control treatment and uniform allocation
the r test treatments is:

(9.1)
\
rn, 4- nc
• here
' - number of test treatments
n, - sample size required for each of the r test
treatments
ind

sample size for the control treatment

n

I or the purposes of calculation it is necessary
io specify the test-control allocation ratio,

X = ntlnc

(9.2)

It is simply the number of patients to be assigned
to a test treatment divided by the number to be
assigned to the control treatment. This quantity
is fixed by the study investigators and is
generally the same for each of the r test treat
ments in the trial (see Section 9.3.7 for factors
determining the choice of A).
The usual approach is to calculate the sample
size requirement for nt using the formulas given
in Section 9.4 and then to derive the value of nc
from Equation 9.2 by noting that nc = ndk. For
example, given r = 3, A = 0.5, and a calculated
value of 50 for nt yields an nc of 50/0.5 = 100.
The total sample size is 250, as derived from
Equation 9.1.
The sample size is not given by Equation 9.1 if
the trial involves more than one control treat
ment. The simplest approach in this case is to
use just one of the control treatments for the
sample size calculation—ideally the one that
provides the best basis for assessing the test treat
ments. The value for nc, as derived from Equa
tion 9.2, would be used for each control treat
ment and the total sample size would be
= rni+ Snc, where s is the number of control
treatment groups. An alternative approach, if a
minimal level of type II error protection is de
sired for comparisons of a test treatment with
any one of the control treatments, involves mak
ing a sample size calculation for each of the
different control treatments and then using the
largest value of N obtained to plan the trial.

Number of treatment groups

The considerations involved in reaching a deo
sion on the type and number of study trealmcru
have been discussed in Chapter 8. The umr'*
size formulations presented in Section 9 4 in
for the case of a trial involving one test and ex
control treatment. However, they can be uxd
for trials with any number of test treatments.

9.3.2
T»H»

Design specifications affecting sample size con-

S irnher of treatment groups to be studied
Outcome measure
5’iticipated length of patient follow-up

S'temative treatment hypothesis
fMeitahle treatment difference

(‘rmrrd t\pc | and II error protection
Table 9-1 Illustration of a sample size presents'
a = 0.01 (two-tailed), fi = 0.05. and A = I

S 'xjition ratio
Anticipated rate of loss to follow-up

Pt ~ Pl

Control
event rate.

Anticipated treatment noncompliance rate

Anticipated treatment lag time
0.125

0 250

0 10

19,373

4.549

0.15

12,246

2.886

0.20

8,682

2,055

0.30

5,119

1.223

P.

75

n i"

f^gree of stratification for baseline risk factors

I'•■.el of type I and II error adjustment for multiple
o-mparisons

le'tl of type I and II error adjustment for multiple
l«x>ks
I e*el of type | and II error adjustment for multiple
outcomes

Outcome measure1

Sample size and power formulations are given in
this chapter for binary as well as continuous
outcome measures. However, the main emphasis
is on binary outcomes (sec Glossary) because of
the class of trials considered in this book, as
outlined in Chapter I. Trials with binary out
comes are characterized by data collection
schemes in which patients may be classified at
any point after enrollment as either having or
not yet having experienced the event of interest.
The event may be a desired or undesired out
come depending on the trial and patients se
lected for study. It will be desired (positive) in
trials in which patients are watched for disap
pearance or amelioration of some medical con
dition. It will be undesired (negative) when they
are watched for the occurrence of death or some
I. See Chapter 8 for additional discussion of factors influencing
the choice of the primary outcome measure.

76

9.3 Specifications for sample size calculations 77

Sample size and power estimates

morbid event. All discussion and calculations in
this chapter are for negative events.
For the purposes of sample size or power cal
culations, the investigator may decide to alter
the form of the outcome measure when the un
derlying measure is polychotomous or continu
ous. The choice should be dictated by the antici
pated analysis requirements at the end of the
trial. The calculations should be made using
the unaltered underlying measure if the aim of
the trial is to assess distributional changes in the
measure over the time course of the trial. They
should be made using a binary event measure,
constructed from a dichotomization of the un
derlying measure, if observed values of the mea
sure, at or above a specified level, take on special
medical or operational significance (e.g., as in
the case with blood pressures over a defined level
used to diagnose hypertension and signal the
need to initiate treatment).
The decision as to whether to design the trial
to detect a mean change in some continuous
measure or a difference in some event rate can
have major implications on how the trial is per
ceived when it is finished. It is one thing to
conclude that there is a significant difference in
mean diastolic blood pressure between the study
treatment groups, and quite another to conclude
there is a significant difference in the rate of
development of hypertension in the two groups.
The latter statement has far greater clinical rele
vance than the former.

9.3.3

Follow-up period

Sample size and power calculations require spec
ification of the follow-up period. Generally, the
longer the period, the higher the accumulated
event rate and the smaller the required sample
size for a given type 1 and II error level.
The specification used for planning purposes
may be modified as the trial proceeds. For exam
ple, the follow-up period may be extended to
compensate for a shortfall in patient recruitment
or for a lower-than-anticipated control event
rate. Or it may have to be curtailed because of
funding or other problems.

a specified size that are either beneficial or
verse (two-sided alternative), or that are o- >
beneficial (one-sided alternative). The deciv. as to whether to use a one- or two-sided alter-,
tive depends on the clinical importance of .
positive versus a negative treatment effect
how much is known about the safetv of the mtreatment when the trial is planned.
Trials of the sort considered in this hook an
done to establish the efficacy of a treatment, r *
toxicity. This fact argues for use of a one-Mdr.*
alternative in the calculation, even though pr*,
tice seems to favor use of two-sided alternator
The reason has to do with the amount of pn evidence investigators have regarding treat mesafety. They may prefer a two-sided alternate*
simply as a means of documenting their <>»•
uncertainty regarding the potential merits of t*
test treatment. A side benefit of the practice i
that it leads to a larger sample size than the unr
a and P using a one-sided test. The increavd
sample size represents a hedge against unant>0
pated losses, such as those due to lack of cor
pliance to the treatment protocol during t1*
trial.

9.3.5

9.3.5.1

9.3.4

Alternative treatment hypothesis

The calculations in this chapter are always made
under the null hypothesis of no treatment effect
versus a specified alternative. The alternative
will be constructed to cover treatment effects of

Detectable treatment difference

The experimenter is required to specify the min
imum treatment difference he wishes to detev*
under the alternative hypothesis. The larger the
difference the smaller the required sample u/t
The difference chosen should be realistic *
50% reduction in the test treatment event rate
while of unquestioned clinical relevance '■
achieved, is unlikely in real life. Only miracle
treatments produce reductions of this size ax*
there are few such treatments around and nee
fewer that require discovery via a clinical tnal
Generally, the gains with most new treatment’
are much more modest. Certainly a reduction n
the event rate does not have to be enormous to
be important. Small reductions, in the range o( *
to 10%, can have major public health implica
tions if they apply to death or some other senom
nonfatal event associated with a common d•>
ease.

Binary outcome measures

Specification of the difference, in this case, re
quires the experimenter to designate a value *
both Pc and Ph or for Pc and the perctniir
reduction in Pc to be achieved with the trsi
treatment,

• Scrr

r

anticipated event rate for the controltreated group

i'v!

P. ~ amiticipated event rate for the test-treated
group
tv. minimal detectable difference expressed in
jv.nltite terms is:
(9.3)
3 < - r, - p,
f Spressed in relative terms, it is:
/>.

(9.4)

\'though cither form is acceptable, many inves• f itors prefer to express the difference in rela■ -.r terms using Ar, even though sample size
a-d power formulas are conventionally ex?-sscd in terms of A^. Equation 9.5 can be used
■ • .onxert from a relative to an absolute differ~ P, Ar

(9.5)

Ideally, the value chosen for Pc should be
derived from follow-up studies of patients simir to those to be enrolled in the trial and who
'nrned treatment similar to that planned for
•Sc control-treated group. Unfortunately, fol.'•■up data such as these are usually not availaMe Henc-, the experimenter may have to rely
on an educated guess for Pc. The value chosen
-uv turn out to be higher or lower than the one
actually observed in the trial. Selection of a value
'••r P, that is lower than the one subsequently
^served means the trial was larger than it
needed to be to detect a given relative differtxt not a serious problem unless the undereMimation resulted in a significant increase in
'be cost and time needed to carry out the trial.
' I be reverse is true if Pc + Pt < I and the exper•nenter is interested in detecting a prespecified
•^••lutc difference.) A more serious and com■■n<«n problem stems from overestimation of Pc.
I he sample size estimate in such cases will be
’waller than needed to achieve the required
error protection. Overestimation can occur even
•n instances in which investigators have reasonau ' reliable information for determining Pc, as
n ’he C oronary Drug Project (GDP). The Pc (5»rir mortality rate) used for sample size calcula'. 'ns was 30 per 100 population. The observed
rite was only 21 per 100 population (Coronary
Ibug Project Research Group, 1975).
Ihe tendency to overestimate Pc arises, at

least in part, from the failure or inability of the
study planners to predict the impact of the pro
posed patient eligibility criteria on subsequent
observed event rates. The exclusion of seriously
ill patients from enrollment may well yield a
population with a better than expected progno
sis.

9.3.5.2

Continuous outcome measures

The difference to be detected in this case is ex
pressed as a function of means, as discussed in
Section 9.4.2. The variance estimate2 required,
like the value of Pc used for binary outcomes,
should be based on actual data if at all possible.
It is wise to explore the effect of a range of
variance estimates on sample size if there is no
reliable way of estimating variance before the
start of the trial.
9.3.6

Error protection

The choice of a and p (probabilities of type I
and II error, respectively, see Glossary for defi
nition) is arbitrary. The first instinct of an inex
perienced investigator is to want a trial that pre
cludes any possibility of either type of error—a
lofty goal since an infinite number of patients is
required to achieve it!
The choice of a and p should depend on the
medical and practical consequences of the two
kinds of errors. Relatively high error rates (e.g.,
a = 0.10 and p = 0.2) may be acceptable for
preliminary trials that are likely to be replicated,
whereas lower rates (e.g., a = 0.01 and P —
0.05) should be used if replication is unlikely.
The consequences of both a type 1 and II error
must be considered. For example, one might
choose:
a - p if both the test and control treatments

are new, about equal in cost, and there
are good reasons to consider them both
relatively sale
a > p if there is no established control treat
ment and the test treatment is relatively
inexpensive, easy to apply and is not
known to have any serious side effects
a < p if the control treatment is already
widely used and is known to be reasona
bly safe and effective, whereas the test
treatment is new, costly, and produces
serious side effects
2. The need for an independent variance estimate is avoided for
binary outcomes The variance in such cases is a function of the
specified event rates

IT*

S ;

78

9.3 Specifications for sample size calculations

Sample size and power estimates

The most common approach is to set a < 0.
However, this is only reasonable if the conse
quences of a type II error are considered to be
less than those of a type 1 error.

9.3.7

Choice of allocation ratio

The allocation ratio is ordinarily under the con
trol of the experimenter and is set before patient
enrollment is started, except with some forms of
adaptive allocation (see Chapter 10). All of the
trials sketched in Appendix B involved preset
allocation ratios, except one (see item 20.a of
Table B-4 in Appendix B). A uniform alloca
tion scheme, in which the probability of assign
ment to one treatment group is the same as for
all other treatment groups, is generally preferred
(used in 11 of the 14 trials sketched in Appen
dix B; see item 20.e of Table B 4). Nonuniform
methods of allocation are used when there is a
need to concentrate more patients in certain
treatment groups to satisfy secondary aims of
the trial or to provide increased precision for
certain of the treatment comparisons. Investiga
tors in the Persantine Aspirin Reinfarction
Study (PARIS) decided to allocate twice as
many patients to the aspirin and to the aspirinpersantine treatment groups than to the placebo
treatment group. They made this choice because
they considered comparison of the aspirin and
aspirin-persantine treatment groups more im
portant than comparison of either of these treat
ment groups with the placebo treatment group
(Persantine Aspirin Reinfarction Study Re
search Group, 1980b).
The CDP allocated 2.5 times as many patients
to the placebo treatment as to any one of the five
test treatments (Coronary Drug Project Re
search Group, 1973a).3 The allocation ratio was
chosen to minimize the variance for the five test
control comparisons of 5-year mortality. A some
what lower ratio would have been derived using
an approach developed by Dunnett (1955). His
method assumes the experimenter wishes to con
struct confidence intervals about each test-con
trol outcome difference at the end of the trial,
such that the risk of a type I error for all com
parisons combined is a. This error probability
condition is satisfied if
patients are assigned
to the control treatment for every patient as
signed to any one of the r test treatments used in
3. The actual ratio, a* derived hy methods described in the 1973
CDP publication, was 2 45. It was rounded up to 2.5 to simplify
construction of the allocation schedules needed tn the trial.

the trial. The CDP would have required an i".v
cation ratio of 2.24 instead of the one used » •>
this method of calculation.

9.3.8

Losses to follow-up

Losses to follow-up are concentrated in dro^
outs (throughout this book, patients enrolled \
the trial who are no longer able or wil|inr •
return to the clinic for regular follow-up evinations) who can no longer be followed (or thr
event of interest. A patient who drops out
automatically lost to outcome follow-up unH»
the event used for outcome assessment can v
reliably observed and reported outside thecL-«.
setting.
The loss to follow-up rate used in the sampk
size calculation is estimated by the expcrimcntr
As with other variables, such as P,. the sj jt
used should be based on relevant experience 1
at all possible. The value chosen may be zero
very small in cases in which it is possible t
continue follow-up of patients for the outci’n*
of interest even if they refuse to return to
clinic for regular follow-up examinations (e r
as with patients under follow-up for mortalih •'
some other event that can be reliably obseon!
and recorded outside the clinic setting) h mn
have to be set quite high if long periods of chr*
surveillance are required for outcome measur?
ment.
Clearly, any loss of outcome data, regard Io»
of how it occurs, will reduce the statistical preci
sion of the trial, and may introduce bias as »c
if the losses are differential by treatment group
Hence, the sample size estimate, as given M
formulas in Section 9.4, must be increased
compensate for the anticipated loss. This is n«"
mally done by multiplying the sample size bs t1*
quantity, 1/(1 — d), where d is the anticipated
loss rate. For example, a d of 10'100 wouM
mean that for every 100 patients enrolled. In
could not be classified as to presence or absence
of the outcome of interest because of the l>T
of follow-up data. It would require multip1'
ing the calculated sample size by a factor
1/0.9 = 1.11 to compensate for the losses

9.3.9 Losses due to treatment
noncompliance
The sample size must be increased to compe*'
sate for loss of precision due to treatment non-

ce as well. Treatment compliance is
compliance
all-or-none phenomenon. The level of
rj-eh an a.:
‘•npliance achieved may range from low to
’>■ th depending on the patient. Perfect com- nee mav be difficult, if not impossible, to
xh.cve especially in drug trials where the pa■
is required to take the assigned medication
•»tr long periods of time.
(here are two aspects to the determination of
. -pliance. One has to do with the amount of
opoMire the patient has to the assigned treat-ynt. .md the other has to do with the amount
•’exposure to the other study treatments. Un■-'-xposurc to the assigned treatment may arise
• P.iticnt unwillingness to accept the assigned
treatment
• Phxsician unwillingness to administer it
• Patient or physician unwillingness to use the
lull treatment dosage

iKrrrxposure to the assigned treatment may
i'.sf from:

• \ mistake by the study physician or patient in
the assigned treatment (e.g., as in the case
m which a patient takes twice as many pills
as required)
• Xdministration of the same treatment outside
the study clinic by the patient’s private phy
sician
• Patient self-treatment with medications ob
tained outside the study clinic (e.g., as with
a patient in a myocardial infarction trial
who is assigned to aspirin therapy and who
takes his medication but who uses his own
supply of aspirin for headaches and other
ailments as needed)
I sp^sure to one of the other study treatments
-"is arise in various ways. Examples include:
• A patient who takes a drug outside the trial
that is similar to one of the test treatments
(e g., as with a patient assigned to the con
trol treatment in an aspirin trial who uses
an over-the-counter cold remedy contain
ing aspirin)
• A patient who demands and receives, midway
m the course of the trial, another study
treatment in place of, or in addition to, the
assigned treatment
• A physician who unwittingly switches a pa
tient from the assigned treatment to
another study treatment through a mix-up
m prescriptions

79

• A physician who elects, for medical reasons,
to administer another study treatment to a
patient in the trial, in addition to, or in
place of, the patient’s assigned treatment

Any departure from the study treatment pro
tocol, regardless of the nature of the departure,
reduces the chances of finding a treatment differ
ence. For example, a patient assigned to the test
treatment who refuses the treatment may, in
effect, expose himself to the control treatment
(e.g., as in the case where the control treatment
involves no treatment at all). This reduces the
chance of finding a treatment difference, even if
the adherence of patients actually assigned to the
control treatment is excellent. Conversely, so
does the exposure of control-treated patients to
the test treatment (e.g., as in a coronary bypass
surgery trial where a sizable number of the con
trol-treated patients receive bypass surgery),
even if the compliance of patients assigned to the
test treatment is excellent.
Loss of precision due to noncompliance is not
necessarily related to patient follow-up status.
Dropouts, to be sure, automatically become noncompliant if the treatments to which they were!
assigned are stopped when they drop out. How-|
ever, as noted above, patients who do not drop
out can become noncompliant as well. Further,
being a dropout does not necessarily imply a
state of noncompliance if the treatment process,
as specified by the study protocol, was com
pleted before dropout and the patient is not
exposed to any of the study treatments after
dropout.
The loss of precision due to noncompliance is
compensated for in the same way as losses to
follow-up, as discussed in Section 9.3.8. The
value for d will be based on the amount of
noncompliance anticipated and its role in reduc
ing the precision of the trial. Trials with losses
from follow-up and noncompliance will require
a composite multiplier to account for both kinds
of losses. For example, the CDP used a com
bined d of 0.30. In actual fact, the losses were
due almost exclusively to noncompliance, since
it was possible to follow virtually every patient
for mortality—the primary outcome measure.
9.3.10

Treatment lag time

Most calculations are made as if there is no
treatment lag (i.e.. the full effect of the treatment
is realized as soon as it is applied). That conven
tion is followed in this chapter. The approach is

80

9.4 Sample size formulas 81

Sample size and power estimates

reasonable with some forms of treatment (e.g.,
most types of surgery and certain drug treat
ments), but not for others (e.g., with a drug
given to dissolve atherosclerotic plaques). The
decision of investigators in the Anturane Rein
farction Trial to ignore deaths that occurred
within seven days after the initiation of treat
ment was based on a presumed treatment lag for
Anturane (Anturane Reinfarction Trial Research
Group. 1978. 1980; Temple and Pledger, 1980).
One reason for ignoring lag times has to do with
the mathematical difficulties involved in taking
account of them in sample size calculations. Fur
ther, there is often no reliable way to estimate
lag times.
The impact of lag time on sample size is il
lustrated below for a trial involving 5 years of
follow-up for each patient enrolled, one test
and one control treatment, X = I, Pc = 0.30,
A/? = 0.30, a = 0.01 (one-tailed), P = 0.05, and
d = 0. The sample sizes recorded are derived
from tables developed by Halperin and co
workers (1968). The required sample size, given
the above specifications, is 1,474 if the full effect
of the treatment is realized as soon as it is ad
ministered. It is about twice this size if it takes
2.5 years for the treatment to reach full effective
ness and nearly 20 times as large if the lag time is
10 years.

Ijig time

Sample size:
"t + "r

Sample
size ratio*

0
2 months
6 months

1.474
1.530
1.656

1.00
1.04
1.12

1 year
2 years
2.5 years

1.870
2.444
2,828

1.27
1.66
1.92

3 years
4 years
5 years

3.266
4.492
6,536

2.22
3.05
4.43

7.5 years
10 years

12.136
29,428

8.23
1996

•Ratio of sample size for indicated lag time relative to size for 0 lag
time.

9.3.11 Stratification for control of
baseline risk factors

The sample size is influenced by the amount of
stratification done to control for baseline varia-

tion. The issues involved in selection of xtrat f,
cation variables are discussed in Chaptrr m
Technically, the sample size calculation sh<>b j
take account of the stratification planned H «
ever, in actuality, most calculations arc mV’
ignoring stratification. Doing so can lead to ioverestimate of the required sample size if
variables used for stratification represent imp. *
tant risk factors and if the calculated sample»r?
is small (see Section 10.3.2 of Chapter 10)

However, calculations are ro«t>nely
...-c tenoring the need, in part because of dlffi.„
making the necessary adjustments,
tv, practice is followed here. However, de. rW of trials should recogntze that the error
- .-Ct,on provided in such cases will be less
-m'the levels used in making the calculations.

9.3.12 Degree of type I and II error
protection for multiple comparisons

\ trial, even though planned to focus on a pri-irx outcome, will generate data for a number
‘ secondary outcomes as well (see Glossary for
•r' nitions of primary and secondary outcome
-<axurc$). The usual approach is to base the
unple M/e calculations on the primary outcome
• interest and to accept whatever power that
.Aul.iiion yields for the comparisons involving
«\-‘nd.iry outcome measures.
I he onlv compensation made may be in the
.K ue of o and p. The investigator may choose
.~j!ler \alucs than normally used as a means of
•xfrasing the precision for the primary as well
r xctondary comparisons. He may be forced to
-jk calculations for each outcome and then to
-x the largest size for planning the trial if he is
.-•tiling to designate any of the measures as

The experimenter must also decide whether the
error protection specified is to be for a smt'e
treatment comparison or for multiple treat mecomparisons (see Glossary and Section 20 4
Chapter 20). Section 9.3.7 alludes to methods H
sample size calculation in which the investigate
is interested in r test-control treatment compa”
sons. However, the need for making multip'e
comparisons is not limited to such cases. It cabe just as great when r = I (i.e., where the tr*
involves only two treatment groups) if the imr»
tigator wishes to design the trial to provide a
specified level of error protection for treat me-’
comparisons within designated subgroups of pa
tients. One approach in this setting is to cak-j
late sample size estimates for each subgroup
interest. A drawback with it is that it leads to a
series of recruitment quotas—one for each sub
group (see Section 14.1 of Chapter 14) An a'
ternative and generally preferable approach i«
ignore subgroups in making the sample size cal
culation and then to estimate the power pro
vided for subgroups of interest. The total sample
size may be increased (e.g., by making a rr*
calculation using smaller values for a and
*
the power is considered to be inadequate for one
or more of the subgroups of interest.

9.3.13 Degree of type I and II error
protection for multiple looks for
safety monitoring
The experimenter may plan to look at outcome
data at various time points over the courx
the trial in conjunction with the safety m<m>toring process (see Glossary and Chapter."
Carrying out multiple looks will alter the ope
and type II error levels (see Dupont. I9’'’*'
Ideally, sample size should be estimated with tbe
need for safety monitoring in mind from t

• 3.14 Degree of type I and II error
protection for multiple outcomes

?• mary.

9.4

SAMPLE SIZE FORMULAS

Table 9-4 provides a summary of the calcula
tions discussed in this Section and Section 9.5.
Tables 9-5 and 9 6 are included for use in mak
ing sample size and power calculations. Other
more extensive tables of the two functions may
be found in many texts on statistics.
The method of analysis implied in the sample
size calculation should be identical to that used
when the results of the trial are analyzed. How
ever, this is not always possible, as already noted
with regard to the need for safety monitoring
and the use of secondary outcomes in the analy
sis process. Technically, there are as many meth
ods of sample size calculation as there are meth
ods of data analysis. The methods presented in
this Section are the most common ones.
The methods presented assume that the pri
mary comparison will entail a simple compari
son of proportions constructed at the end of the
trial (or after a specified period of patient fol
low-up). Strictly speaking, they are not appro
priate if the treatment groups are to be com
pared using life-table methods. The log rank test
is the test of choice in such cases (Gail, 1985). It
will yield smaller sample sizes than are obtained
with the tests covered herein (i.e., it is more
efficient). The difference is small for trials in
volving rapid patient accrual and low event

T»Hr »-4 Sample size and power calculation summary for Sections 9.4 and 9.5

Sample size

Power

Assumptions

Applicability

1 theft exact test

See Section
9.4.I.I

See Section
9.5.1.1

Independent
observations

Applicable over entire event
rate range from 0 to I

( hi iquare approximation

Eqs 9.6, 9.7

Eqs 9.16. 9.17

Independent
observations

Pc and P, >0.2 but <0.8;
nfPc, ncQr. nfP,. and n.Q,
all >15

hierte tine transform
approximation

Eqs 9.8, 9.9

Eqs 9.18, 9.19

Independent
observations

P( and P, >0.05 but <0 95;
nrPr. n(Qf. n,P,. and n,Q,
all >15

h><»ton approximation

Eqs 9.10, 9.11

Eqs 9.20. 9.21

Independent
observations

Low event rates (e.g . Pc and
P,< 0.05; n, P_. and n,P,
>10)

V.rmal approximation
for 2 independent
means

1
Eqs
9.12, 9.13

Eqs 9.22. 9.23

Independent
observations
Common variance
Normality

nf and n, >30

Sofmal approximation
for mean change

Eqs 9.14, 9.15

Eqs 9.24. 9.25

Independent
observations
between patients
Common variance
Normality

nc and n, >30

If"
< Bwian outcome measure

• < oMinuous outcome measure

I

I

SI :

82

Sample size and power estimates

9.4 Sample size formulas 83

Table 9-5 Z values for /V(0. I) distribution for selected
error levels

Error level

One-tailed

Two-tailed

0 500
0.400
0.300

0.000
0.253
0 524

0.674
0 842
1.036

0.200
0.100
0.050

0 842
1.282
1.645

1.282
1.645
1.960

0025
0010
0 005

t.960
2.326
2.576

2.248
2.576
2.813

rates. It is largest for trials involving slow accrual
and high event rates.
All of the formulations in this chapter are for
one-tailed tests. However, they may be used for
two-tailed tests by using a/2 wherever a appears
in the formulas cited. Strictly speaking this sub
stitution should be used only when the alloca
tion ratio, A, equals I, since there are disagree
ments among statisticians as to the validity of
the substitution when A # I. However, the com
mon practice is to use the substitution even if
A# I.
9.4.1

9.4.1.1

Binary outcome measures

Fisher's exact test

Fisher’s exact test is the test of choice for com
paring simple counts or proportions based on
binary data (Gart, 1971). The test, unlike others
considered in this section, works for samples of

Table 9-6 Values of 4» (.4), the proportion of area of a
M0. I) distribution point lying to the left of a designated
point A. for selected values of A

any size. It yields an exact p-value for the
served difference and, hence, the name.
Closed form sample size formulas for the lev
are not available. Required sample sizes muu \
read from tables (Casagrande et al. I97K (,4
and Gart, 1973; Haseman, 1978) or calculated
using computer programs.

9.4.1.2

Chi-square approximation

The standard 2X2 chi-square test (uithn-r
continuity correction) can be used in place o'
Fisher’s exact text if there are 15 or more p*
tients represented in each of the 4 cells of the
table (i.e„ there are at least 15 patients in each o'
the 2 treatment groups who have experienced
the event and at least 15 others in each of the.’
treatment groups who have not). This rule «
somewhat more stringent than the one propel
by Cochran (1954). He proposed a total samp*
size of 40 and a cell frequency of > 5. Indiu
tions are that, even under these border cond
tions, the test provides a good approximation io
the exact test.
The test can be used for sample size estimation
if the event rates in both treatment groups arc at
or between 0.2 and 0.8 and provided the result
ing estimates satisfy the above cell condition,
Fisher’s exact test or one of the other tests dis
cussed in this section should be used if the cond tions are not satisfied. The formulas for uniforand nonuniform allocation, derived from the
2X2 chi-square test, are as follows:

(> - I - P,
- I - Pt

Nonuniform allocation (A

- P< -Pt

The inverse sine transform (denoted by sin-1
i-d expressed in radians) is also used as an
•rproximation to Fisher’s exact test (Cochran
i-u! Cox, 1957). It has the virtue of providing
*
approximation to the exact test over a
• <1cr range of P values than is the case with the
•' tnarv 2X2 chi-square test—0.05 to 0.95 compared with 0.2 to 0.8—given the same cell size
.••nditions as specified in Section 9.4.1.2.

(•) Ai

(r + l)nr

Nonuniform allocation (A ^ /)
I nt form allocation (A = /)

A

A

-3.00
—2.50
-2.00

0 0013
0.0062
0.0228

3.00
2.50
2.00

-1.50
-1.00
-0.75
—0.50
-0 40
-0 30

0.0668
0.1587
0.2266

1.50
1.00
0.75

0.3085
0.3446
0.3821

0.50
0 40
0.30

-0 20
-0.10
0.00

0.4207
0.4602
0.5000

0.20
0.10
0 00

(4)

0.9987
0.9938
0.9772
0.9332
0.8413
0.7734
0.6915
0.6554
0.6179
0.5793
0.5398
0.5000

nc

(Z« x/?C(A4-l)/A

n

(Zq + Zp)2

.

+ Zp 'Jp^. + P,Q,lk ) Aj

2 (sin-1 'JPC- sin-1

)2

(9.8)

n, - n.
' - (r+|)nr

nt = Knc
N = rnt 4- nc
where
A = ntlnc, the ratio of the number of p*
tients assigned to a test-treated group ’■'
the number assigned to the contro
treated group
nc required sample size for the conm'
treated group

■nuniform allocation (A

n _

' - rn, + nc

i

/)

(Zq-h Zff)2(A+ 1)/A
4 (sin-1 \/pc - sin-1 \/^ )2

nt ~ knt

9.4.1.4

(9.9)

Poisson approximation

The Poisson approximation can be used for com
parison of proportions that lie below the lower
limit (i.e., 0.05) specified for the inverse sine
transform, provided ncPc and ntPt are both
> 10 (Gail, 1974). The same approximation may
be used for P values lying above the upper limit
(i.e., 0.95) for the transform by using a comple
mentary event (i.e., by using 1 — Pc and I —
in place of Pc and Pt in the formula).

Uniform allocation (A = /)
(Za 4- Z^)2 (Pc 4- P,)
(Pc ~ Pt)2
n t = nc
N = (r+ l)nf

(9.10)

nc =

■)

/)

(Zo + Zp)2 (Pc + P,/A)
"r
(Pc- P,)2
nt = \nc
N = rn, 4- nc

9.4.2

v 4 1.1 Inverse sine transform
jrproximalion

/)
_______

+ Zp 'JPcQc + P,Q, )2/Aj
"r
N

The definitions for Pc, Ph Za, Zp, and A arc the
same as for Equations 9.6 and 9.7.

r-(f( 4-AP,)/(l 4-A), a weighted average
of the 2 event rates
C I - P

Uniform allocation (A

nc

n. - required sample size for one of the testtreated groups
V - total sample size required in all groups
combined
n - tvpe I error probability
p - tvpe II error probability
/, - point on the abscissa of a M0,1) curve
(i.e., a normal distribution with mean 0
and variance 1) to the right of which is
found l00(a)% of the total area under
that curve
- point on the abscissa of a M0,1) distri
bution to the right ot which is found
100(0)% of the total area under that
curve
T - assumed event rate (expressed as a pro
portion) for the outcome of interest in
the control-treated group
P, - assumed event rate (expressed as a pro
portion) for the outcome of interest in
the test-treated group

(9.11)

Continuous outcome metsures

The methods described above may be used for
trials involving a continuous outcome measure if
the investigator plans to base the primary analy
sis on a comparison of proportions using a bi
nary categorization of the measure. He should
use the methods described in this section if the
primary outcome is continuous or near continu
ous. Conversion of continuous data to binary
form for analysis purposes is unwise unless a
binary categorization is considered to provide
the most relevant treatment of the data. Any
categorization reduces the amount of informa
tion provided by the data and, if used as a basis
for sample size calculations, can be expected to
yield an overestimate of the required sample
size.
Equations 9.12 and 9.13 are derived using a
statistical test for comparison of means observed
after a specified period of follow-up. Equa
tions 9.14 and 9.15 are derived using a statistical
test for mean change from baseline to some
specified period of follow-up. Both sets of equa
tions arc based on the normal approximation to
the /-statistic. The approximation underesti
mates sample size if the estimated number of

9.6 Sample size and power calculation illustrations 85

Sample size and power estimates

84

patients per treatment group is <30. Other for
mulations. such as those discussed by Lachin
(1981) and Cochran and Cox (1957). can be used
in such cases.

9.4.2.1 Normal approximation for
two independent means
Uniform allocation (A = /)
2(Zct + Zp)2 o2

(9.12)

(Pc _Mr)2

N

(r+ IK

Nonuniform allocation (A # /)
(Zg + Z^a2 (A+l)/A
(Mr - M,)2

(9.15)

n, = Xnf
N = rnt 4- nc
where
true mean of the outcome measure for
Pc
control-treated patients
= true mean of the outcome measure for
test-treated patients
a2 variance of the outcome measure for a
single individual (assumed to be the same
for all patients in both treatment groups)
and where observed expressions of the outcome
measure are assumed to be independent of one
another and to be normally distributed. See Sec
tion 9.4.1.2 for notation.

9.4.2.2 Normal approximation for mean
changes from baseline

Uniform allocation (X
2(Z„ + Zf)2a2
d

/)

<Pdc - Pdt)2

(9.14)

n t = nc
N = (r+ IK

(Pdc ~ Pdt)2

and where the baseline and follow-up measuments made on different patients are assumed t •
be independent of one another and norma >
distributed.

P.iwcr = I - <(> (A)
• ‘•CfC

1

A

7ny2PQI^(-\pc-p>[
\/(7’((?r+ piQ^

9.5

POWER FORMULAS

Sometimes the number of patients available H
study is fixed by practical considerations Ithese cases it is useful to calculate the power thr
can be expected with the available sample sire
The power functions for the chi-square ir
proximation and inverse sine transforms are d.»
cussed by Lachin (1981). The formulations l.'f
the Poisson approximation are based on work
by Gail (1974). The power function for hshe’t
exact test involves a complicated summation for
mula that is not practical for routine use
All of the power formulations given imoht
use of normal approximations in which

Power = 1— p = I — 0 (/()
where

All other notation is as defined in Section 9 4

Power =1—0 (/4)
where
I Mr ~ Pt I
A = Za~
x/(n^+ ^tr)a2l(nfnt.)

1

i -

9.5.2.2 Normal approximation for mean
changes from baseline

z„-

Uniform allocation (\

I)

(9 18)
Nonuniform allocation (A # /)

Power =1—0 (/I)
where
I Pdc ~ Pdl I
A
Za~
V(Mf + nc)ojl(ntnc)

/)

(9.25)

(9.19)

x/llnc + l/n,
95 I4

9.6 SAMPLE SIZE AND POWER
CALCULATION ILLUSTRATIONS

Poisson approximation

l mftirm allocation (A = /)

Power = I — <t» (zl)
• Me

1 - 7o -

lP~Cl
\/(C + Pt)lnc

(9.20)

X nunifnrm allocation (X # I)

f\^Cr ~ I - 0 (A)
• here
< zn- IP-CI
V(^ + P,/A)/nf

(9.13)

n, = Xnc
N = rnt + nc
where
Mie — MOr >s the true value of the differ
Pdc
ence in the outcome measure at follow
up and baseline for the control treat
ment
Pdt = Ph ~ POr is the corresponding value for
the test treatment

(9.24)

^20jn(

x/2/^

Power = I — 4> (/I)
• ‘rre
2| sin 1 \fPc — sin 1 \/P, |

/)

Power =1 — 0 (A)
where
I Pd< ~ Pdi I
A = Za-

(j)

2|sin~l \fPc-sin"1 x/^|

(9.23)

(9.17)

.’/
Inverse sine transform
trr'ovintalion

Power
• here

(9.22)

Nonuniform allocation (A # /)

P<Qdn< + ptQil ni

i nd'irm allocation (X

I Mr ~ Pl I

(9.16)

power = I — <P (-4)

/., >JPQInr + PQIn, - | P, - P, I

7n~

x/lo2/nv

\ nuniform allocation (A # I)

X •nuniform allocation (A

0 (/4) = proportion of area of a MO.Ddistn
bution that is to the left of a point 4

Nonuniform allocation (A # /)
= (Zo-l-Z/oj(A+l)/A

POc - true baseline mean (observed jum Sfore the initiation of treatment| lor
outcome measure for patients a«ig—
to the control treatment
Mlc
true follow-up mean (observed after ,
specified period of follow-up) for
outcome measure for patients assignto the control treatment
MOr and Mh are the corresponding means (. ■
patients assigned to the test treatmer
oj =2(1— p)<j2
o2 = variance of the outcome measure on »
single individual (assumed to he py
same for all patients in both treatmegroups) at either baseline or follow t P
correlation coefficient between hasel •*
and follow-up outcome measures on i
single individual

(9.21)

The examples that follow are designed to illus
trate sample size and power calculations using
the formulas provided in Sections 9.4 and 9.5.
Values reported for nc in illustrations 1 through
5 were rounded up to the next higher integer
regardless of the size of the decimal fractions
yielded by nc in the calculations.

9.6.1 Illustration I: Sample size
calculation using chi-square and inverse
sine transform approximation
a. Design specifications

9.5.1

9.5.1.1

Binary outcome measures

Fisher’s exact test

Power estimates must be computed or read
tables of the power functions (Casagrande et al
1978).

■^ 2 Continuous outcome measures
9 5 2.1 Normal approximation for
ri>mparison of two independent
^can^

1 mforni allocation (A = /)

9.5.1.2

Chi-square approximation

Uniform allocation (A = /)

Power = I - 0 (/f)
•here

• Number of treatment groups (see Sec
tion 9.3.1): 2 (i.e.. one control and one test
treatment)
• Outcome measure (see Section 9.3.2): death
• Follow-up period (see Section 9.3.3): 5 years
• Alternative treatment hypothesis (see Sec
tion 9.3.4): one-sided
• Detectable treatment difference in binary out
come (see Section 9.3.5):

•f
86

Pf=0.40 (5-year control treatment mor
tality rate)
P( - P, = 0.l0
• Error protection (see Section 9.3.6): a =
0.05, P = 0.05
• Allocation ratio (see Section 9.3.7): 1:1 (i.e.,
X = I, equal numbers in test and control
groups)
• Losses to follow-up (see Section 9.3.8): 0%
• Losses due to dropouts and noncompliance
(see Section 9.3.9): 20%
• Treatment lagtime (see Section 9.3.10): 0

b. Method of calculation

b. Method of calculation

9.6.3 Illustration 3: Sample size
calculation using Coronary Drug
Project design specifications

Equations 9.6 and 9.8
c. Results

Chi-square approximation (Equation 9.6, Sec
tion 9.4.1.2)

S'

9.6 Sample size and power calculation illustrations

Sample size and power estimates

= (1.645\/2(0.35)(0.65)

+ 1.64 5\/0.40-0.60+0.30-0.70 )2 / 0.102
= 490
nc= (1/0.8) X 490 = 613 (adjusted
for 20% loss)
n,= 613
N = nc+ nt= 1226

Inverse sine approximation (Equation 9.8, Sec
tion 9.4.1.3)
(1.645 + I.645)2______
nc
2 (sin-1 Vo.40 — sin-1 \/o.3o)2
nc= 491
nc= (1/0.8) X 491 = 614 (adjusted for 20%
loss)
n, = 614
N= nc+ n,= 1228

9.6.2 Illustration 2: Sample size
calculation using Poisson
approximation

a. Design specifications

Same as for Illustration I except:
• Detectable treatment difference (see Sec
tion 9.3.5)
P( = 0.04
1 tSA = Pc - P, 0.016

n - 1906
d 2.5) X 1906 = 762
(I 0.7) X 1906 = 2723 (adjusted for 30%

Equation 9.10, Section 9.4.1.4
c. Results

loss)
r. - (I 0.7) X 762 = 1089 (adjusted for 30%
loss)
5^ + ^= 5(1089)+ 2723 = 8168

(1.645 + I.645)2 (0.040 + Q.Q24)
(0.040 - 0.024)2
"c 2707
(1/0.8) X 2707 3384 (adjusted for
20% loss)
N = 3384 + 3384 = 6768

T'-e calculations shown above yield results quite
. - l.tr to those in the Coronary Drug Project
nc a different method. The total number of
-j- cnts derived via that method, after adjust-mt tor losses, was 5(1117) + 2793 = 8378.

c. Results
_ 2(1.960 + I.645)2 (140 mm Hg)2
(4 mm Hg)2

228
(1/0.7) X 228 = 326 (adjusted for 30%
loss)
n, = 326
N. = 326 + 326 652

9.6.5 Illustration 5: Sample size
calculation using Fisher’s exact test
a. Design specifications

a. Design specifications (Coronary Drug Pr«ect Research Group, 1973a)

• 6 4 Illustration 4: Sample size
eilculation for blood pressure change

• Number of treatment groups (see Section
9.3.1): 6 (i.e., 1 control and 5 test trejt
ments)
• Outcome measure (see Section 9.3.2): death
• Follow-up period (see Section 9.3.3). dihimum of 5 years
• Alternative treatment hypothesis in hinan
outcome (see Section 9.3.5): one-sided
• Detectable treatment difference in binary out
come (see Section 9.3.4):
Pc = 0.30 (5-year control treatment mor
tality rate)

: Ikvgn specifications

• Number of treatment groups (see Sec
tion 9.3.1): 2 (i.e., I control and 1 test treat
ment)
• Outcome measure (see Section 9.3.2): blood
pressure change after 3 years of treatment
• follow-up period (see Section 9.3.3): 3 years
• Xhcrnative treatment hypothesis in mean
change from baseline (see Section 9.3.4):
tuo-sided
• Iktectable treatment difference in continu
ous outcome measure (see Section 9.3.5):
A 4 “ uj, - gj, = 4 mm Hg (expected dif
ference in mean change from base
line)
n- _ 100 mm Hg2 (variance of a single
blood-pressure measurement)
0.3 (correlation between a baseline
P
blood-pressure measure and the
measure after 3 years of follow-up,
both taken on the same individual)
2(1 - p)o2= 2(0.70)100 mm Hg2 =
140 mm Hg2
• I rror protection (see Section 9.3.6): ct
0 05. 0 = 0.05
• Xllocation ratio (see Section 9.3.7): 1:1 (i.e.,
A - I)
• I osses to follow-up due to dropouts and noncompliance (see Sections 9.3.8 and 9.3.9):
vy;
• Treatment lag time (see Section 9.3.10): 0

Error protection (see Section 9.3.6): a ~
0.01. P = 0.05
Allocation ratio (see Section 9.3 "i
for a contrH
l:l:l:l:l:2.5 (i.e., X = 1/2.5,
I
group that is 2.5 times as large as any of the
five treatment groups)
Losses to follow-up (see Section 9.3.9). 0G
Losses due to dropouts and noncomplianct
(see Section 9.3.9): 30% after 5 years of
follow-up
Treatment lag time (see Section 9.3.10): 0
b. Method of calculation

Equation 9.7, Section 9.4.1.2
c. Results

",

87

(2.326(0.279-0.721-(0.400+1) 0.400]'
+1.645(0.300-0.7OO+O.225-O.775
0.400]* )2/O.O752

k Method of calculation

huation 9.14, Section 9.4.2.2

I

• Number of treatment groups (see Sec
test treattion 9.3.1): 2 (i.e., 1 control and rtt/.
ment)
• Outcome measure (see Section 9.3.2): death
• Follow-up period (see Section 9.3.3): 2 years
• Alternative treatment hypothesis in binary
outcome (see Section 9.3.4): one-sided
• Detectable treatment difference in binary out
come (see Section 9.3.5):
Pc = 0.5
P, = 0.1
= 0.4
• Error protection (see Section 9.3.6): a
0.05. P = 0.10
• Allocation ratio (see Section 9.3.7): 1:1 (i.e.,
X = 1, equal numbers in the test and con
trol groups)
• Losses to follow-up (see Section 9.3.8): 0%
• Losses due to dropouts and noncompliance
(see Section 9.3.9): 0%
• Treatment lag time (see Section 9.3. !0):0
b. Method of calculation

Use tables produced by Haseman (1978) or Casagrande and co-workers (1978) and compare
the result with that obtained using the chi-square
and inverse sine transform approximation.
Results
N = 50 (25 in each group) from sample size
tables in Haseman (1978) or Casagrande
et al. (1978)
42 (21 in each group) chi-square approx
N
imation (Equation 9.6. Section 9.4.1.2)
N 40 (20 in each group) from inverse sine
transform approximation (Equation 9.8,
Section 9.4.1.3)

88

Note that n,p( = 4 is below the limit specified for
use with the chi-square and inverse sine trans
form approximations and that they underesti
mate the required sample size.

9.6.7 Illustration 7: Power for design
specifications given in Illustration 2
for 1500 patients per treatment group

9.6.6 Illustration 6: Power calculation
based on chi-square and inverse sine trans
form approximation

As given in Illustration 2 except:

a. Design specifcations
’W-"'

9.7 Posterior sample size and power assessments

Sample size and power estimates

• Number of treatment groups (see Sec
tion 9.3.1): 2(i.e., 1 control and I test treat
ment)
• Outcome measure (see Section 9.3.2): death
• Follow-up period (see Section 9.3.3): 5 years
• Alternative treatment hypothesis (see Sec
tion 9.3.4): two-sided
• Detectable treatment difference in binary out
come measure (see Section 9.3.5):
Pc = 0.40
Pt = 0.30
A^= Pc - P,= 04-0.3 = 0.1
• Error protection (see Section 9.3.6): a =
0.05, P to be determined
• Allocation ratio (see Section 9.3.7): 2:1 (i.e.,
X = 2, twice as many patients in the testtreated group as in the control-treated
group), with:
nc= 300
n, = 600
N = nt 4- nc = 900
• Losses to follow-up (see Section 9.3.9): 0%
• Losses due to dropouts and noncompliance
(see Section 9.3.9): 0%
• Treatment lag time (see Section 9.3.10): 0

a. Design specification

• P to be determined for indicated sample size
• nc and nt = 1500 (effective sample size, if
after reduction for 20% loss due to drop™
and noncompliance)

b. Method of calculation

Equation 9.20
c. Results

A = 1.645 - |0.040 - 0.024)/
[(0.040 + 0.024)/1500]*
= - 0.8045
Power =1 — 0 (—0.8045) I —0.21 =0^

9.6.8 Illustration 8: Power for design
specifications given in Illustration 4 for
150 patients per treatment group
a. Design specification
As given in illustration 4 except:

• ft to be determined for indicated sample mw
• nc and nt = 150 (effective sample size, i e.
after reduction for 30% loss due to dropout
and noncompliance)
b. Method of calculation

b.

Method of calculation

Equation 9.24

Equations 9.17 and 9.19
c. Results
c. Results

Chi-square approximation:

A = [1.960(0.333 0.667X1/300 + 1/600)'^
- 10.400 - 0.300 | ]/[0 400• 0.600/300
+ 0.300-0.700/600p
= - 1.0242
Power = I — (-1.0242) = 1 - 0.15 = 0.85

inverse sine transform approximation:
2lsin-1 \/o.4O — sin-1 x/030 I
A = 1.96 \/’l/300 + 1/600
= -1.012
Power = I — 4> (-1.012)
I -0.16 = 0.84

4 mm Hg
___
\/2(I40 mm Hg2)/150
= - 0.9677
Power =1-0 (-0.9677) = I - 0.17 = 011

A = 1.96 -

9.7 POSTERIOR SAMPLE SIZE AND
POWER ASSESSMENTS
The calculations made when the trial is planned
will provide the recruitment goal. Howeser. t
goal may have to be changed during the tn

. „ cample, it may have to be raised if the
event rate for the control-treated group
X the early stages of recruitment is lower
; espected or there is more loss of precision
.
noncompliance and dropout than onpenvisioned. The period of follow-up may
^e to be extended as well. Extension of follow
er mav he the only option available if recrmt4m has been completed when the shortfall in
■nired error protection is first recognized.
' there are occasions where an overestimate ol
P in the planning stage may be offset by lower
,n expected dropout and noncompliance rates
■ Jnnc the trial. For example, this was the case in
■he ( PP. The actual five-year mortality in the
r Mcbo-treated group was lower than expected,
tui so were the dropout and noncompliance

89

rates (Coronary Drug Project Research Group,
1973a, 1975).
Power calculations should be made at the end
of the trial using the observed sample size and
actual losses due to noncompliance and drop
outs. Such calculations should be a part of any
finished report where the observed treatment ef
fect is small and the authors, therefore, conclude
in favor of the null hypothesis of no difference
among treatment groups. The calculations, as
noted by Freiman and co-workers (1978), are
useful to readers when trying to decide whether
or not to accept the author’s conclusion. A
reader may be inclined to accept the conclusion
if the estimated power of the study was large
enough to detect an important difference, but
not otherwise (see also Mosteller et al., 1980).

10.2 Adaptive randomization 91

10. Randomization and the mechanics of treatment masking

Chance favours only those who know how to court her.
Charln S. ,

10.1 Introduction
10.2 Adaptive randomization
10.3 Fixed randomization
10.3.1 Allocation ratio
10.3.2 Stratification
10.3.3 Block size
10.4 Construction of the randomization schedule
10.5 Mechanics of masking treatment assign
ments
10.6 Documentation of the randomization scheme
10.7 Administration of the randomization pro

ment worksheet for block ’
size k
Table 10-5 Illustration of Moses-Oakford >
gorithm
Table 10-6 First 25 lines of page 17 of TV
Rand Corporation’s I mil' random digits
Table 10-7 Items that should be included the written documentation of iV
allocation scheme
Table 10-8 Safeguards for administration
treatment allocation schedule*
Table 10-9 Sample CDP treatment allocj* •
schedule
Table 10-10 Sample CDP allocation form jV
envelope
Table 10-11 Reproduction of 20 sets of rand *
permutations of first 16 inter'’
from page 584 of Cochran a-d
Cox (1957)
Table 10-12 Allocations for Illustration I
Table 10-13 Allocations for Illustration 2
Table 10-14 Allocations for Illustration .1
Table 10-15 Allocations for Illustration 4
Table 10-16 Sample allocation schedule fmthe Macular Photocoagulat '•
Study for Illustration 5
Table 10-17 Allocation schedule for double
masked drug trial described
Illustration 6
Figure 10-1 Stylized bottle label for medxa
tions dispensed in the XW <r»

cess

10.8 Illustrations
10.8.1 Illustration I: Restricted randomiza
tion using a table of random permu
tations
10.8.2 Illustration 2: Unblocked allocations
using a table of random numbers
10.8.3 Illustration 3: Blocked allocations
using the Moses-Oakford algorithm
and a table of random numbers
10.8.4 Illustration 4: Stratified and blocked
allocations using the Moses-Oak
ford algorithm and a table of ran
dom numbers
10.8.5 Illustration 5: Sample allocation sched
ule for the Macular Photocoagula
tion Study using pseudo-random
numbers
10.8.6 Illustration 6: Double-masked alloca
tion schedule using the Moses-Oak
ford algorithm and a table of ran
dom numbers
10.8.7 Illustration 7: Sample CDP double
masked allocation schedule
Table 10 I Stratification considerations for
randomization
Table 10-2 Blocking considerations
Table 10-3 Moses-Oakford assignment algo
rithm for block of size k
Table 10-4 Moses-Oakford treatment assign

10.1

INTRODUCTION

A valid trial requires a method for a«ir -r
patients to a test or control treatment tha ■' ■ ~
of selection bias. The best method for en u.
bias-free selection is via a bona fldc rJnd‘ hi
tion scheme as discussed in Section 8.4 o

90

x Nonrandom methods may be used, but
■
.11 suffer from defects that can be avoided
.•h randomization. Hence, randomization is
x .»nl\ method of assignment discussed in this

comes in the treatment groups (outcome
adaptive)

The biased coin randomization procedure,
proposed by Efron (1971), is an example of a
number adaptive scheme. It is an alternative to
general designs exist for randomization
blocking in a fixed randomization design (see
• njnents to treatment: adaptive randomization
Section 10.3.3). Patients are assigned to the treat
’ tiscd randomization. With fixed randomiment groups with preset probabilities so long as
• ron schemes, the assignment probabilities
the difference in the number of patients assigned
--jin fixed over the course of the trial. In
to the treatment groups remains within a speci
randomization schemes (also referred
fied range. The probability of assignment to a
as dynamic randomization, but not in this
test treatment is increased or decreased, relative
s .-ii assignment probabilities for the treatto that for the control treatment, when the range
-rii* change as a function of the distribution of
is exceeded.
.ms assignments, observed baseline characBaseline adaptive randomization is designed
Mie*, or observed outcomes.
to make certain that the treatment groups are
I h< emphasis in this chapter is on fixed ranbalanced with regard to important baseline char
! -m/ation. Only a brief overview of adaptive
acteristics that may affect the outcome measure.
•iM-'mizaiion is provided (Section 10.2). Fixed
In this approach, the assignment probabilities
i-d-mi/ation is easier to manage than adaptive
are a function of observed differences in the
i-domi/ation. Assignment schedules can be
baseline composition of patients already en
r crated before the start of patient recruitment.
rolled (Begg and Iglewicz, 1980, Freedman and
Psi* not possible with most adaptive schemes.
White. 1976; Friedman et al., 1982; Pocock.
s^A-nment must be generated as needed. Fur1983; Pocock and Simon, 1975; Simon, 1977).
the generation process is usually compliThe main advantage of the technique is the op
red enough so that it has to be done on a
portunity it provides for balancing the composi
-puter to keep track of previous assignments
tion of treatment groups on several different
i-sl an\ other data used in the adaptation pro
baseline characteristics without stratification
ru All of the trials listed in Appendix B, ex(see Section 10.3.2). The main disadvantage is in
rrt one the National Cooperative Gallstone
its administrative complexities. The technique
v xh used fixed allocation schemes. None of
cannot be managed without a computer.
V 113 reports of trials reviewed in Chapter 2
The play-the-winner scheme, proposed by
rr r anv indication of having used adaptive ranZelen (1969), is an example of outcome adaptive
• •niz.ition. However, this count may be somerandomization. The simplest version is one in
• *•41 deceptive in that many of the reports
volving only one test and one control treatment,
► bed the details needed to reach a definitive
where the first patient enrolled has the same
-dement regarding the method of treatment asprobability of being assigned to either treatment,
’ foment used.
and thereafter the assignment received by each
patient is a function of the outcome observed
and the treatment assignment of the preceding
II.2 ADAPTIVE RANDOMIZATION
patient. The assignment will be the same as for
T^rre are three general types of adaptive ranthe preceding patient if the outcome observed
• “iiration:
for that patient was favorable. The assignment
will be to the other treatment if the outcome was
• I hose in which the assignment probabilities
unfavorable. Hence, the name, play-the-winner.
arc modified as a function of observed de
The main difficulty with the scheme, at least
partures from the desired allocation ratio
with simple versions such as the one described, is
inumber adaptive)
that it allows an investigator to predict the next
• Those m which the assignment probabilities
assignment, thereby introducing the possibility
are modified as a function of differences in
of bias into the patient selection process. A sec
(he observed distribution of baseline char
ond limitation is the need to determine the out
acteristics among the treatment groups
come for the last patient enrolled before the next
I baseline adaptive)
one can be enrolled.
• Those in which the assignment probabilities
The play-the-winner algorithm has been modare modified as a function of observed out-

1

92

I
f
in

10.3 Fixed randomization 93

Randomization and the mechanics of treatment masking

ified to incorporate outcome information from
multiple patients (Wei and Durham. 1978). This
modification eliminates dependence on the last
outcome observed and therefore makes it more
difficult for an investigator to predict the next
assignment. However, even modified in this way,
the scheme has limited utility. The ability to
identify a “winning” treatment and to have that
knowledge influence treatment assignments dur
ing the patient recruitment process is minimal in
most trials requiring long-term follow-up for out
come assessment.

10.3

FIXED RANDOMIZATION

Fixed randomization schemes require specifica
tion of the:

The considerations involved in making these
specifications are outlined in the subsections
that follow.

103.1

Allocation ratio

The number of allocations made to any one of
the study treatments is a function of the assign
ment probabilities—assumed to be set in ad
vance of patient recruitment and to be held fixed
over the course of recruitment in fixed alloca
tion schemes. The only changes that occur are
due to major design modifications, such as oc
curred in the University Group Diabetes Pro
gram (UGDP) with the addition of a fifth treat
ment (phenformin) some 18 months after the
start of patient recruitment (University Group
Diabetes Program, l970d).
The allocation of patients to the study treat
ments can be uniform or nonuniform. A design
will be characterized as uniform if the assign
ment probabilities for the t test treatments and
control treatment are equal, i.e.,

PI=P2= •• = P, = •..= P,+|

(10 I)

where

Pi, i = I, •• , t, denote assignment pro
babilities for the t test treatments
and
P.

denotes the assignment probability
for the control treatment

and where
r+i

X p,= i
1=1

Table 1©-l

Stratification considerations for randomiza

tion

• Only variables that are observed and recorded before

ie.3.2

Stratification

Mraiification1 during patient enrollment in..ohcs the placement of patients into defined
,!rJia for randomization. It is done to reduce or
< mir.ate variation in the outcome measure due
•p the stratification variable(s) (see Table 10-1
■or points concerning stratification during ranj..miration). A variable is said to be controlled
when patients are assigned to treatment in such a
as to ensure that it has the same distribu• .m m all treatment groups. Separate allocation
khcdules are required for the various levels or
eates assumed by the variables to be controlled.
Vocations to each stratum are made using the
umc allocation ratio as for all other strata. A
Khcmc requiring control of sex would require a
separate allocation stratum for males and for
•rmalcs. A scheme requiring control of sex and
»r. the latter classified at three levels (e.g., <45,
4' through 55, and >55), would require six (i.e.,
2 ’) allocation strata, one for each age level and
sex combination. In general terms, 5 stratifican.»n variables with /, levels for the ith variable
•ill produce a total of // • 7; • • • lj " ls.r ls alloiition strata.
I he term stratification, as used throughout
this chapter, refers to a process that takes place
m coniunction with randomization, and that is
based on data collected prior to randomization.
Mratification that is done in conjunction with
data analyses, as discussed in Chapter 18, is
it'erred to as post-stratification. Both forms of
Gratification may be used in the same trial, but
n«'< on the same variable.
The main arguments for stratification involve
• combination of philosophic and statistical con’•dcrations. Ideally, the goal in any trial is to
carry out the comparison of the study treatments
" groups of patients that are identical with regird to all entry characteristics that influence the
outcome measure. The best way to achieve the
ri3l is via matching for all variables of concern.
However, it is impractical for reasons discussed
'fl ( hapter 8. The best that can be done is to
Gratify the study groups on a few variables and
then to randomize within those strata.
Clearly, there is a practical limit to the numof variables that can be realistically con
trolled via stratification. The number of strata

where r, is the expected number of assignments
to the rth test treatment and where all values of are expressed as integers, reduced so as to have
no multiplier in common other than I (e g . m
allocation ratio involving I assignment to the
test treatment for every 2 assignments to the
control treatment would be expressed as a rat <
of 1:2). Expressed this way.

Srr=l

• Block size

i'i’tal.

ri:r2:--:rl:--:r, + |

/+l

• Allocation ratio
• Allocation strata

rm (Coronary Drug Project Research Group,

It will be characterized as nonuniform if there N
at least one probability value in Equation in |
that differs from the other values in the equ>
tion.
The entire allocation scheme for the tnal can
be expressed as a ratio of t + I numbers.

(in ?>
B

where B is the minimum block size (see Glo««n
and Section I0.3.3). For example, the mimmur
block size in a 2-treatment trial with an alloca
tion ratio of I: I is 2. It is 4 if the allocation ratio
is I ■ 3. It is 5 for a 3-treatment trial with an
allocation ratio of 2:2:1.
All t + I values of r are equal to I in uniform
fixed allocation designs. At least one value of»
will be greater than I in nonuniform fixed allo
cation designs.
The most common allocation design is one
involving uniform allocation. All of the trials
sketched in Appendix B, except three, were of
this type (see line 20e, Table B-4, Appendix Rt
Uniform allocation should be used, except where
there are valid reasons to allocate a dispropor
tionately larger number of patients to one treat
ment than to another. The reasons may have to
do with the cost of one treatment versus another,
the way of administering one versus another, or
the presumed safety or efficacy of one verin
another (see Persantine Aspirin Reinfarction
Trial Research Group, 1980a, for example of
nonuniform allocation). Other reasons relate to
statistical considerations, as discussed in Section
9.3.12, where the study involves multiple text
treatments, each of which is to be contraMed
with the same control treatment. A third xet of
reasons relate to secondary research aim' that
are best pursued via use of nonuniform alloc*
tion. One of the reasons why the Coronan Drug
Project (CDP) enrolled more patients in the
placebo-treated group than in any of the ten
treated groups had to do with a secon an

' b* to be confuted with poststratification (see Glossary).

I

randomization may be used for stratification in the
treatment assignment process.

• Increased statistical efficiency resulting from stratifica
tion is minimal for trials involving >50 patients per
treatment group.

• It is impractical to control for more than a few sources
of variation via stratification at the time of randomi
zation (i.e., generally no more than two or three).
• Use of a large number of allocation strata may allow for
fairly large chance departures from the desired alloca
tion ratio if there are only a small number of patients
per stratum.

• Any gain in statistical efficiency resulting from stratifi
cation using a given variable will be a function of the

relationship of that variable to the outcome measure.
The gain will be small to nil if the relationship is weak
or nonexistent. It will be greatest for variables that are
highly predictive of outcome.

stratificathion on any patient characteristic complicates
the ranfl
randomization process; it may prolong the time

• —

needed to clear a patient for enrollment if stratifica
tion depends on readings or determinations made out
side the clinic.
• Variables used for stratification should be easy to ob
serve and reasonably free of measurement error.
• Variables that are subject to major sources of error due
to differing interpretations should not be used for
stratification. They are of limited use for variance

control and the errors made may open the study to
criticism when the results are published.
• It is unreasonable to expect that all important sources of
baseline variation can be controlled via stratification
during randomization. Analysis procedures involving
post-stratification and multiple regression will be re
quired to adjust treatment comparisons for baseline

differences not controlled via stratification.

• Use of any stratification scheme that involves calcula
tions or complicated interpretations should be
avoided, especially in self-administered randomiza
tion schemes where the calculations or interpretations
are not checked before treatment assignments are
issued.

• Clinic should be used for stratification in multicenter
trials. This form of stratification will control for dif
ferences in the study population due to environmen

tal. social, demographic, and other factors related to
clinic.

quickly reaches unmanageable limits when a
number of different variables are used. As a
result, the choice of variables must be judicious
and by definition must be limited to variables
that are independent of the treatment assign
ment. In addition, the choice should be limited
to variables that are not subject to large observa
tional or recording errors so as to minimize clas-

94

I'

£

&

Randomization and the mechanics of treatment masking

sification errors made in the stratification pro
cess.
The gain in statistical precision from stratifi
cation is inconsequential once the number of
patients per treatment group reaches 50 or more.
The greatest gains are for small trials involving
20 or fewer patients per treatment group (Griz
zle, 1982; Meier. 1981).
Clinical trial researchers are divided over the
wisdom of stratification at the time of randomi
zation. Those in favor of the process presume
that even if it does not increase statistical preci
sion it is unlikely to reduce it. Therefore, why
not stratify? Those who question use of the pro
cess argue that the statistical gain, at best, is
likely to be small. This fact, coupled with the
practical complexities involved in administering
the process, serve as the main arguments against
stratified randomization (see Brown, 1980, for
pro arguments; Meier, 1981, and Peto and co
workers, 1976, for con arguments). The diversity
of opinion is reflected in the trials sketched in
Appendix B. Six of the trials did not stratify on
any patient characteristic. The other eight used
sex, age at entry, and/or some indicator of dis
ease state for stratification (see item 20.b, Table
B-4, Appendix B).
The goal in stratification is to reduce the vari
ance associated with treatment comparisons
through control of variables that affect outcome.
Clearly, there will be no reduction, and hence no
gain in statistical precision, if the variables are
unrelated to outcome. The more restrictive the
patient selection criteria, the less the need for
any stratification. The relationship of a variable
(e.g., age) to an outcome (e g., death), even if
quite striking when assessed over a broad range
of unselected patients, may be modest over the
range represented by patients enrolled into the
trial.
The CDP provides graphic evidence of the
futility of identifying factors that predict mortal
ity, the outcome of interest in that study and
several of the others sketched in Appendix B. A
multiple linear regression model, using 40 differ
ent baseline characteristics as predictors for mor
tality, accounted for only 10.6% of the observed
variance associated with mortality (Coronary
Drug Project Research Group, 1974). Risk
group, defined by number and severity of pre
vious myocardial infarctions and the only vari
able used for stratification other than clinic, had
little predictive value. It ranked 26 in the list of
40 variables in terms of predictive value. The five
most important predictors, in order of impor-

we" Fcc%aTStePWiSe 7greSsi°n procwJut
were. ECG ST segment depression, cardi-v
galy (as read from chest x-ravs) New )
Heart Association functional class, ventneu »conduction defects (as read from FC(M
history of use of diuretics. They accounted < over two-thirds of the total variance expire
by the model.
Stratification using patient charactcnvo
should not be undertaken lightly. It will corrcate the randomization process since au r
ments cannot be made until all data needed 1 -r
stratification are in hand. This mav delav.
times by weeks, the enrollment of a patient ■
needed data come from laboratory determr.
tions or readings made outside the clinic
ables that require a series of complex and err.f
prone classifications in order to be corner^
into values suitable for use in slratificj* should be avoided. The same is true for variant
requiring subjective interpretations. A high cr rate in the classification of patients by strata canegate the effect of stratification and mas oprthe study to criticism when the results are r a
lished.
Clinic is a natural stratification variable m
multicenter trials. All of the 13 multicenter tnah
sketched in Appendix B (see item 2O.b, Table R
4) used this form of stratification. The cautiom
expressed above with regard to use of path
characteristics for stratification do not appls to
clinic. Use of separate allocation schedule' H
clinic, with each schedule having the same allo
cation ratio, ensures comparability of the treat
ment groups with regard to the mix of paiier''»
coming from the various clinics in the trial I*»
assurance is important since clinic populati. "*
can differ widely with regard to a host of chat*,
teristics, even if the study has fairly rigid en-n
criteria. Patients will come from different geo
graphic areas and, hence, will have ddfert-t
environmental exposures and perhaps dem
graphic characteristics as well. Further, then
may be subtle differences in treatment patter**
from clinic to clinic, even if the study has a well
defined treatment plan. In addition, there are
practical reasons for the stratification, especial''
in masked drug trials in which clinics receive the
drugs they are to use in coded bottle* from a
central supply point. It is much easier for the
supplier to estimate the drug needs of mdiudua
clinics if the allocation ratio is fixed aero** cl
ics than when it is not.
Clinic variation in outcome event rates cm
seen from inspection of the UGDP results ihc

10.3 Fixed randomization
-jmber of deaths recorded ranged from a high
• ’1 out of the 90 patients enrolled in the Cin-n3(i clinic to a low of I out of the 87 patients
'moiled m the Baltimore clinic when the first
^jlts from the trial were published (Univer. n ('.roup Diabetes Program Research Group,
i-rikl. Four of the 12 clinics accounted for a
•tie over 70% of all deaths reported. Critics of
•r study cited clinic variation in mortality as
x of the explanations for the tolbutamide re.; (see Chapter 7). However, in doing so they
•* ed to recognize that the variation was uniri\ to be treatment-related because of the strat• cition by clinic in the randomization process.
V>rmally. the question of who treats within a
.'me is ignored in the randomization process,
xme of the trials sketched in Appendix B conled for this source of variation. Physician-to-‘•\sician variation in treatment practice may be
.-.ill m masked drug trials, but may not be in
.■’masked trials, especially those involving surgiprocedures. It may be appropriate in such
.
to control for anticipated variation by strat•• ne on treating physician.
statistical considerations are only one reason
1 r stratification. It is sometimes done simply as
i r'-n to protect the study from criticism when it
s ! nished. Indeed, it is easier to answer criti.■»ib concerning the comparability of the study
f'up< if the criticisms focus on variables that
‘re been stratified. However, defensive stratifir “n can backfire if the variables selected are
■r*cd by critics as “inappropriate” or if they are
z*'? io make cogent arguments suggesting that
•her “more important” variables were left un^•ntrollcd.
''’ratification is also used to control for a
■z’uhle known or suspected to interact with
'raiment (see Glossary for definition of treat•rnt interaction). Stratification of this sort
-‘■■’uld be considered for any variable that, de^•xling on its level, has the potential of amelio•• ne or enhancing a treatment effect. The
fpenmenter, via stratification, is able to comrar? treatment effects across strata and thereby
mate the size of the interaction effect. In
k.-ual fact, however, most interactions, unless
arc pronounced, are difficult to detect. The
’'F'cal trial, because of its small size, provides
"'e statistical power for their detection.
I streme cases of interactions in which the
’’’atment has a positive effect when the interactt variable assumes one state and the opposite
*hen it assumes another state should not
** controlled via stratification. They should be

95

dealt with by constructing more restrictive selec
tion criteria so only patients who react positively
to the treatment are enrolled.
10.3.3

Block size

The investigator must decide whether to con
strain the randomization process so as to ensure
balance in the number of allocations made to the
various treatment groups in a stratum at various
points over the course of patient enrollment.
Unconstrained randomization may lead to im
balances in the baseline characteristics of the
treatment groups if there are, quite by chance,
long unbroken runs of assignments to the same
treatment and if the type of patients enrolled
changes over time. Table 10—2 lists considera
tions involved in blocking.
The desired allocation ratio in a stratum could
be achieved with a single blocking constraint if
the exact number of patients to be enrolled in
the stratum were known in advance. However,
this approach is not recommended. First, there
are few situations in which it is practical to
recruit to a set limit within a stratum. Hence,
failure to achieve the desired recruitment goal
could mean that the study closes far from the
desired allocation ratio. Second, the approach
may allow too much room for variation around

T«ble 10-2

Blocking considerations

Blocking should be considered if:
• Patienl enrollment is likely to continue over an ex
tended period of time, or if the demographic or
clinical characteristics of the study population can
be expected to change over the course of enrollment
• There are practical or statistical reasons why it is
important to satisfy the specified allocation ratio at
various points during the enrollment process

Block size considerations:
• The smallest possible block size is the sum of integen
defined by the allocation ratio (see Equation 10.2)

• The block sizes used for construction of an allocation
schedule should not be divulged until it is appro
priate to do so- and never before patient enroll
ment is completed
• The larger the block, the greater the chance of depar
ture from the specified allocation ratio
• Variable block sizes are preferable to fixed blocks.
especially in unmasked trials
• Use of a large number of allocation strata may lead to
a large departure from the specified allocation
ratio, unless small block sizes are used within each
stratum

96

Randomization and the mttechanics of treatment masking

the desired allocation ratio over the course of
patient enrollment. For example, the constraint
in a trial involving two treatments, a I: I allo
cation ratio, and a single block of 100 patients
does not take effect until 50 assignments have
been made to one of the two treatment groups.
Hence, in theory it is possible that the results of
the trial could be completely confounded with
time of enrollment if the First 50 patients are
assigned to the same treatment. A third reason
has to do with the need for interim analyses over
the course of the trial, as discussed in Chap
ter 20. These analyses are easier to interpret if
large departures from the desired allocation
ratio have been avoided. Certainly, blocking is
recommended any time recruitment extends
over a long period of time.
The usual approach to blocking in fixed allo
cation schemes is to use a sequence of blocks of
the same size or of differing sizes, each of which
is constructed using the same allocation ratio.
All of the 14 trials sketched in Appendix B,
except two—the National Cooperative Gall
stone Study (NCOS) and the Veterans Admin
istration Cooperative Studies Program Number
43 (VACSP No. 43)—used this approach.
The blocking arrangement used should not be
revealed to clinic personnel until it is appro
priate to do so (after patient recruitment is com
pleted in unmasked trials and after the trial is
completed in double-masked trials). Further, the
scheme used should be designed to minimize the
chance of clinic personnel discovering the block
ing scheme. Discovery of the scheme can lead to
selection biases if the information is used to
predict future assignments and if the predictions
influence decisions on enrollment. The probabil
ity of making correct predictions is highest with
simple blocking schemes involving small blocks
of uniform size. For example, it is 0.5 in designs
involving two treatment groups and an un
masked treatment assignment scheme using
blocks of size two. The chance of discovering the
blocking pattern is minimal with large blocks,
even if blocks of uniform size are used, espe
cially if treatments are administered in double
masked fashion as in the CDP (see Section 10.5
and Coronary Drug Project Research Group,

1973a).
The preferred approach, particularly in un
masked trials, involves a mix of different block
sizes with the order specified. One arrangement
is to have the blocks filled in order according to
size. This arrangement may be considered if
blocks of several different sizes are used and if

10.5 Mechanics of masking treatment assignments 97

the largest block represents a sizable fract™
the total numbers of assignments anticipated a stratum. The arrangement reduces the am-tijof variation around the specified allocation rr
as recruitment proceeds-a desirable feature •
the designers wish to have an observed all^,
tion ratio that is near the specified one
recruitmcnt is finished. An alternative appr..^
involves a random order of blocks according ...
size. It is preferred to the ordering describe!
above when only two or three different hl.u
sizes are used and when each stratum conta -o
several blocks of each size. The random ordcri-r
eliminates any chance of clinic personnel disc.o
ering the blocking pattern.
The usefulness of blocking can be reduced K
the use of too many allocation strata. (There cabe large departures from the desired aflocat ratio if none of the blocks in the individual stra'i
are filled by the time patient recruitment is
pleted. Use of small block sizes will help gua’!
against this problem, but their use may mcreaw
the chances of predicting future assignments *
discussed above.

, renerator that has been tested for the defect
j-d Mund to be free of it.
\n algorithm is needed to translate output
Stained from the randomizing device into treat-ent assignments. The translation is straightfor• ird for schemes based on tables of ran! m permutations, as in Illustration I in Section
'I It is more complicated for schemes using
-•put Irom tables of random numbers or from
p<udo-random number generators. The method
j-Hnhcd in Table 10-3 is based on an algorithm
r- posed by Moses and Oakford (1963) and can
v implemented using the worksheet displayed
- I able 10 4. Use of the algorithm is illustrated

*»M» IM

in Tabic 10-5 for a random sequence of numbers
selected from Table 10-6.

10.5 MECHANICS OF MASKING
TREATMENT ASSIGNMENTS
Masked administration of treatment (see Chap
ter 8 for discussion of the rationale for masking)
is feasible only in cases in which it is possible to
administer all study treatments in an identical
fashion and in which clinic personnel do not
need to know the identity of the treatment being
administered in order to care for the patient
receiving it. Most applications of masked treat-

Mo^es-Oakford assignment algorithm for block of sire k

Hlustraiion
(set Table 10-5)

t + I =4

I Specify number of treatment groups, t + I.

? Specify treatment allocation ratio, r\:r2:

'.rf. ••• rt+ ( such that

10.4. CONSTRUCTION OF THE
RANDOMIZATION SCHEDULE
The randomization schedule can be conslrudcd
once the design specifications outlined in 'xi
tion 10.3 have been set. Construction mas he
done using output from:

KI

’ Specify block size k such that it is > B and is divisable by B.
< Specify treatment symbols or codes.

’ Set down an arbitrary sequence of treatment symbols in column 2 of
worksheet (Table 10-4), such that the allocation ratio specified in
step 2 is satisfied.

• A published list of random numbers, e g r
provided by The Rand Corporation IH**'
• Published random permutations of a vt 7
numbers, e.g., those appearing in Cochriand Cox (1957) and Fisher and
(1963)
• A computer-based pseudo-random numhe?
generator

* (Generate a random number,* Nt, such that it is > I but
value in column 5, line k, of worksheet.

Tike treatment symbol on line N\, column 2, and record on line k.

C, from line I, col. 2, record on
line 8, col. 4

I (rots out symbol on line N), column 2. Record symbol given on line
V column 2. on line
column 3 (skip if
= k).

Cross out C, line I, col. 2, add T3
to line I, col. 3

’ (ienerate a new random number N2 such that it is > I but < k — I
and record in column 5, line k — I.

N2 = 4, record on line 7, col. 5

column 2 or from column 3. if any

appear in column 3, record on line k — I, column 4.
II Cross out the symbol appearing in columns 2 or 3. line N2. Record
tvmbol given on line k — 1, columns 2 or 3 on line Nj, column 3
(skip if N2 = k - I).
Repeat steps 9. 10. and 11 r ' ‘
.
reducing the upper limit of permissible randnm numbers by I for each
’ until all but the last assignach repetition
repetition**
ment has been made.

It < omplete the scheme by recording in column 4 the unused treatment

nmbol appearing on line I, columns 2 or 3.
n
*

ated via a random process.

■

I

Tl = Test treatment I
T2 = Test treatment 2
T3 = Test treatment 3
/V| = I. record on line 8, col. 5

IQ Tike treatment symbol on line

2. So termed because the numbers they generate are not i*
of a random process, but have properties similar to 1*10* rw

k=8
C = Control

k; record

column 4.

Methods such as coin flipping, where the orde*
of assignment cannot be replicated, are unaceptable (see Chapter 8).
Most computer statistical packages include
pseudo-random number2 generators. Thes ma»
be used for construction of the allocation wbed
ule, but with some caution. Output from some id
the generators involves serial correlations (e g
see Hauck, 1982). While the defect is not of gro’concern in the allocation process, it is best to uv.'

n= >

r2= I
r,= I

f+l
V rf- = B (see Equation 10.2).
(=1

^r,wn

Tl from line 4, col. 2, record on
line 7, col. 4
Cross out Tl, line 4, col. 2, add
T3 to line 4, col. 3

As outlined above

Take T3 from line I. col. 3, record
on line I. col. 4

page 17 of The Rand Corporation's 1 million random digits (1955), as reproduced in Table 10-6.

"z '* wr',,cn 10 a"ow the user to work from the bottom up on the worksheet illustrated in Table 10 4. It can be written to allow
W ihitJ0'" 'k' ,<>P down bu, ,h's arrangcment complicates keeping track of the permissible range for the next random number to be
h1'1 °^s as ou,lined. the limit for the next number to be selected is given by the line number of the next line on the sheet to be

98

10.5 Mechanics of masking treatment assignments 99

Randomization and the mechanics of treatment masking
Table 10-4

Treatment codes

Random numbers*

Block size

Column

Page

ZU

Row

Start_____

Allocation ratio

End____

Source:_____

(I)

(2)

(3)

(4)

(S)

Final

Random
number

Treatment assignments

Order of
assignment

Initial

Replacements

I
2

w
i.

to supply drugs to the study clinics (Coronary
Drug Project Research Group, 1973a).
In a typical drug trial, clinics will dispense
drugs by bottle number. The treatment assign
ment issued by the data center will indicate the
bottle number to be used. The simplest bottle
numbering scheme is one in which all bottles con
taining a given drug bear the same number or
letter designation. The trouble with such schemes
is that all patients on a drug are unmasked as
soon as any one patient on the drug is unmasked.
Use of a unique bottle number for every patient in
a clinic avoids this problem, but such schemes
complicate the logistics of supplying clinics with
needed drugs. A compromise between these two
extremes was used in the CDP. Each clinic was
supplied with sets of bottles, labeled from 1
through 30, as discussed in Illustration 7 of Sec
tion 10.8.7. This meant that clinics had some
where between 5 and 8 patients on the same bottle
number by the time recruitment was finished.

n order to deliver the required dosage of mco- c acid. Several of the medications could have
srn delivered via a smaller number of capsules
. oronarv Drug Project Research Group, 1973a).
H ue\cr. this would have required a different
- xrho for those drugs.
I he way in which medications are bottled and
jSelcd iv important. There is no value in going to
c’Tjt lengths to develop matching tablets or cap. o. if the test and control medications arrive at
•sc dime in different sized or colored bottles. The
• ■•rrences do not have to be great to destroy the
Subtle variations in the way the bottles are
.jrred or labeled may be enough to do the job.
I he hcM approach is to have all medications
sailed and labeled at the same facility, under
• cktK controlled conditions. The VA Cooperative
s-.hlies Program has established a central phar-xv to supply its trials with needed study medi.r .mv (Hagans, 1974). Various other trials, such
v the C DP. have contracted with a single facility

Moses-Oakford Ireatment assignment worksheet for block of size k

5

■

6

I

Illustration of Moses-Oakford algorithm

Table 10-5

7

8

Treatment codes

Random numbers*

Block size

Page

T/ »

tfUt. /

T3 -

XaZ- 2.
3

Allocation ratio

l:l: ||
k-l

2

End /7

27

Source

(4)

(5)

Replacements

Final

Random
number

1

T3

73

2

to.

(I)

•Reading rale.

ment administration arise in the context of drug
trials. Masking is accomplished by bottling,
packaging, labeling, and dispensing the test and
control drug in an identical fashion. Tablets may
have to be formulated using a taste-masking
substance, such as quassin as in the CDP As
pirin Study (Coronary Drug Project Research
Group, 1976), to obscure telltale tastes. Another
alternative is to use an enteric coating on the
tablets, provided the coating does not reduce the
bioavailability of the drug. Generally, masking
the identity of a drug is easier to accomplish if
the drug is contained in capsules than if it is
contained in tablets. The capsules help to ob
scure taste differences that may be present when
tablets are used.
There can be subtle differences in sheen, color,
or texture of tablets as well. For example, there
was a slight difference in the sheen of tolbuta-

Column

Start _2.

Order of
assignment
within block

mide tablets as contrasted with the correspond
ing placebo tablets in the UGDP. Howeser. the
difference was apparent only in indirect light,
and then only in side-by-side comparisons of the
two kinds of tablets. Such differences are
avoided with opaque capsules.
Trials involving multiple test treatments
should be designed with the goal of using i
single placebo unless it is not possible or practi
cal to do so. The goal cannot be achieved if the
study medications are dispensed in different
forms, as in the case of the UGDP. Two kinds <>
placebo pills were required, one to match tolbu
tamide tablets and the other to match phen •,rmin capsules.
Use of a common placebo imposes the same
pill schedule on all patients, regardless of treat
ment assignment. For example, the C
quired all patients to take nine capsules per das

3
4

(2)

(3)

Treatment assignments
Initial

X
X

T3

i

T3

3
Ta

5
6

(L

7

T/

8

* Reading rule:

u,

*

2

Row

2^

100

10.7 Administration of the randomization process

Randomization and the mechanics of treatment masking
Table 10-6

Items that should he included in the written

documentation of the allocation scheme

Column number
Row
number

Table IB-7

Mi.iJied portion of bottle label

First 25 lines of page 17 of The Rand Corporation's I million random digits

101

A. For

The XYZ Trial

5

10

15

20

25

30

35

40

45

50

5

00397
14328
88534
97347
01366

56753
44708
87112
87316
72976

53158
72952
68614
73087
01868

71872
27048
83073
77135
51667

68153
67887
88794
71883
63279

09298
28741
96799
98643
60040

20961
46752
67588
03808
88264

49656
88177
75049
08848
79152

33407
95894
84603
14133
03474

95683
40086
83140
60447
61366

10

20523
70603
48410
69788
33884

21584
97122
94516
41758
83655

93712
44978
15427
55004
88345

83654
78028
75323
30992
69602

89761
08943
71685
17402
52606

90154
13778
70774
63523
57886

96345
11080
50342
42328
18034

37539
34271
33771
87171
03381

32556
68266
03678
24751
75796

74254
8537?
42321
15084
35901

procedures

using

published

lists

of

random

numbers

Bottle number 42

• Reference citation to the published numbers
• Section of the table or list used (indicate enough detail

Rx Take one capsule each morning

Pate 3

7 <95

to allow regeneration of the schedule)

*.

I or: Iturrv L. Green

John Smith. M D
Phone. 555 1701

• Reading instructions indicating the order in which
numbers are read, including a description of any
modular arithmetic used to convert numbers out
side the usable range to usable values

■

15

77480
11057
79368
94385
92127

28683
98849
43710
01717
42588

68324
29499
80365
96191
93307

66035
21565
88735
50404
80834

07223
30786
75275
80166
11317

14926
83292
21664
93965
26583

16128
92392
57965
24688
25769

13645
37104
19002
27839
98227

90370
36899
00301
10812
14887

20

29148
33782
77653
52611
91857

68662
93424
55430
60012
47904

26872
16530
84644
88620
22209

72927
96086
00448
72894
78590

79021
17329
86828
94716
68615

51622
74020
58855
22262
52952

29521
11501
67451
99813
31441

33355
46660
95264
69592
41313

45701 45996
05583 22277
67386 82«24
63464 33163
18550 72685

25

68825
23727
84832
49771
23727

04795
54291
30654
11123
56577

53971
56045
48543
08732
51257

14592
61635
18339
49393
83291

39634
32186
65024
12911
12329

23682
90355
91197
72416
16203

76630
73416
64624
17834
91681

02731
63532
74648
18878
68138

81481 86542
24340 18886
09660 27897
62754 85072
79959 43609

• Specifications of the construciion process, such as
those listed for illustrations in Section 10.8

H ivtuhiihlc portion of bottle label

assignment list

• Copy of the assignment list

The XY7. Trial

31949
49906
12658
31715
58462

B. For procedures using computer based pseudo-random

Bottle number 42

number generators

In case of emergency open this label
lor bottle contents

• Reference citation to the pseudo-random number
generator

• Program listing of the pseudo-random number gener

Call: (301) 955 8M89
XYZ Coordinating ( enter

ator
• Seed used to start the generation process

• First and last numbers generated with the seed
ttwr» It-1

Stylized bottle label for medications dis-

• Specifications for the construction process, such as

those listed for illustrations in Section 10.8

m the XYZ trial.

• Computer programs used to generate the assignment

list

Krnts (1975), appointed to review the study
»Mit X years after the completion of patient en"ment, was especially interested in the randomirati<»n process used.

Sourer: Reference citation 387 Reprinted with permission of The Rand Corporation (New York: The Free Press.
1955). Copyright © 1955 and 1983 by The Rand Corporation.

However, it also meant that they could get by with
a much smaller inventory of drugs than would
have been required with individually numbered
bottles.
Most prepackaged medications in masked trials
will be supplied to clinics with a two-part label, as
illustrated in Figure 10-1. One part of the label
will be affixed to the package and dispensed with
it. It should bear the name of the study, the bottle
or container number, instructions for taking the
medication, and the name of the physician or
clinic responsible for dispensing the medication.
The other part of the label is loosely affixed to the
container. Its prime purpose is to indicate con
tainer contents, either on the face of the label (for
single-masked trials), or by breaking a seal (for
double-masked trials). It is required for interstate
shipment of drugs under federal law; it is illegal to
ship drugs across state lines without it. It is det
ached when the medication is dispensed and is
ordinarily retained at the clinic to allow clinic

• Worksheets or computer program used to generate the

II7 ADMINISTRATION OF THE
RANDOMIZATION PROCESS

personnel to unmask a medication in an emcr
gency.

An allocation scheme, no matter how carefully
.-nstructcd. will be useless as a means of pro•nting against patient selection bias if it is not
’ '<>wcd. Departures from the schedule to ac. rnmodate the desire of a patient or his physioan. no matter how well motivated, are never
■otified. They can invalidate the results of the
f’i’ire trial if they are numerous and if there are
rasons to believe they are treatment related. A
urriully executed trial will include various safenirds to make certain the assignment schedule
’ followed, as listed in Table 10-8.
I he preferred system is one in which alloca• ont are issued from a central point on a perpibent basis. The main advantage with such
’•vems. as opposed to systems with no central
control (c.g.t as in systems with envelopes placed
’he clinic to be used in the order provided),
m the audit trail provided and the oppor-

10.6 DOCUMENTATION OF THE
RANDOMIZATION SCHEME
There should be a written description of the
scheme used to generate the allocation schedule
It should be written when the randomization sche
dule is produced and should be checked for darn
and accuracy before it is filed for future reference
Table 10-7 provides an outline of the items to he
covered in the writeup. The details should he
sufficient to allow a person from outside the studs
to reproduce the schedule with the information

provided.
. .
The documentation may be needed to detenc
the study years after the completion of random*zation. The UGDP serves as a case in
Committee for the Assessment of Biomet
Aspects of Controlled Trials of Hypog sccmx.

1

• Copy of the assignment list

tunity to proscribe release of an assignment until
a patient has been shown to be eligible for en
rollment via the data provided, the required base
line data have been collected, and his consent to
participate has been obtained. The CDP used a

Table 10-8

Safeguards for administration of treatment

allocation schedules

• Avoid the use of any assignment scheme that has a high
degree of predictability (e.g., use of small blocks as

discussed in Section 10.3.3)
• Keep each treatment assignment masked to the patient.
physician, and person issuing the assignment until the
patient has been accepted into the study and is ready
to start treatment

• Vest responsibility for issuing assignments in an individ-

ual or group located outside the clinic

• Withhold disclosure of an assignment until the patient is

judged eligible for enrollment, has given his consent to
be enrolled, and all essential baseline data have been
obtained
• Make certain that the assignment process establishes a
clear audit trail that indicates who requested the as
signment and when it was issued

( M-l Lpk

102

10.7 Administration of the randomization process

Randomization and the mechanics of treatment masking

centrally administered mail-based assignment
scheme (Coronary Drug Project Research
Group, 1973a). The Coronary Artery Surgery
Study (CASS) used a centrally administered tel
ephone-based assignment scheme (Coronary Ar
tery Surgery Study Research Group, 1981).
Either scheme is preferable to one that is self
administered. Such systems are subject to the
abuses noted in Section 8.4.
Table 10-9 contains a facsimile of an alloca
tion schedule from the CDP, as used in the Coor
dinating Center for making assignments. The
allocation process required the clinic to initiate
the request. This was done by sending the forms
completed for a patient’s two prerandomization
visits to the Coordinating Center. An allocation
was not released by the Center if essential items
of information were missing from the forms, if
an eligibility stop condition (see Section 12.5.8)
had been checked, or if the clinic did not indicate

that a signed consent had been obtained (rrthe patient indicating his willingness to be en
rolled into the trial.
Once all essential conditions were met. a tree
ment assignment form was prepared (Pan K
Table 10-10). The bottle assignment recorded rthe form was taken from the first topmost cmr*'
line of the allocation schedule for the clinic av
stratum to which the patient belonged (the th •:
line in the sample schedule in Table 10 9) isID number and the name of the patient wrr
entered on the line. After entry of the requi-r:
data on the treatment assignment form, it **
placed in an opaque envelope (Part B. Table l<
10), which was then sealed and placed in a larrenvelope for mailing to the clinic. The inner evelope was retained in sealed condition at ihr
clinic until the patient returned for his final ba*
line examination and was judged ready to uatreatment. A patient was not considered enroll

n the trial until the clinic opened the treatment
^liKation envelope. Once this was done the
rjnent was counted as a member of the treat
ment group to which he had been ass.gned. Asvenments issued for patients who failed to re•urn lor their last baseline visit, or who withdrew
the.r consent at that visit, were not counted,
provided they were returned to the Coordinating
(enter in sealed condition. The ID numbers and
-jmes of such patients were deleted from the
j K-ation schedule on receipt of the sealed enve■pcs at the Coordinating Center. The assign-

Part A.

ments in question were not reissued. The small
amount of imbalance introduced in this way was
not considered serious enough to justify the ef
fort involved in reissuing the assignments.
The allocation schedule used by personnel in
the CDP Coordinating Center revealed the con
tents of the bottles assigned (see Table 10-9).
The presence of this information violates one of
the masking safeguards listed in Table 10-8.
However, there is no evidence that this informa
tion had any effect on the assignment process.
The mail system described was made possible

Sample CDP allocation form and envelope

Ttble 10-10

COP treatment al location form
treatment

wa have received your

al location

for

Mr .
Table 10-9

Sample CDP treatment allocation schedule

Order of
assignment
within block

Bottle
number to
be assigned

I
2
3
4
5
6
7
8
9
10
II
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

29
14
26
2

27
19
15
16
13

25
10
4
24
23
9

30
17
20
II

Bottle
contents

Patient
ID number

CPIB
NICA
PLBO
ESG2
ESC I

(56-001
(56-002

NICA
DT4
CPIB
PLBO
PLBO

ESGI
ESG2
PLBO
PLBO
DT4

6

ESG2
DT4
DT4
PLBO
CPIB

5
28
22
18
7

PLBO
ESGI
CPIB
ESGI
ESG2

8
I
12
21
3

NICA
PLBO
PI HO
PLBO
NICA

(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(

)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)

( JAMEI
( ASJON

(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(

Is

Identified

This person should race Ive
by the following number:

Patient name
or name code

)
)
)
)
)
I

)
)
I
)
)
)
)
)
)
)
)
)
I
)
)
I
)
I
I

)
I
)
)
I

Sourer: Reference citation 104. Adapted with permission of the American Heart Association. Inc.. Dallas.
Texas

I dent(tying number

• hose

The sealed tear off

portion

of the label on each bottle should

The patient’s name, treating
bo removed prior to dispensing.
physician, date and prescription number should be recorded on
the
label
prior to filing with the
the tear off portion of
patient’s prescription record.

Th« treatment should be Initiated at Initial visit 3 and should
be admlnls+ared on the fol IoeI ng scheduIe :
I

capsule three times a day after meals from Initial
visit 3 through Initial visit 4.

2

capsules three times a day after meals
visit 4 through Initial visit 5.

5

NOTE:

103

from

Initial

capsules three times a day a’+er meals a'ter Initial
visit 5 throughout th« remainder o’ the study on the
above named person unless clinically contraindicated
If the date on which the treatment allocation envelope
has been opened Is more than four months after the date
of Initial Visit I (which, as Indicated on Form 01, Is
______________ _) , this allocation must be returned unused
to the CDP Coordinating Center and this patient must
Start anew with Initial Visit 1.

Date of Allocation
CDP Coordinating Center
2'201
Balti more, Mary I and

I

104

Randomization and the mechanics of treatment masking

Part B:

10.8 Illustrations

Treatment allocation envelope

pouter (IBM S/23 DataMaster). The assignlynt was released via the computer, but only if
data forms entered by the clinic met the edit
•rMs necessary for assignment.
M ins trials, especially single-center Inals, which
jinnot arrange for a centrally administered allo.„n scheme, must rely on self-administered
Khemes managed at the clinic. The usual ap~.>.Kh in such cases is to place the assignments
- kc.tied envelopes arranged in a predetermined
-dcr with personnel instructed to use the enve■pck in order of arrangement, as indicated by
-. ’mhers appearing on the faces of the envexs Strict ground rules should be established
• indicate when envelopes are to be opened and
■ > ensure that patients are counted in the trial
•xr this has happened. Persons authorized to
•uw an allocation envelope should be required
•» check the prerandomization data form for
-umuk data and for exclusion conditions before
•‘•c envelope is opened. Documents completed in
•sr allocation process should identify the patient
• f whom the assignment was intended and the
• tic the envelope was opened. The time infor-ition is important when checks are made to
.‘-termine if envelopes are used in the order
-dicatcd.
I here is, of course, no method of allocation
•hat is completely foolproof. It is important for
,K ' reason to perform periodic checks for breakd-'wns in the assignment process, regardless of
n is administered. It is dangerous to assume
•hat the rules for allocation, no matter how exrlxitly outlined, will always be followed. The
checking that is carried out should be performed
s an individual or group of individuals not
J rrctly involved in the assignment process. For
rumple, such checks in CASS were made by an
ntcrnal review team during visits to the CASS
(■wdinating Center. A similar function can be
performed by the statistician or some other indi'•dual in the case of small-scale single-center
'r als using self-administered allocation schemes.

’’

CORONARY

DRUG PROJECT

TrajtiMnt A I I ocat I on

Mr.
1.0.

I

DO NOT OPEN untlI

Instructed to do so In Form 02

No.

tat Initial visit 3).

I
. 1

Bi
Sourer Coronary Drug Projecl Research Group

because of the time separation between initiation
of the request and treatment—generally about a
month. Telephone assignments were allowed
only when there was not adequate time to com
plete the mail circuit and then only if Coordinat
ing Center personnel were satisfied that the pa
tient in question was eligible for enrollment, that
the clinic had completed the necessary forms,
and that they had obtained his consent for en
rollment.
The scheme described above cannot be used in
cases where clinic personnel have to have the
assignment as soon as the patient agrees to en
roll. A system for making telephone allocations,
such as used in CASS, has to be used in such
cases, unless the study is willing to rely on a
noncentral self-administered scheme (not recom
mended). The procedure in CASS required
Coordinating Center personnel to carry out a
series of telephone-administered checks with the
requesting party before an assignment could be
released. They included:

• Checks for eligibility
• Checks on the disease classification (needed
for proper stratification)
• Checks to determine if the patient had signed
the study consent statement and had indi-

cated his willingness to accept either surr
eal or medical treatment.
• Checks to make certain a date for surgen h»!
been set (for use if the patient was asMgnr.*
to surgery)

CASS Coordinating Center personnel respom,
ble for issuing assignments were masked »itfc
regard to assignments until the telephone mtr
view was completed. This was done to protec
against premature disclosure of assignments dur
ing the interview process.
The telephone assignment process used m
CASS could be managed during the norm*
working hours of the Coordinating Center Th-*
may not be possible in studies involving climo
scattered across a large number of time rono
Extended hours of phone coverage will
needed in such cases. Twenty-four-hour pho’*
coverage will be needed when the trial insohn
emergency treatments that must he initiated »
soon as possible.
The advent of low-cost, stand-alone minicom
puters makes it possible to control the asMgnment process without any contact with the
coordinating center, as in the Hypertension Pre...............................
\ invention Trial (HPT). A clinic
in that stud*
itiated a request for assignment via an on-sitc

lU ILLUSTRATIONS

i

•

The illustrations in this section are designed to
•cquamt the reader with various techniques for
constructing allocation schedules. The first 5 illus
trations arc for unmasked trials. Illustrations 6
•nil are for masked trials. Illustration I involves
w of random permutations of a set of numbers
lor constructing the randomization schedule. All
the remaining illustrations, except Illustration
invoke use of random number tables. Illustra-

105

tion 5 involves use of a pseudo-random number
generator.
10.8.1 Illustration 1: Restricted ran
domization using a table of random
permutations
a. Specifications
• Treatment groups. 3
• Allocation ratio: 1-1:2
• Blocking constraints:
- Number of blocks: 3
- Block sizes: k/ = 12, k2 = 4, kj = 4
• Treatment masking: None
• Stratification variables: None
• Random permutation source: Cochran
and Cox (1957). See Table 10-11.

b. Approach

Step I Establish treatment notation. Let:
Tl denote test treatment I
T2 denote test treatment 2
denote control treatment
C
Step 2 Establish treatment coding rule.
Assign:
C for integers I through k/2
Tl for integers I + (k/2) through 31c/4
T2 for integers I +• (3A/4) through k
Step 3 Select a random start in table of ran
dom permutations. Set 7, Table 10-11, in this
example.
Step 4 Establish reading rules. Read from
left to right, i.e., use set 7 for first block, set 8
for second block, and set 9 for third block.
Skip numbers in a permutation set that ex
ceed the indicated block size.
Step 5 Record the assignment sequence. See
third column of Table 10-12.
c. Comment

Note that the allocation ratio of 1:1:2 is satis
fied in each of the three blocks.

10.8.2. Illustration 2: Unblocked
allocations using a table of
random numbers
a. Specifications

• Treatment groups: 2
• Allocation ratio: 1:1
• Blocking constraints: None

106

Randomization and the mechanics of treatment masking

10.8 Illustrations
Step 4 Establish correspondence between
numbers selected and treatment assignments,
f <>r this illustration use odd integers (1,3, 5, 7,
9» to designate assignment to the control treat
ment (C) and even integers (0, 2, 4, 6, 8) to
dexignate assignment to the test treatment (T).
Sr,r 5 Record the treatment assignment se
quence (see third column of Table 10-13).

Tible IM I Reproduction of 20 sets of random permutations of first 16 integers, from page 584 of Cochrjn
and Cox (1957)

Permutation set

y
ft

I

2

3

5

6

7

9
II
14
4
6
2
5
16
8
I
13
15
7
10
3
12

16
3
14
13
6
10
15
5
12
8
9
I
4
2
7
II

15 12 2
2 6 15
8 16 II
3
I
5
10 7 13
14 9 12
II 14 10
13 10 3
7 II
7
3 2
I
9
I
6
5 5 9
12 13 16
4 15 4
6 8 8
16 4 14

II
13
15
7
10
3
4
9
8
5
2
6
I
16
14
12

4 16
10
I
5 14
6 2
16 7
3 10
14 13
12 6
13 15
15 9
II
3
9 4
2 II
I 12
7 5
8 8

4

8

9

10

II 10
4 13
14 II
16
I
2 12
5 6
6 4
3 7
13 9
9 3
8 8
10 5
12 2
7 15
I 14
15 16

II

12

13

14

15

16

17

18

19

2 5
II
8
I 14
14 9
6 12
5 16
12 4
3 7
4 3
10 II
15
I
8 13
16 15
9 10
13 2
7 6

5

14

II

2

14

13

16

6

16

16

4

3

5

15

5

15

15

13

5

7

II

II

15
16

14

3

3

I

6

16

6

in

6

13

8

9

15

9

I

9

II
i

12

10

15

10

II

4

II

5

10

14

16

5

7

9

3

II

14

7

3

14

4

9
8 15
12 <4- 3

12
14
5

8

I

12

6

13

8

5

13

7

9

7

8

8

6

2

10

7

9

15

2

10
7

8

2

4

2

9

12

16

II
4

4

2
6

I
6

I

I

2

13
10

16

13
4

I

12

12

10

12

14

(omment
\.<tc that the sequence for the first 20 assign—nts provided 9 T assignments and 11 C as»mments for an observed allocation ratio of
i I 2 instead of the desired ratio of I • I.
|« 1.3 Illustration 3: Blocked allocations
oune the Moses-Oakford algorithm and
i uhle of random numbers

I
11

j Specifications

Same as for Illustration 2 except:

2

• Blocking constraints:
- Number of blocks: 4
- Block sizes: k/ — 10, k}

Source Reprinted with permission of John Wiley A Sons, Inc.. New York (copyright O 1957).

Table 10-12

Step I Same as for Illustration 2.
Step 2 Starting point: row 9, column 42,
Table 10-6.
Step 3 Reading instructions: Left to right to
end of row, then down, row by row. Use pairs
of integers as long as the remaining block size
is >10. Skip 00 and pairs of integers that
exceed remaining block size. Use single integ
ers once the remaining block size is <9. Ignore
0. (Note: Most numbers exceeding the remain
ing block size could be converted to the usable
range through subtraction of an appropriate
multiplier of the remaining block size if de
sired. For example, the number 53 converts to
9 by subtracting 44 if the remaining block size
is II. However, such arithmetic is tedious and
subject to error if done by hand and therefore
is not done in this example.)
Step 4 Set down an arbitrary order of treat
ments, as shown in column 2, Table 10-14.
Step 5 Establish the final order of treatment
assignment (column 4, Table 10-14) using the
Moses-Oakford algorithm (Table 10-3).
c. Comment

The table below gives the location of the first
and last numbers used from Table 10-6 for each
of the four blocks.

Allocations for Illustration I

Order of
assignment
I
2
3
4
5
6
7
8
9
10
II
12

h. Approach

4, kj=2.

*4=4

• Treatment masking: None
• Stratification variables: None
• Random number source: Rand Corpootion (1955)

107

Value from
Table 10-11*

-Block I

Treatment
assignment

4
10
5
6

C
T2
C
C

3
12
II
9

c

2
I
7
8

13
14
15
16

J-Block 2

17
18
19
20

^j-Block 3

T2
T2
Tl
C

c

Tl
Tl

I
2
3
4

C
C
Tl
T2

4
2
3
I

T2

•Starting point: Permutation set number 7, Table 10 II

C
Tl
C

Allocations for Illustration 2

b. Approach

Step I Establish treatment codes. Let:
C denote control treatment
T denote test treatment
Step 2 Select an arbitrary starting p*1^’
from table of random numbers. Suggested me
thod:
1. Arbitrarily open book to some page md
place the point of a pencil on the pjr
without looking. Use the three digits t.
immediate right and nearest the pomt t1'
designate the starting page (17 in exam
Ple)ii. Repeat the process described in step i te
select a starting column (22 in examp'*1
and row (3 in example) for the page v
lected (page 17 in the example, see lah<
10-6).
Step 3 Define order in which numbers are to
be used. Read from left to right and down H
row, beginning at the point designated in ep
2. Use single integers.

' * Jcr nf
t’y'^rnt

A

ID

II
12
H
14

r
16

r
.

i«
IQ
.X)

Random
number*

Treatment
assignment

8
7
9
4
9

C
C
T
C

6
7
9
9
6

T
C
C

7
5
8
8
7

C
C
T
T
C

5
0
4
9
8

C
T
T
C
T

•"» P°"« Row J. Column 22. Table 10 6.

Last number

First number

Block
number

Column

Row

Column

Row

22
40
50
14

10
10
10
II

I

42

9

2

31

10

3
4

50

10

3

II

c
T

10.8.4 Illustration 4: Stratified and
blocked allocations using the
Moses-Oakford algorithm and
a table of random numbers

a. Specifications
• Treatment groups: 3
• Allocation ratio: |:|:5
• Blocking constraints: Blocks of sizes 7 or
14 arranged in random sequence

~TT

TF
/(

Ij.

w

~TT

-ry

/<

□pu iuipeay ,

o.

Tq

'-pp^

-ry^T
C y, £
(ft
& O/ c^pp^-ry
fu^y^-ry )

J-

r oz

P

r

ij.

F

61
P ’US’

r
T-

81

V

Tl
TT

:■

n

k 41

/

s

P

91

~T

V

si

/

r vi

e

£

~

ci

z xia -

cl

II

jaquinu
ujopuey

siuxuj.teid^H

l«u!J

|e|Hul

(SI

<£>

U)

(I)

IF

*oa

uujn|Oj

P

‘ v
t.'tpo.) lUdlUJWJl

.suaqujnu tuopuey

6

n

8

~r

'D

V

jicj

roi

L

oqej uoiir.xiiiv

/ 7^7^^ - IJ-

0!

~b

I

^ / /

~TT^
Trpe,s

II

JJ.

(E|>

»jno$

3i

jaquinu Q|
liutity

jo japiQ

sjuiiuuiisse lujiuieaji
<V>

uiniCJis uiq)iM
iuauiu8isse

juoiuudtsse
JO JjpJQ

9

i xia ’

azis )|.>o|a

s

£■

P

~F

jaqiunu HX’ia

y*

f

p uoijEJjsnjn joj suoi)B3O||V SI~OI ’IQ"!

-ijjo ojolu v pasnun uiEiuaj pjnoM siuamuSissE
aqj jo j|Bq aouis jnjajSBM si siqj ‘jaAaMOjj |eu)
aqj jo |eo3 juaiujinjaaj aqj jaam oj suoi)E3O||E
q^noua q)iA\ auo qasa ‘sajnpaqos 3 dopAdp oj
aq p|noA\ qasouddE auo (SEjonb juauiqnjaaj 8ui
-a[oaui su^isap qjiM saiqnaijjip jo uoissnasip joj
tq jaidsqj aa$) tunjEJis uad sjuaijEd jo jaquinu
paijiaadsun ue jo juaiunoaua joj ajnpaqos uop
-Eao[[E ue jo uoipnjjsuoa sajinbaj maiqojd siq£
luaiuiuoj a

•$|-0| a|qB_L ‘Z uiun|oa ui u^oqs sc •sjuxSBaJi jo japjo XjEJjiqJE ue u^op PS t
0 pus 00 OJOU^I sJJ.iJia|3uis asn uaqj *6> si spew aq oi siuaiuu.xi-r
jo jaquinu SuiuiEtuaj |pun sja^ajui jo miN
Suisn iqSiJ o) ipi PBa>l :3lnJ ^u!PB3d
J’;s
•9 01 3NC1
•9 uiun|oa ‘oi i¥l0J -luiod Suiijbis *.
($1 01 31M*1
ui UAsoqs) sapoa juaunsaJ) qsqqEisj / Jm

Jjqwnu
uiopury

(£ 01 3IAB1) uiqiiJo2|B pjojifeo-sasohi
aqj 3uisn ($| 0l 3I9B± > uiunjoa) sjuaiu
-jesji jo jopao |buij aqi qsijqBjsa £ dais
601

suouDJunpj g o/

^luotua.iejd.')^

|eii<u|

k)uouiu8isse 1U.>IU|CJ1£

(S)

(VI

(£>

^x>|q uiiqim
|UJUiu8bse
jo jjpjQ

77“ ^7"

~7T pU3

FfT

“TT-UEIS

<IUI11|O.)

KK

jujuiuSisse
jo Jipo

(Cl

IXi&J "’tWJ TZK^y osjiios

*°a

($$61) uo,‘
-Bjodjoo pusy :aajnos jaquinu uiopuiH •
(SjaAa| z) | ’.saiqEiJEA uoiiBaijilEJis •
auoN :8ui)(SBiu juauncajj •

IBU.J

(n| i

777
oiicj uoi|c.x>||y

tH? 7* (7/

r7^ru^> = P

sozis 5|>0|a
7^

.sjjquKiu mopui'a
sapo.i |u.mi|r.'j £

'X'^IM J” ON

t UOIJBJJSnilj JOJ SUOl)B3O||V

ki-oi *iq«i

Hui^sdiu luaiuiDajj Jo sJiuoyaaio aifi puo uoijDZiuiopuvy

801

r-f

4'

110

r

f;

1

10.8 Illustrations

Randomization and the mechanics of treatment masking

cient approach is to generate sets of worksheets
arranged in the order generated as dictated by
the random sequence of block sizes used (in this
example, 7, 7, 14, 7, 14, etc.; see Table 10-15 for
third block of size 14). They are then used in
order, as needed, depending on enrollment pat
terns in the 2 strata. The first worksheet of block
size 7 is used to make assignments for patients in
the stratum represented by the first patient en
rolled into the trial. For example, if the stratifi
cation variable is sex and the first patient en
rolled is female, then the first worksheet is used
for the first 7 females enrolled. The second sheet
is not used until the eighth patient enters the
same stratum, or until a patient enters who qual
ifies for the second stratum, in this example, a
male. The stratum number is not placed on the
sheet until it is used. The lines in the column
labeled Patient ID number would be filled in as
the individual assignments are issued. The
numbers written in column la would depend on
the number of sheets already used for allocations
to the stratum in question. For example, they
would run from I through 14 if there had been
no previous assignments in the stratum, and
from 8 through 21 if a block of size 7 had already
been filled for the stratum.

• Blocking constraints: Uniform block size

• Random number source: Computer ha^
pseudo-random number generator

of 14

• Treatment masking: Double-masked
• Stratification variables: I (2 levels)
• Random number source: Rand Corpora
tion (1955)

b. Approach

Step / Establish treatment codes. Let
C denote control treatment
T denote test treatment
Step 2 Select a block size, 6 or 8, bv so-r
random or pseudo-random process (size 6 *
this Illustration).

*• Approach

The first step is to denote the bottle numbers to
used. The designation in this Illustration was
m.»de bv arbitrarily selecting a random permutat.on of the first 16 integers (set 12, Table 10 11).
I he first 6 values of size 14 or less in the permuunon are used to denote bottles containing the

Step 3 Arrange treatment codes in arbitral
order (column 2, Table 10-16).
Step 4 Generate a sequence of 5-digit pseudo
random numbers and record in the ordgenerated (column 3, Table 10-16).
Step 5 Link the treatment code (column
Table 10-16) and pseudo-random number (c*«lumn 3, Table 10-16).
Step 6 Order the pseudo-random numbm
with associated treatment codes (column 4
Table 10-16).

Block size

Allocation ratio

J J?- 3

a. Specifications
• Treatment groups: 2

(I)

• Allocation ratio: I • I

• Blocking constraints: Blocks of sizes 6 or 8
in random sequence
• Treatment masking: None
• Stratification variables: 2 (clinic and type
of eye disease, three different types)

8

(J)

Initial
assignment

Pseudo
random number

(6)

Replacements

Final

Random
number

Bottle
number

7M 72-1,
72/0
72-/Q

7^

Tl-ll
72-b

Tl-ll
7i-l

L

/

Sourer

(31
Treatment assignments

Initial

72-10

6

• Treatment groups: 3

Ihl
71-3

7

• Allocation ratio: 2:2:3

7^7
T'l-ll

9

(5)

(4)
Ordered pseudo
random number with
treatment code

Final
assignment
from column 4

I
2
3

T

26391

(T) 07631

T

C

29126

(C) 10645

C

T

07631

(C) 22846

C

4

C

22846

(T) 26391

T

5

T

30856

(C) 29126

C

6

C

10645

(T) 30856

T

_±_
T2^72-&
71- //

12

12^.

13
14

T2-IQ
7^-13

12

(L^.
7

7/-7

71-/0

II

(LZ
j2Z

72-^

zZJ

nr.

_3_

* Reading rule:
tf-

<u.

I
i

(5)

/Q, 43

a. Specifications

io

(2)

(4)

Start

End

4

Table 10-16 Sample allocation schedule from the Macular Photocoagulation Study for Illustration 5

(/)

Column

1,3, 7, //

(2»

8

Order of
assignment
in block

Random numben'
Page

Tl

2

10.8.6 Illustration 6: Double-masked
allocation schedule using the MosesOakford algorithm and a table of
random numbers

Note that each bottle number appears only once
in Table 10-17. Subsequent blocks will contain
different orderings of the same bottle numbers.

71:

Order of
assignment
within block

Sheets should be used in the order needed, r
discussed in Illustration 4.

c. Comment

Treatment codes
Tita
12,

2,

c. Comment

10.8.5 Illustration 5: Sample allocation
schedule for the Macular Photocoagulation
Study using pseudo-random numbers

/7

l±

Step 7 Repeat steps 2 through 6 as necessan
to generate the desired number of assign
ments.

control drug, the next 4 numbers are used to
designate bottles containing test drug I and the
last 4 numbers are used to denote bottles con
taining test drug 2. The bottle codes and asso
ciated treatment are recorded in column 2, Table
10-17, and then rearranged as described for Il
lustrations 3 and 4 to yield the bottle sequence
indicated in column 6. The sheet provided is for
I block in the scheme.

Allocation schedule for double-masked drug trial described in Illustration 6

Table 10-17

Ill

O^d-d-rtM-r^ AduJ- h-f. Asur IL
>/e>
OO </»

Row

112

Randomization and the mechanics of treatment masking

10.8.7 Illustration 7: Sample CDP
double-masked allocation schedule

I

tft

a. Specifications
• Number of assignments: 8,341
• Clinics: 53
• Treatment groups: 6
• Allocation ratio: 2- 2- 2'2'2'- 5
• Blocking constraints: Uniform block size
of 15
• Treatment masking: Double-masked
• Stratification variables: 2 (clinic and risk
group, two levels per clinic, to yield a
total of 53 X 2 = 106 allocation strata)
• Random number source: Rand Corpora
tion (1955)

11. The study plan

b. Approach

The allocation procedure is described in a CDP
publication (Coronary Drug Project Research
Group. 1973a). Treatment assignments were
identified by a 2-digit bottle number as shown m
Table 10-9. The same bottle numbers were used
in all clinics. Hence, all bottles bearing a particu
lar number always contained the same medica
tion, regardless of clinic.

The way to improve a treatment is to eliminate controls.

c. Comment

II I Introduction

Note that each bottle number appears once and
only once in the 30 assignments listed in Table
10 9 and that both blocks in the table satisfy the
allocation ratio (i.e., contain 2 assignments to
each test treatment and 5 assignments to the
placebo treatment).

II 2 Design factors and details to be addressed
in the study plan
11 t Objective and specific aims
II 4 The treatment plan
115 1Composition of the study population
116 The plan for patient enrollment and fol
low-up
II 7 The plan for close-out of patient follow-up
Table III Example of a factorial treatment de
sign for a two-drug study
Table 11-2 Numbers of patients by treatment
group in PARIS
Table 11-3 Major items to be included in the
treatment protocol
Table 11 A Advantages and disadvantages of op
posing selection strategies
hhle 115 Primary selection criteria of trials
sketched in Appendix B

ft

11.1

INTRODUCTION

The basic elements of the plan for any trial will
he set long before the first patient is enrolled.
I he nature of the test treatment and outcome
measure will be specified in the funding proposal. Specifics having to do with execution of
the study plan may not be addressed until the
trial has been funded. The period of time be
tween initiation of funding and enrollment of the
first patient is one requiring intense effort to
deselop and test procedures needed for the trial.
However, the planning and testing process does
nm end there. In fact, it is likely to continue over
much of the course of the trial, particularly in
long-term trials involving extended periods of
patient recruitment or follow-up. The goal in
such settings of maintaining the study plan un
changed once the first patient has been enrolled,
while laudable, is not always practical.
I he term study plan used in a broad sense
refers to the design of the trial and all the organi

zational and operational details needed to carry
it out. In this sense, various other chapters, in
addition to this one, relate to the study plan,
starting with the two previous chapters and in
cluding most of those that follow.
11.2 DESIGN FACTORS AND
DETAILS TO BE ADDRESSED
IN THE STUDY PLAN

No trial should be undertaken without:
• A concise statement of its objective(s)
• A specification of the outcome measured) to
be used for evaluating the study treatments
• Agreement on the treatments to be tested
• A sample size calculation that indicates the
required number of patients, or a calcula
tion of the power provided with a prestated
sample size
• Specification of the required length of patient
follow-up
• A specified set of patient entry and exclusion
criteria
• A method for randomization
• A specified baseline and follow-up examina
tion schedule
• A set of data intake procedures, including
specification of the methods for data entry,
editing, and quality control
• An established organizational and decision
making structure

Agreement on the design and operating features
of a trial cannot be ensured unless they have
been written down and have been reviewed and
accepted by investigators responsible for the
trial.
113

OBJECTIVE AND SPECIFIC AIMS

The statement of the primary objective is by far
the most important specification in the trial. It
must be formulated and agreed upon before a

113

I

Hugo Menuch

11.4 The treatment plan

114

115

The study plan

data collection scheme can be developed. The
statement should indicate the:

• To develop methods applicable to multicenier
clinical trials.

• Type of patients to be studied
• Class of treatments to be evaluated
• Primary outcome measure

CDP

Sample statements of objectives follow:

University Group Diabetes Program
(UGDP)
Evaluation of the efficacy of hypoglycemic
treatments in the prevention of vascular
complications in a long-term, prospective,
and cooperative clinical trial (University
Group Diabetes Program Research Group,
1970d).
Coronary Drug Project (CDP)
Evaluate the efficacy of several lipid-infuencing drugs in the long-term therapy of
CHD in men ages 30 through 64 with evi
dence of previous myocardial infarction
(Coronary Drug Project Research Group.
1973a).

National Cooperative Gallstone Study
(NCGS)
To determine the efficacy of oral adminis
tration of a high and low dose of CDC acid
in dissolving or reducing the size of choles
terol gallstones, as compared with placebo
treatment (National Cooperative Gallstone
Study Group. 1981a).
The statement from the CDP comes closest to
satisfying the three requirements stated above. It
indicates the type of patients to be treated and
the class of treatments to be used. However, it is
ambiguous with regard to outcome, other than
to suggest that it is related to coronary heart
disease (CHD). The UGDP statement indicates
nothing about the study population and is am
biguous with regard to chosen outcome mea
sure. The NCOS statement names the treatment
and outcome measure, but says nothing about
the study population.
It is not uncommon for a large-scale trial to
have secondary objectives as well. They are illus
trated for the three trials cited above.

UGDP

• To study the natural history of vascular dis
ease in maturity onset, noninsulin depen
dent diabetics.

• To obtain information on the natural histon
and clinical course of CHD.
• To develop more advanced technology for the
design and conduct of large, long-term, col
laborative clinical trials.
NCGS

• To determine whether either a high or lo»
dose of chenodeoxycholic acid could hr
safely used to dissolve cholesterol gall
stones.
• To determine the rate of recurrence of gall
stones in those patients in which cheno
deoxycholic acid feeding has successful!*
dissolved gallstones.
Whenever multiple objectives are stated, it it
wise to rank them in order of importance. The
ranking will have important design implications
especially if data requirements for the objecinn
differ. The investigators should state the specific
aims to be pursued in conjunction with each
objective. The methods and data collection re
quirements of the trial should then be con
structed to satisfy the stated aims.
11.4

THE TREATMENT PLAN

General considerations involved in choosing the
test and control treatments were discussed m
Chapter 8. Once they have been selected, it i'
necessary for investigators to address a sene*
of practical issues concerning treatment admin
istration. One issue in drug trials concern*
whether the treatments are to be administered
using a fixed- or variable-dosage schedule. Ide
ally, the administration schedule should be a*
near that used in actual practice as feasible A
variable dosage schedule, tailored to the need* n
individual patients, should be used if the test
drug is ordinarily used in this way. A fix
dosage schedule may be used if the drug is nor
mally used in this way or if the variation m
dosages used is small.
The choice mav be constrained by masking
requirements. The desire to individualize treat
ment in order to achieve some desired effect < or
example, to normalize blood glucose le*e* i"
the case of a hypoglycemic drug) may base to

is a reason to suspect additive or synergistic
treatment effects. It should not be used with
treatments that are incompatible, or where there
is no interest in some of the treatment combina
tions. A partial factorial treatment structure (see
Glossary) may be considered in the latter case, as
in the Persantine Aspirin Reinfarction Study
'n Anothe^ issue in drug trials has to do with the
(PARIS). Persantine was not used alone, be
formulation of the test treatment. Whenever feacause of the high dose level required in the ab
tIhle it should be used in the same form as in
sence of aspirin and because of previous animal
normal practice. However, here again some com
work suggesting that the combination of aspirin
promises mav be necessary. For example, inves
and .persantine had a more profound effect on
tigators may choose to use capsules for dispens
blood platelets than either drug alone (Persan
ing study medications even though the test drug
tine Aspirin Reinfarction Study Group, 1980b).
is normally dispensed in tablet form in order to
The primary aim of the study was to provide a
mask the taste and appearance of the study
drugs. Modification in the form or route of ad comparison of the combination of persantine and
aspirin against aspirin alone. A secondary aim
ministration is acceptable only if it does not
was to measure the usefulness of this combina
atlect the bioavailability or pharmacological ac
tion against a placebo treatment. This difference
tion of the study drugs.
in interest is reflected by the fact that the number
A kev design decision in trials involving two
of patients assigned to the placebo treatment was
or more treatments that may be used alone or in
only half the number assigned to either of the
combination concerns whether a factorial treat
other two treatment groups (Table 11-2).
ment structure should be used (see Glossary for
The methods for administering the treatments
definition). Table 11-1 illustrates use of this de
and ground rules under which treatments may
sign for a two-drug study. Separate placebos for
be altered or stopped should be set down in the
exh drug tested are necessary when the test
treatment protocol. Table 11-3 provides a list of
drugs are to be dispensed on different time sched
the items that should be included in this docu
ules or in different forms (e.g.. capsules for one
ment.
text drug and tablets for the other test drug).
The details of the protocol should be sub
Patients in the cell designated AB would receive
jected to careful review before implementation.
N'th drug A and B, those in cell AB would
Lack of agreement can lead to unacceptable va
receive drug A and the placebo for drug B,
riation in the data collection or treatment pro
and so on. A trial involving three different drugs,
cess. Establishing standards for data collection
each administered at a single, fixed-dose level
and treatment administration is important whe
and suitable for use alone or in combination,
never multiple investigators are involved in a
*<nild involve eight (i.e., 2’) treatmen£combinatrial, whether they are located in a single clinic
tions ABC. ABC, ABC, ABC, ABC. ABC,
or in multiple clinics.
\RC. and ABC.
Medical conditions that may require a study
The main advantage of a factorial treatment
physician to depart from the assigned treatment
structure lies in the opportunity it provides for
should be detailed. It is also wise to outline side
estimating both individual and combined treat
effects that are a normal part of a drug’s phar
ment effects via the same experiment. A full
macological effect. For example, the treatment
factorial treatment structure (see Glossary for
protocol for the NCGS warned physicians
definition) should be considered whenever there

^andoned if there is to be double-masked admmMration of the treatments. The manipula
tor required for dosage titrations can be haz,
ardous to patients if they are done in a masked
fash.cn and may in any case render the masking

Numbers of patients by treatment group in

T«ble 11-1 Example of a facto
rial treatment design for a twodrug study

Table 11-2
PARIS

Drug

B

Drug

Persan tine

Persant ine
placebo

B

A

AB

Aspirin

810

810

AB

0

406

A

AB

AB

Aspirin placebo

/1.5 Composition of the study population

116 The study plan
Table 11-3
protocol

Major items to be included in the treatment

• Specification of the test and control treatments to be
tested and rationale for the choices
• Review of previous research on the safety and efficacy of
the proposed treatments
• Description of the methods for administering the test
and control treatments

fl

• List of contraindications for the proposed treatments
• Specification of the clinical conditions that may necessi
tate termination of the assigned treatment

*

• Specification of side effects that may require termina
tion of the assigned treatment, as well as those that
should not

R!

• Methods, in the case of masked drug trials, for packag
ing and dispensing drugs, including a general outline
of the conditions under which the masking may have
to be revealed to clinic personnel or to a study patient
• General scheme to be used for assigning patients to the
study treatments

against stopping a patient’s treatment because of
mild diarrhea, since such problems were a rec
ognized side effect of chenodeoxycholic acid
therapy and were not considered to be serious
(National Cooperative Gallstone Study Group,

1981b).
The conditions under which a treatment as
signment is revealed to clinic personnel in a dou
ble-masked trial should be specified. As a rule,
there are few valid reasons for unmasking as
signments during the course of the trial, since the
assigned treatments can be terminated without
revealing their identity to patients or clinic per
sonnel. For example, provisions for unmasking
in the CDP were limited to emergencies involv
ing life-threatening uses of a medication by a
patient or a member of his family or when a
patient required emergency surgery and the sur
gical team needed to know his treatment assign
ment. Patients undergoing elective surgery sim
ply stopped taking their study medicine before
the surgery and during the recovery period.

11.5 COMPOSITION OF THE STUDY
POPULATION
The formulation of patient selection criteria for
the study represents a balance of two opposing
forces: one designed to produce a highly homo
geneous study population and the other de
signed to minimize the restrictions on the study
population and hence maximize the opportuni-

ties for patient recruitment. On the one hand
the more homogeneous the population, the m.precise the study, and hence the smaller the
number of patients needed to detect a gnfdifference. On the other hand, the greater tk
heterogeneity, the broader the basis for genera;,
izing findings at the end of the study. The ad\an
tages and disadvantages of different selecticstrategies are summarized in Table 11-4.
Investigators must agree on selection and ex
clusion criteria before patient recruitment slant
Often they fail to appreciate the impact thecnirria will have on recruitment. Estimates of paiien
availability made during the design stage of the
trial are likely to be unrealistically high unlru
they are based on actual patient surveys using
the proposed criteria. Factors that are not likeh
to influence outcome should not be used far
exclusion, since they do nothing to improsc the
precision of the trial while they make patient
recruitment more difficult. Table 11-5 lists the
main selection criteria used in the trials sketched
in Appendix B.
Socioeconomic status is usually not a valid
basis for patient selection. Neither the scientific
nor the lay community is likely to look kindh on
such forms of selection. Selection on the basis of
ethnic origin, religion, or race should also be

Tible 11-5

Primary selection criteria of trials sketched in Appendix B

TYiat

Sex

Age limits
on entry

AMIS

Both

30 69

Prior MI

CASS

Both

None

Prior Ml

CDP

Males

30 64

Prior MI

HDFP

Both

30-69

Diastolic blood pressure >95 mm Hg

HPT

Both

25-49

Diastolic blood pressure >78 but <90
mm Hg

IRSC

Both

<10

Grade III or IV vesicoureteral reflux

MPS

Both

>50 for SMD.
>18 for HISTO,
None for INVM

Evidence of neovascularization for all
three conditions

MR FIT

Males

35-57

High risk for CHD

NCOS

Both

None

Radiolucent gallstones

PARIS

Both

30-74

Prior Ml

PHS
POSCH
UGDP

Males

40 75

Absence of Ml history

Both

30-64

Hypercholesterolemia

Both

None

Newly diagnosed diabetes

VACSP 43

Males

None

Evidence of gangrene of either foot

isoidcd. A possible exception relates to diseases
or conditions concentrated primarily, if not ex
clusively, in individuals of a particular religious,
ethnic, or racial background. However, even if
one avoids use of such factors, the study popula
tion of a clinic may be quite homogeneous with
regard to them. The socioeconomic, ethnic, or
rxial spectrum covered by a study population
•ill be a function of where and how it is re
cruited. The racial mix of clinics in the UGDP
'aned from being nearly all white to being
nearly all black (University Group Diabetes Pro
gram Research Group, l970e). This variation
stands in marked contrast to that observed in the
< oronary Artery Surgery Study. The population
•n that study was virtually all white—a reflec
tion. undoubtedly, of the nature of the patients
sened by the participating clinics and of the
popularity of bypass surgery in white middle
class America (Coronary Artery Surgery Study
Research Group, 1981).'
Of the 14 trials listed in Table 11-5, 4 used sex
w an exclusion. The sex restriction in the CDP
•is required because estrogen—one of the drugs
tested in that trial—was contraindicated for fe
males. The Veterans Administration Coopera,!'e Study Program No. 43 (VACSP 43) and the
Pbvsicians’ Health Study (PHS) excluded fetttales from enrollment simply because of the

Table 11-4 Advantages and disadvantages of oppo<iM
selection strategies

Highly restrictive selection criteria
• Advantages
- Provides more precise comparison of the test
control treatments
- Results of the trial less likely to be effected h
population variability
• Disadvantages
Increases the cost and time required for pstr*
recruitment

- Limits the generalizability of the study findinp
Minimally restrictive selection criteria

• Advantages
Makes patient recruitment easier

- Provides base for wider generalization of findingi

• Disadvantages
May obscure treatment effects because of vana
bility in composition of study population
- Results of the trial may be confusing, especial''J
an observed effect appears to be associated
with a subgroup of patients in the siud' and
the subgroup is too small to yield a relia**
treatment comparison
________

I

117

Disease state

small number of females contained in the popu
lations approached for study. The rationale for
the restriction in the Multiple Risk Factor Inter
vention Trial (MRFIT) is less clear. There is no
question that even if the trial had been open to
females that the majority of enrollees would
have been male. However, that fact alone does
not provide a sufficient rationale for the exclu
sion. Valid treatment comparisons can be made
so long as the proportionate mix of males and
females is the same across study treatments.
Ten of the 14 trials used age as a selection
criterion. Generally, practical considerations fig
ured in the limits used. For example, this was the
case in the choice of the lower age limit for the
Hypertension Prevention Trial (HPT). The orig
inal design called for a lower limit of 18. Ulti
mately, however, the limit was raised to 25 be
fore the study started because problems were
anticipated in recruiting and following people
aged 18 to 25.
The use of upper age limits, especially in
studies involving adult populations, is less easy
to justify. CDP investigators arbitrarily imposed
an upper limit of 65 primarily as a means of
excluding individuals who had experienced their
first Ml relatively late in life. The limit made
recruitment more difficult and in all probability
did little to improve the precision of the trial.

118

<•

The study plan

since there is no reason to believe the study
treatments are any more or less effective in indi
viduals over 65 than for those under 65.

■.'

3
f

11.6 THE PLAN FOR PATIENT
ENROLLMENT AND FOLLOW-UP

The study plan should include a description of
methods to be used for patient recruitment and
an outline of the data collection schedule (see
Chapters 12 and 14). Ideally, there should be at
least two separate patient contacts before ran
domization with adequate time between the con
tacts to:
• Allow clinic staff time to consider the suit
ability of the patient for study
• Facilitate the identification of “faint of heart”
patients
• Allow a staged approach to the informed con
sent process (see Section 14.6)

Jilt
e

'If*
IB

!?

The study design should provide for a land
mark that when passed marks entry of a patient
into the trial (e.g., the point at which the treat
ment assignment is divulged to clinic personnel).
A patient should be counted as part of the study
population, regardless of his subsequent course
of treatment, once the landmark has been
passed.
After enrollment, patients will be required to
return for one or more scheduled follow-up vis
its. The timing of these visits will depend on the
data collection requirements of the study. The
frequency is usually highest right after the initia
tion of treatment. The CDP required a clinic
visit of each patient at one month and again at
two months after enrollment for dosage in
creases. The next required visit was at four
months after enrollment and then every four
months thereafter (Coronary Drug Project Re
search Group, 1973a).
Except in special cases, the frequency of re
quired data collection visits should be the same
for all patients. A difference in the visit rates can
bias the study results if it influences the rate at
which clinical events are diagnosed and re
ported. This kind of bias was of concern in the

12. Data collection considerations

Hypertension Detection and Follow-Up pr,v
gram (HDFP) because of more frequent con
tacts with patients assigned to stepped-care thar
with those assigned to usual care (Hypertension
Detection and Follow-Up Program Cooperatn?
Group, 1979a).
The possibility of bias is not eliminated by iw
of identical schedules for required visits if the
rate of interim unscheduled visits between sched
uled visits is different for the study groups a
differential rate of unscheduled observations castill bias the way in which events are diagnosed
and reported in the trial. Most long-term inas
keep track of such contacts, if for no other rea
son than to provide a means of comparing the
study groups for differences in contact rates
The study plan should include provision (or
some minimal form of follow-up for dropouts
(see Glossary for definitions). The follow-up
may be for mortality only or for other kinds of
outcomes, depending on the trial. See Chapter
15 for more details.

Investigators seem to have settled for what is measurable instead of measuring what they
would really like to know.
Edmund D. Pellegrino

11.7 THE PLAN FOR CLOSE-OUT OF
PATIENT FOLLOW-UP

An important design issue concerns disengage
ment of a patient from the trial when it is fin
ished. Two general models are used for this pur
pose. One model is characterized by a common
closing date for all patients, regardless of the
date of enrollment. Another involves close-out
after a specified length of follow-up. The latter
approach requires as much time for close-out as
for enrollment, whereas close-out takes place at
the same time for all patients, regardless of when
they were enrolled, when the former approach n
used (see Section 15.4 for added discussion).
The CDP is an example of a trial using a
common close-out date. All patients were sepa
rated from the study during June through Au
gust of 1974 (Coronary Drug Project Research
Group, 1975). The NCGS provides an example
of close-out after a specified period of follow
up—two years (National Cooperative Gallstone
Study Group, 1981a).

j

i? I Introduction
2 Factors influencing the clinic visit schedule
12 2.1 Introduction
12 2 2 Baseline clinic visit schedule
12 2.3 Follow-up clinic visit schedule
12 2.4 Visit time limits
12' Data requirements by type of visit
12 3.1 General considerations
12 3 2 Data needed at baseline visits
12 3.3 Data needed at follow-up visits
2 4 Considerations affecting item construction
12 4.1 Implicit versus explicit item form
12 4.2 Interviewer-completed versus patientcompleted items
12.4.3 Questioning strategy
12 4 4 Single versus multiple use forms
12.4.5 Format and layout
>2 < Item construction
12 5 1 General
12 5.2 Language and terminology
12 5.3 Use of items from other studies
12 5.4 Closed- versus open-form items
12.5.5 Response checklists
12 5 6 Unknown, don't know, and uncertain
as response options
12 5.7 Measurement and calculation items
12.5.8 Instruction items
12 5.9 Time and date items
12 5.10 Birthdate and age items
12 5.11 Identifying items
12.5.12 Tracer items
12 5.13 Reminder and documentation items
12 6 l ayout and format considerations
12.6.1 Page layout
12 6 2 Paper size and weight
12 6.3 Type style and form reproduction
12 6.4 Location of instructional material
12 6.5 Form color coding
12.6.6 Form assembly
12 6.7 Arrangement of items on forms

12.6.8 Format
12.6.8.1 Items designed for unformatted
written replies
12.6.8.2 Items requiring formatted written
replies
12.6.8.3 Items answered by check marks
12.6.9 Location of form and patient identi
fiers
12.6.10 Format considerations for data entry
12.7 Flow and storage of completed data forms
Table 12-1 Sample appointment schedule and
permissible time windows, as
adapted from the Coronary Drug
Project
Table 12-2 Methods for avoiding errors of omis
sion and commission in the data
form construction process

12.1

INTRODUCTION

Decisions regarding the data collection schedule
and related forms are among the most important
in the trial. They will determine both the amount
and quality of data generated in the trial.
There must be adequate time, once the study
is funded and before data collection starts, for
investigators to agree on the details of the data
collection process. They must be concerned first
with setting the schedule at which patients are
seen, both before and after entry into the trial,
and then with outlining the specific items of
information to be collected each time the patient
is seen. The investigators should allow adequate
time after these steps are completed for develop
ing and testing required data forms and for re
ceiving and reacting to suggestions from clinic
personnel who must use them.
The form development process should be un
dertaken by personnel who are experienced in
form construction and who are familiar with
methods for data collection and data processing
in prospective studies. The development of data

119

'-J1
120

72.2 Factors influencing the clinic visit schedule

Data collection considerations

forms can be facilitated by review of sample
forms used in other trials, especially those from
trials with design and operating features similar
to the one in question. Some of the desired
samples can be obtained through the published
literature (e.g.. see appendixes in Coronary
Drug Project Research Group, 1973a. and Coro
nary Artery Surgery Study Research Group,
1981) or via a central respository (e g., see Na
tional Cooperative Gallstone Study Group.
1981a. for reference to forms placed on file at the
National Technical Information Service). Others
will have to be obtained by direct request to
investigators involved in the trials of interest.
The reference list in Appendix I includes a
number of citations pertinent to data collection
and the construction of forms. Several of the
references are from interview and survey litera
ture but are relevant to clinical trials as well. A
classic book by Payne (1951). although focused
on opinion polling, is useful reading for anyone
involved in data collection. The Teacher's Word
Book of 30.000 Words (Thorndike and Lorge.
1944) indicates the expected level of comprehen
sion of words as a function of education level. It
is a useful resource, especially when forms are
being designed for use in patient interviews.
Also included are several textbooks with chap
ters on forms design (Backstrom and HurshCesar, 1981; Kidder, 1981; Marks, 1982; Sudman and Bradburn, 1983), as well as a number
of journal articles. The three articles by Wright
and Haybittie (I979a.b.c) and a chapter from a
monograph from the Coronary Drug Project
(Knatterud et al., 1983) have direct relevance
to the field of clinical trials. Papers by Collen
and co-workers (1969), Helsing and Comstock
(1976), Hochstim and Renne (1971), Holland
and co-workers (1966), and Milne and William
son (1971) deal with data collection via question
naires. Other papers of interest include those by
Barker (1980), Barnard et al. (1979), Bishop
et al. (1982), Duncan (1979), Edvardsson (1980),
Finney (1981), Layne and Thompson (1981).
McFarland (1981), Romm and Hulka (1979),
Roth et al. (1980), Schriesheim (1981), Smith
(1981), and Zelnio (1980).
12.2 FACTORS INFLUENCING THE
CLINIC VISIT SCHEDULE

12.2.1

Introduction

Every clinical trial must provide for data collec
tion at a minimum of two time points: at or just
before randomization and the initiation of treat-

ment to provide baseline data, and at least oner
after randomization for collection of follow-u;
data. It is possible to collect all the required datj
for a patient during a single clinic visit if n „
possible to collect the necessary baseline dan
issue the treatment assignment, administer th*
treatment, and make the required follow-up oK
servations all on the same day. However, the
usual situation is one in which a patient is re
quired to make one, two. or even more visits tn
the clinic on different days before he or she can
be enrolled and assigned to treatment. There
after, the patient may need to make a senes o'
return visits, extending over a period of uetU
months, or even years, to receive the assigned
treatment and for follow-up data collection
The discussion throughout this^book deals
with trials in which data collection is performed
on an outpatient basis. If any hospitalization is
required, it is assumed to be a small portion o'
the total time the patient is expected to be under
study.
Patient visits that take place before the ran
domization visit are herein referred to as prera-i
domization visits. Enrollment into the trial oscars at the randomization visit and is marked h
some explicit act (e.g., the opening of the treat
ment allocation envelope). Thereafter, the pa
tient is a member of the treatment group to
which he or she was assigned.
It is conventional to consider data collected at
the prerandomization and randomization wits
as baseline data and to refer to both types of
visits as baseline visits (see Glossary). This con
vention will be followed in this book. It is rea
sonable if all data collected at the randomization
visit are collected before initiation of treatment
Post-randomization visits include all visits that
take place after the randomization visit. All such
visits will be referred to as follow-up visits in this
book, whether they are done on a scheduled or
ad hoc basis.

12.2.2

Baseline clinic visit schedule

Baseline visits (prerandomization and randomi
zation visits) are needed to:
• Determine a patient’s eligibility for enroll
ment

,

• Provide baseline data for assessing chanen
occurring after the initiation of treatmen
• Explain the purpose of the study to the p •
tient and to obtain consent lor partic p
tion in the trial
• Issue the treatment assignment

121

Whenever possible, it is useful to have the
•Mucnt make at least two visits to the clinic
L.,re enrollment. The visits may be only a few
<,u apart especially when there is an urgent
to initiate treatment, or they may extend
Mer a period of weeks, or even months. The
•rpc.H•.t visits make it possible to replicate certain
measurements. A time separation
ke\ baseline
f.
»rt*een visits may be needed as well to:

procedures, particularly those entailing risks to
the patient. Hence, whenever feasible, the sim
plest procedures with the least risk should be
performed first so that patients who then prove
to be ineligible can be spared the inconvenience
(and risks, if any) of the more complex and time
consuming procedures.

• Perform the necessary screening and diagnos
tic procedures for determining patient eligi

A follow-up visit is any visit, either required or
nonrequired, to the study clinic by a patient who
has been enrolled into the trial (i.e., assigned to
treatment) that takes place after the randomiza
tion visit. Required visits should be specified in
the study protocol and should be scheduled to
take place at specified time points after the ran
domization visit. They are herein variously re
ferred to as scheduled follow-up visits, required
follow-up visits, or, in contexts where the mean
ing is clear, simply as follow-up visits. Visits of
this class are needed to:

bility

• Allow sufficient time for a patient to recover
from a procedure performed at one visit
and to go through the preparatory steps
required for the next clinic visit
• Provide adequate time for the informed consent process
• Allow adequate time for clinic staff to evalu
ate the data collected on the patient before
enrollment

Ihe Coronary Drug Project (CDP) required
i»o prerandomization visits. The first visit was
uwd to make an initial determination of a pabenfs eligibility for enrollment into the study, to
>*tain scrum for lipid and other determinations,
tn perform a general physical examination, and
tn provide the patient with a preliminary expla
nation of the study. The second visit, scheduled
approximately I month after the first visit, was
’jvd to assess a prospective patient’s adherence
to the prerandomization treatment schedule,1 to
obtain additional serum for a repeat set of laboritnry determinations, and to obtain the pa
tient's signed consent to participate in the trial.
The randomization visit, scheduled approxi
mately I month after the second prerandomizatmn visit, was used for a final asessment of the
patient's suitability for enrollment into the trial,
including a further assessment of his adherence
to the assigned medication schedule, and veri
fication that the patient was indeed willing to
he randomized. If so, the treatment allocation
mstlope was opened and the assigned treatment
•as initiated (Coronary Drug Project Research
Group. 1973a).
Required diagnostic and data collection proce
dures should be designed to minimize patient
inconvenience and exposure to unnecessary
1
considered eligible for enrollment into the CDP at the
’*d ol ihe first prerandomization visit were given a single-masked
r-k'So medication (three capsules per day) which they were to
•*r until the randomization visit.

12.2.3

Follow-up clinic visit schedule

• Carry out procedures specified in the study
protocol, including those for treatment ad
ministration and treatment adjustment
• Evaluate the patient's response to treatment
• Assess patient and physician adherence to the
assigned treatment
• Collect information on the treatment process
and outcome and related data needed for
evaluation of the treatments

The timetable for required follow-up visits will
be dictated by various factors, including:
• Requirements for treatment administration
and for assessing adherence to treatment
• Rate of occurrence of the outcome(s) of
interest
• Patient health care needs
• Cost of a patient visit
• Patient convenience considerations

The schedule for required follow-up visits may
be designed to allow for more frequent visits
immediately after enrollment of a patient into
the trial to permit clinic personnel to initiate and
administer the assigned treatment. The interval
between visits may be increased to some maxi
mum and held constant thereafter once the in
itial treatment process is completed.
Follow-up visits that are made on an ad hoc
basis because of special problems experienced by
the study patients after enrollment into the trial
will be variously referred to as unscheduled fol
low-up visits, nonrequired follow-up visits, or
interim follow-up visits.

122

Investigators should construct the data collec
tion schedule so as to be able to distinguish
between required and nonrequired follow-up
visits. The data system should be designed to
yield a count of both types of visits. Differences
among the treatment groups in the number of
interim follow-up visits can lead to biases in the
diagnoses and reports of clinical events used to
evaluate the study treatments (see Section 11.6
and Question 68 of Chapter 19 for further dis
cussion).

12.2.4

-

w;-

1

■
fl

|

1 I

4^
I
I

i'

12.3 Data requirements by type of visit

Data collection considerations

Visit time limits

Ideally, the entire set of scheduled baseline and
follow-up visits for a patient should be done at
precise time points relative to the time of ran
domization. However, such precision is gener
ally not possible in a free-living population, nor
is it necessary for most of the observations re
quired in the typical clinical trial. The usual
approach is to consider a visit and related data
collection as valid if the visit took place within a
defined interval on either side of the desired time
point. The permissible length of this time win
dow (see Glossary) will depend on the number of
required data collection visits and on the amount
of variation that can be tolerated in the timing of
observations.
The CDP allowed a maximum of 4 months
for completion of the three baseline examina
tions. After enrollment, the patient was required
to return to the clinic I month after randomiza
tion and again at 2 months after randomization
for scheduled dosage increases in his assigned
medication. Regular follow-up visits were sched
uled to take place at 4-month intervals there
after. Each of these visits had to be within 2
months of the preferred date, as dictated by the
date of randomization. Visits not carried out
within the time window were counted as missed.
The coordinating center for the study provided
clinics with computer-generated appointment
schedules that indicated the preferred date and
the permissible time window for each required
follow-up visit (Table 12-1).
12.3 DATA REQUIREMENTS BY
TYPE OF VISIT

12.3.1

Table 12-1 Sample appointment schedule and permissible time windows, as
adapted from the Coronary Drug Project

• The purpose(s) of each visit has been outline
• There is general agreement among the imtM:gators on the specific procedures to he car
ried out at each visit

Date of entry: Oct. 31. 1966
Bottle number assigned. 2

A key step in form construction is idenlificaiinof the specific items of information to be Clv.
lected during each clinic visit. The process rt
quired for the step should be designed to guard
against errors of omission as well as errors <•!
commission (see Table 12-2 for a list of precau
tions). Probably the single most common cause
of errors of omission is haste in the developmcr'
of the data forms. The process of identifsmi
required data items and then constructing and
testing them takes time and patience. Efforts m
shorten this process in order to gef started with
patient recruitment and data collection are
usually unwise.
The desire to create forms that, in addition to
meeting the research aims of the study, prostdr
data needed for routine patient care is probah s
the single most important contributor to enon
of commission. The fact that certain measure
ments need to be made in providing routine care
for patients is not sufficient reason to jusiils
inclusion of them in the study data system
Before starting form construction, the types of
data needed and the procedures for generating
them should be outlined. Once developed, the
outline should be reviewed by personnel not di
rectly involved in constructing the data forms
as a check against the two kinds of errors men
tioned above. Further, there should be general
agreement on the ordering of the procedures to
be performed at any given data collection visit
before the forms are constructed. The ordenne
will influence the sequencing of items on the
forms.
One of the last steps in the construction pro
cess is to carry out an item-by-item review of
each form against a list of data needs and goals,
as set down by the leadership of the study. Pan
items that cannot be justified in this renew
should be deleted from the final data forms -\ >
follow-up forms should also be checked against
each other and against the baseline set of forms
for consistency and as a safeguard against erron
of omission.

• A baseline and follow-up visit schedule has
been established by the study investigators

12.3.2

Patient ID No.: 59 0021

Patient Name: John D. Doe

The indicated visits should be done within the time windows specified and as close
to the desired date as possible. Visits not completed within the sepcified time
window should be skipped and will be counted as missed.

Desired
date

First
possible
dale

Last
possible
date

Interval
length
in days

Dosage adj. Visit I
Dosage adj. Visit 2

Dec. 1.66
Dec. 31.66

Nov. 16.66
Dec. 17.66

Dec. 16.66
Jan. 15.67

31
30

Follow-up visit I
Follow-up visit 2
Follow-up visit 3

Mar. 2.67
July 1.67
Oct. 31.67

Jan. 16.67
May 2,67
Sep. 1.67

May 1.67
Aug. 31.67
Dec. 31.67

106
122
122

Follow-up visit 4
Follow-up visit 5
Follow-up visit 6

Mar. 2.68
July 1.68
Oct 31.68

Jan. 1.68
May 2.68
Sep. 1.68

May 1.68
Aug. 31.68
Dec. 31.68

122
122
122

Follow-up visit 7
Follow-up visit 8
Follow-up visit 9

Mar. 2.69
July 1.69
Oct 31,69

Jan.
May
Sep.

1.69
2,69
1,69

May 1,69
Aug. 31.69
Dec. 31.69

121
122
122

Follow-up visit 10
Follow-up visit 11
Follow-up visit 12

Mar. 2.70
July 1.70
Oct 31.70

Jan. 1.70
May 2.70
Sep. 1.70

May 1.70
Aug. 31.70
Dec. 31.70

121
122
122

Follow-up visit 13
Follow-up visit 14
Follow-up visit 15

Mar 2.71
July 1.71
Oct. 31.71

Jan
May
Sep.

1.71
2.71
1,71

Mav 1,71
Aug. 31.71
Dec. 31.71

121
122
122

Visit

Source: Reference citation 104. Adapted with permission of the American Heart Association.
Inc., Dallas. Texas.

• To establish patient eligibility through items
ihat indicate the presence of required eligi
bility conditions and the absence of exclu
sion conditions
• To characterize the demographic and general
health characteristics of patients eligible
for enrollment into the trial
• To establish a baseline for assessment of
changes in variables to be measured over
the course of follow-up
• For any stratification required in the random
ization process
• For post-stratification
• To aid in contacting and tracing patients
• To assess clinic performance in carrying out
the informed consent process
• To assess adherence to the study protocol
• To link baseline and follow-up records
• To address other topics unique to the study in
question

General considerations

The development of data forms cannot be started
until:

123

Data needed at baseline visits

The first step in the design of any set ofTorm*
to enumerate the types of data needed (s
tion 12.2.2). Baseline data are needed.

I

The second step is to list the specific data items
and forms needed for each visit. Some items will
appear only once in the list; others will appear
under several categories.
The need for record linkage can usually be
satisfied by use of a unique number that identi
fies the patient and type of visit performed. The
data needed for stratification will be satisfied by
collection of information necessary for making
the classifications called for in the stratification.
Variables that are to be tracked over time must
be observed during the prerandomization or ran
domization visit to provide the necessary base
line information. The same is true for variables
that are to be used in risk-factor or subgroup
analyses to be carried out later on in the trial.
Investigators must have a thorough knowledge
of the epidemiology of the disease being treated
and of the conditions likely to influence the se
lected outcome measures to make an intelligent
choice of baseline variables for use in such anal
yses.

*1

124

12.4 Considerations affecting item construction

Data collection considerations

Tsble 12-2 Methods for avoiding errors of omission
and commission in the data form construction process

A. Safeguards against errors of omission
• Allow adequate lime for developing and testing data
forms before starling data collection
• Solicit contenl advice and input from persons not
directly involved in the development process

• Review data forms used in similar trials
• Ask persons not directly involved in the developmen
tal process to review proposed data forms for defi
ciencies
• Test data forms under actual study conditions before
use in the study

B. Safeguards against errors of commission

• Distinguish between data needed for patient care and
those needed to address the objectives of the trial
• Make certain every data item scheduled for collection
is of direct relevance to achieving a stated aim or
objective of the trial
• Establish an appropriate set of review and approval
procedures in order for new items to be added to
existing data forms

12.3.3

Data needed at follow-up visits

Data collected during follow-up are needed to:
• Assess changes in variables that are or may be
affected by treatment
• Characterize the nature of treatment over the
course of follow-up
• Characterize departures from the treatment
protocol and the reasons for them
• Characterize patient adherence to the as
signed treatment(s)
• Characterize the nature of treatment effects
observed, including side effects and patient
complaints related to treatment or believed
to be related to treatment
• Characterize the state of a patient’s health
and quality of life
• Maintain up-to-date patient locator informa
tion
• Assess adherence of clinic staff to required
procedures, as set down in the study proto
col
• Link baseline and follow-up records obtained
on the same patient
• Address other topics unique to the study in
question

The same process as outlined for base' v
forms should be used to construct the follow...
forms. It should begin with an enumerate-?items related to the above categories. It is h!Videntify all the variables on the baseline set /
forms that are to be updated at one or m’~
follow-up visits before starting construction the follow-up forms. Once this is done, it
necessary to indicate the visit or visits ai
specified variables are to be observed.
A series of items will be required to pros*
data on treatment administration. Trials mw .
ing technically complicated treatment pr.xt
dures, such as in some surgical trials, max rr
quire an entire set of forms for charactenzing tSr
treatment process.
The follow-up data system must also pnnxV
information on treatment compliance and othe amount of exposure a patient has had t
competing treatments. The latter information ™
needed to characterize the extent of cross-tm?
ment contamination present in the various trra*
ment groups when the results of the trial irr
analyzed. The follow-up forms must also md'>>
items for recording real or imagined treatmr-*
side effects reported by the study patients *
thorough knowledge of the treatments bent
tested and of pertinent medical literature «
needed to formulate suitable items.
A category of major interest in some tnah
(e.g., cancer chemotherapy trials) concerns
effect of treatment on a patient’s quality ol I 'r
The outcome measure, whether it he death <*
some nonfatal clinical event, may be only pin
what is needed for treatment assessment. A tn’
treatment, even if known to prolong life, mas hr
rejected by patients because of its noxious stdr
effects. Information on changes in a pat test’s
employment status, recreational activities, esn
cise habits, ability to care for himself, etc. »
be needed if quality of life measures art to **
used in evaluating the study treatments.

ing to the proper line on the card or by reading
his reply from the card.

item form
u hat is your present age?

Age in
Years

V. hat is your birthdate?
Mo

Day

Yr

Example of Ask-as-Written item

(AAU'') Are you presently taking vitamins
or minerals regularly?

( ) ( )
Yes

'-r'hit item form
\rr ....

Birthdate

Mo

Day

Yr

12 4.2 Interviewer-completed versus
patient-completed items

Tv< data forms may be designed to be com■red by clinic staff or by the patients them* -n Most of the forms will be completed by
"k personnel in a clinical trial. Hence, the
•’•ejinder of this chapter and Appendix F is
•• t’en from this point of view. However, many
* 'be same points outlined in Sections 12.5 and
•A •’PPly to forms completed by patients as
I’ems used as a reminder to clinic personnel to
certain information should be distinr- 'bed from those that are to be read or pre*‘’rd to the patient exactly as they appear on
v form, lhe Hypertension Prevention Trial
HPIl preceded all items of the latter type by
codes of AAW—Ask-as-Written—or
Show-as-Written (see examples below),
bmt that had a long list of possible answers or
considered too complicated to comprehend
■ * • serbal presentation were presented in the
fashion using specially prepared flash1 b The participant selected his response from
those listed on the card, either by point-

Implicit versus explicit item for*

A key consideration in item construction
do with wording of the items and whether t
are stated in explicit or implicit terms. Fxamr^
of the two forms are given below.

I

No

Example of Show-as- Written item

The wording chosen will depend upon the
• i-ire of the information being collected and on
s- od of sophistication of the person responsi*«• lor completing the items. An explicit form is
•^!cd when the wording of an item can effect
v information to be obtained. Survey revjrvhcrs have long recognized the importance
• uandardized wording for questions when the
• rmation is collected via an interview.
\n implicit form may be satisfactory for items
- plcted by clinic personnel. However, even in
' < case, care must be taken to make certain the
•rm is constructed so as to avoid misinterpreta•n among staff responsible for completing the

12.4 CONSIDERATIONS AFFECTINC.
ITEM CONSTRUCTION
12.4.1

125

(SAW) Have you taken any of the following
drugs in the last month? (Use HPT Flashcard
04 and check as many as apply)

(

)

Anacin

(

)

Appedrine

(

)

Bromoquinine

(
(
(

)
)
)

Coryban D
Dexatrim
Dristan

(
(

)
)

Excedrin
Midol

(
(
(

)
)
)

Nodoz
Permathene-12
Prolamine

(

)

Triaminicin

(

)

Vanquish

The SAW approach can be useful in the collection of sensitive information involving per
sonal income, sexual behavior, or the like. A
patient may be more willing to indicate his reply
by pointing to the appropriate reply or by refer
ring to a letter or number code on a flashcard
than to answer the question verbally. Other tech
niques have been developed for collection of
sensitive information. A particularly interesting
one involves a “random response" technique.
The technique is not discussed herein, but des
criptions and illustrations of it can be found in
papers by Begin et al. (1979), Begin and Boivin
(1980), Frenette and Begin (1979), Himmelfarb
and Edgell (1980). Martin and Newman (1982),
and Zdep et al. (1979).

12.4.3

Questioning strategy

The designe-s of the data forms must decide
where general, nondirective, questions are to be
used to elicit subjective information and where
more specific, directive ones are to be used.
Clearly, the type and amount of information
obtained can be influenced by the questioning
strategy used. For example, the number of pa-

126

tients reporting gastrointestinal distress in an
aspirin study can be expected to be higher if the
count is based on responses to a specific ques
tion concerning such problems (e.g.. Have you
had any gastrointestinal distress since you
started treatment?), as opposed to a general ques
tion (e.g.. Have you had any problems since you
started treatment?). The two strategies may be
used in tandem in situations in which it is ap
propriate to begin an area of inquiry with a
general question followed by one or more that
are specific and direct.

’ -C'

w
1
3-

12.4.4

if''
si
;R

1 i
I H

12.5 Item construction

Data collection considerations

Single versus multiple-use forms

The organization and content of the forms will
be influenced by whether they are designed to be
completed over a series of clinic visits or at a
single visit. Multivisit forms are more efficient to
use in that there are fewer forms to complete and
process than is the case with single-visit forms.
Further, since there is some administrative over
head associated with the completion and pro
cessing of any form, the fewer the forms, the
lower the total overhead.
A disadvantage with multivisit forms is the
time and inconvenience involved in filing and
retrieving partially completed forms. Further,
their use can slow the flow of information in the
study since a form cannot be sent to the data
center until it is complete. The delay can be
lengthy if the visits to be covered are widely
separated in time. Hence, if they are used at all,
their use should be limited to sets of visits that
are completed over short time intervals.
Data generated at different sites, whether
within or outside the clinic, even if part of the
same visit, should be recorded on separate
forms. This is particularly true for forms used to
record results of procedures or measurements
that are done by personnel who are not under
the direct control of the study clinic and that
cannot be provided on the day of the patient’s
visit to the clinic (e g., as is usually the case with
most laboratory determinations and with expert
readings of biopsy materials, coronary angio
grams, ECGs, eye fundus photographs, and the
like). The only exceptions are those in which the
data in question flow back to the study clinic
within a day or two of the patient’s clinic visit.

12.4.5

Format and layout

Decisions need to be made regarding the general
format and layout of the data forms. Issues to be

addressed include (see Section 12.6 for divuv
sion):

’ Avoid the use of terms that may have different meanings to the different people in
volved in completing the forms.
8 Provide necessary definitions on the forms
or indicate where they may be found.
Vse simple sentences in the construction of
items and instructional materials. Phraseol
ogy should be consistent with the educa
tional level of the individuals responsible
(or completion of the forms.
10 Avoid unnecessary words (Appendix F.3).
r.jj.
11 Avoid the use of double negatives (Appen
dix E4).
12 Avoid the use of compound questions by
dividing them into a series of specific ques
tions (Appendix F.5).
P hems requiring3 a comparative judgment
should indicate the basis for the compari
son (Appendix F.6).
14 I anguage research suggests that positive
terms, such as better, bigger, or more, are
less subject to interpretation error than neg
ative terms, such as worse, small, or less
(Wright and Haybittie, 1979a) (Appendix
F.6.3).
15 Items requiring an affirmative or negative
response are confusing when an affirmative
reply indicates the absence of a condition
(Appendix F.7).
|A For the same reason as indicated in 15,
questions concerning disease state or his
tory are easier to understand if stated in a
way which requires a yes or positive reply
when the condition is present, rather than
when it is absent (Appendix F.8).
P. The time point or interval to be used in
answering an item should be explicitly
stated in the item. A time point may be
defined by a specified date, by some event
or condition, or simply as the “present.” A
time interval may be defined by two calen
dar dates or from some date to the present
(Appendix F.9).
18 Variation in the direction of response from
question to question (e.g., stating some
questions that require a comparative as
sessment in positive terms and others in
negative terms) should be avoided (Appen
dix F.IO).
10 Avoid leading questions (Appendix F.l I).

• Full-page versus multicolumn layout
• Paper size, quality, and color
• Use of boxes, parentheses, or lines for record
ing responses to designated items
• Location of check spaces for responses
• Printed versus photocopied forms

12.5

ITEM CONSTRUCTION

This section and the next contain a senes n*
detailed comments and suggestions concern-i
item and form construction. Many of the poir»
are supported with illustrations contained in Ar
pendix F.

12.5.1

General

1. Every item and item subpart should ha\t i
unique identifying number (Appendix I h
2. Items should always be constructed to re
quire a response, regardless of whether *
condition is present or absent. The practice
of allowing a blank or unanswered item t
indicate the absence of a condition cj*.
cause confusion. Once the form is com
pleted there is no way to distinguish h?
tween items purposely left blank hecjuv
the condition in question was not prev-’
from those accidentally left blank (Appc*
dix F.2).
3. The conditions under which an item is to
be skipped should be part of the item
should be included in the instructions f<*
the item (Appendix F.2.4).
4. Items or sections on a form that mas hr
skipped in certain instances should be prt
ceded by items that document the left-macy of the skip. For example, a for-’
should include an item for recording t»*
patient’s age if parts of the form are to **
skipped for patients in a specific age range

12.5.2

Language and terminology

5. Use simple, uncomplicated language
6. Avoid the use of esoteric terms and abb"
viations. This is especially important in '
tuations where there is likely to be a tur*
over in the personnel responsible I*
completion of the study forms, or in mu
center trials where the level of stall lam
iarity with the study forms may van

12.5 J

Use of items from other studies

The item construction process can be facilitated
* a review of existing forms from related

I

127

studies. The review may help identify data items
that should be included on the data forms as well
as aid in their construction and format.
20. Assemble sets of forms from other related
studies and order by topic (e.g., smoking
history, exercise habits, disease history,
and so on).
21. Do not use an item simply because it has
been used before in other studies.
22. Do not construct an item de novo if a
suitable version of the item already exists,
has been used in other studies, and has
seemingly produced reliable information.
23. Do not modify the wording of an item
taken from another study if the item has
been shown to produce useful information
and if information generated from it is to
be compared with findings from studies in
which the item was used.
24. Do not use an entire form or section of a
form that has been copyrighted without the
written approval of the copyright holder.
25. Do not reproduce an entire form or section
of a form used in another study without
permission from the study, even if the form
is not copyrighted.

12.5.4

Closed- versus open-form items

A closed-form item is one that is completed
using a defined list of permissible responses. An
open-form item is characterized by the absence
of a defined list of permissible response options.
Closed-form examples

Indicate the highest grade completed in
school:
( ) 6th grade or less
( ) 7th, Sth, or 9th grade

(
(
(
(
(

)
)

10th or 11th grade
12th grade

) 2 or 3 years of college
) 4 years of college
) 5 or more years of college
Have you had any of the following diseases or
conditions diagnosed in the last year? (check
all that apply)
( ) Heart attack
( ) Stroke
( ) Congestive heart failure
( ) Emphysema
( ) Cancer
( ) None of the above

128

Data collection considerations

Open-form examples

What is the highest grade you have completed
in school?
Use the space below to list serious illnesses
that you have had. (Enter “none” if you have
never had a serious illness.)

26. An open-form item should be used when it
is difficult to anticipate the different re
sponses that may be given, or when there is
a desire to avoid leading the respondent by
indicating permissible replies.
27. An open-form item should be used to rec
ord continuous data, unless a closed form,
with designated categories, is considered to
provide adequate detail (see Section
12.4.2). An open form should be used even
if data are to be subsequently tabulated
into designated categories (e.g., age <25,
25-49, and >50). The opportunity to cate
gorize in different ways is lost whenever
continuous data are collected and recorded
in categorical form.
28. Closed-form items, with a predefined list of
response options, should be used when
there is a need to structure the responses
obtained (e.g., when it is desired to present
the respondent with all possible options
when answering a question or when it is
desirable to remind him of the permissible
response options).
29. The time required to code and process in
formation from open-form items is usually
greater than for closed-form items.
30. A closed-form item will do little to facili
tate coding and processing if most of the
responses fall into a general catchall cate
gory, such as the “other (specify)” category,
included at the end of the response list.

12.5.5

Response checklist

A response checklist defines the permissible or
acceptable responses to an item. The simplest
checklist is one for items requiring a binary re
sponse, such as yes or no, present or absent, or
the like. This list should cover all possible re
sponses and may be constructed to allow only

12.4 hem construction
one response or multiple responses, depend!-!
on the item.
31. A response checklist is preferable to aunformatted written reply, except as ind
cated in Section 12.5.4. An item invohmn
long list of possible response options iv?
Section 12.4.2 for flashcard alternate?
will require more space for layout than aitem designed to elicit an unformatted
ten reply, but the information generated
will be easier to process and interpret thais the case with an unformatted written reps
32. Vertical checklists are easier to use and
subject to less confusion with regard to th?
location of appropriate check spaces tha*
are horizontal checklists ^Appendix Fl?
33. A response checklist that is not exhaustn?
should include an “other” category that car
be used to record responses not covered r
the list.
34. There should be adequate space on the
form for respondents to write out re
sponses that fall into the catchall categon
The space provided will influence the
amount and legibility of the informatrnr
recorded.
35. Frequent use of a catchall category for an
item increases the time required for com
pletion of the item and for coding and pro
cessing the information generated by it (as
suming the written responses are to hr
coded and processed).
36. It may be wise or necessary to expand the
list of permissible response options for ai
item during the trial. Any expansion
should be based on a review of the re
sponses provided in the catchall categon
and should be done as soon after the sta”
of data collection as is feasible. Expancomay not be practical in short-term tnah
in situations in which it can be expected p
cause major coding or analysis problem*
37. A condition is more likely to be recorded
as present if it appears in a checklist than d
it does not. Hence, list expansions dunng
the trial may appear to “increase" the pm
alence of certain conditions. However, the
expansion will not influence treatment com
parisons unless the changes were imple
mented at different times for the various
treatment groups under study.
38. It is sometimes convenient to include a sum
mary check position at the head or end of i
list that may be used in lieu of checking
each individual entry for the list (see Ap
pendix F. 12.2.4 for example).

12 5.6 Unknown, don't know, and
^certain as response options

io The three options are interrelated and are
to a large extent used as if they were inter
changeable. The particular option listed
will depend on the context of the question.
in The operational implications are about the
same. All three options imply the lack of
information needed to answer a question.
41 Don't know or uncertain should not be
listed as a response option if the aim of the
item is to require the respondent to record
his best guess even if he does not know or is
uncertain regarding the accuracy of his
reply. The form should have written in
structions when guesses are required.

12.5.7 Measurement and Calculation items
K measurement item is one that requires the
respondent to record some measurement. A cal
culation item is one that requires the respondent
.to carry out an arithmetic calculation using
other information on the form. The examples
that follow are taken from the HPT.

Hnfhi and weight measurement and calculation
example
Height (shoes off):

inches

Weight (outdoor garments
and shoes off):
lbs

QI. = Wt/HP

lbs! in2

0.

pressure measurement and calculation ex
ample

BP in mm Hg

hi RZ BP
* Reading
Zero value
c a-b
•W RZ BP
d Reading
r Zero value
( d-e

*vrage RZ BP
1 Sum (c + f)
Avg (g 2)

1

I

SBP

DBF

129

42. The unit of measurement should be speci
fied on the form (Appendix F. 13).
43. Measurements should be made and re
corded in units familiar to the personnel
responsible for making them. Use of an
unconventional unit may lead to data col
lection and recording errors (Appendix
F.13.4)
44. Whenever feasible, all recordings of a spec
ified variable should be made using the
same unit. Use of different units may occur
when different laboratories are used (e.g.,
as in a multicenter trial in which each clinic
relies on its own laboratory for making
required laboratory determinations).
45. Space should be provided on the form for
the respondent to indicate the unit of mea
surement when it is not practical to specify
the unit in advance (Appendix F. 13.2.2,
F. 13.2.3).
46. Continuous variables, such as age, blood
pressure, laboratory values, and the like,
should not be recorded in categorical form
(see statement 27).
47. The precision required for a measurement
should be specified on the data form (Ap
pendix F. 14).
48. The amount of precision required for a
measurement should not exceed the error
involved in ------making the measurement (see
comment regarding item F. 14.1 in Appen
dix F).
49. The raw data used to make any summary
calculations should be recorded on the
form.
50. Data forms should be constructed to mini
mize the number of arithmetic calculations
required during a patient visit. All calcula
tions except those needed to perform a pa
tient examination or to carry out some
other treatment or data collection function
during the examination should be per
formed at the data center as part of the
data entry and analysis processes.
51. Calculations needed on data forms made
by clinic staff, even relatively simple ones,
should be made using a pocket calculator
or a computer.
52. Items requiring a series of arithmetic oper
ations during completion of a data form
should be arranged in a format that facil
itates those operations. For example, num
bers that must be added or subtracted
should be arranged vertically and with ade
quate space for recording intermediate cal
culations (Appendix F. 15).

&

130

53. Arrange the calculations for a given item
in a single unbroken column, if possible.
Avoid arrangements in which calculations
are started on one column of a form and
continued on the next column or page.

12.5.8

Instruction items

An instruction item is one that is included on a
form to instruct the individual completing the
form as to how to deal with a given question.
The two types of instruction items discussed are
STOP and SKIP items (Appendix F.16).

K

12.4 Item construction

Data collection considerations

54. A STOP item is used to indicate conditions
that, when encountered during the course
of a patient visit, require clinic personnel to
temporarily or permanently halt some
procedure or process. The stop will be per
manent unless the conditions that require
the stop can be removed.
55. STOP items on any given form should be
arranged to allow the respondent to termi
nate all work on the form as soon as a stop
is checked. This requires an arrangement in
which essential information, required on
all patients, is obtained before any stops
are allowed.
56. A common use of STOP items during the
prerandomization series of clinic visits is to
indicate conditions that exclude a patient
from enrollment into the trial. Stops of this
sort will halt further work-up of the pa
tient.
57. It is wise to arrange prerandomization
stops for procedures in ascending order
with regard to the risk or general discom
fort they entail for patients. The goal
should be to carry out the lowest risk, least
expensive, most productive procedures
first.
58. A SKIP item may be used whenever there
is an item or series of items on a form that
can be skipped depending on the answer to
the item.
59. A SKIP item should indicate the condi
tions under which the skip can occur and
the item or items to be skipped.

12.5.9

Time and date items

A time item is one that requires the respondent
to record the actual clock time at which some
step, procedure, or measurement was carried out.

example, insurance companies consider a
person to have attained the next year of age
one-half year beyond his last birthday an
niversary, whereas a person reporting his
age will give it as of his last birthday annixersary.

A date item is one that indicates the date
step, procedure, or measurement was earned
(see items in Section El3.1 of Appendix f fexamples).
60. Items requiring a clock time should indcate whether the time recordings are f<v
a m. or p.m. if a 12-hour recording svxter
is used. The use of a.m. and p.m. will cjw
confusion for recording 12 noon and i:
midnight, unless instructions given on the
form indicate how these times are to I*
recorded.
61. Times should not be recorded on a 24-hw
basis unless personnel responsible for rhe
recordings are thoroughly familiar with 24hour timing schemes or the readingx irt
made directly from 24-hour clocks.
62. The order to be used in recording the ditr
should be specified on the form (e g. w?
items in Section F.I3.I in Appendix h
The two most common conventions art
Month, Day, Year
Day, Month, Year
63. Failure to specify the convention to be uvd
dates if they are recorded in digital form
For example 1-9-82, could be read as Jaru
ary 9, 1982, or 1 September 1982, depend
ing on the convention used.

12.5.10

12.5.11

11 is wise to construct a name code, made
up of some combination of letters from the
patient’s first, middle, and last name, for
use as a patient identifier. This identifier is
in addition to ID number and should not
be changed once it has been issued, even if
the patient has a subsequent name change.
The name code may be used in addition to
name or in place of it depending on
uhether the study forms are designed to
preclude collection of name.
bach follow-up data form should include
an item for recording visit number. The
number is typically checked against the
patient’s appointment schedule (see Table
12 I for example) to determine if the visit
occurred within the permissible time win
dow.
'o Patient identifiers useful for mortality fol
low-up include:

Birthdate and age items

64. The baseline data forms should include
both an age and birthdate item if age n to
be used either as an eligibility condition f-'f
enrollment into the trial or in subsequent
data analyses.
65. Date of birth is a key piece of information
in many trials. It may be needed for mak
ing accurate age calculations or for record
search and linkage operations in the fol
low-up of dropouts for mortality via the
National Death Index and other similar
files. Birthdate may also be useful in link
ing different records for the same indi'xJual if name is not collected.
66. A patient’s reported age should be checked
against his reported birthdate on entry into
the trial, as illustrated in Appendix T I
This is particularly important if age is uvd
as an eligibility condition. Discrepanon
should be resolved.
67. The age that is reported may differ depend
ing on the source it is taken from

Identifying items

btrv data form should contain space for re
cording the patient’s ID number and name (or
•ume code). Once these two items have been
rtcred into the data system, a cross-check
rS’uld be made on all new data to be entered,
l-'ormation from a form should not be added to
data system if the ID number and name (or
-jmc code) do not agree. See Section 12.6.9 for
• jriher comments.

I

• Social Security number
• Date of birth
• Place of birth
• Father’s name
• Mother’s maiden name
• Patient’s maiden name for females
• Date and place of death (if applicable)
'I A unique identifier should be assigned to

I

I

each member of the clinic staff involved in
data collection. This number may be used
in place of name or initials (or in combina
tion with name) to identify the individual
responsible for completing or reviewing a
form, or a series of items on a form.

12.5.12

131

TYtcer items

A tracer item is one that is used to obtain infor
mation needed to locate a patient. In some cases
the information provided by such items is used
to locate and recontact a patient who has
dropped out of the study to try to persuade him
to return to the clinic for examination and sub
sequent follow-up. In other instances the items
are used to facilitate the collection of mortality or
morbidity data.

72. Tracer data should be collected on all pa
tients upon entry into the trial and should
be updated at periodic intervals over the
course of the trial.
73. Useful patient tracer data include:

• Current address and telephone number
(home and work if patient has both)
• Employer’s name, address, and tele
phone number
• Name, address, and telephone number
of a close relative
• Name, address, and telephone number
of a friend or neighbor
• Name and address of patient’s private
physician
74. Other tracer items, especially for mortality
follow-up, are listed in Section 12.5.11,
Statement 70.

12.5.13
items

Reminder and documentation

A reminder item is one that is intended to re
mind clinic personnel to perform an indicated
procedure or task (Appendix F. 18.1). A docu
mentation item is one that is used to indicate
that a step or condition required in the data
collection, enrollment, treatment, or follow-up
process has been performed (Appendix F. 18.2).
75. Reminder items are useful in trials with
complicated data collection schemes, or in
which there is a good chance that some of
the personnel involved in data collection
will be unfamiliar with details of the data
collection protocol.
76. Reminder items should be used in conjunc
tion with steps or procedures that are es
sential to the data collection, enrollment,
treatment, and follow-up processes.
77. Key data items that are to be completed by
a designated individual should be followed
by documentation items for recording the

132

date the items were completed and the
name or certification number of the indi
vidual who was responsible for their com
pletion.
78. There should be space at the end of each
form to record the date the form was com
pleted.
79. Documentation items should be included at
the end of each form for recording the
name or certification number of the person
responsible for review of information on
the form and for recording the date of the
review (Appendix F. 18.2).

12.6 LAYOUT AND FORMAT
CONSIDERATIONS

Ji

12.6 Layout and format considerations

Data collection considerations

12.6.1

Page layout

80. Choose a layout that permits use of a single
page size for all forms (e.g., S'/j" X 11").
81. Use a layout in which all pages within a
form are oriented in the same way. That is,
with pages laid out either portrait style (i.e.,
with lines of print running across the short
axis of the page) or landscape style (i.e.,
with lines running across the long axis of
the page).
82. If possible, use the same page orientation
for all forms of a given type (e.g., all those
used at the clinic for follow-up data collec
tion).
83. Use a layout that is uncluttered and that
facilitates use of the forms by both clinic
and data processing personnel.
84. Choose between a full page or two-column
layout (Appendix F.I9).
85. Generally, two-column layouts are more
space efficient than full page layouts.
86. The layout chosen should be compatible
with the data entry needs of the study;
clinic needs should take precedence over
those for data entry if meeting both needs
leads to conflicting layout requirements.
87. Avoid a layout such as that displayed in
Appendix F. 19.1.1, where check spaces are
scattered over the page. The layout in
creases the time required to complete and
key a form and may contribute to errors in
those processes as well.
88. Use layouts such as those illustrated in Ap
pendix F. 19.1.2 and F.19.2. Standardizing
the location of check positions within and
across forms facilitates completion of the

forms and reduces the time and errors in
volved in keying data from them.
89. Whenever feasible, choose a layout that
facilitates entry of data directly from the
form, such as illustrated in Append i
F.19.2.
90. Items should be arranged so as to minimirt
the number that are split across columns or
pages of a form.
91.
71. The
int pagt.->
pages ui
of a iuiiii
form ?»nouia
should oc
be printed of
typed on only one side. The reverse side of
the pages may be used to print instruc
tional material or should be left blank
92. Page layout should be designed to heir
respondents identify items or sections of a
form that are to be skipped under specified
conditions. This may be done by setting
key words or phrases in boldface type or h
use of special instructions or other aids tn
direct the respondent to applicable items <x
sections (Appendix F.20).
93. The space between subparts of an item
should be less than the space between
items.
94. The space separating items should be uni
form unless variation in spacing has opera
tional significance.
95. Similarly, the space separating one part o<
section of a form from another should be
the same and should be greater than the
space separating individual items.
96. Right-hand justification of typed or printed
text should be avoided if it results in notice
able variation in the spacing between
words.

12.6.2

Paper size and weight

97. Use a good quality paper with enough
gloss to avoid bleeding through from ink or
felt pens.
98. Use the same size paper for all forms (wr
statements 80, 81, and 82).
99. A paper size of S'/i* X 11" is preferable to
other sizes, especially when forms are to he
photocopied and filed using standard offiet
equipment.

12.6.3 Type style and form reproduction

100. The print or type font used should be larK
and crisp enough to allow for image degra
dation when forms are photocopied
101. Use a print or type font at least the size
newsprint.

i
1

'

‘

in-* Avoid capitalization of long phrases or sent
ences. Text written in capital letters is more
difficult to read than a mixture of upperand lower-case letters (Wright and Haybittle, 1979b).
ini Use a different print or type font for em
phasizing specific words, phrases, and head
ings and for distinguishing instructional
material from data collection items (e.g.,
see items F.2I.I in Appendix F).
UM Printed forms are generally easier to read
and are esthetically more pleasing than
typewritten forms.
ins Consideration should be given to printing
forms that are to be used in large numbers
or that are difficult to photocopy because
of their size or the way in which they are
assembled. Forms should not be printed
until they have been thoroughly tested and
are no longer subject to revision. It may be
less costly to photocopy forms that are
used in small numbers. The same may be
true for forms used in relatively large
numbers if they are likely to undergo
changes. Forms may be photo-reproduced
from either typed or professionally printed
masters.

12.6.4

Location of instructional material

H* Instructional material on the first page of
the form should indicate when the form is
to be used and who is responsible for com
pleting it (Appendix F.21.2.1).
IO? Instructional material relating to specific
items or sections of a form should be lo
cated next to those items or sections (Ap
pendix F.21.2.2).
IW All instructions needed for completion of a
form should be included on the form. This
is especially important in long-term trials
m which personnel may change over the
course of the trial, and in multicenter trials.
W All instructional material should be as con
cise and simple as possible.
HO Instructional material should be identified
by use of a special type font or in some
other way (Appendix F.2I).
III. Instructional material that is too extensive
for inclusion next to the item or section to
which it pertains should be contained in a
separate booklet or should appear on the
hack side of the page adjacent to the one in
question.

133

112. Key definitions needed for completion of
an item should appear on the form.
113. The instructions should identify items that
are to be read verbatim to the patient, as
discussed in Section 12.4.2.
114. Items with a list of permissible responses
that are not mutually exclusive should con
tain an instruction to indicate whether or
not the respondent may check more than
one response.
115. Items which include unknown, don’t know,
or uncertain as response options should
include instructional notes to indicate if
any special procedures are required before
these categories are checked (e.g., an in
struction to remind clinic staff to check
specific medical records before checking
the uncertain category for a designated
item).
116. The instructions should indicate the steps
to be followed in performing a particular
measurement or procedure. Reference to
the appropriate section of the study hand
book or the manual of operations should
appear on the form if the measurement or
procedure is too complicated to be out
lined on the form.
117. There should be an instruction at the end
of each form that indicates where the form
is to be sent after completion and the steps
to be followed in preparing the form for
transmission.

12.6.5

Form color coding

Color coding is useful if there is a need to distin
guish among different types of forms (e.g., pre
randomization forms versus follow-up forms, or
forms completed in the laboratory versus those
completed in the clinic) or among different cop
ies of the same form (e.g., white for the original,
green for the first copy, and pink for the second
copy).

118. The color-coding scheme should be simple,
logical, and easy to remember.
119. The colors chosen should be limited to a
few distinct shades.
120. A particular color should have the same
meaning throughout the study (e.g., pink
always identifies the second copy of an orig
inal).
121. As a rule, forms printed on pastel-colored
paper are easier to read and will produce
better quality photocopies than those

134

printed on dark-colored paper. The legibil
ity of photocopies produced from pages
using the colors proposed should be
checked before making the final color se
lection.
122. Color coding should never be used as the
sole means of identifying a form or its use.
Written information should appear on
the form to designate its use and should
be sufficient to identify a particular form
if individuals are unable to distinguish
among the colors.
123. It may not be practical to use multicolor
forms if a clinic is responsible for maintain
ing its own supply of forms from photo
copy masters.
.

St

12.6 Layout and format considerations

Data collection considerations

12.6.6

Form assembly

124. Multipage forms may be supplied to clinics
collated and bound (e.g., stapled), collated
and unbound, or uncollated. The latter
method of supply is preferable when the
number of pages making up a form varies
depending on the patient or examination.
Forms that are collated should be supplied
unbound if it is likely that they will have to
be disassembled for completion or to make
photocopies of them after completion.
125. The individual pages of a form should be
sequentially numbered and should indicate
the total number of pages in the form (e.g.,
by using the following kind of numbering
scheme: page I of 10, page 2 of 10, etc.).
126. Paper clips or similar kinds of fasteners are
not acceptable for securing the pages of
completed forms. They are likely to come
off as the forms are handled in copying,
coding, or filing.
127. Forms may be developed with specially de
signed answer pages that may be detached
from the main body of the form. The Lipid
Research Clinics used this approach to re
duce the volume of paper flowing to the
coordinating center. Detachable answer
pages may be used only if all information
required for data entry can be recorded on
the answer sheets and adequate documen
tation is provided on the answer sheet to
identify the patient and type of examina
tion performed.
12.6.7

Arrangement of items on forms

Thought should be given to the ordering of items
within and across forms. The arrangement

out disrupting the numbering system for
other sections. However, the disadvantage
is that both a section and item number are
needed to locate a specific item on a form.
■ s should be arranged among forms so
|V Items
that any given form can be completed in a
single session, as discussed in Section
12.4 4.
time lag between collection of a block
ns The
of information and transmission of that
information to the data center should be
minimized. This generally requires use of
different forms for recording data that are
generated at different clinic visits. Different
forms may be needed as well for data gen
erated at the same visit, but by people at
different locations in the clinics.
IW Data items that are considered confidential
or that deal with sensitive information
should appear on separate pages of a form
or on a different form so that it is possible
for the page or form to be stored apart
from the remainder of the patient’s file.

should be compatible with the needs of pan-,
lie—i
and clinic staff. Arrangements that are not rrj
result in missed or poor quality data.
128. Place items calling for a particular frame c'
reference next to one another.
129. The nature, quality, and quantity ofinfr*.
mation obtained on a form may be mf:>
enced by the order of the items on it.
130. The number of positive responses to a lie
of questions will be higher for lists that iread or shown to the patient than when
list is simply used by clinic staff to recc’d
information volunteered by the patient ivSection 12.4.3).
131. The order of procedures should rema*
fixed over the duration of the trial, e^
cially if there is any chance that one pnxt
dure (e.g., ingestion of iopanic acid •
order to perform cholecystograms) affrr>
the results of another procedure leg
serum cholesterol determinations; see
tional Cooperative Gallstone Study Grou;
1981a, for additional details). A fixed ordr
does not necessarily eliminate this proble-. '
but it does control the effect over time ar*d
across treatment groups. Further, not i
variations in sequencing can be avoided ‘
the number of procedures performed
differs from examination to examination
132. The arrangement of items within a forshould be compatible with the preparatio*
required for a particular examination (e ? 1
the items to be completed with the patie- j
in a fasting state should appear bef '• <
those that are to be completed after the
patient has been allowed to eat or ha< bee
given a glucose load).
133. Group items into sections with headmr
indicating the general content of the vt
tions. Use a different type font to facihtrr
identification of section headings.
134. The numbering and identification schemo
used on a form should be designed to far
tate the identification of items and the'
subparts.
135. Use different spacing to indicate transitin'1
from one item to another and from o*
section to another.
136. Devise a numbering system for identifica
tion of individual items on a form I’c"^
should be numbered sequentially oser t**
entire form or within sections of the form
The former system is preferable. The latty
one has the advantage of allowing for ad
don or deletion of items in a section •>i

12.6.8

Format

/2 6.8.1 Items designed for unformatted
written replies

Items in this class should provide space for
handwritten replies without any restriction on
•he number of characters of information that
mav be provided (Appendix F.22).
140 The amount of space provided on the form
will influence the quantity and quality of
information supplied.
141 The space provided should be consistent
with the amount of detail desired and
should be large enough to prevent the re
spondent from having to resort to use of
cryptic abbreviations or unnaturally small
handwriting.
142 Designate the area where the reply is to be
recorded. If lines are used, the space be
tween them should be at least '/Z (e.g., see
item F.22.2 in Appendix F).
143 An unlined space, such as shown in item
F.22.3, may be preferable to use of lines,
especially if responses are typed.

1^6 8.2
replies

hems requiring formatted written

hems in this class require the respondent to fit
the response into a designated number of char-

i

135

acter spaces. The restriction is ordinarily im
posed to facilitate processing of the information.
144. The number of allowable characters per
item will be dictated by the code format
established when the item was developed.
145. Formatted items should indicate the
number of data characters allowed or re
quired. This may be done in the instruc
tions accompanying such items (e.g., by
asking the respondent to make certain his
reply does not exceed more than a specified
number of characters) or by using charac
ter boxes or lines, as illustrated in Appen
dix F.13.1 and F.23.
146. Character lines are preferable to character
boxes, especially if the lines that form the
boxes serve to camouflage characters con
tained in the boxes. The weight of the lines
or color of the ink used to form the boxes
should be distinctly different from the line
weight or color of the characters appearing
in the boxes when boxes are used.
147. Forms to be completed by hand should
have character line segments that are
long. The line segments may be shorter if
the forms are to be completed using a type
writer.
148. The precision requirements for numeric
data should be indicated in the item, as
illustrated in Appendix F. 14.1 or F. 14.2.

12.6.8.3

hems answered by check marks

149. The order of responses (e.g., yes followed
by no, or vice versa) should be uniform
throughout a form and across forms (Ap
pendix F.24).
150. Inadequate space for checking the proper
response (Appendix F.24.4) may lead to
errors when items are completed or keyed.
The separation of check spaces when ar
ranged vertically may have to be fairly siza
ble if multiple copies of a form are to be
made using carbon or NCR (no carbon
required) paper. Variation in the registry of
the copies relative to the master can render
entries recorded on the copies ambiguous.
151. The space used for checking a response
should be as near the items as possible. A
dashed or dotted line should be used to
associate the check space with the response
category when the latter is widely separated
from the former (see Appendix F.24.8 and
F.24.9).
152. A long list of response options should be
broken by a blank line after every third or

136

Data collection considerations

fourth entry in the list to aid the eye in
locating the appropriate check space (Ap
pendix F.24.9 and F.24.10).
153. Forms requiring a check mark to indicate
the appropriate reply to a question are pref
erable to those in which the respondent
reads a list of items associated with the
question and then records the code
number(s) of the itcm(s) selected. The latter
approach should be considered only when
the same list of responses applies to several
different questions on the form, or when
the list of possible responses is inordinately
long.
154. Use of lists that are not part of a form, or
that are located elsewhere on it, may in
crease the time needed to complete the
form.

•i
4; if

12.6.9 Location of form and patient
identifiers

155. Each form should bear the name of the
study, the name of the form, a form
number, version number, and version date.
156. The form number, version number, and
version date should appear on each page of
the form. The version date is useful if indi
vidual pages are revised during the study.
157. There should be space on each page for
recording the patient ID number and visit
number (see Section 12.5.11).
158. The space for recording patient ID number
should appear in the same relative position
on all forms (eg., upper right-hand
comer). A standard location helps to mini
mize the risk of the item being left blank
when forms are completed and facilitates
use of the information for filing and re
trieval.

12.6.10
entry

Format considerations for data

159. If possible, data forms should be designed
to allow for data entry directly from the
form, without intervening transcription of
the data. This generally requires designation of codes and fields on the form (Ap
pendix F.25), except where data entry is
done via CRT screens that display the re
quired fields.
160. It may be useful to reserve space on each
form for office use. The space may be used

12.7 Flow and storage of completed data forms
to record transactions involved in the conpletion of the form and entry of inform»tion into the data system.
161. Coding and data entry operations should
be designed to minimize the number erf
times a form is handled. Ideally, all mfor.
mation should be keyed at the same time
including any handwritten unformatted information.
162. A special code should be entered into the
data system to identify items that contain
data that are not keyed (e.g.. uncoded
handwritten replies). The code is useful il t
is ever necessary to retrieve forms contain
ing unkeyed information.
163. The location of check spaces should he
standardized to facilitate the data entry pro
cess.
^64. Coding conventions should be uniform
across forms (e.g., use the same letter or
number code to denote a yes reply).
165. The layout of a form should take account
of coding and data entry requirements, but
should not be dominated by them, espe
cially if the layout complicates use of the
form in the clinic.
166. The coding layout should permit data entn
personnel to proceed through a form in in
orderly fashion with few. if any, reference
to items already keyed or to items still to he
keyed.
167. The form number, version number, or ver
sion date appearing on a completed form
should be keyed. The information may hr
needed to interpret changes in the data tbit
occur as a result of forms or coding
changes.

12.7 FLOW AND STORAGE OF
COMPLETED DATA FORMS

Mudv chairman’s office for a preliminary review
jnd edit and then to the data center for keying,
ednmg and storage. The intermediate stop dejss receipt of the forms at the data center,
?hCrehv reducing the usefulness of the edits and
inaMes carried out by the center. Further, inter
mediate stops complicate communications with
dimes concerning missed visits or deficient
Snms. since the inventorying and editing respon...___are
hv shared
thp chairman's
office and
uhilities
by the chairman
’s office and
•he data center.

I

Data forms should flow to data entry for keyin|t
and storage as they are completed. (See Char
ters 16, 17, and 24 for additional discuwon
concerning data flow, editing, and storage rriv
cedures.) Continuous unrestricted flows are
preferable to those that are constrained by batching requirements (e.g., such as those imposed h
requiring a clinic to forward forms for procnv
ing only at specified time intervals).
Intermediate stops as a form moves from the
clinic to the data center for processing should he
avoided, if at all possible. Many of the Veterim
Administration multicenter trials have proce
dures in which forms are sent from clinics to the

t

I

137

The requirements for form storage should be
addressed early in the course of the trial, ideally
before any forms have been completed. The stor
age plan should be designed to protect the rec
ords from any unauthorized use and against loss
or destruction. Protection of the latter type may
require maintenance of duplicate files—one at
the clinic and the other at the data entry site.
Large or important files may be microfilmed to
reduce the space required for storage or as a
further safeguard against loss.

Part III. Execution

11

i
(hapters in This Part
,
stePs *n executing the study plan
11 Preparatory
14 Patient recruitment and enrollment
15 Patient follow-up, close-out, and post-trial follow-up
16 Quality assurance

Ii

lhe four chapters of this Part are concerned with execution of the trial. Chapter 13 out ines
lhe neps required in executing the trial, with emphasis on the steps to be carried out tn gett.ng
.1 tried Chapters 14 and 15 concentrate on the recruitment, treatment, and follow-up pro
ves. The last chapter details general procedures needed to ensure the quality of the data

generated in a trial.

1

ft

y
II
Kg

r-

139

13. Preparatory steps in executing the study plan

The lame man who keeps the right road outstrips the runner who takes a wrong one. Nay, it is
obvious that when a man runs the wrong way, the more active and swift he is the further he will
sjf Francis Bacon

go astray.

11

I

fl
i I
1

I

I

111 Essential approvals and clearances
IVI.| IRB and other approvals
111.2 IND and IDE submissions
111.3 OMB clearance
I ’ 2 Approval maintenance
1.12 I IRB
112.2 FDA
1.12.3 Other approvals
D 3 Developing study handbooks and manuals
of operations
I * 4 Testing the data collection procedures
I ’ 5 Developing and testing the data manage
ment system
0 6 Training and certification
117 Phased approach to data collection
hble 13-1 Information required for IRB ap
proval
Table 13-2 Items of information required for
IND and IDE submissions to the
FDA
Table 13-3 Suggestions for development of
study handbooks and manuals of
operations

Ill ESSENTIAL APPROVALS AND
(I.EARANCES

Ml trials require completion of a series of steps
^fore they can be started. The steps outlined in
’his chapter are in addition to those discussed in
Chapters II, 12, and 21 with regard to prepara
tion of the study plan, data forms, and funding
request.

1’1.1

IRB and other approvals'

of individual centers in a trial (clinics, as well as
the data center and any other resource center
concerned with data collection or patient care).
The main function of the board is to provide
assurance that the proposed research meets ac
cepted standards of ethics and medical practice.
Technically, the assurance is needed only for
federally funded studies. However, most institu
tions require reviews for all research involving
humans, regardless of the source of funding. The
impetus for the boards grew out of concerns in
the 1960s regarding the nature and extent of
research involving humans. A memo dated Feb
ruary 8, 1966, from the Surgeon General of the
United States Public Health Service mandated
creation of the local boards as a prerequisite for
continued funding. The structure for IRBs, their
composition, and their domain of responsibility
has subsequently been spelled out in federal reg
ulations on protection of human subjects (Office
for Protection from Research Risks, 1983).
Each board, in order to comply with current
regulations, must:

• Have at least five members
• Not be made up exclusively of members of
one sex or of one profession
• Include at least one member whose primary
concerns are in a nonscientific area (e.g.,
law. ethics, theology)
• Include at least one member who is not other
wise affiliated with the institution and who
is not part of the immediate family of a
person who is affiliated with the institution
• Exclude any member from review of a spe
cific proposal who has a conflict of interest
(e.g., is an investigator in a study under
review)
Individual IRBs have their own rules regard-

(Tne set of approvals has to do with those proing time schedules for submissions, formats for
'
proposals, and the nature and amount of mate
Mded uby- the institutional review boards (IRBs)
rials to be supplied. Table 13 1 lists the informa
tion requirements as envisioned for a ‘ typical
1 Set Section 14.6 for additional comments.

141

I

13.1 Essential approvals and clearances

142

143

Preparatory steps in executing the study plan
Table 13-2 Hems of information required for IND and IDE submissions to the FDA

Table 13-1

Information required for IRB approval

• Statement of study objectives and rationale

• Description of the study treatments and methods of
administration
• Recap of prior evidence concerning safety and efficacy
of the study treatments
• Type and source of study patients

• Primary outcome measure for assessing the study treat
ments
• Length of patient follow-up
• Number of patients to be enrolled and rationale for
proposed sample size

• Risk-benefit analysis of trial

• Method of treatment assignment (e.g., random, physi
cian choice, etc.)
• Summary of methods for protecting patients from need
less or prolonged exposure to a harmful study treat
ment

I

• Summary of safeguards to protect patient privacy and
confidentiality
• Consent statement and related material

w”':

o
V".’

IRB in relation to clinical trials. Specifics will
vary from board to board.
The material submitted to the IRB should
indicate the nature and extent of safety monitor
ing to be performed (see Chapter 20). The indi
vidual or group responsible for this function
should be identified in the submission along with
sufficient details to enable members of the IRB
to make an informed judgment regarding the
statistical credentials and expertise of the indi
vidual or group named. The submission should
include a general description of the methods to
be used for safety monitoring, the frequency of
interim analyses for monitoring purposes, and
the procedure to be followed in communicating
with local investigators and the IRB regarding
proposed treatment changes emanating from the
monitoring. Details regarding the communica
tion process are especially important in trials in
which monitoring responsibilities are vested in
an individual or group that is not under the
control of the local clinical investigator, as in
most multicenter trials and some single-center
trials.
The National Institutes of Health (NIH) will
not review a research proposal involving humans
without assurance from the proposing investiga
tor's IRB. The assurance is supplied via comple
tion of form HHS 596 (Protection of Human
Subjects Assurance/Certification/ Declaration)

that is signed by a responsible official of the
IRB.
Proposals for clinical trials may require at
least two IRB reviews before initiation of patten
intake. The first will be required in conjunction
with the submission of the funding proposal to
the sponsor. The second will be required after
the proposal is funded and before the initiation
of patient intake, after the details of the studs
protocol and consent process have been set
The proposing investigator is responsible for
communications with his IRB. He must be pre
pared to address their concerns in a forthright
manner and to revise consent statements in ac
cordance with their requests. Concerns regard
ing the rights of patients to privacy and confi
dentiality, as well as safety issues, must be
addressed. The entire review and clearance pro
cess may take months and may be complicated
by the need to clear changes through the leader
ship of the study, in the case of multicenter
trials (see Section 14.6.2 for added detailsl
Additional reviews and approvals will be
needed if the trial involves use of hazardous
materials, such as radioactive isotopes, or labo
ratory animals.
13.1.2

All nonclinical laboratory studies have been or
will be conducted in accordance with the Good
Laboratory Practice regulations of the federal
government, or that reasons why they have not
or cannot be followed will be supplied to the
FDA

• Summary of previous investigations involving the
drug

B. Investigational Device Exemption (Summarized from
reference 189. Appendix I)
• Name and address of sponsor of IDE along with
names and addresses of all other investigators to be
involved in the IDE
• Summary of prior investigations of the device

• Copies of informational material (including informa
tion on label and labeling) about the drug to be
supplied to investigators involved in administering
the drug
• Name and qualifications of each investigator to be

involved in proposed studies
• Name and qualifications of personnel responsible for
monitoring progress of proposed studies and for
safely monitoring
• Description of the study plan, including details, in the
case of proposed clinical trials, regarding sample
size, duration of the study data collection, methods
of treatment, as well as details concerning the IRB
responsible for reviewing the proposed work, and
details regarding informed consent
• Assurances from the IND sponsor that:
The FDA will be notified if the investigation is
discontinued and of the reasons for the action
Each investigator associated with the IND will be
notified if an NDA for the drug is approved, or
if the investigation is discontinued

If the drug is to be sold, an explanation will be
supplied to the FDA as to why sale is required
and why sale should not be regarded as com
mercialization of the drug

IND and IDE submissions

Most drug trials will require submission of in
Investigational New Drug Application (INDA.
also referred to as an IND) to the Food and
Drug Administration (FDA) before they can he
started (Food and Drug Administration, 19811
Table 13-2, Part A, lists general items of infor
mation required for an INDA.
An INDA is required for any drug that is not
approved by the FDA for the indication pro
posed. The requirement extends to established
drugs that are to be used in ways that depart
from prescribed practice, as indicated in the
label insert. For example, the University Group
Diabetes Program (UGDP) needed an INDA
for both tolbutamide and phenformin even
though they had been approved by the FDA is
hypoglycemic agents. Even a nonprescription
drug requires an INDA if it is used like a pre
scription drug. For example, one was required
for aspirin in both the Coronary Drug Protect
Aspirin Study (CDPA) and Aspirin Myocardul
Infarction Study (AMIS).
The FDA approval process can delay the start
of the trial and lead to alterations in its design
Investigators in the National Cooperative (till
stone Study (NCGS) were required to earn out

r'l-^hMtional New Drug Application (Summarized
. ;.ffl H)A Form 1571. 10/82. Notice of Claimed Investtgat . njl I xemption for a New Drug)
• Details concerning the drug, including drug name,
composition, source, method of preparation, qualnv control procedures in production and packaging

Clinical studies in humans will not be initiated
prior to 30 days after receipt of the Notice of
Claimed Investigational Exemption for a New
Drug by the FDA, unless otherwise indicated
by the FDA

I

- An environmental impact statement will be pro
vided to the FDA. if so requested

biopsy studies of patients treated with chenodeoxycholic acid before they were allowed to
proceed with a full-scale trial of the drug (Na
tional Cooperative Gallstone Study Group
1981a, 1981b, 1984).
Amendments to the Federal Food, Drug, and
Cosmetic Act of 1938, passed in 1976, extended
the regulatory authority of the FDA to medical
desices. A medical device is defined as (Food
«nd Drug Administration, 1983):

Any instrument, apparatus, implement, ma
chine, contrivance, implant, in vitro reaxent. or other similar or related article.

• Description of the methods, facilities, and controls
used for the manufacture, processing, packaging,
storage, and. where appropriate, installation of the
device
• Certification that all investigaton have signed an
agreement to be involved in the IDE and that no
new investigators will be added without signed
agreements
• Name and address of the chairperson of each IRB
associated with the IDE request
• Details regarding price of the device if it is to be sold
and an explanation of why sale does not constitute
commercialization of the product
• An environmental impact statement when requested
• Details concerning labeling of the device
• Copies of all forms and informational materials to be
provided to patients in relation to the consent pro
cess
• Description of the study plan including.

- Statement of purpose
- Study protocol
- Risk analysis

- Description of the device
- Methods for monitoring the investigation (pro
gress as well as safety), including names and
addresses of monitors

including component, part, or accessory,
which:
• Is recognized in the official National For
mulary, or the United States Pharma
copeia, or any supplement to them;
• Is intended for use in the diagnosis of
disease or other conditions, or in the
cure, mitigation, treatment, or preven
tion of disease, in man or other ani
mals: or
• Is intended to affect the structure or any
function of the body of man or other
animals; and

144

Preparatory steps in executing the study plan

• Does not achieve any of its principal
intended purposes through chemical
action within or on the body of man
or other animals and which is not de
pendent upon being metabolized for
the achievement of any of its principal
intended purposes.

I

The definition covers approximately 1,700 de
vices that range from blood collection tubes and
tongue depressors to-Jieart valve replacement
materials and pacemakers.
The FDA has established three classes of de
vices. based on the degree of control deemed
necessary for assuring the safety and efficacy of
the device (Food and Drug Administration.
1983). All three classes are subject to the Good
Manufacturing Practices Regulations. In fact,
the only controls required for Class I devices
(e.g., capillary blood collection tubes, tongue de
pressors, crutches, and arm slings) are via these
regulations. Added assurances for Class II devi
ces (e.g., hearing aids, blood pumps, catheters,
and hard contact lenses) and Class III devices
(e.g., life-support or life-sustaining devices, such
as pacemakers, intraocular lenses, and heart
valve replacements, as well as devices considered
of importance in preventing impairment of
health) are provided via performance standards
plus clinical trials for Class III devices. Permis
sion to carry out trials of Class III devices is
obtained via an Investigational Device Exemp
tion (IDE), granted by the FDA. Part B of Table
13-2 lists items of information required in con
junction with an IDE application (Food and
Drug Administration, 1980).
13.13

OMB clearance

The Office of Management and Budget (OMB),
one of the offices in the executive branch of the
United States government, has the authority to
review and approve data forms used by all
branches of the federal government, including
the NIH. Technically, any data form to be ad
ministered or distributed to ten or more people
that is produced by a governmental agency, or
by a group under contract to it, requires OMB
clearance—even draft versions of data forms
developed simply for testing purposes. Forms
developed under NIH grants are not subject to
the order.
The review can delay the start of data collec
tion, especially if staff at OMB regard certain
forms or items as unnecessary or to constitute an

13.4 Testing the data collection procedures

invasion of a person’s privacy. Usually, hounrr
the review and approval process is not a ma^
stumbling block. In fact, many areas ofdinica.
investigations are exempt from review and thox
that are required may be achieved in short ordif the project officer of the sponsoring agtno
maintains an effective working relationship ur'
OMB staff and allows sufficient lead time f.*
clearance.
13.2

13.2.1

(rded adverse events to the FDA as they occur.
Ihcre is also a requirement to provide summaof study results as the trial progresses. The
•itter reporting requirement may be satisfied by
vmplv supplying the FDA with copies of reports
rrerared for the treatment effects monitoring
.ommittee (see Chapter 23). Both the CDP and
XCGS satisfied the majority of their FDA re
porting requirements in this way.

APPROVAL MAINTENANCE

13.23 Other approvals

IRB

Other approvals granted at the start of the trial,
wch as for use of radioactive compounds or
controlled substances, will have to be updated as
the trial proceeds. Changes to the data forms
may have to be cleared through OMB if the
ttudv is funded via a government contract. Sponwmg agencies, such as the NIH, will require
mterim progress reports to continue funding for
the trial.

The approval granted by the IRB prior to the
start of the trial and for each renewal will be lot
a one year period, unless otherwise indicated
The submission accompanying a renewal requeM
should indicate the nature and extent of progrrn
made since the initial request or last rene»£
request, the reasons for continuing the stud\
and proposed changes in the study protocol or
consent procedures. Changes must be cleared
before they can be implemented. Those that can
not wait for the annual review will require spe
cial reviews.
The IRB may require a synopsis of intenm
results for renewals of trials requiring safety mon
itoring (see Table 22-1 in Chapter 22). Compil
ing with this request will pose problems in tnah
in which clinical investigators are denied accw
to interim results for reasons discussed in Chap
ter 22. The results portion of the renewal sub
mission will have to be prepared and submitted
by nonclinical personnel in such cases. The
boards may be willing to forego looks at intenm
results if they are satisfied with the safety moni
toring done in the study, as discussed in Sec
tion 13.1.1. They may have no choice in multi
center trials if clinics are not given access to
interim results. Theoretically, they could still in
sist on synopses of results for the clinic in ques
tion, but they would be of little value because of
the numbers of patients involved.
Investigators are obligated to report unex
pected adverse events as they occur. Those re
ports are reviewed as they are received and mi'
lead to immediate suspension or withdrawal ol
the approval until or unless changes mandated
by the IRB are made.
13.2.2

133 DEVELOPING STUDY
HANDBOOKS AND MANUALS
OF OPERATIONS
\ny trial requires two basic sets of documents:
■me that describes clinic operations and another
tbit describes the data intake and processing
procedures in the trial. These two sets of docu
ments may constitute separate sections in the
umt handbook or manual or may be contained
m separate documents (see Appendix G).
A large multicenter trial may require several
other documents in addition to the two mennoned above. Studies with a central laboratory
•ill need a document that describes its methods
ind procedures. Other resource centers, such as
those needed for performing special reading or
coding functions, will also need documents de
tailing their practices.
I he groundwork needed for production of the
rrquired handbooks and manuals is laid when
the trial is planned. The work involved in writing
and maintaining these documents will start
shortly after the trial is funded and continue
until it is finished. Table 13-3 contains a list of
suggestions concerning their development and
maintenance.
A handbook, as used in this context, is a
document that contains a series of tables, charts,
figures, and specification pages that detail the
design and operating features of the trial. A
manual, as discussed herein, is a document that

FDA

The individual (or agency) to whom the IND A
or IDE is granted is required to report uncv
I

i

145

details the methods and procedures of the entire
trial or some aspect of it largely through written
narrative and accompanying tables, charts, and
figures. The two kinds of documents serve some
what different functions and, hence, are not nec
essarily interchangeable. The primary virtue of a
handbook lies in its organization and in the
tabular nature of the material presented. It is
designed for use as a ready reference for study
personnel. Manuals are designed to document
procedures used in the trial. They are most use
ful to persons who want a detailed description of
the actual procedures used.
The two kinds of documents may be devel
oped simultaneously or in sequence, starting
with the handbook. The latter approach was
used in the Hypertension Prevention Trial
(HPT). Work on the manual of operations was
delayed until the handbook was developed. The
development of the handbook simplified the
task of preparing the study manual of opera
tions. Further, the fact that the trial had been
under way about 9 months when the work
started allowed its developers to reference exist
ing study documents and task specific manuals,
thereby avoiding the need for inclusion of those
details in the main document.

13.4 TESTING THE DATA
COLLECTION PROCEDURES
Three general assurances should be satisfied be
fore data collection is initiated:

• Essential data collection and patient exami
nation procedures have been reviewed and
approved by the study leadership
• Data forms needed for patient enrollment
and for the initial phase of treatment and
follow-up have been tested and are ready
for use
• Projected time requirements for developing,
testing, reviewing, and approving data col
lection procedures and related data forms
for use in the later stages of treatment and
follow-up are consistent with the data col
lection schedule of the trial
Satisfying the last condition may require a delay
in the start of patient recruitment, even though
the initial data intake procedures have been
tested and approved. Once the first patient is
enrolled, the rest of the data collection schedule
is lockstep. It is better to delay the start of
patient recruitment than to be forced into post
poning follow-up visits because of the lack of

146

Table 13-3

• Identify major topics or functions for which hand
books or manuals are required (e.g.. clinic opera
tions. data intake and processing, laboratory proce
dures. $1^.)
• Develop a draft table of contents for each required
handbook or manual and submit for review and
comment by the leadership group of the trial before
development

• Develop methods and procedures for data collection
with input from key study personnel, including cli
nicians. statisticians, clinic coordinators, labora
tory technicians, and the like

;

tfcted on real patients. The “walk-throughs” in
variably identify items or sections that need to be
rrlocated or rewritten to eliminate confusion or
h> ureamline the way forms are to be completed.
Once these steps have been completed the
forms are ready for field tests involving real pa
tients. The test conditions should be as similar to
thine for the actual trial as feasible. The best
approach is one in which the entire set of study
procedures are carried out. However, this may
not be possible for procedures that entail risks or
that arc justified only in special circumstances.
Ihe forms used should be in near final form,
with one or two exceptions. They should make
generous use of open-ended response categories,
wch as discussed in Section 12.5.4, in order to
collect information useful in constructing re
sponse checklists for the final versions of the
forms. They may also include alternative ver
sions of the same item in order to determine the
preferred wording of the item.
The number of patients used for the test
should be large enough and heterogeneous
enough to provide a reliable basis for prepara
tion of final versions of the forms. The number
will depend on available resources and on the
complexity of the data collection scheme pro
posed The penalties for undetected deficiencies
ire greatest in trials involving large numbers of
patients.
The deposition of data collected in the test run
should be settled before the run is undertaken if
study-eligible patients are to be used in the test.
The temptation in such cases is to reserve the
option of adding the test data to the main data
file if the number of changes mandated by the
test is “small.” The best approach is to preclude
this option from the outset for several reasons.
First, the desire to preserve the option may re
duce the value of the test itself if investigators
limit the changes they are willing to make simply
is a means of maintaining the option. Second,
the effort involved in merging test data into the
mam file may not be worth the return, especially
if the merger requires a lot of recoding and
reprogramming. Third, the absence of a stated
policy can open the trial to criticism later on if
the decision on use of test data appears to have
heen motivated by a desire on the part of the
investigators to accentuate or ameliorate the ob
served treatment effect.
The intelligibility of any material that is read
given to patients in the trial should receive
special scrutiny during the testing process. Par
ticular attention should be paid to the patient
consent statement and related materials. They

Suggestions for development of study handbooks and manuals of operations

A. General

.S'

13.6 Ttaininx and certification

Preparatory steps in executing the study plan

• Ensure that written material contained in handbooks
or manuals is concise and devoid of complex sent
ences and esoteric language
• Test the adequacy of each handbook or manual by
having it reviewed by individuals who will be
using it
• Release a handbook or manual for use only after it
has been reviewed and approved by the leadership
of the study
B. Organization
• Each handbook or manual should have an official
name and should be easily distinguished from all
other handbooks or manuals in the study (e.g.,
through use of different colored binders)

• The name of the handbook or manual, date of release,
version or edition number, and the name of the
individual or group responsible for its distribution
should be indicated on the title page
• Include a detailed table of contents, along with a
listing of all tables and figures in the document

• Include a subject index and glossary
• Chapters in manuals should be divided into numbered
subsections; the accompanying numbers and titles
should appear in the table of contents of the docu
ment

• Right-hand page margins should be wide enoutTr
allow room for user notes (e.g., at least
standard 8% x II’ pages). The same is true of t^
and bottom margins
• Pages should be typed using high resolution type for-,
to allow for image degradation in photo-repmd*
tion without a serious loss of legibility
• Boldface type, underlining, or other methods sho^
be used to identify key phrases, definitions
important procedural statements

• Ideally, pages should be numbered sequentiallv hew
the beginning to the end of a document. wiihnv«
regard to chapter or subsection. Numbennp sh
terns that recycle by chapter or section allow
page updates without disrupting the entire numhr
ing system. However, such systems are not as con
venient for users as are continuous numbenn|t ssv
tems
• Placement of page and other identifying informanon
should appear in a standard location on all pj^
(preferably upper right-hand corner) and shoud
not be too near the edge of the page

C. Suggested maintenance aids

• Responsibility for periodic review and revision o( i
manual should be assigned to a specific indindui
or group
• A specific individual should be given responsibility If*
keeping (rack of revisions made to a handbook o»
manual and for making certain that all users of it*
handbook or manual are supplied with updates r
they are produced
• Each new version of a handbook or manual should be
identified with a revision date and should indicate
the date and version number of the document it
replaces

• Large documents that are subject to frequent updates
should be kept in loose-leaf binders (facilitates par
replacements and simplifies photo-reproduction d
the document)

• Left-hand page margins should be wide enough to
keep text from being obscured or lost when pages
are photocopied or bound (e.g.. at least l%" for
standard S'/j x II* pages assembled in loose-leaf
notebooks or pressure binders)

• Individual pages that are updated and inserted tn i«
existing version of a document as replacemenit f»»
outdated pages should include the revision date n
the top or bottom right-hand corner of the pajes

data forms or to use forms that have not been
adequately tested.
The construction of the data collection instru
ments is one of the most important tasks in the
entire study. General rules for item construction
and forms development have already been dis
cussed in Chapter 12. The paragraphs that fol
low deal with methods for testing the data
forms.
It is probably fair to say that any item on a
data form that can be misinternreted will be.
Some of the interpretation problems can be

avoided by a careful review of all forms before
any field testing is done. The next review should
involve use of the forms on a few “practice
patients. Ideally, the forms should be completed
for persons as similar to study patients as pos
sible, but friends, colleagues, or spouses, in
structed to behave and respond like “typical
patients, may suffice for some of the testing
The entire set of data forms and accompany
ing procedures should be submitted to walk*
throughs,” involving the staff who will be re
sponsible for completing them, before they are

I

i

147

should be tested on sample patients and then
modified where necessary to ensure a clear and
accurate presentation of the trial.

13.5 DEVELOPING AND TESTING
THE DATA MANAGEMENT SYSTEM
Ideally, the development of computer programs
needed to inventory completed data forms, and
to edit, store, and retrieve data contained on
those forms, should be started as soon as the
forms have been developed for testing. However,
this ideal is rarely achieved in reality. For one
reason, even experienced investigators can un
derestimate the time required to develop a func
tioning data management system. Inexperienced
investigators may not even recognize the need
for one until well into the trial. Other reasons
have to do with time and resource limitations.
Of necessity, most of the work in the initial
phase of a trial is devoted to development of the
study protocol, data forms, and the data system.
The pressures to complete these tasks and to get
started with patient recruitment makes it diffi
cult to find the time needed to develop a working
data management system. The problem is com
pounded by the fact that it is not feasible to
develop a working system until data collection
procedures for the trial have been set—some
thing that may not be done until patient recruit
ment is ready to start.
It is the responsibility of the data center to
make sure that essential data management rou
tines are available when needed. Basic routines,
such as those needed for randomization, must
be available by the time the first patient is ran
domized. Others, such as for inventorying data
forms, should be available as soon as forms
begin arriving at the center. The same is true for
the editing routine to be applied to completed
forms. Work on programs needed for perfor
mance and safety monitoring should begin soon
thereafter (see Chapters 16 and 17).
The decision to start data intake before the
data management system is in place can jeopar
dize its subsequent development. A good data
center will keep this from happening by insisting
on adequate lead time for its development before
the start of data collection.

13.6 TRAINING AND
CERTIFICATION
As a minimum, data collection personnel should
be required to work through a sample set of data

148

-

/

f i-

Preparatory steps in executing the study plan

collection forms and to familiarize themselves
with study procedures before being allowed to
start data collection. Obviously, training cannot
be started until data forms are in final form and
needed documents, such as the study handbook
or manual of operations, are available. This fa
miliarization effort may be followed by work
shops for demonstrating specific procedures and
for observing personnel performing assigned
data collection tasks. The training may be part
of a formal certification process in which per
sonnel are required to pass proficiency tests be
fore they are allowed to start data collection, for
example, as used in the DRS (Diabetic Retinop
athy Study Research Group, 1981). This process
should be started well before the projected start
of data collection in order to avoid delays due to
certification failures.
The training and certification processes are an
essential part of quality control. They should be
maintained over the course of data intake. Exist
ing personnel should be required to undergo
refresher training and recertification at intervals
over the course of the trial. New personnel, re
cruited during the course of the trial, should be
required to go through essential training and
certification procedures before starting data col
lection in the trial.
The need for training and certification is most
apparent in multicenter trials. Special efforts are
required in such cases to make sure that all
clinics are operating under the same ground
rules and that they are adhering to established
data collection procedures. However, the need is
not unique to such trials. It extends to single
center trials as well. The opportunity for vari
ation and misunderstanding with regard to data
practices can be as great, sometimes greater,
than in multicenter trials.

13.7 PHASED APPROACH TO DATA
COLLECTION

Once the necessary testing and certification ha\f
been completed, patient enrollment may hepn
There is a temptation, once this point is reached
to proceed as rapidly as possible. However, some
initial restraint is wise, since live study condtions can be expected to reveal heretofore unde
tected defects. The larger the number of patient*
already enrolled when the defects are discoxered
the greater the costs involved in correcting them
A phased approach to data collection is espe
cially important in multicenter trials invohint
a large number of clinics. Allowing all clinics m
start data collection at the same time can swamp
the data center before staff have had a chance to
develop a functional data system. This problem
can be minimized in one of two ways. One wax n
to fund only a skeleton set of clinics to bepn
with. The full complement of clinics can be re
cruited and funded once data collection is under
way in the initial set of clinics. This approach
was used in the CDP. The study started with jum
5 clinical centers in 1965. A second set of 29
clinics was added in 1966. A third set of 21
clinics was added in 1967 to bring the total to 55
(Zukel, 1983).
The other way, when a full complement of
clinics is identified from the outset, is to autho
rize only one or two clinics to start data collec
tion. Other clinics are not phased in until essen
tial support systems have been developed and
tested. This approach was used in the Multiple
Risk Factor Intervention Trial, MRFIT (Sher
win et al., 1981). The sponsoring agency must
have the flexibility needed to determine when
funding for data collection is to start in the
individual clinics to make this approach siable

14. Patient recruitment and enrollment

Seek, and ye shall find.

Matthew 7, verse 7

Studies Program (Collins et al., 1980). Unfortu
nately. it is not easy to assess the recruitment
performance of many of the completed trials
because of the absence of details in published
reports concerning the original recruitment goal
and timetable for achieving it.
Investigators may set a number of secondary
recruitment goals or quotas in addition to the
main one. Some may relate to the mix of pa
tients within a clinic (e.g., the number of males
versus females). Others, in the case of the multi
center trials, will relate to the numbers of pa
tients to be enrolled per clinic. All secondary
goals should be viewed as general guidelines
rather than as absolute for practical reasons. For
example, it is more efficient to allow all clinics in
a multicenter trial to recruit to a common cutoff
date than to a set number per clinic. The same is
true with regard to goals or quotas regarding the
mix of patients within a clinic. Certain kinds of
patients will be harder to find than others. Insis
tence on a specified mix will increase the time
needed for patient recruitment.

111 Recruitment goals

112 Methods of patient recruitment
14.1 Troubleshooting
14 4 The patient shake-down process
14 5 The ethics of recruitment
14 6 Patient consent
14.6.1 General guidelines
14 6.2 The consent process
14.6.3 Documentation of the consent
14 6.4 What constitutes an informed consent?
14 6.5 Maintenance of consents
14 7 Randomization and initiation of treatment
14 R Zclen consent procedure
Table 14-1 Methods of patient recruitment
I able 14 2 Comments concerning the choice of
recruitment methods
Table 14-3 General elements of an informed
consent
Table 14-4 Suggested items of information to
be imparted in consents for clini
cal trials

14.1

RECRUITMENT GOALS

The recruitment goal in fixed sample size designs
'hnuld be set before the trial is started. As noted
m Chapter 9. it may be based on a formal calcu
lation or on practical considerations. It serves as
a landmark for gauging progress during patient
recruitment when accompanied by a timetable to
indicate when it is to be achieved.
It is not uncommon for trials to fall short of
’heir stated goal, even when the recruitment pe
riod is extended well beyond the date originally
vt for achieving the goal. The Coronary Artery
hurgery Study (CASS) extended the recruit
ment time and even then enrolled fewer patients
'han originally planned. The same was true for
’he Program on the Surgical Control of Hyperhpidcmia (POSCH). Their recruitment experi
ences are similar to those outlined for trials
carried out as part of the Veterans Cooperative
149

14.2 METHODS OF PATIENT
RECRUITMENT

Table 14-1 lists methods of patient recruitment.
The methods have been divided into those that
rely on direct patient contact and those that do
not. Each method has specific strengths and
weaknesses that must be considered when a
choice is made among them (Table 14-2). Any
method of recruitment requires the support of
colleagues to succeed. An investigator should
not undertake a trial without this support.
Studies relying on patient referrals can expect
to experience difficulties meeting their recruit
ment goal if referring physicians are not in sym
pathy with the study or if they are reluctant to
make referrals for fear of “losing” their patients
to the study. The National Eye Institute distrib
uted letters to ophthalmologists announcing the
start of the Diabetic Retinopathy Study (DRS),

14.2 Methods of patient recruitment

150

Table 14-1

T«ble 14-2

Methods of patient recruitment

Comments concerning the choice of recruitment methods (continued)

Trials using method*

Recruitment method

• Clinic contacts

AMIS. CDP. UGDP

B. Indirect patient contact
Via referring physician

• Screenings

HDFP. MR FIT

• Direct mailings

HPT. I RC

Recruitment method
A. Direct patient contact

Comments

• Required mode of recruitment if study clinic located
in tertiary care facility. May be used as the pri
mary method of recruitment or as an adjunct to other
methods
• Study clinic should be located in an established referral
center for the disease or condition of interest

B. Indirect patient contact

>

• Referring physicians

AMIS. CASS. CDP. DRS, MPS, UGDP

• Retrospective record reviews

POSCH, UGDP

• Spot radio and TV ads

AMIS. MRFIT

• Patient's primary care must be compatible with study
tenets
• Not a reliable method of recruitment if the disease or
condition is routinely treated by a primary care physi
cian
• Method works best for a disease or condition for which
there is no recognized form of therapy and when the
referring physician has no concern about “losing re
ferred patients
• It may be necessary to augment the referral process by:

•See Glossary for name corresponding to acronym.

s

'■< I

Table 14-2

Comments concerning the choice of recruitment methods

Recruitment method

- Mailing letters to referring physicians to inform
them of the study and of the type of patients
needed
- Journal articles outlining the design and purpose of
the trial
- News articles in the medical or lay press concerning
the trial
- Presentations at medical meetings to acquaint refer
ral physician with the trial

Comments

A. Direct patient contact
i;-.

-■

151

Patient recruitment and enrollment

Via primary care clinic

• Clinic must be large enough to yield the required
number of patients if it is to serve as sole source of
patients
• The study investigator should be responsible for the
primary care clinic or play a major role in its opera
tion

• Fellow colleagues in the clinic must subscribe to the
tenets of the study and be willing to follow the pre
scribed treatment

Via screening

I
Via retrospective record reviews

• Generally, only viable for relatively common diseases or
conditions. Not viable if most patients seen at the
clinic are ineligible for the study

• Not useful if newly diagnosed patients are required, or
where most patients identified by the reviews are
likely to be ineligible for enrollment (e g., because
they have received a form of treatment that disquali
fies them from consideration)

• Method of choice for identification of patients with a
disease or condition that can be diagnosed with a
simple and inexpensive test and that is not routinely
diagnosed via regular patient care channels

• May have to be used:
- When it is impractical or too costly to mount a
screening effort to identify patients

• May be used to supplement other recruitment methods
when the disease or condition of interest is rare (e g., a
certain type of hyperlipemia)
• Study clinic should have facilities to treat identified
patients or must be prepared to refer patients not
suitable for study to appropriate sources for care
Via mailings or telephone calls

• May be preferred method for rare disease or condition,
if routinely diagnosed and noted in clinic records

• Best limited to recruitment for primary prevention trials
or trials focusing on treatment of a disease or condi
tion not presently being treated by the medical com
munity
• Not recommended for recruitment of patients with a
disease or condition routinely diagnosed and treated.
Direct appeals in this case may be viewed as efforts to
“steal" patients
• Method usually used in combination with screening
procedures carried out at the clinic to determine the
eligibility of those who respond to the direct mail or
phone appeal. Screening is essential if a respondent is
not likely to know whether he has the disease or
condition of interest

Via radio or TV spot ads and the
news media

- When there is no risk-free low-cost screening proce
dure available
- If eligible patients are unlikely to be referred to the
study clinic
- If the disease or condition is so rare as to make it
impractical to consider any of the recruitment
methods outlined above
• Usually used as an adjunct to other methods of recruit

ment
• Often used to acquaint members of the lay and medical
community with the trial
_______ ______

*hich outlined the type of patients desired for
iht trial. Care was taken in the letter to note that
patients who were referred for study would re
main under the care of the referring ophthalmol
ogist for their regular eye care.

\ i 4-^

Some trials have used the news media to facili
tate patient recruitment. Recruitment publicity
may take the form of news stories appearing in
area newspapers, may be aired on radio or televi
sion, or may consist of paid advertisements

152

/

i

aimed at certain types of patients. Some of the
clinics in the Multiple Risk Factor Intervention
Trial (MRFIT) used spot television ads to in
form potential study candidates of the trial.
Such direct appeals are only practical in settings
where patients can be expected to know they
have the disease or condition of interest and are
not under treatment for it (see Chapter 24 for
further discussion of study information policy
issues). The need to have newly diagnosed, un
treated patients can be a major stumbling block
to recruitment if most of the patients arriving at
a clinic are already under treatment. This was
one of the difficulties in recruiting patients in the
University Group Diabetes Program (UGDP).
Studies may be forced to establish their own
screening and referral procedures if existing
sources of patients are inadequate. Various
trials, such as the Hypertension Detection and
Follow-Up Program (HDFP), Coronary Pri
mary Prevention Trial (CPPT) of the Lipid Re
search Clinics (LRC), and MRFIT, had to de
velop special screening procedures to find
suitable patients. The LRC had to make over
436.000 patient contacts in order to find the
3,810 ultimately enrolled into the CPPT (Lipid
Research Clinics Program. 1982). The MRFIT
screened over 361,000 to find the 12,866 enrolled
in that study (Multiple Risk Factor Intervention
Trial Research Group, 1982). The HDFP
screened over 158,900 to identify the 10,940 pa
tients enrolled in that trial (Hypertension Detec
tion and Follow-Up Program Cooperative
Group, 1979a).
The systematic review of hospital records can
offer a useful means of patient identification if
the records can be expected to contain the
needed information. However, it is not useful if
most of the patients are ineligible because of
their disease history or treatments received. The
review is fairly easy to carry out if it is restricted
to the investigator’s own institution, but not if it
involves other institutions as well, as in POSCH.
That study relied on record searches at several
hundred different hospitals. Special personnel
were required to negotiate the agreements
needed to make the searches (Matts et al., 1980).
14.3

14.6 Patient consent

Patient recruitment and enrollment

TROUBLESHOOTING

The period of patient intake is crucial in the life
of a trial. Special efforts are needed over the
entire period to spot and correct problems that
impede patient intake. Recruitment performance
should be monitored closely by comparing the

rate of enrollment with that required to achieve
the stated recruitment goal in the time penoj
specified. An extremely low recruitment rate
may call for a relaxation of some of the selection
criteria or cancellation of the entire study or of
support for one or more of the clinics in it. The
monitoring process may be facilitated by screen
ing logs. The logs may help to pinpoint reason*
for exclusions and, hence, may suggest ways of
modifying selection criteria to increase patient
yield. They may also help to characterize the
ways in which the population enrolled differ*
from the population screened, as in CASS
information that may be useful when generaliz
ing results of the trial (Coronary Artery Surgen
Study Research Group, 1984; see also Que*
lion 9 in Chapter 19).
Study leaders should conduct formal visits to
clinics for on-site inspections. The first round of
visits should be as soon after the start of patient
recruitment as possible. Subsequent visits ma\
be carried out at intervals over the life of the tnal
(see Section 16.8.3). The visits can be helpful in
identifying and correcting problems and in bols
tering the morale of clinic staff (see Cassel and
Ferris, 1984, for discussion of site visiting proce
dures in the Early Treatment Diabetes Retinop
athy Study, ETDRS).

14.4 THE PATIENT SHAKE-DOWN
PROCESS

The process of evaluating a patient for entry into
a trial may require several examinations. The
longer the evaluation period, the easier it will he
to identify uncooperative or otherwise unsuit
able patients. Patients who fail to keep appoint
ments or who do not comply with data collec
tion requirements for baseline visits are not
likely to become more compliant after enroll
ment.
Some drug trials (e.g., the CDP. Coronan
Drug Project Research Group, 1973a) require
use of a single-masked placebo during the pre
randomization evaluation period to help identify
noncompliant patients (see Question 37. Chap
ter 19). No medication, not even a placebo,
should be given without explanation. Of neces
sity, the explanation must be less than forthright
if clinic staff are to conceal its nature in lhe case
of single-masked placebos. The evasive nature of
the explanation required can strain the patient
physician relationship at a crucial point in the
enrollment process.

14.5 THE ETHICS OF

recruitment
me methods used for recruitment should be
devoid of any procedures that may be construed
is coercive. Cash payments as inducements for
enrollment or for patients to continue in a trial
should be used with caution, especially if the
mal involves risks. They may be necessary in
trials involving healthy volunteers who will not
realize anv direct benefit from the trials, but not
in trials involving treatment of some health condmon. In those cases, the benefits derived from
the care provided should serve as a sufficient
:n(iducement for enrollment.
lhe recruitment process should not involve
any restrictions on the demographic, social, or
ethnic characteristics of the patient population,
euept those needed for scientific reasons (e.g.,
restriction of age to allow concentration on a
high-risk group of patients, or restriction to the
sex group with the preponderance of the disease)
or lor practical or ethical reasons (e.g., exclusion
of non-English-speaking patients because of con
cern regarding adequacy of the informed con
sent process). However, this is not to say that the
studv may not end up with a preponderance of
one sex or ethnic group, or with patients largely
from the same social class. The composition will
depend on patient sources available to clinics.
The recruitment procedures used in a trial
may come under scrutiny long after enrollment
has been completed. The Tuskegee Syphilis
Study is a case in point (Schuman et al., 1955;
Tuskegee Syphilis Study Ad Hoc Advisory
Panel. 1973; Vonderlehr et al., 1936). Critics of
the study have suggested that the concentration
on poor, uneducated blacks led to a climate of
complacency in the way it was run (Brandt,
1978; Jones, 1981; Rothman. 1982).
14 6 PATIENT CONSENT'

14.6.1

General guidelines

It is unethical to carry out any experiment that
entails risks to humans without their voluntary
consent. The Nuremberg Code2 and all codes
since then have been explicit on the need for
soluntary consent (Levine and Lebacqz. 1979:
lesine, 1981). However, relatively little attention
1 Sw Section 13.1.1 for additional comments.

• The code was an outgrowth of the war crimes trials in Nurem•*'1 following World War II. The code is reproduced in 1 evine.
i»«l

153

was devoted to the actual consent process in
medical research until the Surgeon General
of the United States Public Health Service
(USPHS) addressed the issue in a memo (dated
February 8, 1966) to heads of institutions con
ducting research under Public Health Service
grants. The memo ultimately led to detailed reg
ulations, including the creation of institutional
review boards (IRBs), as a means of ensuring
adherence to ethical practices in the design and
conduct of research on humans. Table 14-3 pro
vides a summary of the pertinent points concern
ing the consent process, as contained in the most
recent set of regulations. The regulations read in
part:

Except as provided elsewhere in this or
other subparts, no investigator may involve
a human being as a subject in research cov
ered by these regulations unless the investi
gator has obtained the legally effective in
formed consent of the subject or the
subject’s legally authorized representative.
An investigator shall seek such consent only
under circumstances that provide the pro
spective subject or the representative suffi
cient opportunity to consider whether or
not to participate and that minimize the
possibility of coercion or undue influence.
The information that is given to the subject
or the representative shall be in language
understandable to the subject or the rep
resentative. No informed consent, whether
oral or written, may include any exculpa
tory language through which the subject
or the representative is made to waive or
appear to waive any of the subject's legal
rights, or releases or appears to release the
investigator, the sponsor, the institution or
its agents from liability for negligence {Of
fice for Protection from Research Risks,
p. 9. 1983).
The requirement for consent, when first intro
duced, led to fear that it would make recruit
ment of patients for studies impossible. This fear
has not been justified, although the burden im
posed by the regulations is unfair in one regard.
An investigator is required to make certain that
a patient about to enter a trial understands the
nature of the risks and benefits that may accrue
from the treatments to be offered. Yet that same
patient, when seen by his regular physician, may
be offered similar treatments without any discus
sion of their risks or benefits (Chalmers, 1982a).

154

Patient recruitment and enrollment

Table 14-3 General elements of an informed consent
• A statement that the study involves research, an expla
nation of the research and the expected duration of
the subject's participation, a description of the proce
dures to be followed, and identification of any proce
dures that are experimental

itI
s' •

IT
st

I

y
.3

• A description of any reasonably foreseeable risks or
discomforts to the subject
• A description of any benefits to the subject or to others
that may reasonably be expected from the research
• A disclosure of appropriate alternative procedures or
courses of treatment, if any, that might be advantage
ous to the subject
• A statement concerning the extent, if any, to which
confidentiality of records identifying the subject will
be maintained
• For research involving more than minimal risk, an ex
planation as to whether any compensation or medical
treatments are available if injury occurs and. if so,
what they consist of, or where further information
may be obtained
• An explanation of whom to contact for answers to perti
nent questions about the research and research sub
jects' rights, and whom to contact in the event of
research-related injury to the subject
• A statement that participation is voluntary, refusal to
participate will involve no penalty or loss of benefits
to which the subject is otherwise entitled, and the
subject may discontinue participation at any time with
out penalty or loss of benefits to which the subject is
otherwise entitled
• When appropriate, one or more of the following ele
ments of information shall also be provided to each
subject:

- A statement that the particular treatment or proce
dure may involve risks to the subject (or to the
embryo or fetus, if the subject is or may become
pregnant) that are currently unforeseeable
- Anticipated circumstances under which the sub
ject’s participation may be terminated by the in
vestigator without regard to the subject’s consent
- Any additional costs to the subject that may result
from participation in the research

- The consequences of a subject’s decision to with
draw from the research and procedures for or
derly termination of participation by the subject
- A statement that significant new findings developed
during the course of the research that may relate
to the subject’s willingness to continue participa
tion will be provided to the subject

- The approximate number of subjects involved in
the study
Source: Reference citation 365.

14.6.2

The consent process

Tabic 14-4 provides a list (prepared by the au
thor) of items that should be covered in the
consent process. It differs from the list in

14.6 Patient consent

Table 14 3 in that it is specific to the area of
clinical trials. Appendix E contains sample co*
sent statements from three of the trials sketched
in Appendix B.
The consent process, to be valid, must he
based on factual information presented in an
intelligible fashion and in a setting in which the
patient, or his guardian, is able to make a fret
choice, without fear of reprisal or prejudicial
treatment. Meeting these conditons may be im
possible in cases where the patient is highly
vulnerable, either because of his medical con
dition or physical surroundings. Extra precau
tions are needed whenever minors, mental pa
tients, or prisoners are approached. The claw
action suit for damages brought against invexngators at the University of Maryland on behalf
of Maryland state prisoners had to deal with
questions concerning the nature of free consent!
obtained in prison settings (United States Du
trict Court for the District of Maryland. I9'Qi
No damages were awarded, but the suit tool
years to complete.
Reservations concerning the adequacy of the
consent process in institutionalized population*
have all but eliminated these populations as pa
tient sources for research studies. They have abo
tended to discourage trials in children. The latter
trend is unfortunate. Some trials must be done
in children to obtain information pertinent tn
their illnesses or treatments.
The consent process must be completed be
fore the treatment assignment is issued (except
with the method proposed by Zelen; see Sec
tion 14.8). No patient should be randomized
who expresses a reluctance or unwillingne** tn
accept whatever treatment is assigned. The pro
cess should include an explicit statement regard
ing a patient’s right to withdraw from the trial at
any time after randomization. The statement
may be balanced with a discussion of the effect
withdrawals have on the trial and the responsi
bility a patient has, within limits, to continue in
the trial if he decides to enroll (Levine and I ebacqz, 1979).
It is best to avoid exact time specification*
regarding the anticipated length of follow-up in
long-term trials. The time, even if seemmgh
fixed at the outset, may have to be extended
later for reasons unanticipated at the outset. Sim
ilarly, promises as to when the study treatment
will be offered to patients assigned to the control
treatment should be avoided if there is an*
chance of having to renege on them later, as *as
the case in the NCOS (National Cooperati't
Gallstone Study Group, 1981a).

155

TiW* 14-4 Suggested items of information to be imparted in consents for clinical trials
(.rtml descriptive «nd design informition
. Description of the disease or condition being studied
and how the patient qualifies for the study
• Tvpe of patients being studied and the number to be

enrolled
• Anticipated length of follow-up
• Description of data collection schedule and proce

dures
Treatment information
• | ist of the treatments to be studied and rationale for
their choice
• Treatment alternatives available outside the study

• Nature of the control treatment
• Method of treatment administration

• Method of assigning patients to treatment
• level of treatment masking
• Nature of information regarding treatment results
that will be made available to patients during and at
the conclusion of the trial
I

benefit information

• Description of the risks and benefits that may accrue
to a patient from participation in the trial

• Enumeration of the potential risks and benefits asso
ciated with the study treatments, as well as an enu
meration of common side effects
• Description of any special procedures that will be
performed, including an enumeration of the risks
and benefits associated with those procedures, and
the lime points at which they are to be performed

Most clinical trials involve the collection and
storage of personal information, such as name
md address, on study patients (see Section 15.3
for uses of the information in tracing patients).
Some investigators engaged in epidemiological
studies have indicated the exact date at which
such information will be purged from patient
files. The commitment is unwise in long-term
clinical trials for two reasons. First, it may be
impossible to meet because of unexpected delays
m the conduct of the trial. Second, and more
important, there may be a need to contact pa
tients after the trial is completed, especially if
any of the study treatments appear to be producmg late and unexpected adverse effects.
The mechanics of obtaining the informed con
tent must be individualized to the population to
I* studied. Information may be presented in
various ways so long as there is adequate oppor
tunity for a patient (or his guardian) to have all
questions regarding the study answered before

I

Patient responsibilities and safeguards

• Outline of responsibilities of patients enrolled in the
trial, including discussion of the importance of con
tinued follow-up
• Outline of what is expected of the patient in following
the examination schedule and in carrying out spe
cial procedures between visits

• Outline of safeguards to prevent continued exposure
of a patient to a harmful study treatment or denial
of a beneficial one
• Outline of safeguards for protecting a patient's right
to privacy and confidentiality of information
• Indication of a patient's right to withdraw from the
trial at any time after enrollment without penalty or
loss of benefits to which he is otherwise entitled

• Statement of the policy of the investigator’s institu
tion on compensation for, or treatment of. studyrelated injuries

• Statement of the patient’s right to have questions
answered regarding the trial and indication of items
of information that will not be disclosed (e.g., the
treatment assignment in a double-masked trial)
• Statement of the length of time personal identifiers
will be retained after the close of the trial, where
such information will be retained, and the reasons
for keeping it (e.g., for use in contacting or recalling
the patient after close of the trial). Statement
should also indicate ways in which the information
may be used (e.g.. to access the National Death
Index or other information sources for determining
mortality status after the close of the trial)

he is asked to make a decision on enrollment.
Hard sells are to be avoided. First, because they
represent subtle forms of coercion. Second, be
cause they can lead to enrollment of uncoopera
tive patients.
Whenever feasible, it is wise to carry out the
consent process in two stages with a time separa
tion of a day or more between the first and
second stages. Many trials lend themselves to
this approach, especially those that require mul
tiple visits to establish a patient’s eligibility for
enrollment. Exceptions are cases in which treat
ment must be started on the spot.
The first stage should be designed to acquaint
the patient with the study and its requirements.
It should involve a conversation with the patient
in a setting that is conducive to a two-day ex
change. The information imparted should be sup
plemented with written material, including a
copy of the consent statement for the patient to
take home to review at his leisure. The second

156

X.
■r
•5

1 :

P

14.2 Zelen consent procedure

Patient recruitment and enrollment

stage should be used to answer questions raised
by the patient and to review what would be
required of him if he agrees to enroll. The con
sent statement should be signed at the end of this
stage.
Both stages should allow ample opportunity
for the patient to question clinic personnel re
garding the study and his role in it. A patient
should not be asked to sign the consent state
ment if he has any doubts about enrolling or if
the clinic staff believes he does not understand
what his participation would involve. The pa
tient should be asked to reaffirm his willingness
to accept whatever treatment is assigned before
he signs the statement.
The time point at which the consent process is
initiated is important. If it is initiated too early in
the recruitment process, a good deal of time may
be wasted explaining the trial to individuals who
are subsequently found to be ineligible for en
rollment on medical grounds. However, delay
ing the start of the process until the eligibility
assessment is complete may not allow enough
time for an orderly two-stage consent, especially
if there is any urgency to start treatment once
eligibility has been established.
The treatment assignment should be issued on
the same day the consent is signed. The treat
ment should be initiated as soon thereafter as
feasible, preferably on the day of assignment. A
large time gap between consent and initiation of
treatment will tend to increase the patient's anx
iety regarding treatment and may increase the
chance of his withdrawing before treatment is
started.
The consent statement used in multicenter
trials should be standardized to the extent possi
ble. Some variation in language may be unavoid
able because of local IRB wording requirements.
However, the amount can be minimized by pro
viding clinics with a prototype statement that
covers the items listed in Table 14-4. Individual
clinics may not reduce or abridge information
contained in the statement, but may add to it if
required to do so by local IRBs.

14.6.3

Documentation of the consent

Federal regulations require that:
Informed consent shall be documented by
use of a written consent form approved by
the IRB and signed by the subject or the
subject 's legally authorized representative.
A copy shall be given to the person signing

the form (Office for the Protection from
Research Risks, p. 10. 1983).

The IRB must approve the consent statement
and will want to review all information (written
as well as verbal) presented to patients in con
junction with the consent process. The statement
presented for signature may contain a written
description of all pertinent information needed
in the consent process, or may refer to materials
presented orally or in an accompanying d<xument, such as in a patient information booklet
The patient’s signature should be witnessed b\ i
third party, regardless of how the presentation it
made. The patient should be given a copy of the
consent form after it has been signed. The ongi
nal should be kept in the patient’s file.
The responsibility for obtaining informed con
sent goes beyond the simple mechanics of pre
senting and signing documents. It is the respon
sibility of all those connected with the studs to
ensure that the process is carried out in a respon
sible manner. This responsibility extends beyond
the clinics in multicenter trials. The approstd
statements should be collected by the coordinat
ing center for review and storage. The re\ie»
should be done by the study leadership and
should be aimed at making certain that the state
ments meet study standards. In addition, the
center should set up procedures to withhold
treatment assignments until signed consent*
have been obtained.
Clinic site visits (see Section 16.8.3) should
include checks on the consent process. This can
be done via a walk-through for a hypothetical
patient or by witnessing the process being car
ried out with an actual patient. The visiting team
may also talk to patients who have gone through
the process to learn what they know about the
trial. The Beta Blocker Heart Attack Tna!
(BHAT) assessed the quality of the consent pro
cess by interviewing a sample of patients (How
ard et al., 1981).

14.6.4 What constitutes an informed
consent?
The question of what constitutes an informed
consent is complex. It depends on the inform*tion to be conveyed and on how it is perceived
by the patient. The formal nature of the doctor
patient relationship, coupled with the patient*
anxieties regarding his condition, can be maior
blocks to meaningful communication. Studies o
the consent process suggest that patients m*'

ful to comprehend much of what they are told
■ Howard et al., 1981).
...
Consent materials should be simply written. It
miportant for design concepts, such as randomization, placebos, and masking, to be exr-amed in lay terms. Some investigators have
<hoscn to exclude patients who do not comprehend fundamental aspects of the study design.
Ihc Hypertension Prevention Trial (HPT) re
quired patients to correctly answer a series of
questions on the trial before they could be en
rolled. A vaccine study research group at the
I mversity of Maryland requires volunteers to
a test on the trial prior to enrollment (Le
vine. 1976; Woodward, 1979).
Ihc failure to cover important items of infor
mation in the consent statement can cause a
dilemma later on. A case in point is the failure to
specify the nature of follow-up that will be ear
ned out on indivduals who drop out after enroll
ment It is common in a long-term trial to employ special procedures to obtain up-to-date
mortality data on all study patients, including
dropouts, at the time of final data analysis (see
(hapter 15). Normally, these procedures are ear
ned out unobtrusively. Nevertheless, the pre
ferred approach is to make the patient aware of
the ways in which his personal identifying infor
mation may be used for tracing and mortality
follow-up before he is enrolled. A patient who is
uncomfortable with what is proposed should not
he enrolled.

14.6.5

Maintenance of consents

Consents given at the time of enrollment may
ha\e to be updated to remain valid. Patients
should be informed of any decision or action
that is likely to affect their willingness to con
tinue in the trial, such as a decision to stop a
study treatment in another group of patients
because of an adverse effect or to add a data
collection procedure that is inconvenient, un
comfortable, or risky. The CDP informed all
study patients of the decision to terminate use of
the high-dose estrogen treatment, even though
kss than one-quarter of them were on that treat
ment.
Changes in the federal regulations regarding
the informed consent process during a trial may
require addendums to the consents. For exam
ple. investigators in the UGDP were required to
'shtain signed consent statements from patients
*fter recruitment had been completed. Undocu
mented oral consents, obtained at the start of the

157

trial, were not considered sufficient after the Feb
ruary 8, 1966, memo from the Surgeon General
of the USPHS. More recently, addendums have
been required to inform patients of local policy
on compensation for and care of study related
injuries.

14.7 RANDOMIZATION AND
INITIATION OF TREATMENT
Patients judged eligible and who are willing to
participate in the trial are ready for enrollment.
The point at which the treatment assignment is
disclosed to the treating physician should be
used to mark formal entry of a patient into the
trial. Once enrolled, a patient should be counted
as part of the study population (see Chapter 18).
The randomization procedure should be set
up to make certain that assignments remain
masked until they are needed for initiation of
treatmentt (see Chapters 8 and 10). As already
noted in Section 14.6.2, treatment should be
started as soon after enrollment as practical,
ideally on the day of randomization.

14.8 ZELEN CONSENT PROCEDURE
The usual approach is to obtain a patient’s con
sent before he is randomized. The sequence is
reversed in a modification proposed by Zelen
(1979). In that method, eligible patients are ran
domized before consent is obtained. Those as
signed to the control (standard) treatment are
given that treatment without discussion of the
alternative treatment(s) under evaluation. Only
patients assigned to the test treatment(s) are
given an opportunity to refuse the treatment
assignment. Patients who refuse are given the
control treatment.
The appeal of the approach lies in the fact that
only patients assigned to test treatments are pre
sented with information on treatment alterna
tives. The others are spared the anxiety that may
be aroused by such discussions. However, in
actual fact, most IRBs are reluctant to accept
the approach, except under very special circum
stances (such as in a trial involving a high-risk
treatment on patients with a poor prognosis for
life), and then only where cogent arguments can
be made in its favor.
The approach has a number of limitations. Of
necessity, it is limited to unmasked trials since
the treating physician must know the assignment
to identify patients with whom choices are to be

158

Patient recruitment and enrollment

discussed. !n addition, refusals after randomiza
tion. if sizable, will make it difficult to reach any
conclusion from the trial. Further, the procedure
can lead to subtle forms of coercion. Patients
assigned to the test treatment may be coaxed by
study personnel to accept the assignment simply

as a means of avoiding the data analysis and
interpretation problems that can arise if i^.
are a lot of treatment refusals. Finally th?
method is unfair in that only patients assigned t’
test treatments are allowed a choice.

15. Patient follow-up, close-out, and post-trial follow-up

There are only two classes of mankind in the world—doctors and patients. . . you
[doctors] have been, and always will be exposed to the contempt of the gifted amateur—
the gentleman who knows by intuition everything that it has taken you years to learn.
Rudyard Kipling

I

r
!

I

i
I
J

|< I Introduction
l< 2 Maintenance of investigator and patient in
terest during follow-up
15.2.1 Investigator interest
15.2.2 Patient interest
15 3 Losses to follow-up
15 4 Close-out of patient follow-up
l< 5 Termination stage
1*6 Post-trial patient follow-up
hhle 15-1 Aids for maintaining investigator in
terest
’ s that en
Table 15-2 Factors and' approaches
hance patient interest and partici
pation
Table 15-3 Methods for relocating dropouts
lable 15-4 Data items that may be used in
searches of the National Death
Index
Table 15-5 Study close-out considerations
Table 15-6 Activities in the termination stage
Figure 15-1 Lifetable cumulative dropout rates
for the clofibrate, niacin, and
placebo treatments in the CDP

Patient post-trial follow-up
A process that involves patient follow-up after
completion of the close-out stage of the trial
and that is designed to yield information on
the primary or a secondary outcome measure.
This chapter deals with the steps involved in
carrying out these three processes (see also Ap
pendix D).

15.2 MAINTENANCE OF INVESTI
GATOR AND PATIENT INTEREST
DURING FOLLOW-UP

The follow-up process requires a dedicated and
committed staff to schedule and carry out the
required examinations and a willing patient pop
ulation. Both are needed if the trial is to succeed.

15.2.1

Investigator interest

Investigator commitment to the trial and interest
in its activities must be high throughout if it is to
succeed. Interest will be easy to maintain in a
short-term trial, where the initial enthusiasm
that usually accompanies the start of any new
activity is enough to carry it through to comple
tion. However, even in such cases spirits can sag
before the data analyses are done and the final
paper has been written. They can sag long before
that point in long-term trials. Table 15-1 lists
some of the aids that can be used to maintain
investigator interest. The list is written with
long-term multicenter trials in mind. However,
morale problems are not unique to multicenter
trials. They can be just as great in single-center

15.1 INTRODUCTION
Before approaching the subject matter of this
chapter it is necessary to provide working defini
tions of three different processes. They are:

Patient follow-up
A process involving periodic contact with the
patient after enrollment into the trial for the
purpose of administering the assigned treat
ment, observing the effects of treatment, mod
ifying the course of treatment, and collecting
data to evaluate the treatment.
Orient close-out
A process carried out to separate a patient
from the trial, involving cessation of treatment
and termination of regular follow-up.

trials.
Periodic meetings of study personnel are es
sential in maintaining a cohesive investigative
group. They are needed before the start of the
trial to outline the treatment and data collection
procedures for the trial, and they are an essential

159

160

Table 15-1

f!
I?’**

Aids Tor maintaining investigator interest

• Periodic meetings of all study personnel
• Distribution of periodic progress reports on patient re
cruitment and follow-up. data collection, and other
performance characteristics of the trial for review by
all members of the investigative group
• Periodic newsletters distributed to study personnel de
signed to inform them of study progress, protocol
changes, and so forth
• Investigator participation in the analysis of results and
in writing or presenting papers concerning the trial
• Preparation of reports and papers during the course of
the trial summarizing the design, organizational, and
operating features of the trial
• Execution of ancillary studies
• Certificates of appreciation from the sponsor, and
signed by key study leaders, to staff reaching impor
tant milestones (e.g., their five-year anniversary with
the study)

S'

If
1

15.3 Losses to follow-up

Patient follow-up. close-out. and post-trial follow-up

part of the quality assurance process once it is
under way. Meetings should include clinic coor
dinators, technicians, and other support staff
important to the trial, as well as senior person
nel.
The long-term multicenter trial will require a
variety of other ways to maintain investigator
interest. The chance for investigators to engage
in ancillary studies (see Glossary for definition
and Section 22.7.3 for discussion of manage
ment issues related to such studies) can help
maintain their interest and general commitment
to the trial. The opportunity to carry out anal
yses on data collected during the trial can also
help morale. In reality, the opportunities for
such analyses may be limited in settings in which
there is a desire to mask clinic staff to treatment
results, as discussed in Chapter 22. However,
this policy does not preclude access to data unre
lated to treatment outcome. The Coronary Drug
Project (CDP) allowed access to baseline data
for all the treatment groups as well as follow-up
data for the placebo-treated group of patients.
The follow-up data were used to generate several
papers on the natural history of coronary heart
disease (see Table B-3 of Appendix B for list).
Access to adherence or process measures by
treatment group is also acceptable. Staff in
the Multiple Risk Factor Intervention Trial
(MRFIT) were provided with data indicating the
level of risk reduction achieved as the study
progressed. These summaries included data on
clinic performance in terms of achieving stated

treatment goals and were used by study leaden
to assess the intervention procedures.

15.2.2

TiU« 15-2 Factors and approaches that enhance patient
-crest and participation________ _________________
• ( lime staff who treat patients with courtesy and dignity
and who take an interest in meeting their needs
• Clinic located in pleasant physical surroundings and in a
secure environment
• Convenient access to parking for patients who drive.
and to other modes of transportation for those who
do not
• Pisment of parking and travel fees incurred by study

Patient interest

A patient’s interest in the trial and willingness m
continue in it can be expected to diminish with
time. The longer the period of follow-up the
greater the need for measures to counteract wan
ing interest and participation levels. Table 15 2
lists factors and approaches that can help sustain
patient interest in the trial. However, by all odds
the most important factor is the attitude of clinic
staff. Uninterested or discourteous staff will lead
to an uninterested patient population.

patients

• Payment of clinic registration fees and costs for proce
dures required in the trial
• Special clinics in which patients are able to avoid the
confusion and turmoil of a regular out-patient clinic
• Scheduled appointments designed to minimize waiting
time

• ( hmc hours designed for patient convenience
• Written or telephone contacts between clinic visits
• Remembering patients on special occasions, such as
Christmas, birthday anniversaries, etc.
• htablishment of identity with the study through proper
indoctrination and explanation of study procedures
during the enrollment process; through procedures
»uch as use of special ID cards to identify the patient
is a participant in the study, and by awarding certifi
cates to recognize their contribution to the trial

15.3 LOSSES TO FOLLOW-UP
A loss to follow-up occurs whenever an item of
information required as part of a scheduled fol
low-up examination is not obtained in the per
missible time window (see Glossary for defini
tion). The loss may be due to:
• Failure of the clinic staff to complete an item
on an otherwise properly completed data
form
• Failure of the patient to agree to certain proce
dures during an examination
• Failure of the patient to return to the clinic
for an examination within the time windo*
specified for it
Losses due to missed examinations or to exami
nations that are not done within the specified
time window, and hence are counted as misled,
are more worrisome than the losses resulting
from failure to complete specific items or proce
dures during an examination. Further, an occa
sional missed examination for a patient has dif
ferent implications than does a sequence of
missed examinations. The longer the sequence,
the greater the uncertainty regarding the out
come status of the patient.
Patients who are no longer able or willing to
return to the clinic for scheduled follow-up ex
aminations are dropouts. The declaration mr
be made by the patient (e.g., by announcing an
intent to leave the study because of a lack o
interest or because of a forthcoming most to
another city) or by clinic staff. The latter will
the case with a patient who disappears or w o
does not. for whatever reason, keep his sched
uled appointments. However, a clinic dec ata
tion should not be made until (I) clinic sta a*?
made a concerted effort to locate the patient u

he has disappeared, and to try to convince him
to return to the clinic for a follow-up examina
tion; and (2) the patient has missed a specified
number of follow-up clinic visits. The date of the
patient’s last completed follow-up examination
should be used as the date of dropout. The pa
tient should remain classified as a dropout until
or unless he returns to the clinic for a follow-up
examination.
Patients who are classified as dropouts may or
mas not be lost to follow-up for the outcome of
interest. They are when the diagnosis or mea'urement of the primary outcome can only be
done at follow-up examinations performed in
»tudy clinics. They are not when it can be done
outside study clinics (e.g., as in trials with death
the primary outcome). Similarly, conversion
of a patient from active to dropout status may or
may not affect his treatment compliance (see
Glossary for definition). It will not if the conver’ton occurs after treatment has been completed
and if the treatment cannot be reversed or nulliIt will be tantamount to creating a state of
nnncompliance if the conversion requires termi
nation of an ongoing treatment process (e.g., as
m most chronic drug treatment trials).

I

161

The willingness of a patient to remain under
active follow-up will depend on a variety of fac
tors, including:
• The amount of time and inconvenience in
volved in making follow-up visits to the
clinic
• The perceived importance of the procedures
performed at follow-up visits from a health
maintenance point of view
• The potential health benefits associated with
treatment versus potential risks
• The amount of trauma and discomfort pro
duced by the study treatment or procedures
performed
• The number and type of side effects asso
ciated with treatment
The dropout rate may well change over the
course of follow-up, as illustrated in Figure 15-1
for the three treatments continued to the end of
the CDP. The rate declined with time, but only
slightly. The niacin treatment group had the
highest 5-year rate. It was also the group that
had the largest number of patients with treat
ment-related complaints (Coronary Drug Proj
ect Research Group, 1975).
The procedures carried out in conjunction
with follow-up examinations may influence drop
out patterns. For example, a spurt in dropouts
may occur just before an examination involving
a noxious procedure. Similarly, there may be a
peak after patients pass a specified time point in
the trial, especially if they perceive that their
time commitments to the trial are satisfied.
A certain number of dropouts in long-term
trials will occur simply because of patient reloca
tions. Such losses can be reduced in multicenter
trials by transfer of follow-up responsibilities to
sister clinics. The CDP was able to maintain the
clinic visit schedule for several patients in this
way (Coronary Drug Project Research Group,
1973a).
Dropouts should be contacted at periodic in
tervals. The contacts may be made via home
visits, telephone, or mail and should be made
even if they cannot be used to collect outcome
data since they may be useful in persuading pa
tients to return to active follow-up.
Patients who cannot be contacted should be
traced so that contact may be re-established. The
tracing process should be initiated as soon as
possible. Table 15-3 provides a list of some of
the methods that can be used for tracing (see
Section 12.5.12 for a discussion of the types of
identifying and locator information that should

■

15.4 Close-out of patient follow-up 163

162 Patient follow-up. close-out, and post-trial follow-up
Figure 15-1

Contact with the patient or his family is essent a| (or most forms of follow-up. One notable
exception is for mortality follow-up using the
Mtional Death Index-NDI (National Center
m Health Statistics, 1981). Table 15-4 lists the
,Ifms of information needed for such searches,
(he Index contains deaths recorded in the U.S.
xmee 1979. It contains basic identifying informanon for each deceased person, including the
death certificate number and state in which the
certificate is located.
It should be possible, with the search methods
described above, to provide mortality data on
virtually every patient enrolled in a trial. Both
the CDP and UGDP were able to achieve this
goal (without the NDI since it was not opera
tional when these studies were done). The CDP
had vital status on all but a few of the 5,011
patients covered in the final report on clofibrate
and niacin (Coronary Drug Project Research
Group. 1975). The 1970 publication from the
1 GDP on tolbutamide provided mortality data
on all but 5 of the 823 patients included in that
report (University Group Diabetes Program Re
search Group, !970e).

Lifetable cumulative dropout rates for the clofibrate. niacin, and placebo treatments in the

CDP

15r

£

Q.

o

I

10<0

Placebo
Clofibrate

i

o
O

>

I

Niacin

5

5
3

E

8
0

1
N- 4.706

I

I?*'

2
4.421

3
4
Year of Follow-up
4.167

3.911

5

6

7

3,651

2.336

729

Nott: N denotes total number of patients in clofibrate. niacin, and placebo groups combined. Approximate number!
for individual treatment groups are 2/9, 2/9. and 5/9 times N for clofibrate. niacin, and placebo, respectively

Source: Reference citation 107. Adapted with permission of the American Medical Association. Chicago. Ill
(copyright C 1975).

be collected). Simple steps (Part A, Table 15-3),
such as those involved in checking phone and
address directories, may enable clinic staff to
locate most of the “lost" patients, and they
should be carried out before any of the ap
proaches listed in Part B of Table 15-3 are con
sidered.
Searches carried out by agencies retained for
that purpose should be done discreetly, without
patient contact. This proscription should extend
to the coordinating center or other resource cen
ters in the trial as well, unless the patient has had
prior contact with the center in question or con
sented to such contact when he was enrolled.
The cost of searches carried out by firms, such
as Equifax,1 will vary from a few dollars to
several hundred, depending on the extent of the
search. Relatively inexpensive searches may lo
cate the majority of lost patients, whereas a
fairly large investment may be needed to locate
those that are especially hard to find. Some help
in the location process may be provided by gov
ernmental agencies. As a rule, they will not re-

15.4 CLOSE-OUT OF PATIENT
FOLLOW-UP

lease address information but they may reveal
whether their record indicates that the patient
has died or may agree to send letters to studs
patients that are alive, suggesting that they re
contact the study clinic.
Table 15-3

The process of disengaging patients from a trial
may require as much skill and care as the enroll
ment process. Recent papers have addressed as
pects of the close-out process (e.g., Hawkins and
fanner, 1978; Klimt and Canner, 1979; Klimt,
1981). Table 15-5 provides a summary list of
considerations that should be addressed in plan
ning for close-out. (See Chapter 3 and Appen
dix D for additional information.)

Methods for relocating dropouts

A. Ordinary
• Via check for address change through the post oftkr

city directories, telephone books, etc.
• Via contact with known friends or relatives of the
patient
• Via other sources, such as the patient's most rtetr*

Tabit 15-4 Data items that may be used in searches of the
Sttional Death Index

employer, church group, etc.
• Last name*
• First name*
• Middle initial

B. Special
• Via a private agency specializing in locating people
• Via firms maintaining large address files and thw

• Via departments of motor vehicles*

• Social security number*
• Month,* day, and year* of birth
• Fathers surname* (for females)

• Via a government agency, such as the Social Secunts

• Age at death (actual or estimated)

market a tracing or follow-up service

• Sex

or Veterans Administration*

•Race

• Via a private or public institution, such as x

• Marital status
• State of residence
• State of birth

• Via the patient’s private doctor*
I. Equifax is an Atlanta-based firm that was established to pro
vide credit and related information for clients of the hanking and
insurance industry A branch of the firm was established in the
1970s for marketing a locator service for follow-up studies.

•May not yield direct contact with patient if the agenev or ind•'*

ual is unwilling to supply the desired address in or

Of •

‘t'ornidered to be key in checking for a possible record match

legally constrained from doing so.

i

T«bk 15-5

Study close-out considerations

• Time schedule (i.e„ whether to close-out follow-up for
all patients at the same calendar time or after a fixed
period of follow-up, see Section 11.7)
• Information to be collected (see Section 15.4)
• Phased treatment disengagement (usually applicable
only to drug trials, see Section 15.4)

• Nature of recommendations given to patients regarding
subsequent treatment

• Method for ensuring proper transfer of patient care
responsibilities to alternate clinic or physician when

appropriate
• Ensuring patients have ample opportunity to make alter
native arrangements for care and to have any ques
tions answered regarding the trial and its outcome
before separation

• Method of summarizing baseline and follow-up data for

subsequent use by patient's private physician
• Nature of patient contact required to document separa

tion from trial
• Update of patient locator information and consent (ap
plicable only if there is any possibility of having to
contact patients later on to check their status or to
recall them for examination)

• Masked trials: Time at which treatment is to be un
masked for study staff; for patients
• Masked trials: Amount of information to be collected
on the efficacy of the mask (see Section 15.4 and

Section 8.5)

The separation can be an emotional expe
rience for both patients and clinic staff. It should
be based on a detailed plan that has been con
structed and reviewed before the start of close
out. The details of the separation should be dis
cussed with patients well before separation oc
curs. Clinic staff must spend whatever time is
necessary to answer questions and to help find
suitable alternative sources of care. The latter
step is imperative in any trial that has been
providing patients with routine medical care, as
in the UGDP. Investigators in that study dis
cussed care requirements with each patient be
fore departure and made certain of continued
care after the close of the trial. The study clinic
provided the new clinic or physician with a sum
mary (prepared by the coordinating center) of
key baseline and follow-up data assembled on
the patient when transfers of care were involved.
The record generated in conjunction with sep
aration should contain:
• The name of the treatment the patient was on
• The date the patient was informed of the
treatment assignment (for masked trials)

164

■

i
'Wi'

1 F

15.6 Post-trial patient follow-up

Patient follow-up, close-out, and post-trial follow-up

• The date treatment was discontinued (when
appropriate)
• The date of the final close-out visit
• The name of the clinic or physician responsi
ble for future care of the patient
• The treatment recommendation and prescrip
tion (when appropriate)
• A list of materials and information given to
the patient on departure

Close-out provides an opportunity to assess
the adequacy of the mask in masked trials. Theo
retically, such checks could be made at various
time points throughout the trial. However, usu
ally they are not carried out because of a desire
to discourage speculation concerning the treat
ment assignments since the assessment involves
asking the masked individual(s) to state a guess
regarding treatment assignment (see also Sec
tion 8.5 and Krol, 1983).
A key consideration at close-out has to do
with whether to carry out added data collection
on patients as they are separated from the trial
(the same consideration may arise in conjunc
tion with protocol changes involving termina
tion of a particular treatment during the trial).
The wisdom of making such provisions depends
on the importance of the data generated in rela
tion to the aims of the trial. Results obtained for
tests or procedures for which there are no corre
sponding baseline values will be of limited use in
making treatment comparisons if the treatment
groups differ because of losses due to dropouts
or deaths. Investigators in the CDP opted
against introduction of special data collection
schemes during close-out, except for the addi
tion of a few items to facilitate relocation of
patients (Krol, 1983).
The method of terminating therapy in masked
drug trials must be given special consideration.
A dosage step-down scheme may be necessary if
an abrupt cessation of one or more of the drugs
is considered unsafe. In addition, a patient will
want to know the treatment he was on. Hence,
study physicians must be supplied with treat
ment codes well in advance of close-out visits,
especially in trials where time is needed to con
sider alternative courses of therapy before mak
ing a treatment recommendation.
Ideally, any treatment recommendation given
to patients at close-out should be based on find
ings from the trial. However, often this is not
possible, since the final analysis of the results
may not be completed by the time of close-out.
Recommendations may have to be qualified or

simply withheld, especially in designs involving
close-out after a fixed period of follow-up (sa
Section 11.7). In such cases, the close-out pro
cess will extend over a period of time as long at
that required for patient enrollment. It may not
be advisable to unmask treatment assignment
in such designs until all patients have been sepa
rated from the trial, unless it is possible to lift the
mask on a per-patient basis (see Section 10 5)
Patients should be told at close-out if the
clinic plans to keep in touch with them and. if
so, the reason for doing so and the way in which
contact will be maintained (e.g., via mail, tele
phone, or home visits). They should be asked to
sign a consent authorizing the contacts and to
provide updated locator information if contacts
are planned. In fact, it is a good idea to alert
patients to the possibility of future contacts and
to obtain consents for them even if subsequent
contacts are not planned, if there is any chance
they will be needed later on.

15.5

TERMINATION STAGE

Close-out of patient follow-up is only the fint
stage in shutting down the trial. It is normallv
followed by a series of activities (see Table 15 b
and Section VI of Appendix D) beginning with
completion of the close-out visits and ending
with termination of all funding for the trial The
time needed for termination is variable and de
pends on the trial. A period of a year or longer n
common for trials of the type sketched in Ap
pendix B.
As a rule, clinics will require financial support
for a period of time beyond the patient close-out
stage to complete data transmissions to the dati
center and to respond to edit queries from that
center. Support for the data center will have to
extend beyond that for clinics to allow adequate
time for the center to complete analyses of the
results and to prepare them for publication Ihe
UGDP Coordinating Center continued to re
ceive funding through April 1982, nearly 7 yean
after completion of the last close-out examina
tions. The coordinating center in the CDP con
tinued to operate through 1983, over 9 yean
after termination of the closeout stage of that
trial.
One of the last steps in the termination stage
has to do with record storage and disposition
All study forms and related documents to be
retained (especially those with personal identi >■
ers on them) should be stored in a secure l«ation. Forms and related documents should

T.H» IM Activities in the termination stage
V Grneral
• Revise organizational structure (at the start of the
termination stage) to meet special needs of the ter
mination stage. Discharge committees no longer
needed
• Update mortality follow-up for all patients, including
dropouts
• Carry out final data edit checks

• Establish cutoff date beyond which changes to the
data system are no longer allowed (needed so data
files can be “frozen” for final analysis)

• Develop and implement plan for the final disposition
n( the study data forms and related documents,
such as x-rays, fundus photographs. ECGs. etc.
• Develop plan for dealing with requests for special
analyses or for access to the study data after termi
nation of study funding (see Chapter 24)

• Disseminate study findings and conclusions to study
investigators and to referral physicians (may be
done by distributing preprint or reprint of main
study manuscript)
• Discharge all remaining committees at the end of the
termination stage
B

Additional activities in drug trials

• Collect sample of study drugs for future laboratory
analysis in case of questions regarding drug purity
• Dispose of remaining unused study drugs
• Submit final report to the FDA if trial involved an
INDA or IDEA: cancel INDA or IDEA after ac
ceptance of the report by the FDA

destroyed (in compliance with local statutes for
disposition of medical records) if secure storage
cannot he assured and the required period of
storage has passed (see Section 17.6). General
factors to consider in arranging for record stor
age and policy questions concerning access to
study data are discussed in Chapter 24.

15.6 POST-TRIAL PATIENT
FOLLOW-UP
Post-trial follow-up, by definition, takes place
after the termination stage of the trial (see Chap
ter 3. Appendix D, and Glossary for further de’fls). Ideally, the patient personal identifiers
octded for the follow-up should be deposited at

165

a central location before the trial terminates,
especially in multicenter trials. If this is not
done, the task of assemblying the information
after the trial has terminated may make subse
quent follow-up difficult, if not impossible. The
repository should be established at a center that
can assure secure storage, and that is likely to
remain functional into the foreseeable future.
Federal agencies, such as the National Institutes
of Health (N1H), generally are not suitable as a
repository because of their susceptibility to re
quests under the Freedom of Information Act
(see Chapter 24).
There should be a sound rationale for any
post-trial follow-up involving direct patient con
tact. The prime motivation for most post-trial
follow-ups stems from a desire to extend the
period of observation for death or some other
serious but nonfatal event. Another reason may
be to observe patients for a disease or condition
that may be caused or aggravated by treatments
administered during the trial. The usefulness of
the information obtained will depend on the com
pleteness of the follow-up and the nature of
intervening treatments administered after close
out. Interpretation of the results will be easiest if
patients have not been exposed to any additional
treatment after separation from the trial. It will
be problematic if they have been.
The CDP provides an example of post-trial
mortality follow-up. The follow-up was per
formed by the coordinating center, with help
from clinics still in operation when the follow-up
started in 1981. Addresses and other identifying
information on patients were used for tracing
them and for accessing the National Death
Index and other files.
Some trials have provided a form of post-trial
follow-up during the trial. For example, this was
the case for the two discontinued treatments in
the UGDP. Patients assigned to both the tolbu
tamide and phenformin treatments were fol
lowed for mortality (as well as for other nonfatal
events) until separation of all patients in August
of 1975. There has been no further post-trial
follow-up of any of the UGDP treatment groups
since then (University Group Diabetes Program
Research Group, 1982).

-J^
16.2 Ongoing data intake

16. Quality assurance

TiHf tt-l

Quality assurance procedures

• \i<ual check by a member of the clinic staff after a data
form is completed for illegible responses and for un
answered or incorrectly answered items

• Ongoing data processing
• Replication of the coding and data entry process as a
means of error detection
• Computer edit of keyed data for inadmissible codes or

If it ain’t broke, don’t fix it.
Old American Ad*e

I
ii
4 I?

16.1 Introduction
16.2 Ongoing data intake: An essential prereq
uisite for quality assurance
16.3 Data editing
16.4 Replication as a quality control measure
16.5 Monitoring for secular trends
16.6 Data integrity and assurance procedures
16.7 Performance monitoring reports
16.8 Other quality control procedures
16.8.1 Site visits
16.8.2 Quality control committees and centers
16.8.3 Data audits

• Reprogramming an analysis procedure » «
means of checking on its accuracy

Deficiencies anywhere in the chain of event!
from data generation to publication of the re
suits can reduce the quality of the finished pnv
duct and the conclusions reached from the toil
Everyone involved in data collection, analyut
and manuscript writing must perform effectives
to produce a quality end result.
This chapter deals with the mechanics of qual
ity assurance. Other chapters of this book tone*upon issues related to quality assurance. Thev
include:

• Repeat laboratory determinations
• Multiple independent readings of ECGs, fundus photo
graphs. X rays, tissue slides, etc.
• Independent review of patient death records for classify
ing cause of death
• Submission of masked duplicate specimens or records to
check on the reproducibility of a measurement or
reading procedure

I

• Generation of periodic reports assessing the compliance
of clinics to the treatment protocol

I
I

• Treatment masking (Chapter 8)
• Randomization (Chapters 8 and 10)
• Data form construction (Chapter 12 and Ap
pendix F)
• Production and maintenance of study hand
books and manuals (Chapter 13)
• Testing the data intake and processing system

Table 16-1 Quality assurance procedures
Table 16-2 Types of edit checks
Table 16-3 Edit message rules
Table 16-4 Data integrity checks
Table 16-5 Performance characteristics subject
to ongoing monitoring
Figure 16-1 MPS Coordinating Center edit
message. August 3. 1983
Figure 16-2 MPS Coordinating Center edit
message, October 4, 1983

16.1

missing values
• Pita edit queries (directed from the data center to the
dime) concerning completed data forms
• Generation of periodic status reports concerning the
data collection process

(Chapter 13)
• Database maintenance (Chapter 17)
• Review procedures for study publications
(Chapter 24)
• Activities staging (Appendix D)

INTRODUCTION

tion, or very shortly thereafter. Theoretically,
the entry process could take place as patients are
examined using video displays to remind physi
cians and technicians of items to be entered.
However, on-line data entry of this sort is us
ually not practical. The need to do so during an
examination may distract both the patient and
physician and may complicate the examination,
further, it is unlikely that all data could be
entered on the spot since much of it may not be
available until some time after the examination
»completed (e.g., as with results of certain labo
ratory tests or readings from biopsy material,
FCGs. X rays, etc.). However, even if these prob
lems could be overcome, documentation of the
data collection process argues against on-line
entry. The data forms and related paper records
art needed to document the data collection and

16.2 ONGOING DATA INTAKE: AN
ESSENTIAL PREREQUISITE FOR
QUALITY ASSURANCE

Quality assurance, as applied to clinical trials, is
any method or procedure for collecting, process
ing, or analyzing study data that is aimed at
maintaining or enhancing their reliability or va
lidity. Examples include (see Table 16-1):

Most of the quality assurance procedures out
lined in Table 16-1 require a continuous and
timely flow of data from the clinic to the data
center to be useful. The data edits and analyses
carried out during the trial to assess data qualit'
and clinic performance will lose much of their
value if there is a large time gap between data
generation and conversion into computer-read
able formats.
The ideal data intake system is one in which
data are edited and entered on the day of genera-

• Edit procedures to check on the accuracy of
items on completed data forms
• Repeat of a laboratory determination to check
on reproducibility
• Rekeying data as a check for errors in the
entry process
• Carrying out analyses by clinic in a multicen
ter trial to detect performance variations

• Comparison of the performance of clinics in a multicenter trial to detect differences in the quality or com
pleteness of the data generated, as reflected by such
characteristics as number of missed follow-up exami
nations. number of dropouts, number of deficient
data forms, etc.
• Reprogramming of a data editing or analysis procedure
as a check on program accuracy or on the quality of
program documentation
• Interim analyses of study data for treatment effects that
can be used to reveal inadequacies or inconsistencies
in the data collected

166

I

I

167

entry processes, to say nothing of their use in
patient care. Hence, discussion throughout this
book is predicated on the assumption that data
collection always involves completion of paper
forms and records, regardless of where and how
data entry is done.
One viable approach to on-site data entry in
volves completion of a paper form during the
patient examination and then entry of the infor
mation contained on that form as soon after the
examination as possible—ideally, on the same
day or within a few days after the examination.
The entry should be done by clinic personnel
who are familiar with the data collection re
quirements of the trial, and should be subjected
to edits during the entry process. The keyed data
may remain at the clinic for subsequent analyses
or may be transferred to a central data facility
for additional edits, analysis, and storage. The
transfer may take place on-line as the data are
keyed, or may be done off-line either on a fixed
schedule or on demand, as dictated by the data
center. On-line transfer may be via hard-wired
or telephone connections to the central facility.
Off-line transfers may be done by telephone or
by mailing the magnetic records to the central
storage facility.
Systems involving on-site data entry and mul
tiple data generation sites, as in multicenter
trials, are herein referred to as distributed data
entry systems. Those in which data forms are
sent to a data center for entry are referred to as
centralized data entry systems. AU but two of the
14 trials sketched in Appendix B had systems
of the latter type. Only the Coronary Artery
Surgery Study (CASS) and Hypertension Pre
vention Trial (HPT) had distributed data entry
systems.
A trial in which each data generation site is
responsible for maintaining its own database
with programs provided from the data center is
herein referred to as having a distributed data
base (e.g., the HPT). A trial in which the only
electronic database that exists is the one main
tained at the central data facility is herein re
ferred to as having a centralized database.
The main advantages of distributed data entry
have to do with the potential for eliminating the
time lag between the data generation and data
entry processes, and with the ability to involve
data collection personnel in the data entry pro
cess. However, in order to work well, the ap
proach requires skilled personnel at the data
center who have the patience and know-how to

' ’4- •

168

I

■I

I
■f?’

Quality assurance

16.3 Data editing

select the equipment needed for the system, to
supervise acquisition and installation of it at the
clinic, to train clinic personnel in its operation,
and to develop and maintain the software pack
ages needed for on-site data entry and editing.
The lag time between the generation of a form
and data entry should never be more than a
week or two, regardless of the type of entry
system. The goal should be to establish and
maintain the discipline needed to ensure a timely
flow of forms from the point of origin through
data entry. Designs that allow forms to accumu
late over a specified time interval or in batches of
a certain size before they are forwarded for data
entry should be avoided. The best design is one
in which individual forms proceed to data entry
on a per-form basis without regard to other
forms or conditions. Batching increases the time
from completion of a form to data entry. If some
batching is required for reasons of efficiency, it
should be minimal and should never allow forms
to accumulate for more than a week or two. The
same is true for accumulation of forms at the
data entry site.

16.3 DATA EDITING
The term data editing refers to the process of
detecting, querying, and, when appropriate, cor-

Table 16-2

reeling values in a data set that are invalid. The
normal editing process involves a series of edn
checks and edit queries. An edit check is m
operation carried out on an item or series of
items on a completed data form for the purpose
of identifying possible errors (see Table 16 2i
An edit query is a question generated from re
view of a completed data form that concerns the
accuracy or adequacy of some item of informa
tion on the form and that requires someone it
the generation site to review the information in
order to respond to the query. The query may he
generated by a clerk checking a completed form
for deficiencies, or by a CRT or printer driven
by edit programs.
Edit queries that are written will be referred to
as edit messages. Any edit message that requires
review and possible corrective action should he
printed on hard copy. This does not preclude use
of a CRT for a preliminary display of messages,
but this procedure is not adequate if messages
must be sent to various places in the clinic lor
review and action. Special care must be taken to
make certain that the messages are intelligible
Table 16-3 gives suggested edit message rules
A sample of such messages, as taken from the
Macular Photocoagulation Study (MPS), is re
produced in Figures 16-1 and 16-2 for a ficti
tious clinic and patient. The two pages relate to

TiWe 16-3 Edit message rules
• I w a format that facilitates use by clinic personnel, even
,f the format is not ideal for data entry
• TeM the intelligibility of the messages on personnel who
must deal with the messages
• void the use of esoteric codes, abbreviations, and other
nvmhols that are not readily understood by personnel
who must respond to the statements
• Identify the patient, examination, form, and item
number on the edit statement
• Allow space on the statement for the respondent to
indicate the action taken
• Group messages for a given patient examination in such
a wav so as to simplify the task of dealing with them
(e g. list all laboratory-related edit messages for a
given examination on one page and all messages con
cerning clinical evaluation of the patient on another
page, if different personnel are required to deal with
the two types of edit messages)
• (ienerate duplicate copies of the edit messages to allow
clinics to retain a copy of answered queries

lipre U l

patient 03-072-S, with name code MARV, seen
on July 6. 1983, in connection with his fifth
follow-up clinic visit (second annual examina
tion). The message dated August 3, 1983, relates
to inconsistencies noted in visual acuity mea
surements done on the patient. The message
dated October 4, 1983, relates to discrepancies in
readings of fundus photographs done at the
clinic with those done at the MPS Fundus Pho
tography Reading Center. Clinic personnel are
required to indicate corrected values on the edit
message sheets and to return them to the MPS
Coordinating Center for processing.
The first set of edit checks should be done by
hand at the clinic shortly after a form is com
pleted. A second set of checks, involving a com
bination of hand and computer checks, may be
performed when the data are keyed. The main
advantages of computer checks lie in the ease
and accuracy with which they can be made and
in the ability to use the computer to write and

MPS Coordinating Center edit message of August 3, 1983.

Clinic J 03
8 it lent :

Stutfv:

Ext Rtstarch Clinic

03—072—S

»t COMPONENT

VitiV.

Code ! MARV

0702

SMD
fvOS
07/0A/B3
(Follov-ui- Viii*. 03?

Visual Acuitu Measures (Follow—jn)

ITEM

OU VULUE

4AR

10

CORRECTEJ VALVE

Types of edit checks

Type

Edit check

• Patient identification and
record linkage

• Check of ID number and name code for transposition errors
• Check of name for spelling errors
• Check to make certain all pages of a given form carry the same ID

number
• Legibility

• Check for response checkmarks placed outside designated spaces

• Form admissibility

ABR

97

4CR

OO

Thera it a erobltB with one or eore of the tbove entwert. Question AAR
euit be answered with either » ’IO* or a •03*» and the answers to Questions 4BR
and 4CR oust indicate the saallest line read at THAT distance and the nunber
of additional letters reed at THAT distance.
Please surrlw the correct answers for all three euestions.

• Check for illegible handwritten replies, spelling errors, etc.
• Check to determine if the form was completed within the specified
time window
• Check to make certain the form completed is the correct one for the
indicated examination

• Missing information

• Check for unanswered items or sections of an otherwise completed
form
• Check to make certain all required forms have been completed

• Consistency

• Check of information supplied in one section against another section
on the same form for inconsistencies
• Check of information supplied on the same patient on one data form
with that from another form completed at the same or at a differ
ent examination as a check for possible data inconsistencies

• Range and inadmissible
codes

• Check to identify items with values that exceed specified ranges
• Check for undefined alphabetic or numerical codes

169

I
PERSON COMPLETING THIS FORM:

bate:

__

w
16.4 Replication as a quality control measure

170

I
evidenced by comparisons of values recorded
,
the
original
study
forms
with
entries
appearon- " the computer data file of the study (Commflw'for'thT As^ssment of Biometric Aspects

IS I

MPS Coordinating Center edit message of October 4. 1983.

Figure 16-2

pi

««« MPS READING CENTER »»*
Clinic I 03

Patient I

tluda:

tv« Research Clinic

Viait:

Code ! HARM

03-072-8

St COnPOMENT

9911

8MD

„f Controlled Trials of Hypoglycemic Agents
in'sPnnH and
Drug Administration,
19/8).
munately
procedures
in the UGDP Coordi-

rvO9
07/04/W
(Folloo-t* Viail 03)

Line CenterPrequired all changes to the comXr file to originate with the original data
terms It would not have been possible to maint31n a one-to-one correspondence between the
original records and computer file without such

Annual Follou-ue Iradina Fore (FT» FV01. FV03 ...)

ITOI

OLB VALllt

9b

9

CTRREC1EB VALUE

* A'series of identification and linkage checks

If lb«r« it no blood' thoro la no blood I
lh«n Mvttiafts Sb »nd M Mtt alto ba
In othor oordt' if cuostion la la

.lOa

9

—

lOb

n

—

!

If HratM
FA ar« AtManl than •uctlion* 10a and 10b MUST ba anawarad.
If lhara it no RPE alrorha (Quaatlon 10a ■ 'n') than auattion 10b outt alto ba

I 1

171

Quality assurance

1
Ff«»0N COMM-ETIW THIS FORM! ________

keep track of the queries. Clinics need periodic
reminders of outstanding queries to ensure they
are addressed (see Chapter 17 for a discussion of
file updates based on edit changes). The com
puter. however, should never be a substitute for
the checks performed by staff at the clinic before
forms are forwarded for data entry. An expe
rienced clinic coordinator, with an eye for errors
and an encyclopedic knowledge of the study pro
tocol, can do more to enhance the quality of the
data generated than any set of computer checks.
There should be an audit trail for any change
made to a completed data form, regardless of
when and how the change was initiated. The
nature of the deficiency, when it was detected,
the change, and when the change was made
should be noted. Once recorded on a form, data
should not be erased or obliterated. Entries that
are incorrect should be lined out and the new
entries added to the form. Any change, regard
less of when it was made, should be dated and

______i*n:____

should carry the initials of the person making the
change.
Data entry personnel should be given explicit
instructions regarding the types of data changn
they may make. Sound practice dictates that
data should be entered as recorded, even if an
item is “clearly” in error and the change required
seems obvious. The temptation is to make an
“obvious” change on the spot, without any check
ing. However, there are at least two reasons to
resist the temptation. First, there is always the
chance that the item has been correctly recorded
even though it appears to be in error. Second,
on-the-spot changes will lead to discrepancies
between the computer data file and the origin!
study records. Such discrepancies, if sizabk.
may lead to serious questions concerning t
integrity of the data collection and processing
activity. Both audits of the University Group
Diabetes Program (UGDP) focused on the accu
racy of the data collection and entry processes.

I
'

should be performed before any form is added to
the computer file. The ID number recorded
should be checked for transposition errors (e.g.,
va a check digit; see Glossary). No form should
he added to the file unless the ID number and
other identifiers agree (e.g., such as name or
name code).
Admission of a record to the data file may
also depend on time window (see Glossary)
checks needed to ensure that the information in
question was obtained within a specified time
interval. Examinations performed outside the
specified window may either be rejected or assigned to the appropriate time slot, depending
on the philosophy of the study.
Computer checks made during data entry
should be designed to detect use of inadmissible
codes (e.g., entry of an alphabetic character
• hen only numeric codes are permissible or use
of an undefined or inadmissible numeric code).
These errors should be corrected before the
generation of edit messages.
Most editing systems are designed to deal with
one item at a time. There may be some cross
checking of items, but it is usually limited to
items on the same form. Cross checking of items
across forms is generally not done because of the
logistical difficulties involved in making such
checks and because of the limited return in
added undetected errors and deficiencies.
The foundations for data editing should be
laid when the study is designed. The edit require
ments should be specified in the handbooks and
manuals needed for operation of the trial. The
data forms used in the trial, as suggested in
Chapter 12, should include reminder and docu
mentation items (see Section 12.5.13) that re
quire clinic personnel to carry out essential
checks while the forms are being completed and
that remind them of the steps that must be per-

formed in conjunction with specified data collec
tion procedures.

,6 4

REPLICATION AS
A A QUALITY
REPLICATION
CONTrOL MEASURE
Replication of an observation
observatr or reading is frecheck oon the quality of the
quently used as a check
data obtained.
obtained. Examples
Examples oof replication used in
data
this way are:
• Comparison of two independent measure
ments, such as a laboratory test, to deter
mine if the difference observed is outside a
specified range
• Use of two independent readings of an ECG
to identify items of disagreement for adjud
ication by a third reader
• Comparison of cause of death codes assigned
by two different individuals to identify
areas of disagreement for adjudication by a
third reader
• Averaging two or more consecutive blood
pressure readings made on a patient during
a given clinic visit in order to have a more
reliable estimate of the patient’s “true”
blood pressure
• Rekeying data as a check for errors in the
entry process
• Use of a computer program, written specif
ically to duplicate the tasks performed by
another program, to check the accuracy of
results provideo with the original program

Replicate values obtained from repeat read
ings or from aliquots of the same laboratory
specimen are usually combined by averaging to
yield a single composite value. However, this
approach is not suitable for combining inde
pendent readings made from the same record
involving binary measures (e.g., presence or ab
sence of S-T depression on ECGs). Some form
of adjudication is necessary when the readings
disagree. It may be done by having an “expert”
make the judgment or by asking the individual
readers to reach agreement. It is important to
select readers who work well together and who
interact on a peer basis if the latter approach is
used.
A common problem in trials involving labora
tory determinations has to do with the detection
and disposition of outlier values (see Glossary).
Explicit rules are required to indicate the condi
tions under which a determination is to be re-

172

16.7 Performance monitoring reports

Quality assurance

peated and the value or values to be reported in
such cases. The procedures of the laboratory
performing the determinations should be re
viewed when the rules are constructed. Labora
tories differ with regard to the practices they
follow in making repeat determinations because
of suspected errors. Some of those practices can
bias the results reported, for example, as is the
case with a laboratory that does three determina
tions per sample, but reports only the two most
concordant values. The same is true for a labora
tory that opts to make repeat determinations
when the observed inter-aliquot difference for
the first set of determinations exceeds a prespeci
fied limit and then reports only the results of the
second set.
The easiest, and often the best, rule to follow
is one that requires the laboratory to report all
determinations made, without any censoring.
Outlier values which, if retained in the data file,
would have undue influence on means and vari
ances may be eliminated or trimmed when the
analysis tape is written (see Section 17.7).
16.5 MONITORING FOR SECULAR
TRENDS

A secular trend in the readings made from rec
ords, such as ECGs, X rays, fundus photo
graphs, and the like, or from laboratory deter
minations, can be troublesome, especially if
differential by treatment. The possibility of this
happening is minimized when the ordering of the
readings or determinations is independent of
treatment assignment (e.g., in schemes in which
readings or determinations are done on an ongo
ing basis and in the order of generation). How
ever, even so, it is wise to monitor for trends.
The information is useful in characterizing the
magnitude of the trend and in indicating
whether it is differential by treatment. Assurance
in the latter regard is especially important for
readings or determinations that are not masked
with regard to treatment assignment or that are
ordered by treatment assignment. In addition,
characterization of the trend, even if not needed
for making treatment comparisons, is useful
when evaluating follow-up results for a particu
lar treatment group in natural history studies.
The number of repeat determinations or read
ings that are made should be dictated by the
importance attached to detecting time trends
and the total resources available for quality con
trol. The cost of maintaining systems designed to
detect secular trends can be sizable. Only a small

part of the cost may be associated with makmr
the actual readings or determinations. The lartr
costs will be associated with managing the
toring system.
Monitoring a laboratory or reading center for
secular trends requires use of known standards
that are subjected to repeat analyses or readmp
over the course of the trial. To be useful, the
repeat specimens should be indistinguishable
from other specimens received at the laboraton
or reading center.
Developing a reliable set of standards, at lew
for laboratory determinations, is not a trini
task. The problem would be easily solved if i
single set of standards could be used throughout
the trial. However, most biological substance
degrade with time and, hence, more dynamic
approaches are needed. The Coronary Drug
Project (CDP) created a pool of donor serum
The pool was aliquoted and then frozen (Canner
et al., 1983c; Hainline et al., 1983). Specimens
from the pool were submitted to the centra!
laboratory on a time schedule designed to coin
cide with actual patients in the trial, using ID
numbers of deceased patients.' When a given
pool was near depletion, or the time limit set for
its use was about to expire, a new one wai
created. Use of specimens from the new pool
overlapped use of specimens from the old pool,
so as to provide a basis for estimating concentra
tion differences between the two pools.
Similar monitoring is needed for readings of
ECGs, fundus photographs, X rays, biopsy ma
terials, etc. However, the mechanics of setting
up and maintaining systems for this purpov
are
“
■ '-.vii
even inviv
more wiiipiivaicu
complicated than
man mosc
those requtrnj
required
for laboratory determinations. The CDP uvtl
a system for making repeat ECG readings to
monitor for time-related shifts in reading Manards (Coronary Drug Project Research Group.
1973a). However, the system was difficult io
manage, and it was not easy to keep readers
from identifying repeat tracings, especially ihov
with distinctive patterns. In any case, the system
was only effective in detecting short-term trends
since the tracings chosen for rereading were se
lected from batches of tracings that had been
read in the recent past. Inclusion of records read
in the distant past was not practical because
of date information contained on the tracings
There was concern that lack of homogeneih of

dates within a reading batch would enable read
er to identify repeat tracings.
Another method sometimes used to control a
reading process involves use of reference meaAurements or records to help readers gauge the
degree of abnormality seen in actual records.
This approach was used in the MPS for grading
the seventy of certain kinds of eye abnormalities,
seen in fundus photographs. The severity of
an observed abnormality was graded by selecting
the photograph from an ordered set of standard
photographs that was most similar to the one in
question.
Concerns regarding secular trends are ob
viated if records are read over a short period of
time at the end of the study and in a random
order with regard to the time of generation and
treatment assignment. However, this approach
suffers from two major disadvantages. First,
postponing readings until the end of data collec
tion means that results from the records in ques
tion will not be available for interim analyses
during the trial (see Chapter 20). Second, wait
ing for the readings may delay preparation of the
final report. Both disadvantages are avoided
with an ongoing reading program that runs over
the course of the trial.

'
'

■

'

I

16.6 DATA INTEGRITY AND
ASSURANCE PROCEDURES
iJ
I

I. Use of fictitious ID numbers would have caused the eta"*1
laboratory to reject (he specimens because of edit checks per
formed by it prior to admitting specimens for analysis.

4

An editorial by Meinert (1980b) discusses factors that may contribute to dishonest practices
in the field of clinical trials. They do occur, but
there is no reason to believe their incidence is
higher in this field than in other areas of re
search. In fact, it may be lower because of the
general emphasis on error detection and quality
control. However, even so, there are good rea
sons for constant vigilance against shady prac
tices. The luxury of replication, used so effec
tively in the laboratory sciences to confirm or
refute findings, is not always feasible in clinical
trials for practical as well as ethical reasons. For
example, it would be difficult to justify addi
tional placebo-controlled trials of hypertensives
tn the light of the conclusions from those done
by the VA (Veterans Administration Coopera
tive Study Group on Antihypertensive Agents,
l%7, 1970) or to replicate the Multiple Risk
factor Intervention Trial in view of its cost and
the time required to complete it (Multiple Risk
factor Intervention Trial Research Group,
1982).

173

Table lb-4 Data integrity checks
• Comparison of information on a patient’s medical chart
with that recorded on a study data form
• Comparison of information on data forms with that in
the computer
• Interviews with support personnel for identification of
questionable or undesirable data practices

• Review of methods for issuing treatment allocations to
check for discrepancies in the administration of the
allocation schedule
• Review of analysis procedures used by the data center
for evidence of a bias for or against a particular treat
ment
• Comparison of the distribution of inter-aliquot differ
ences to detect clinic differences in reading or report
ing procedures
• Independent audit of published reports to determine if
the conclusions are supported by the raw data.

Table 16-4 provides a list of checks that can
be performed to help identify questionable data
practices, whether due to honest errors, careless
oversights, or purposeful acts. The checks, like
others in the trial, should be ongoing since the
problems they are aimed at detecting can occur
at any time over its course.
The best preventive measure is a staff that
appreciates the importance of honesty and integ
rity in all aspects of the trial. The responsibility
for instilling the proper philosophy rests with the
leaders of the trial. They must, by the statements
they make and the actions they take, set a tone
and standard that permeates the entire investiga
tive group.
16.7 PERFORMANCE MONITORING
REPORTS
It is good practice to prepare reports summariz
ing performance characteristics of the trial as it
proceeds. The reports should be prepared by the
data center and should be designed to provide
up-to-date information on all relevant activities
of the trial. Some of the performance character
istics that should be monitored are listed in
Table 16-5. See also Appendix G for sample re
ports.
The information in the report should be re
viewed by the leadership of the trial (e.g., the
steering committee) and should be used as a
basis for initiating corrective action, where ap
propriate. To be useful as a monitoring tool,
reports should indicate the relative standings of

1

174

Table 16-5

ft

A. Clinic characteristics

1. Patient recruitment
• Number of patients screened for enrollment; pro
portion rejected and tabulation of reasons for
rejection*

2. Patient follow-up

• Distribution of enrollment times and median
length of follow-up

• Total number of forms awaiting data entry*
• List of coding and protocol changes imrk
mented since last report

• List of data processing and programming error,
and likely impact on study results

• Timetable for unfinished tasks

• Number of samples received*

• Total number of dropouts and estimated drop
out rate

• Number of samples received improperly or taadequately identified*

• Number of patients who cannot be located for
follow-up

• Number of samples lost or destroyed*

• Number of forms received with missing parts or
missing supporting records

• Mean and variance of inter-aliquot difference,
over time for specified tests

• Number of unanswered edit queries*

• Secular trend analyses based on repeat determi
nations of known standards

• Number of patients who did not accept the Msigned treatment*

• Number of patients who received a treatment
other than the one assigned*
• Summary of data on pill counts and other adher
ence tests by treatment group*

A visit to a center in a trial made by person
nel from outside that center for the purpose
of assessing its performance or potentialfor
performance.
Those making the visit may be from other cen
ters in the trial or from outside the trial. The size
of the visiting team will be dictated by the nature
of the visit. It may be done by just one person or
it may involve a half dozen or more people
depending on needs and circumstances. (See Cas
sel and Ferris, 1984, for details regarding clinic
visiting procedures in an ophthalmic study.) The
"typical” clinic visit in a multicenter trial may
involve the chairman of the study (or his repre
sentative), a director of another clinic in the
trial, the director of the data coordinating center
(or his representative), and the project officer, as
well as other selected resource people (e.g., a
clinic coordinator if there are problems in the
way forms are completed, or a person knowl
edgeable in laboratory methods if there are prob
lems in this area).
The head of the visiting team should prepare a
written report of the visit, based on input from
the entire team. It should indicate when the visit
took place, who made it, who was seen, the areas
of activities reviewed, and the strengths and
weaknesses of the center. When appropriate, it
should contain a list of specific recommenda
tions. It should be sent to the director of the
center visited and to the appropriate leadership
body of the study for review (usually the steering
committee).
Clinic visits may be made on an as-needed
basis or on a set time schedule. The CDP used a
combination of the two approaches. The steer
ing committee requested visits of clinics consid
ered to have performance problems. Clinics that

D. Reading center characteristics
• Number of records received and read*
• Number of records received that were improp
erly labelled or had other deficiencies (tabu
late deficiencies)*
• Analyses of repeat readings as a check on repro
ducibility of readings and as a means of moni
toring for time shifts in the reading process

E. Other performance characteristics

• Number of departures from the treatment proto-

• Status of papers being written

• Summary of other treatment or data collection
protocol violations

• Labelling errors made in drugs dispensed from
the central pharmacy

• Progress in locating patients lost to follow-up

•Report should contain results for the entire study period, for the time period covered since production of the last report, and for the la*
one or two preceding time periods.

clinics in multiccnter trials with regard to im
portant functions such as patient recruitment,
completeness of follow-up, number of error-free
forms, etc. The tabulations may be for the entire
study period or for defined time intervals (e.g.,
the last 3 months, 4 to 6 months ago, etc.). The

Site visits

•\ site visit, used in this context, is:

• Backlog of samples remaining to be analyzed*

• Summary of major events affecting laboraton
operations, such as power outages, paruevlarly those resulting in possible degradation of
frozen samples

• Number of ineligible patients enrolled*

16.8.1

• Number of samples requiring reanalysis and tab
ulation of reasons for reanalysis*

• Number of forms completed since last report
and number that generated edit messages
• Current edit message rate per form contrasted
with rates from previous time periods

4. Protocol adherence

16 8 OTHER QUALITY CONTROL
PROCEDURES

C. Central laboratory characteristics

• Number of dropouts*

• Number of patients enrolled with incomplete
baseline information*

P

• Number of forms received*

• Number of completed follow-up examinations*

3. Data quantity and quality

I

• Number of allocations issued*

• Number of allocations returned unused

• Number of missed examinations and number

past due*

’k.

B. Dita center characterhtlcs

• Summary of major events, such as comptrnnf
malfunctions, necessitating use of backut
tapes to restore the data system

• Rate of missed examinations*

?

uatistic is more important than clinic rankings.
Members of the entire investigative group
should have access to the performance monitormg reports to enable them to gauge their standing m the study. Peer pressure, exerted via dis
semination of the information, can be helpful in
encouraging clinics with poor performance rec
ords to improve.

Performance characteristics subject to ongoing monitoring

• Current rate of recruitment compared with that
required to achieve a prestated recruitment
goal

I

16.8 Other quality control procedures

Quality assurance

rankings can be helpful in identifying problem
clinics. However, they should be viewed wii'i
caution when used as a basis for taking correc
tive or punitive action involving individual clin
ics. The range of the difference between the best
and worst clinics with regard to a performance

I

175

were not visited on this basis were visited rou
tinely over the course of the trial.
The visits should include contacts with senior
staff as well as essential support staff in the clinic
and may involve any or all of the following
activities:
• Private meeting of the site visitors with the
clinic director
• Meeting of the site visitors with members of
the clinic staff
• Inspection of examining and record storage
facilities
• Comparison of data contained on selected
data forms with those contained in the com
puter data file
• Review of file of data forms and related rec
ords to assess completeness and security
against loss or misuse
• Observation of clinic personnel carrying out
specified procedures
• Check of handbooks, manuals, forms, and
other documents on file at the clinic to
assess whether they are up-to-date

• Physical or verbal walk-through of certain
procedures (e.g., the series of examinations
needed to determine patient eligibility, or
the steps followed in the informed consent
process
• Conversations with actual study patients dur
ing or after enrollment as a check on the
informed consent process
• Private conversations with key support per
sonnel to assess their practices and philoso
phy with regard to data collection
• Private meeting with the clinic director’s chief
concerning special issues
The visiting process should not be limited to
clinics. It should include the data center as well
as other key resource centers in a trial. A “typi
cal" data center visit may include many of the
activities mentioned above as well as:

• Review of methods for inventorying forms
received from clinics
• Review of methods for data entry and verifi
cation
• Assessment of the adequacy of methods for
filing and storing paper records received
from clinics, including the security of the
storage area and methods for protecting
records against loss or unauthorized use
• Review of available computing resources
• Review of method of randomization and of
safeguards to protect against breakdowns

176

■

Quality assurance

in the randomization process (see Chap
ter 10)
• Review of data editing procedures
• Review of computer data file structure and
methods for maintaining the analysis data
base
• Review of programming methods both for
data management and analysis, including
an assessment of program documentation
• Comparison of information contained on
original study forms with that in the com
puter data file
• Review of methods for generating analysis
data files and related data reports
• Review of analysis philosophy, especially in
relation to the principles discussed in Chap
ter 18
• Review of methods for backing up the main
data file
• Review of methods for restoring the main
data file or original study records if lost or
destroyed
• Review of master file of key study documents,
such as handbooks, manuals, data forms,
minutes of study committees, etc., for com
pleteness

■'

I4
•A

- <!S

ts
i

V

Some studies, such as the National Coopera
tive Gallstone Study (NCGS), have gone a step
beyond the process outlined above in monitor
ing data center operations. It established a spe
cial monitoring committee, made up of people
from outside the study, with first-hand expe
rience in data coordinating center operations to
review operations in the center (National Co
operative Gallstone Study Group, 1981a). The
committee was responsible for carrying out peri
odic reviews of the center and for reporting re
sults of those visits to the NCGS Steering Com
mittee and Advisory-Review Committee.

16.8.2 Quality control committees and
centers

Certain of the quality control functions in some
of the larger-scale multicenter trials may be per
formed by specifically constituted committees,
as already stated above for the NCOS. For ex
ample, the CDP had a laboratory committee to

Part IV. Data analysis and interpretation

review laboratory standards and methods (Cornnary Drug Project Research Group, 1973a) The
Aspirin Myocardial Infarction Study (AMlSi
created a committee that was responsible for
monitoring the performance of all centers in the
trial, primarily via performance monitoring re
ports prepared by the AMIS Coordinating Cen
ter (Aspirin Myocardial Infarction Study Re
search Group, 1980b). Various other committees
in the structure of a trial will have quality con
trol functions.
A few studies, such as the Persantine Aspinn
Reinfarction Study (PARIS), have funded i
quality control center (Persantine Aspirin Rein
farction Trial Research Group, 1980a). The func
tion of the PARIS center was to carry out data
audits by comparing data from original studi
forms with those in computer files at the PARIS
Coordinating Center. A second function was to
check on the accuracy of analyses performed b\
the Coordinating Center. A third was to sene as
a second analysis center for the study, using
tapes provided by the PARIS Coordinating Cen
ter.
16.8.3

Chapters in This Part

17. The analysis database
|R Data analysis requirements and procedures
.
.
19: Questions concerning the design, analysis, and interpretation of clinical trials
20. Interim data analyses for treatment monitoring

The four chapters in this Part deal with issues involved in the analysis and interpretation of
results from trials. The first chapter details issues concerned with database management.
Chapter 18 details general principles to be followed when results are analyzed. It also contains
brief descriptions of commonly used methods of analysis for trials involving a binary event as
the outcome measure. Chapter 19 contains a list of questions and short answers concerning
the design, analysis, and interpretation of clinical trials. Chapter 20 addresses issues involved
in treatment monitoring and provides a brief description of some of the analysis approaches
used for that purpose.

Data audits

A data audit, as used herein, involves an itemby-item comparison of information recorded on
an original study form with that contained in the
computer file for that form. Such audits, as men
tioned in Section 16.3, were carried out h\
groups reviewing the UGDP, after the study ua<
finished. To be useful as a quality control mea
sure they must be carried out during the trial
Ongoing audits of this sort are especially impor
tant in studies with distributed data systems
where forms are keyed at the clinic and, hence,
may never be sent to the data center, as in the
HPT. Clinics in that study are required to for
ward a random sample of completed data forms
to the data coordinating center. Staff at the cen
ter compare entries on the forms with those in
the data file. Discrepancies are noted for review
A less systematic approach might involve onthe-spot audits carried out during clinic site vis
its and done by arbitrarily selecting a few forms
for comparison with data listings prepared by
the data center in conjunction with the visit.

1

i

177

17. The analysis database

Round numbers are always false.
Samuel Johnson

and the analysis database. There will be when all
entries on study forms are made in codified
form. However, this is not always practical, espe
cially if some of the information collected is
recorded in narrative form and is not coded.

r I Introduction
p.2 Choice of computing facility
p t Organization of programming resources
p.4 Operational requirements for database
maintenance
p.5 Data security precautions
p.6 Filing and storing the original study rec
ords
P.7 Preparation of analysis tapes
Table 17-1 General-use versus dedicated com
puting facilities
Table 17-2 Considerations in choosing among
computing facilities
Table 17-3 Precautions and safeguards for data
base operations

tai'’
wk.-

17.2 CHOICE OF COMPUTING
FACILITY

'S ¥

fc.

17.1

INTRODUCTION

This chapter contains a discussion of issues in
volved in the development and maintenance of
the analysis database. The analyses may be for
the purposes of quality control (Chapter 16),
safety monitoring, (Chapter 20), or for prepara
tion of publications at the end of the trial (Chap
ters 18 and 25).
The study database, as defined herein, consists
of all data contained on official data forms of
the study. It includes data from all baseline and
follow-up forms, as well as data from laboratory
tests and other procedures (c.g., ECGs, fundus
photographs, liver biopsies, etc.) that are a re
quired part of the study protocol. It does not
include data that are part of a patient’s general
medical record, except to the extent that such
information overlaps that which is needed for
the study.
The analysis database is constructed from the
study database, and consists of all codified infor
mation contained in the latter database. Ideally,
there should be a one-to-one correspondence
between the paper forms generated from a study

it

179
1

Most trials will require use of electronic files to
facilitate analysis of the study results. The choice
of the electronic medium (c.g., tape or disk) and
facility is not as crucial in a short-term trial as in
a long-term one. The choice of the facility will be
between a dedicated one, operated by study per
sonnel for the exclusive use of the study, or a
general-use facility, operated by someone else
and shared with other users, or a combination of
the two kinds of facilities. Table 17-1 outlines
the pros and cons of the two classes of facilities.
Once the type of facility has been chosen, the
next decision has to do with hardware selection
within the class (Table 17-2). The options avail
able may be limited if the decision is to rely on a
general-use facility, especially if the selection is
limited to facilities within the investigator’s own
institution. However, even in such cases there is
usually room for a choice if the institution has
multiple general-use facilities. A comparative
evaluation, including the use of benchmarking
techniques to assess the computing power and
cost of candidate facilities, is needed to make an
informed choice. Consideration should be given
to the experience of staff in the computing facili
ties in database management and data analysis
and to the kinds of software packages available
for those activities.
The existence of good database management
packages, along with standard analysis pack
ages, such as provided in BMDP, SPSS, and
SAS (Devan and Brown, 1979; Dixon, 1981;
Norusis, 1983; Ray, 1982), can markedly reduce

180

Table 17-1

A. Pros and cons

If

II. Dedicated facility
A. Pros and cons

• Likely to provide more computing power for the
study than is feasible with a dedicated facility,
but access to the facility may be limited

• Access to computer can be limited to study per
sonnel, thereby avoiding competition with other
users

• Investigators are freed of responsibilities for oper
ation of the facility; however, the operators of a
general-use facility may be insensitive to specific
needs of the trial

• Limited access may make it easier to protect data
files against unauthorized entry
• Amount of computing power and number of hinj
ware and software options likely to be mor.
limited than on large general-use facilities

• Number of programming options on a general-use
facility is likely to be greater than on a dedi
cated facility

• Responsibility for operation of the facility rm,
with study personnel. May be a disadvinup
depending on the skills and interests of the per
sonnel involved

• Generally provides a wider array of hardware than
available on a dedicated facility

• Protection of data files on the system may be more
difficult than with a dedicated facility

B. Factors favoring choice of dedicatedfacility

• No general-use facility in the institution housinf
the data center, or the facilities that exist art
overloaded

B. Factors favoring choice of general-use facility
• Existence of good general-use facility operated by
staff responsive to user needs and equipped with
hardware needed for the study

• Data processing needs are sizable and will co"
tinue over a long period of time (e g . > 3 yean.

• Total duration of the trial, including the period of
final analysis, relatively short (e.g.. < 3 years)

j

iI

Considerations in choosing among computing facilities

A. Considerations in choosing among different general-use
facilities

• Type and amount of staffing available for advice and
consultation

• Hours of operation and modes of access (e.g., only on
site batch entry versus entry via remote job entry
station or via CRT work station)
• Record of mainframe hardware supplier (e.g., firm
with an established record for sales and service
versus one that is a recent entry into the hardware
field)

• Primary use of the facility (e.g., research versus ad
ministration)
• Compatibility of hardware and software features with
other facilities (especially important if there is a
need to switch facilities during the trial)

• Array of available hardware and software packages,
particularly for data management and data analysis

w

0

; MM

The requirements for the data system should be
developed by data processing personnel, in col
laboration with the clinical investigators. Devel
opment of programs should not be started until
there is general agreement on the requirements
for data flow and editing. It may be efficient to
vest responsibility for the development and main
tenance of programs needed for operation of the
database and those needed for data analysis with
different groups (e.g.. see Meinert et al., 1983).
The majority of programming work early in the
trial will be related to development of the data
management system. The demand for this will
diminish once the basic database management
wstems are in place. Programming efforts there
after will be limited to those needed for main
tenance of the system and for implementing
changes dictated by hardware or software
changes or by modifications to the study proto
col. The demand for analysis programming will
begin once recruitment is under way. The first
efforts in this regard will relate to analyses
needed for performance and safety monitoring
and later on for manuscript preparation. The
overall demand for programming is likely to
increase over the course of the trial.
The time spent in improving the efficiency of
operating programs should depend on the
number of times they are likely to be used over
the course of the study, the amount of time
required to run them, and the way computer
charges are billed. Most general-use computing

• The existence of staff with the interest and talemi
needed for operation of a dedicated facility

• No one in the data center staff has the interest or
talents needed for operation of a dedicated fa
cility

Table 17-2

17.3 ORGANIZATION OF
PROGRAMMING RESOURCES

• Programming and data processing staff needed (.'»
the trial is fairly large (e.g.. > 4 full-time equiva
lents)

• Programming and data processing staff needed for
the trial is small (e.g., < I FTE)

• :

the amount of programming time required for
hoth kinds of activities.
The options available if a dedicated facility is
chosen are greater and more varied. Making an
formed judgment may require months of work
to collect the necessary cost and operating infor
mation Highly specialized items of equipment,
requiring use of esoteric programming lan
guages. should be avoided. The cost and incon
venience involved in converting programs to
operate on some other system may make it im
practical to consider conversions later on.
A crucial cost issue is whether to purchase or
leive the required hardware. Generally, purchase
tv cheaper than lease for items used at least three
vears. The disadvantage is that purchase may
make it impractical to take advantage of subse
quent upgrades, especially if the upgrades in
volve new product lines.

General-use versus dedicated computing facilities

I. General-use facility

J

17.4 Operational requirements for database maintenance

The analysis database

• Past history of operation, including record of past
hardware upgrades

B. Choosing among different dedicated facilities
• Available hardware and software features, especiill’
those related to computing power, response time
database maintenance, and construction of files («
data analysis
• Compatibility of programming languages with other
operating systems
• Past history of vendor in producing and serwinj
small-scale dedicated computers
• Nature of details contained in manuals for operating
the facility
• Vendor method of providing updates to the systtw
and their costs

• Expertise of vendor sales and service personnel
• Level of access to vendor systems personnel for an
swering questions having to do with operation of
the system
• Cost and maintenance charges

• Level of satisfaction expressed by other research users
of the facility
• Charging policy for computer time, on-line data stor
age, printing, etc.

I

18!

centers have charges for on-line data storage,
number of lines printed, tape or disk I/Os, etc.
Minor changes in the charging algorithm can
have major cost implications for the trial. Re
programming may be necessary to lessen their
impact.
A major issue in the development of any
system has to do with the amount of testing that
is done before programs are released for use in
the trial. Many flaws can be detected via the
reviews that are part of any good programming
effort. On-line testing should not be started until
there has been a successful “walk-through” of
the program. A number of test runs should be
made thereafter. The data sets used for this pur
pose should be typical of data likely to be col
lected as part of the trial. A number of different
data sets should be used to reflect a variety of
conditions.
Operating programs should be sufficiently
well documented to allow someone unfamiliar
with the programs to operate them. The need for
good documentation, although greatest in long
term trials because of the changes in program
ming personnel that can occur, is important for
all trials. Use of a structured programming lan
guage. such as PL/1, can help in this process;
however, there is no substitute for the critical
review of others in testing the adequacy of the
documentation.
17.4 OPERATIONAL
REQUIREMENTS FOR
DATABASE MAINTENANCE

Data will be added to the analysis database in
blocks. Keyed data are usually stored in a tem
porary file until a defined data entry session has
been completed or until after the close of a de
fined time period. Thereafter, the resulting data
block is transmitted to the analysis database for
storage and subsequent manipulation. The up
date schedule will depend on the rate of data
flow and on how and where data are keyed. The
data center in the Coronary Artery Surgery
Study (CASS) gathered information keyed and
temporarily stored at the clinics, by polling
clinic workstations (usually at night) on a weekly
basis. The Coronary Drug Project (CDP) up
dated its main database about every two weeks
(Meinert et al., 1983).
The prime function of the updating process is
to link new data with that already in the analysis
file. This may be accomplished by physically
locating new data for a patient next to that

182

i-

■t’•

■ *

.V.

f

The analysis database

already on file for the patient or by use of direc
tories in which new data are added to the end of
the file without regard to location of other data
pertinent to a particular patient. The approach
used will be determined by the type of comput
ing hardware and software features available and
the cost of data retrieval under one structure
versus another.
The computer data file should be designed to
minimize the amount of sorting and hand pro
cessing preparatory to an update, as well as the
amount of computer time needed for the update.
Generally, files that are constructed for easy up
dating are not easy to use for data analysis.
Hence, it is usually necessary to reorganize them
preparatory to any analyses.
A crucial issue in the updating process has to
do with the disposition of data items that are still
in a state of flux because of outstanding edit
queries (see Chapter 16). Should such items be
added to the analysis database or should they be
excluded until the edit queries have been re
solved? The CDP analysis database excluded all
such data items. They were added to the file, on
an item-by-item basis, as they cleared the edit
process. They were included in the Aspirin Myocardial Infarction Study (AMIS) analysis database. However, items with outstanding edit queries were flagged. The flags remained in place
until the edit queries were resolved and were
used to eliminate questionable data for certain
of the analyses performed.

17.5 DATA SECURITY
PRECAUTIONS
The database of the study must be safeguarded
against loss or unauthorized use (see the next
section and Sections 15.5 and 24.4 for comments
concerning storage of the original study rec
ords). Table 17-3 provides a list of the general
precautions and safeguards that should be taken
in
m any
anv data operation zd
(Part
--. A),
a\ _
a i:_.
list of safe
guards applicable to files containing patient iden
tifying information (Part B), general methods
for protecting data files against misuse (Part C),
and methods for protecting files against loss or
destruction (Part D).
It is the responsibility of the study leadership
j-i:— r__
L_ trial
to Anfim.
outline Jo.,
data security guidelines
for .the
and to make certain that they are followed. Staff
should be instructed as to their duties and re
sponsibilities regarding data safeguards before
they are allowed access to any study data. They
should be cautioned against the release of data

17.6 Filing and storing the original study records

to anyone except authorized individuals and
then only through approved channels. All em
ployees concerned with data processing should
be given instructions regarding data security and
should be informed (perhaps via statements the\
sign) of the types of disciplinary actions, includmg immediate dismissal, that can be expected if
those safeguards are ignored or willfully vio
lated.
Several of the large-scale multicenter trials
(e.g.. Aspirin Myocardial Infarction Study. Mac
ular Photocoagulation Study, and Persantine
Aspirin Reinfarction Study) have data systems
that preclude collection of any personal identi
fiers, such as patient name and address, at the
data center. The proscription provides a means
of eliminating any chance for breaches of patient
confidentiality in the data center (see Part B of
Table 17-3 for safeguards used when patient
identifying information is collected).
The data center has a responsibility to protect
data in its custody against loss or destruction,
whether caused by mistakes, accidents, or pur
poseful acts. A good data center will have the
capability of regenerating the analysis database
, files. Ideally,
___ ............
...........
via backup
the tapes
or disks containing these files should be stored in a building
' from
"
remote
the one ...............................
housing the main database.
At least two sets of backup tapes (or disks)
should be maintained so that one set can be held
in reserve while the other is used to restore the
system. The schedule for generation of updated
tapes or disks for backup purposes will be a
function of the rate at which new information is
added to the analysis database. (See Meinert
et al., 1983 for a description of the CDP backup
system.)

hMe I7-.1

Precautions and safeguards for database operations

t General precautions and safeguards

• Study leadership that is sensitive to needs for data
security
• Staff experienced in the operation of a database and
in protecting it against loss or misuse
• Signed assurance from each employee authorized to
work on the database, stating he understands the
safeguards and precautions to be followed and the
consequences of a willful disregard of them
• Periodic staff meetings to remind database personnel
of required operating procedures and safeguards
• Periodic review of required operating procedures and
established safeguards by study leaders

I

• Monitoring for adherence to precautions and safe
guards via periodic on-site checks
I. Patient confidentiality safeguards

records.

1

• Denial of access to any patient record stored in the
data center to persons outside the center without
the express written consent of the patient

C. Safeguards against misuse
• Limit the number of persons in the data center who
have access to the original study forms or any re
lated data file, especially those containing patient
identifying information
• Restrict access to the analysis computer files contain
ing study results through use of passwords or other
means

• Proscribe release of any data listing, tape. etc., with
out approval of the study leadership committee

• File completed study forms, data tapes, and disks, in
an attended, locked area
D. Loss safeguards
• Maintain a duplicate file of the original study records
(e.g.. by requiring clinics to maintain a copy of
forms and records sent to the data center)

• Separation of the file containing patient identifying
information from other files

• Microfilm original data forms, computer listings.
study manuals, meeting minutes, etc., for storage in
a secure location

• Proscription against use of patient name, name code.
hospital chart or record number, or other unique
identifiers, such as Social Security number, in any
published data listing. Study ID number should not
be published if it is possible for people outside the
study to use that number to identify a patient. Pub
lished UGDP patient listings (University Group Di
abetes Program Research Group. l970e, 1975,
1977, 1982) were devoid of both clinic and patient
ID number for this reason.

The clinic should retain a copy of all data forms
and related records generated in the trial until all
essential work, including final analysis of the
results, has been completed. This file may be the
only hard copy of study records that exists. This
will be the case in single-center trials without
data centers and in multicenter trials with distributed data entry (see Section 16.2). Generally.
a secon(j paper file is needed if data entry is done
outside the clinic, especially in multicenter trials.
The file used for data entry should be considered
the official file of the study and should contain
the original copy of all paper forms and related

from aborted runs that contain patient identifying
information

• Data flow procedures from the clinic to the data cen
ter that exclude transmission of patient identifying
information
• Electronic storage of patient identifying information
in enciphered form or in a separate file

• Physical separation of pages containing personal iden
tifying information from other pages of the data
forms (especially if forms contain highly sensitive
information)
• Proscription against distribution of data listings that
contain patient name, name code, or any other iden
tifiers easily associated with a specific patient

17.6 FILING AND STORING THE
ORIGINAL STUDY RECORDS

183

• Establish and maintain a series of backup tapes (or
disks) for the analysis database that will allow resto
ration of it in the event of a system malfunction

• Store copies of backup tapes (or disks) of the main
analysis database in an off-site location or in an on
site fireproof vault
• Establish strict rules to safeguard access to backup
tapes (or disks) to avoid unauthorized use in resto
ration efforts
• Provide backup tapes (or disks) of all essential pro
grams. such as those needed for editing, inventory
ing, storage, retrieval, and analysis of the study
data, as well as programs used for the operating
systems

• Secure procedures for disposing of computer output

• Carry out occasional “fire drills" to test the ability of
the staff in the data center to restore the main
analysis database from backup tapes (or disks)

Decisions must be made as to where to house
records that cannot be easily or reliably repro
duced. such as X rays. Records that are needed
for patient care should remain in the clinic or be
returned to it as soon as they are read and the
information from them has been codified and
keyed. Some records, such as ECG tracings, can
be “duplicated” by making a second tracing
*hen the patient is examined. However, this
option does not exist if the “duplication" entails
’dded risks for the patient (e.g., as with X rays).

Both the official and backup paper files
should be stored in locked cabinets in a secure
area. The files should be checked periodically to
make certain needed updates are made and that
they do not become cluttered with superfluous
materials.
The organization of the file will depend on
where the file resides and how it is to be used.
Those housed in the clinic will almost certainly
be organized along patient lines. Those housed
at the data center may be organized in other

184

. 3

■

The analysis database

ways. For example, the CDP Coordinating Cen
ter found it convenient to arrange paper records
by form type and by edit period (i.e., time period
in which the forms were received). This ordering
was more efficient than an arrangement by pa
tient ID number and visit because of the data
entry and editing process used by the center.
Data forms and related records stored at the
clinic and data center may be retained in their
original state or on microfilm. If microfilm is
used, the original records should be retained
until microfilm images have been checked for
legibility and proper identification. Destruction
of study forms and related records should be in
accordance with local statutes for medical rec
ords. Data forms, medical records, computer
listings, or microfilm images that contain patient
identifying information should be burned or
shredded. They should not be moved to the dis
posal site unless they can be destroyed upon
receipt.
General National Institutes of Health guide
lines require investigators to retain raw study
documents (or microfilm copies of them) for a
minimum of two to three years after expiration
of funding (Department of Health, Education
and Welfare, 1976; Department of Health and
Human Services, 1981, 1982b). Requirements
may extend beyond these limits in any case
where there are legal challenges to the study, or
where the results are under review by some offi
cial government agency. Prudent investigators
will retain study records well beyond the re
quired legal limit for scientific reasons alone.

17.7 PREPARATION OF
ANALYSIS TAPES

Most data analyses will be done from a tape or
disk created from the analysis database. There
are several reasons for doing so, especially for
interim analyses done for performance or safety
monitoring (see Section 16.7 and Chapter 20).
The principal ones are:

• To allow database maintenance personnel to
continue making updates to the database
without altering the analysis database

• To reduce the number of times the database is
accessed for data analyses (in order to min
imize the chances of programmer errors)
• To enable analysis personnel to rearrange
data, including application of data reduc
tion and special coding routines, in order
to create a file that is more compact and
suitably arranged for use with data analysis
programs

Theoretically, the updating process could be ter
minated while data analyses are being done
However, termination of the updating process is
not always practical, particularly when data anal
yses take weeks to carry out. as may be the case
when preparing complex reports for patient
safety monitoring (see Chapter 20). In any case,
the interruption of data flow into the database
complicates management of the updating pro
cess and reduces the usefulness of edits carried
out in conjunction with the updating process.
It is wise to decide on a target date for genera
tion of the analysis tape. The date chosen should
correspond to the last major update or change to
the analysis database or to some other event in
the trial, such as close-out of follow-up or termi
nation of a treatment. The format of the analysis
tape or disk requires careful thought. Organiza
tion of data may be quite different from that of
the analysis database. A decision must be made
as to whether to array data by patient or by
variable. Thought is also needed regarding the
degree to which data are to be reduced as they
are written onto the analysis tape or disk. Verba
tim listings from the analysis database will pro
vide the analyst with the greatest amount of
flexibility, but they are also more complicated to
use. Generally, some reduction, in which codes
are combined to reduce the number of categories
and by averaging aliquot determinations or re
peat readings, will be necessary.
A decision is also needed regarding the
amount of editing to be done on data written
onto the analysis tape (or disk). Outlier values or
values known to be in error should be identified
when the tape is written to keep the analyst from
having to perform these checks each time a van
able is used.

18. Data analysis requirements and procedures

AnotherrQIIIICUliy
difficulty ilUUUl
about statistics W
is »IIV
the iVVUMivM.
technical difficulty of calculation.
Before -youj can even
---------mistake in drawing your conclusion from the correlations established by your statistics
you must ascertain the correlations.
George Bernard Shaw

IR I Basic analysis requirements
18.2 Basic analytic methods
18.2.1 Simple comparisons of proportions
18.2.2 Lifetable analyses
18.2.3 Other descriptive methods
18.3 Adjustment procedures
18.3.1 Subgrouping
18.3.2 Multiple regression
18.4 Comment on significance estimation
Table 18-1 Examples of analysis ground rule
violations
Table 18-2 Percentages of UGDP patients with
indicated baseline characteristics
Table 18-3 Percentages of PARIS patients with
indicated complaint during fol
low-up
Tabic 18^1 Hypothetical trial involving compar
ison of percentage of patients
dead at indicated time points
Table 18-5 Lifetable cumulative mortality rates
for the placebo and tolbutamide
treatments in the UGDP, as of
October 7, 1969
Table 18-6 Log rank test for comparing lifeta
bles in Table 18-5
Table 18-7 Percentage distribution of UGDP
patients by level of treatment ad
herence
Table 18-8 Percentage of patients dead within
specified subgroups created using
selected baseline characteristics
Table 18-9 Observed and adjusted tolbutamideplacebo difference in percent of
patients dead
Figure 18-1 Number of deaths in the UGDP
through October 7, 1969, by
treatment group
Figure 18-2 Plot of observed ESG1-placebo dif
ference in percent of CDP pa
tients dead from lung cancer

Figure 18-3 UGDP cumulative lifetable mortal
ity rates by year of follow-up and
by treatment assignment
Figure 18-4 CDP dropout rates as a function of
length of follow-up and treat
ment assignment
Figure 18-5 CDP lifetable plot of the DT4placebo mortality differences
and 2.0 standard error limits for
the differences
Figure 18-6 Percent change in fasting blood glu
cose levels for cohorts of patients
followed through the nineteenth
follow-up visit

18.1

BASIC ANALYSIS

requirements

The essence of a trial emanates from compari
sons of the treatment groups for differences in
outcome. Those comparisons should be made
following ground rules listed below.
Ground rule number I Patients used in treat
ment comparisons should be counted in the
treatment group to which they were as
signed.
Ground rule number 2 The denominator for
a treatment should be all patients assigned
to that treatment.
Ground rule number 3 All events should be
counted in the comparison of primary inter
est.
Clearly, there are situations in which the first
rule is followed, but the second is violated (e.g.,
certain patients are excluded from analyses be
cause their treatment was not in “accordance
with the study protocol). The third rule is an
admonition against analyses in which investiga
tors elect to present results only for events be
lieved to be related to the disease process under

185

186

Data analysis requirements and procedures

Table IR-1

Examples of analysis ground rule violations

Violation

i

Si

I

Example

Counting only a portion of the events observed

Carrying out the primary analysis for cause specific
mortality, ignoring all cause mortality

Counting only those events that occur after a speci
fied period of treatment

Restricting the database for the primary analysis to
30-day postsurgical deaths in a surgery trial, or by
ignoring deaths that occur within a specified time
period after the initiation of treatment in a drue
trial

using only those patients who received their as

Exclusion of patients from the database who did not
receive the “full" course of treatment in a drua
trial

signed treatment or who had perfect (or suitably
high) adherence to the assigned treatment

I

18.2 Basic analytic methods

Allowing the treatment actually administered to de
termine the group in which a patient is counted

Counting a patient allocated to control treatment as

Using only “evaluable" patients

A cancer trial that ignores results for patients who
failed to develop tumors of a certain size

Exclusion of ineligible patients enrolled in the trial

Elimination of patients who were judged ineligible
after enrollment by personnel who were aware of
treatment assignment and course of treatment

treatment (e.g„ cardiovascular deaths in a heart
study). See Table 18-1.
An unsophisticated investigator can be ex
pected to rebel at the notion of using data from
patients who refused the assigned treatment or
who were not treated in accordance with the
study protocol for making treatment compari
sons. One temptation is to ignore such patients
and to proceed with analyses as if they were
never enrolled—a violation of the second
ground rule. The only clue offered to readers to
indicate that this was done may be a single tell
tale sentence, such as “The analyses in this paper
have been restricted to evaluable patients." Of
equal concern are cases where data from all
patients are used, but where the primary analysis
is done by the treatment administered rather
than by the one assigned— a violation of the
first ground rule. The main reason for random
izing in the first place, as noted in Chapter 8, has
to do with the desirability of establishing treat
ment groups that are free of patient and physi
cian selection bias. There is no assurance in this
regard if patients are arbitrarily excluded from
consideration after randomization.
Even if investigators accept the need for anal
yses based on the first two ground rules, they
may willfully violate the third one. Counting
rules that call for exclusion of certain events are,
at best, difficult to defend because of their arbi
trary nature. Further, their use can open the

a member of test-treated group because he re
ceived the test treatment

study to serious criticism. The Anturane Rein
farction Trial (ART) is a case in point. The
published report from the trial drew criticism
because of the failure of study investigators to
count deaths occurring within 7 days of the initi
ation of treatment (Anturane Reinfarction Trial
Research Group, 1978; Temple and Pledger.
1980). These exclusions made it difficult to inter
pret the mortality results.The concern of colics
stemmed from uncertainty regarding the validity
of the assumption underlying the exclusions
(i.e., that deaths occurring in this time period
were not treatment-related) and the apparent
post hoc nature of the 7-day rule. Clearly, rules
for exclusions devised after the start of data
collection must be viewed with skepticism. The
same is true for any exclusion rule, regardless of
when it was written, which is administered by
personnel who have access to patient treatment
assignments, especially if subjective judgments
are required in administering the rule.
Adherence to the above ground rules can lead
to an underestimate of the true treatment effect,
especially if treatment compliance is low. there
are a lot of treatment crossovers (see Glossary
for definition), or the denominators for the treat
ment groups include a lot of patients who could
not be followed for the outcome of interest. The
latter should not be a problem in trials using
mortality as the outcome (see Chapter 15). but
can be in trials with a nonfatal event or a labora-

187

ton or physiological measure as the outcome. A
prudent investigator will carry out supplemental
analvses aimed at quantifying the degree of conservatism implied. Certainly, there is no pro
scription against such analyses so long as they
are accompanied by the primary ones suggested
above. They may include analyses by level of
treatment adherence and for a number of sec
ondary outcomes as well.

tients in the two treatment groups who have
experienced the event of interest. This method of
analysis is valid so long as:

112 BASIC ANALYTIC METHODS

Outcome analyses based on comparisons of pro
portions appear throughout publications of the
trials sketched in Appendix B. Figure 18-1 is
based on UGDP mortality data reported in a
1970 publication on tolbutamide (University
Group Diabetes Program Research Group,
1970e).
This method of analysis, while best suited to
binary data, need not be limited to such data if
investigators are willing to convert a polychotomous or continuous outcome measure to bi
nary form, as in the National Cooperative Gall
stone Study (NCGS). Investigators in that study
chose to categorize gallstone dissolution data as
an all-or-none phenomenon for the primary anal
ysis, even though the underlying measure was
continuous (National Cooperative Gallstone
Study Group. 1981a). Investigators in the Macu
lar Photocoagulation Study (MPS) used a bi
nary outcome (based on a comparison of base-

• Patients in the treatment groups were en
rolled over the same time period and are
subject to the same intensity of follow-up
• The loss to follow-up is low and is the same
across treatment groups

• The treatment groups have comparable base
line characteristics

This section provides a review of analytic meth
ods used for making treatment comparisons in
trials with a clinical event as the primary out
come. Readers may consult textbooks such as
those by Armitage (1971), Brown and Hollander
(1977), Bulpitt (1983), Buyse et al. (1984),
Flandt-Johnson and Johnson (1980), Fleiss
(1981), Ingelfinger et al. (1983), Kalbfieisch and
Prentice (1980), Lee (1980), Pocock (1983). Sha
piro and Louis (1983), and Tygstrup et al.
(1982), and papers by Cutler and Ederer (1958),
Kaplan and Meier (1958), Mantel and Haenszel
(1959), Mantel (1966), and Peto et al. (1976,
1977), among others, for additional details.
18.2.1

Simple comparisons of proportions

The simplest and often most useful analysis in
volves a comparison of the proportion of pa
Fijure 18-1

Number of deaths in the UGDP through October 7, 1969, by treatment group.

(a) All Causes

(b) Cardiovascular Causes

p-Ot 7

15-t

p-O.OQB

I

10-

p-0.81

p-0 82

s

p^>M

8

p-0.86

V

5-

0
N=

PLBO

TOLB

ISTD

IVAR

PLBO

TOLB

ISTD

IVAR

205

204

210

204

205

204

210

204

d() for the indicated drug-placebo comparison. Th' numbers
Notr:p values recorded above the bars are based on
of patients in the treatment groups are indicated below the bars.
Source: Reference citation 46X. Adapted with permission of the American Diabetes Association, Inc.. New York.

188

18.2 Basic analytic methods

Data analysis requirements and procedures
Table 18-2

Ptrcentages of UGDP patients with indicated baseline characteristics (denominatdrs given

Baseline characteristic

fl'
3

PI. BO

T«ble 18-4

p-value*

Age at entry >55
Male
Nonwhite

41.5(205)
30.7(205)
49.8(205)

48.0(204)
30.9(204)
47.1(204)

0.18
0.97
0.59

Fasting blood glucose >110 mg/100 ml
Relative body weight >1.25
Visual acuity (either eye <20/200)

63.5(203)
52.7(205)
4.3(188)

72.1(204)
58.8(204)
5.2(192)

0.07
0.21
066

Calendar time
from start of trial

I

<-5.

Table 18-3 Percentages of PARIS patients with indicated
complaint during follow-up

Treatment group
Complaint

PR! A

Pl.BO

Z-value

Stomach pain
Heartburn
Vomiting

15.8
9.6
2.5

3.74
2.58
1.59

Denominator

810

7.7
5.2
1.0
406

Source: Reference cilation 376 Adapted with permission of the
American Heart Association. Inc.. Dallas. Texas

7 years
8 years
9 years

Cumulative percent dead

Treatment B

Peatment A

Peatment B

100
200
300
300
300
300
300
300
300

100
13.6
16.2
21.0
24.2
26.7
28.9
31.1
33.1

2.0
3.5
6.5
12.3
19.4
25.8
31.7
37.2
42.2

100
200
300
300
300
300
300
300
300

4 years
5 years
6 years

•Percentages calculated assuming annual mortality rates (per 100 population) of 10. 8. 5.4. 3. 3. 3 3. and 3 for years I
,* fXJ“ cp^vd,. for .re.tm.n. ,,<»p A .nd 2. ).».». S. R. R. R. .nd R (or
B E.rellm.n.warned .o

tables, such as those constructed by Lieberman
and Owen (1961).
The continuity corrected chi-square approxi
mation to the test can be used if the numerators
for the two percentages being compared are both
> 5 and the denominators are > 30. The p-values
obtained in such cases are indistinguishable from
those obtained with Fisher’s exact test. In fact,
the approximation is reasonably good even if
denominators are as small as 20 (Cochran
1954).

18.2.2

Treatment A

1 year
2 years
3 years

Source: Reference citation 468 Adapted with permission of the American Diabetes Association, Inc., New York.
•Probability of chi-square value as large as or larger than the one observed under the null hypothesis

line and follow-up visual acuity readings) instead
of mean change in visual acuity as the principal
outcome measure (Macular Photocoagulation
Study Group, 1982, 1983a, 1983b).
Furthermore, use of this mode of summary is
not limited to outcome measures. It is useful in
characterizing differences in the baseline compo
sition of treatment groups and for comparisons
of various kinds of follow-up data as well. Table
18-2 is an example of a comparison of the distri
bution of selected baseline variables that have
been converted to binary form (University
Group Diabetes Program Research Group,
l970e). Table 18-3 illustrates use of proportions
in summarizing follow-up data on observed side
effects ( Persantine Aspirin Reinfarction Study
Research Group, 1980b).
Statistical evaluation of the difference ob
served via a comparison of proportions can be
performed using Fisher’s exact test (Fisher,
1946; see also Chapter 9). The p-value for the
test corresponds to the probability of obtaining a
test-control difference as large or larger than the
one observed under the null hypothesis of no
difference. The p-value may be obtained using
packaged computer programs for the test or from

Hypothetical trial involving comparison of percentage of patients dead at indicated time points*

Cumulative number of
patients enrolled

TO LB

189

h»ve laken place on the first day of years I. 2, and 3.

ods (such as described by Elandt-Johnson and
Johnson, 1980; Kalbfleisch and Prentice. 1980;
l ee. 1980), as illustrated in Figures 18-3 for the
I GDP and 18-4 for the CDP. Other examples
may be found in publications from the Aspirin
Myocardial Infarction Study (AMIS), Hyper
tension Detection and Follow-Up Program
iHDFP), Multiple Risk Factor Intervention
Trial (MRFIT), and PARIS (see Appendix B for
references).
The main advantage of the lifetable approach
is that it provides a means of dealing with vary
ing lengths of follow-up, as illustrated in Table

Lifetable analyses

The typical trial involves patient recruitment
over an extended period of time and follow-up
through a common calendar time point. Hence,
any analysis done during or at the end of the
trial will involve patients with varying lengths of
follow-up, depending on when they were en
rolled. Simple counts of events, such as shown in
Figure 18-1, are not designed to take account of
follow-up time and hence are insenitive to the
way events accumulate over time. The cumula
tive proportion of patients experiencing events
can be the same even though there are marked
differences between the treatment groups as to
when events occur over the course of follow-up.
as illustrated in Table 18-4 for a hypothetical
trial. Note that comparisons of the percent dead
based on tabulations done at the end of calendar
year 6 or before favor treatment B. Those done
at the end of calendar year 7 and thereafter favor
treatment A.
One way of tracking changes over time via
proportions is illustrated in Figure 18 2. This
method of analysis, while useful for safety moni
toring (see Chapter 20), does not give a means of
characterizing the treatment groups with regard
to the rate of occurrence of events. Rate calcula
tions are ordinarily made using lifetable meth-

Figure 18-2
cancer.

18-5. The cut-off date for the analysis was Octo
ber 7, 1969. All patients by that time had been
under follow-up for a minimum of 3 years,
8 months and a maximum of 8 years, 8 months.
Hence, the only attrition during the first 3 years
of follow-up was that due to death. Thereafter, it
was due to both deaths and withdrawals because
of when patients were enrolled. For example,
there were five patients in the tolbutamidetreated group who were enrolled after October 7,
1965, and who were still alive on October 7.
1969. They were counted as withdrawals during
the fourth year of follow-up since they had not

Plot of observed ESGI-placebo difference in percent of CDP patients dead from lung

4.0-1

3.0-

2.01.0-

!

0-

N -1.0-

"io

\20

30

90

—I
100

Month of Study (Calendar Time)

-2.0—3.0*2 values plotted are for observed FSGI-placebo differences in proportions of deaths from lung cancer. Dotted lines
denote Z values corresponding to 0.05 level of s1gnificance taking into consideration there were repeated evaluations
of the data for treatment diflcrences over lhe course ol the trial
Source: Reference citation 105. Adapted with permission of the American Medical Association. Chicago, III.
(copyright C 1973).

1

Data analysis requirements and procedures

190

UGDP cumulative lifetable mortality rates

Fifure 18-3

by year of follow-up and by treatment assignment.

20-i

TO. 8

2

teJ

s 15r□
«o
10-

>

IVAR

ISTD
PL80

//

/

3
a
5-

0

'I

~2

I

3

4

5

6

T

8

TEARS OF F0LL0W-UF

Source: Reference citation 468 Reproduced with permission of the
American Diabetes Association. Inc., New York.

been in the study long enough to have comn|C|Clj
the fourth year of follow-up.
Statistical comparisons of lifetable rates mat
be done using confidence estimation or log rank
tests. The plot of lifetable rates reproduced in
Figure 18-5 uses two standard error limits (ie
approximate 95% confidence intervals) aboui
the line of no difference to assess the statistical
importance of the DT4-placebo mortality dif
ference. The log rank test summarized in Table
18-6 is for data given in Table 18-5. (See Man
tel and Haenszel, 1959. Mantel, 1966, and Pcto
et al., 1977 for general details regarding the test i
Ideally, the calculations should be based on
exact time to death, rather than on grouped
data, as given in Table 18-5. However, the differ
ence between the two methods of calculation will
be small provided the deaths are uniformly dis
tributed within the intervals and that they are
not concentrated in just one or two of the inter
vals. The difference in this example is trivial. Use
of exact time to death yielded a log rank test
value of 1.82 as contrasted with a value 1.78 for
grouped data.

Figure 18-4

Lifetable cumulative dropout rates for the clofibrate, niacin, and placebo treatments in the

h

I

o

!l

d
'o

3
X
Q
O

15

£
£

£Q.
O

£

Niacin

10

Placebo
Clofibrate

io
>

>
3

I
1

3

E

8

*?
0©
0

2^ I 5

“• 11 -s

1
N* 4.706

2
4.421

3
4
Year of Follow-up
4.167

3.911

5

6

7

3.651

2.336

729

3

•© -O

O nO —
O r*- vS
OO'Cb

'*1 — 0 30 00
o O O OO OO 00
8 5 <r>

■Rt o
r>4
ooo
OOO

— rn CM
rt n cm
OOO
OOO

ri O'
X »

cm

«O
Q
rj n

OO

o
3C OC

«z^ V-)

—
—
Op
OO

cm

ooo ooo o o
ooo ooo oo

I
I

I

2 r4

I

5

IE

j
I

E

8

ig

1

c

■o

o oo

m

O 'O

ooo

^11

I

1

= 'S31

Hit
s ■'-H-s

— o

J
I

id!
£
I ^11

5

S=-

€

"I

1

<©

Q

E
g

>o <z-i X>

«/■> x>
rU CM

:=

I JiH
I
1

o •©

IM

5

I

COP.

I

o•c

s: u? ■=

i:
i’-i

Mi S =

6

ooo ooo oo

.mm If
U_ uc H

U_ U- </5

Note: N denotes total number of patients in clofibrate, niacin, and placebo groups combined Approximate numbers
for individual treatment groups are 2/9. 2/9. and 5/9 times N for clofibrate, niacin, and placebo, respectively.

Source Reference citation 107. Adapted with permission of the American Medical Association. Chicago. III.
fcopyright C 1975).

191

LOUJ

ooo oo — oo

I

1

.£ §

III H I
5

5 v

u- c/) I— a. u. </) tc

192

18.3 Adjustment procedures 193

Data analysis requirements and procedures
Figure 18-5

Percent change in fasting blood glucose levels for cohorts of patients followed through the
Figure H-6
nineteenth follow-up visit.
plB0(N.I06)

CDP lifetable plot of the DT4-placebo mortality differences and 2,0 standard error limits

for the differences.

lOq

■

I

A

5-

2
UJ

</>
2.0-

O-S

03

8?

j TOLB (N« 102)

/

| 3 3.0-

2

ib

g

4

I STD (N»II6)

18

FOLLOW-UP EXAM

Z

2 -s-

; | ’°-

ui
O -10-

QO

<
x

0-

° -15

o

bMonths of Follow-up

UJ

I

IVAR (N«IIO)

a: -20-

Source: Reference citation 103. Adapted with permission of the American Medical Association. Chicago III
(copyright C 1972).

UJ

-25J
Source: Reference citation 468. Reproduced with permission of the American Diabetes Association. Inc.. New York.

18.2.3

groups with regard to treatment adherence. Fig
ure 18-6 provides a plot of changes in fasting
blood glucose levels for the cohort of patients
followed through 4.75 years (i.e., through 19
follow-up examinations). Only patients who re
mained under active follow-up over this time
period were included in the analysis. A plot of
means, based on the number of patients ob
served at each follow-up examination, might
have been used instead. However, the two forms
of analyses are not necessarily interchangeable
They will yield different results if the compost-

Other descriptive methods

Any comparison of outcome by treatment group
should be accompanied by other analyses to help
in interpretation of the results of the trial: Tables
18-2 and 18-7 and Figure 18-6 provide exam
ples of supporting analyses, as taken from the
UGDP(University Group Diabetes Program Re
search Group, l970e). The results in Table 18-2
are useful for assessing the baseline comparabil
ity of the treatment groups. Table 18-7 was used
to characterize differences among treatment

lion of the study population enrolled, with re
gard to the variable of interest, changed over the
course of patient recruitment.

18.3 ADJUSTMENT PROCEDURES
To he valid, the evaluation of treatment effects
must be performed on treatment groups that are
comparable with regard to baseline characteris-

as those outlined below.
Tible 18-7

Percentage distribution of UGDP patients by

18.3.1

lexel of treatment adherence

Table 1S-4

Log rank test for comparing lifetables in Table 18-5

Year of
Follow-up
0-1
1-2

TOLB

Total

PLBO

TOLB

Total

PL BO

TOLB

Total

205.0

204.0

205.0
200.0

204.0
199.0

409.0
409.0
399.0

0
5

0
5

0.00
5.01

4

5

0
10
9

4.51

0.00
499
4.49

000
10.00
9.00

194.0

385.5
348.5
274.0

4
4
3

5
5
4

9
9
7

4.53
4.56
3.56

4.47
4.44
3.44

900
9.00
7.00

86.5

176 5
87.5

I
0

5
I

6
I

3.06

0.53

2.94
0.47

6.00
I 00

25.76

25.24

51.00

5 6

139.5

6-7
7 8

90.0
46.0

41.5

176.5

Total

level of adherence*

PLBO

191 5
172.0
134.5

3-4

Treatment group

Expected deaths

Observed deaths

Number starting interval

21

30

51

tics. Usually, the comparability provided by ran
domization is adequate. However, randomiza
tion does not guarantee comparability. As noted
in Chapter 10. stratification can be used to as
sure comparability for a few variables, but the
distribution with regard to others must be left to
chance. As a result, there can be minor, and
sometimes even major, differences in the base
line composition of the study groups. The im
pact of such differences on treatment compari
sons should be removed using procedures such

PL BO

TOLB

ISTD

IVAR

12.8
29.6

57.6

14.8
39.9
45.3

210

204

low
Intermediate
High

10.2

10.3

20.0
69.8

15.7
74.0

Number of patients

205

204

'••unr Reference citation 468. Reproduced with permission of the
American Diabetes Association. Inc.. New York.
•Iklmed as follows:
!<'* Patient took all of prescribed study medication <25*7 of
all follow-up periods

Intermediate Patient took all of prescribed study medication
74r; of all follow-up periods

Hith Patient took all of prescribed study medication >75% of
•II follow-up periods

Source: Reference citation 441.
Log rank
= (21 - 25.76)’/25.76 + (30 - 25.24)’/25.24 = 1.78. p-value = 0 18

I

Subgrouping

The simplest approach involves making the re
quired treatment comparisons in subgroups of
patients that arc homogeneous for selected entry
characteristics. This method of adjustment is il
lustrated in Table 18-8. All of the subgroups
were formed using measures observed before the
start of treatment. The table indicates the size of
each subgroup and the percentage of patients in
the subgroup who had died as of the analysis
cut-off date, October 7, 1969.
This approach, while simple, has obvious lim
itations. Thirty-two (i.e., 2s) different subgroups
would be required to simultaneously categorize
patients for the presence or absence of the five
measures represented in Table 18 8. The number

194

18.4 Comment on significance estimation

Data analysis requirements and procedures
Table 18-8 P
Percentages of' patients dead within specified subgroups created using
selected baseline characteristics
Number
Entry risk factor

i

PL BO

teristics (University Group Diabetes Program Re
search Group, 1970e). Results are summarized in
Table 18-9. The CDP used both multiple linear
and multiple logistic regression models to adjust
observed mortality for as many as 54 different
baseline characteristics (Coronary Drug Project
Research Group. 1974, 1975).
The use of regression procedures for adjust
ment has been extended to event rates calculated
from lifetables (Cox, 1972). The method has
been used in studies such as AMIS (Aspirin
Myocardial Reinfarction
Study
Research
Group. 1980b) and PARIS (Persantine Aspirin
Reinfarction Study Research Group, 1980b).

Percent dead

TOLB

PL BO

TOLB

Definite hypertension
Absent
Present

127
74

139
60

11.0
9.5

12.9
16.7

History of digitalis use
No
Yes

193
9

183
15

8.3
55.6

13.1
33.3

History of angina pectoris
No
Yes

192
10

187
14

9.4
30.0

13.9
21.4

Significant ECG abnormality
Absent
Present

193
6

193
8

9.3
33.3

13.0
50.0

Cholesterol
<300 mg/100mI
>300 mg/ 100ml

181
17

169
30

10.5
11.8

14.8
13.3

Any of above cardiovascular risk factors
None
One or more

98
88

100
92

9.2
12.5

11.0
17.4

18 4 COMMENT ON
SIGNIFICANCE ESTIMATION

The p-values resulting from conventional tests of
significance are often used by investigators to
decide whether to characterize a particular result
as being statistically significant. Clearly, pvalues can help in the statistical quantification of
a result, but they should not become a substitute
for rational thought. The acceptance or rejection
of a treatment rarely hinges on whether a differ
ence reaches some arbitrary level of significance.
In fact, the amount of evidence required to con
clude that a test treatment is no better than the
control treatment may be less than that required
to conclude that it is better. Generally, there is
need in the latter case to make certain the benefi
cial effects observed persist—a judgment that

Source Reference citation 468. Adapted with permission of the American Diabetes Associa
tion, Inc.. New York.

i
&

I

of patients in many of the subgroups would be
too small for meaningful comparison.
In addition, the method requires use of arbi
trary cut-points for subgroupings involving con
tinuous variables. The arbitrary nature of the
cut-points selected can raise questions concern
ing the validity of the analyses presented, espe
cially if there is any suspicion that they were
chosen to minimize or maximize observed treat
ment differences.

18.3.2

Linear' multiple regression model
yt = A + tj
Logistic multiple regression model

yt

I

_

(18.2)

1 4- e-^

where
A
Pf) + 01
Vi

Multiple regression

An alternative approach that avoids some of
these problems and provides a means of control
ling for several sources of variation simultane
ously involves use of regression models repre
sented by Equations 18.1 and 18.2. (See Cox,
1958, Draper and Smith, 1966, and Kleinbaum
et al., 1982, for details on methods of estimation
using the models.) The models are used to esti
mate the probability that a patient experiences
the outcome of interest, given a particular set of
entry characteristics. One drawback to the linear
regression model has to do with the possibility of
obtaining probability estimates that lie outside
the range of 0 to I. This possibility is avoided
with the logistic model.

(18.1)

XH
fi

+ ■" + 0j Xji + ••• + 0k

outcome observed for the zth patient
(either 0 or I for binary outcome mea
sures)
value observed for the /th patient and/th
entry characteristic (/= 1. ■ k)
error associated with y,-

and
Po. 0|............ 0* = regression coefficients (pa
rameters) to be estimated from observed
data

The UGDP used a logistic regression model to
adjust observed mortality results for differences
in the distribution of 14 different entry characI. Referred to as linear because the model does not involve in»
parameter raised to a power other than unity. The term is noi *
comment on the shape of the curve arising from the analysis The
model may yield a curved line or surface depending on the form
taken by the independent variablefs) in the model.

4

Table 18-9

195

Observed and adjusted tolbutamide-placebo

difference in percent of patients dead

Observed percent
dead
Adjusted* percent
dead

TOLB

PL BO

TOLB-PLBO
difference

14.7

10.2

4.5

14.5

10.2

4.3

Source: Reference citation 468. Adapted with permission of the
American Diabetes Association, Inc., New York
•Based on logistic regression model using 14 different baseline
characteristics.

can be reached only by continuing follow-up for
some time after the emergence of an important
difference.
The question of what constitutes statistical
significance is complex. Methodological prob
lems involved in the interpretation of conven
tional tests of significance for safety monitoring
are outlined in the next chapter. However, even
if those problems are ignored, it is still necessary
to use a good deal of caution in the interpreta
tion of p-values. Most trials, even if designed to
focus on a single outcome, will provide data on a
variety of other outcome measures as well. For
example, the CDP provided data on the rate of
occurrence of myocardial infarctions, strokes,
and several other nonfatal cardiovascular events,
in addition to death. The p-values obtained for
one outcome measure will not be independent of
those obtained using another outcome measure.

1

19.3 Questions concerning the source of study patients 197

19. Questions concerning the design, analysis, and
interpretation of clinical trials

nwnts were made using a common allocation
ratio. See question la.

There are three kinds of lies: lies, damned lies and statistics.
Benjamin Disraeli

19.1 Introduction
19.2 Questions concerning the study design
19.3 Questions concerning the source of study
patients
19.4 Questions concerning randomization
19.5 Questions concerning masking
19.6 Questions concerning the comparability of
the treatment groups
19.7 Questions concerning treatment adminis
tration
19.8 Questions concerning patient follow-up
19.9 Questions concerning the outcome meas
ure
19.10 Questions concerning data integrity
19.11 Questions concerning data analysis
19.12 Questions concerning conclusions

'‘■i .

'•f

19.1

INTRODUCTION

This chapter focuses on questions concerning
the design, analysis, and interpretation of study
data. Material is presented in the form of ques
tions and answers and is organized in categories
related to the various aspects of a clinical trial.

19.2 QUESTIONS CONCERNING
THE STUDY DESIGN

a

I a. Question: Can a new study treatment be
added during the course of the trial?
Answer: Yes, but not without impact on
the study design. The University Group Diabetes
Program (UGDP) elected to add a fifth treatment,
phenformin, 18 months after the start of patient
enrollment (University Group Diabetes Program
Research Group, 1970d). The allocation ratio of
phenformin to tolbutamide to insulin standard
to insulin variable to placebo was fixed at
3:1:1:1:1 and was satisfied after enrollment of
every 14, 28, 42, etc., patient in each of the
6 clinics administering phenformin. Patients in

the other 6 UGDP clinics and the first 32 pa
tients in one of the clinics included in the phen
formin portion of the study were allocated using
a ratio of 0:1:1:1:1 in blocks of 16.
The two different allocation schemes created
problems when treatment comparisons were
made involving phenformin-treated patients
(University Group Diabetes Program Research
Group, 1975). The decision in the Coronan
Drug Project (CDP) to study aspirin late in the
trial avoided these design problems by setting up
a separate trial using patients from discontinued
treatments (Coronary Drug Project Research
Group, 1976).

lb. Question: Can a treatment be deleted
from the study design once the trial has started1
Answer: Yes. Use of the test treatment
will have to be stopped if it is shown to he
inferior to the control treatment. The control
treatment will have to be stopped if it is infenor
to the test treatment. The UGDP provides ex
amples of the former kind of change (University
Group Diabetes Program Research Group.
1970e, 1975). The Diabetic Retinopathy Study
(DRS) provides an example of the latter type of
change (Diabetic Retinopathy Study Research
Group, 1976, 1978).
A treatment may also be deleted for reasons
unrelated to treatment results. The original de
sign of the DRS included a test treatment in
volving photocoagulation with both xenon arc
and argon laser. The treatment was abandoned
early in the course of the trial for practical rea
sons.
2a. Question: Do all clinics participating in
a multicenter trial have to be in the trial from the
outset?
Answer: No. Results from clinics can be
combined regardless of when they were added to
the trial, provided all clinics followed the same
treatment protocol and all treatment assign-

196

I

2b. Question: What if a clinic in a multicen
ter trial resigns after it has started patient enroll
ment'’ Will the resignation affect treatment com
parisons?
Answer: Clinic resignations are not un
common. There were two in both the CDP and
the National Cooperative Gallstone Study
(NCOS) (Coronary Drug Project Research
Group. 1973a; National Cooperative Gallstone
Studv Group. 1981a). They may be initiated by
the clinic because of the death, illness, or depar
ture of a key person or by the study leadership
because of performance problems.
The loss of a clinic will reduce the overall
precision of the trial unless other clinics are re
cruited to make up for the loss. The loss will be
minimal if few patients are involved and if re
sponsibility for the continued care and surveil
lance of patients already enrolled can be as
sumed by another clinic in the trial. It will be
suable if the clinic had a large number of pa
tients that cannot be transferred to other clinics
in the trial. Such patients will have to be counted
is dropouts and treated as such for data analyses
in the trial. A large number of dropouts caused
by clinic resignations will make it difficult to
detect treatment effects, but they should not in
validate treatment comparisons provided the al
location ratio in clinics that have resigned was
the same as in the remaining active clinics. Inci
dentally. the possibility of clinic resignation in a
multicenter trial is one reason why it is wise to
construct the allocation schedule with clinic as a
stratification variable.

3. Question: Is it proper to make modifi
cations to the treatment protocol during the
trial?
Answer: Many times it is not so much a
question of propriety as of necessity. Changes
must be made if patient safety is in question.
Other changes may be necessary simply to clear
up ambiguities in the protocol. All changes
should be noted and reported in publications
from the trial.

4a. Question: If the required sample size
cannot be achieved, should it be reduced to
bring it in line with reality?
Answer: It is always possible to find
wme combination of a, 0, and A which yields
the “desired" result (see Chapter 9). Reduction

of the sample size via such manipulations, sim
ply to bring it in line with expectation, is game
playing.
4b. Question: How about revising the sam
ple size calculation during the trial?
Answer: Revised sample size calcula
tions, based on observed outcome and dropout
rates, can help the investigators and sponsor
decide if more clinics are needed or if the period
of follow-up should be extended to achieve the
desired statistical precision. The calculations
should be made using the a, fi, and A specified
when the trial was planned (see Chapter 9).

4c. Question: Is it all right to change the
outcome measure after the start of the trial as a
means of reducing the sample size requirement?
Answer: Such maneuvers are open to
the same criticism as mentioned in the answer to
question 4a. One kind of maneuver involves a
switch from a single event as the prime outcome
measure to a composite event (see Glossary).
The expected rate of occurrence of such an event
will be higher than that for any of its component
parts. The higher the expected rate, the easier it
will be to detect a specified relative difference
with a given sample size. However, the “gain” in
precision is achieved at the expense of clinical
relevancy. It is more difficult to interpret the
meaning of a finding based on combinations of
events than one that is based on a single set of
events.

5. Question: Is it permissible to extend the
period of patient follow-up to compensate for a
lower than expected event rate in the controltreated group or for a shortfall in patient recruit
ment?
Answer: Yes.
6. Question: Is it necessary to specify stop
ping rules for the trial before it is started?
Answer: No. In fact, many trials are
done without any formal stopping rules for rea
sons discussed in Chapter 20.

Other related questions: 7, 42, 43, and 47.

19.3 QUESTIONS CONCERNING THE
SOURCE OF STUDY PATIENTS
7. Question: Is it all right to change patient
eligibility criteria once the trial has started?

1

198

■

fe.

19.4 Questions concerning randomization

Questions concerning the design, analysis, and interpretation of clinical trials

Answer: Ideally no, but some changes
may be necessary. The likelihood of change is
greatest in trials involving long periods of re
cruitment and in those in which investigators are
having trouble meeting their sample size goals
within the stated time periods. The changes will
not affect the validity of treatment comparisons
if they are independent of the observed treat
ment results and if the proportion of patients
allocated to the different treatment groups re
mains unchanged over the course of patient en
rollment.
8. Question: Will changes in the composi
tion of the study population enrolled have an
impact on treatment comparisons?
Answer: No, assuming the proportion
of patients assigned to a particular treatment,
relative to the total number of allocations made,
remains constant over the course of the trial.
This is usually assured with randomization pro
cedures designed to balance the number of
assignments made to the treatment groups at
various points over the course of patient recruit
ment.

9. Question: Is it useful to collect data on
patients screened for enrollment?
Answer: It is if there is a reliable way to
define the base population at risk of enrollment,
as in the Coronary Artery Surgery Study
(CASS). The only patients considered for enroll
ment were those who had had a heart catheteri
zation at a study clinic (Coronary Artery
Surgery Study Research Group, 1981). It is not
useful when the base population is ill defined, as
in the UGDP. Investigators in that trial tried to
maintain screening logs, but abandoned the ef
fort because of lack of agreement among them
as to who should be listed in the logs.

would be if all extraneous sources of variation
could be identified before the start of the trial
and then controlled in the assignment process
However, this is rarely, if ever, possible. Tht
main virtue of randomization is the protection it
provides against patient or physician selection
biases in the treatment assignment process.
1 la. Question: Is it acceptable to use an in
formal, nonauditable method of random assign
ment, such as a coin flip?
Answer: Not if it can be avoided. Such
methods, even if properly administered, are dif
ficult to defend if questions are raised concern
ing the assignment process. There is no satisfac
tory way to dispel doubts concerning the
possibility of selection bias with any nonauditable allocation scheme.

!
i

;

16. Question: Is it necessary to stratify on
all important baseline variables in the randomi
zation process?
Answer: No. Valid treatment compari
sons can be made without any stratification.
17. Question: Is there a limit to the number
of variables that can be controlled via stratifica
tion during the randomization process?
Answer: Definitely. Generally, it is not
practical to stratify on more than two or three
variables.

12. Question: Are schemes such as those
based on day of the week, time of day, or order
in which patients are seen ail right to use?
Answer: No. All such methods are susceptable to selection biases and, as a result, may
not provide a valid basis for comparisons in the
trial. It is too easy for patients or clinic staff to
discover the assignment rules and then to alter
the time or order in which patients are seen
simply to achieve the “desired” assignments.

19.4 QUESTIONS CONCERNING
RANDOMIZATION

10. Question: Is randomization needed for
a valid trial?
Answer: Not necessarily, provided the
method of assignment is free of treatmentrelated selection biases. In fact, some people
have even argued that randomization is unneces
sary (Harville, 1975; Lindley, 1982). Indeed, it

14a. Question: Are the number adaptive
schemes, such as the biased-coin method of ran
domization in which assignment probabilities
change as a function of previous assignments, a
substitute for blocking?
Answer: Yes. They can serve the same
function, as suggested in Section 10.2.

|4b
14b. Question:
Question: Are such schemes better than
(h<Kc that rely on blocking to achieve the desired
allocation ratio?
Answer: Yes and no. On the one hand,
wch methods avoid the problem of predictabil.tv as discussed in Chapter 10-a serious prob
lem with small blocks of uniform size, especially
in unmasked trials. On the other hand, they can
Mold longer unbroken runs of patients who are
all assigned to the same treatment. Further, the
schemes are more complicated to administer
than schemes involving blocking.

15. Question: Should one use blocks of vari
able size if blocking is used?
.
Answer: Generally, yes, particularly in
unmasked trials. The variation reduces the likeli
hood that clinic personnel will be able to predict
a treatment assignment.

lib. Question: How about methods of ran
domization that base treatment assignment on a
specified digit of the patient’s Social Security or
medical record number? Are they acceptable'’
Answer: Again, not if they can be
avoided. Most of these methods fail to satisfy
the conditions needed for a sound allocation
scheme, as discussed in Chapters 8 and 10.

13. Question: Should the treatment assign
ment be blocked?
Answer: Yes. There can be subtle
changes in the composition of the study popula
tion as the trial proceeds. Blocking helps to elim
inate the impact secular changes may have on
treatment comparisons (see Chapter 10).

Other related questions: 73.

j

18. Question: Should one use clinic as a
stratification variable in multicenter trials?
Answer: Generally yes, except in a si
tuation in which there are so few patients per
clinic (as in some multicenter trials involving an
_ --------_to
extremely rare disease) that it is impractical
enrolled
do so. The characteristics of patients
|
___ differ
u;rr—.
can vary widely from clinic to clinic. -r-t
These
ences, if uncontrolled, can confound treatment
comparisons.
19. Question: Is there a way to determine
whether randomization has “worked”?
Answer: No. A random process is defined by the methods underlying the process.
The demographic and baseline characteristics of
patients enrolled in the various treatment groups
can be compared. However, the existence of a
large difference involving an arbitrarily small pvalue does not necessarily mean that the assign
ments were “nonrandom,” nor that there was a
breakdown in the way in which they were issued.
The difference may be due to chance.

i

199

20. Question: Docs the lack of baseline com
parability among the treatment groups indicate
a breakdown in the randomization process?
Answer: Not necessarily. It may be due
to chance, as noted in question 19.
21a. Question: Is it all right for the data cen
ter to take back a treatment assignment once it
has been revealed to the clinic?
Answer: No. The assignment and the pa
tient for whom it was intended should be
counted in the study once it has been disclosed.
Care should be taken to make certain that the
patient is eligible and willing to participate in the
trial before the assignment is revealed (see Sec
tion 10.7).

21b. Question: Should returned assignments
(assuming the envelopes in which they are con
tained have not been opened) be reissued?
Answer: They can be, but often are not
because of the difficulties involved in reissuing
them.

21c. Question: Can the returned assignments
result in measurable departures from the desired
allocation ratio?
Answer: Not if the number returned is
small. They could if the number is large, but
even in this case the chance of a sizable depar
ture is small, unless the number is differential by
treatment group—not likely except in cases
where decisions to return assignments are made
by personnel who know the treatment assign
ments when the decisions are made.
2Id. Question: What if a mistake is made in
preparing the assignment and the wrong one is
disclosed to clinic personnel? Should it be taken
back?
Answer: No. The assignment should
stand as issued once it is disclosed.
2
le. Question: Can such mistakes lead to a
21e.
departure from the desired allocation ratio?
........................
’ 1 **they
Answer: ..They
should not, provided
are independent of treatment assignment. How
ever, they can raise doubts regarding the integ
rity of the study if they occur frequently.

22a. Question: What if the clinic wants to
return an assignment because it was used by
mistake?

1
200

Questions concerning the design, analysis, and interpretation of clinical trials

Answer: The assignment should stand
as issued once it has been disclosed to clinic
personnel.
22b. Question: What if a clinic wishes to
switch a treatment assignment?
Answer: The assignment should stand
as issued once it has been revealed to clinic
personnel.

\i •

r?

23a. Question: What if a clinic administers
the wrong treatment to a patient. Should the
assignment be changed to correspond to the
treatment used?
Answer: No. The assignment should
stand as issued. The mistake should be noted
when the results of the trial are published.
23b. Question: Will mistakes of the type re
ferred to in Question 23a affect the validity of
the trial?
Answer: They may, depending on their
frequency and whether they are treatment re
lated.

24. Question: What if the observed alloca
tion ratio departs from the one specified in the
study design?
Answer: Small departures are to be ex
pected, even with small block sizes, few alloca
tion strata, and no returned assignments. Bigger
departures can occur with large blocks and mul
tiple strata. Generally, other than detracting
from the esthetic quality of the allocation design,
the departures will not affect the validity of the
trial. An obvious exception is where the depar
tures are treatment related.
25. Question: Is it a good idea to have a
large number of allocation strata?
Answer: Yes and no. On the one hand,
the greater the number of strata the greater the
control of extraneous sources of variation. On
the other hand, numerous strata will complicate
management of the allocation process (see Sec
tion 10.3.2).

t

Other related questions: 7, 8, 49, 50, 51, and 56.

!

'

la

Si
'LJ
I

19.5 QUESTIONS CONCERNING
MASKING

26. Question: Is an unmasked trial valid?
Answer: Masking per se is not an indi
cator of validity. Valid treatment comparisons
can be made without masking. The issue is

whether the data collection |process,
-------- especially
a
it relates to outcome assessment, is
‘ subject
’
t to
treatment-related biases.
27. Question: What if it is impossible io
mask?
Answer: This is often the case. The trial
should be designed recognizing the opportuni
ties for treatment-related bias. Bias control
procedures, such as those discussed in Chapter
8, should be considered.
28. Questton: Are there circumstances in
masked drug trials in which the treatment assignment for a specific patient must be revealed
during the course of the trial?
Answer: Yes, a few. However, as noted
in Section 8.5, they should be limited to emer
gency situations. The preferred approach is to
Io
terminate use of the assigned treatment without
revealing its identity.
29. Question: Are there cases in which an
entire set of assignments must be unmasked dur
ing the trial?
Answer: Yes, when a treatment is dis
continued during the study. Clinic personnel will
need to identify patients affected by the change
in order to implement it.
30. Question: Should a patient be informed
of the treatment assignment if he is separated
from the trial before it is over?
Answer: The answer depends on when
the separation occurs, on the arrangements
agreed upon when the patient was enrolled, and
on the health care needs of the patient. Unmask
ing individual patients as they depart from the
study can create problems in maintaining the
mask for other patients, as discussed in Section
15.4.
31. Question: Should patients in a masked
trial be told of the treatments they were on when
the trial is terminated?
Answer: Yes.

32. Question: Should the effectiveness of
the treatment masking be assessed when the trial
is over?
Answer: Yes, as discussed in Section
15.4. Guesses made by clinic staff and patients
regarding treatment assignments can be used to
make the assessments.
Other related questions: 40, 62, 63, and 64.

19.7 Questions concerning treatment administration

19 6 QUESTIONS CONCERNING
THE COMPARABILITY OF THE
TREATMENT GROUPS

33. Question: Are tests of significance help
ful in identifying differences in the baseline char
acteristics of the treatment groups?
Answer: Yes, but the results of such tests
must be viewed with caution because of the prob
lems associated with making multiple compari
sons. as mentioned in Section 9.3.12.
34. Question: When assessing treatment
effects, is there a need to be concerned with
differences in the baseline comparability of the
treatment groups if the differences are small?
Answer: Probably not, but as noted in
Section 18.3, it is a good idea to adjust for
baseline differences even if small.
35a. Question: Is it reasonable to expect the
treatment groups to have identical baseline dis
tributions?
Answer: No. The groups will be identi
cal only for those variables controlled in the
randomization process. Differences of varying
sizes will exist for the other variables.

35b. Question: What if at the end of the
study one discovers that an important baseline
characteristic was overlooked in the data collec
tion process? Is it reasonable to expect that vari
able to explain the observed treatment differ
ence?
Answer: No. The expected difference
among treatment groups for an unobserved base
line characteristic is the same as that for an
observed characteristic, assuming the groups are
the product of a properly administered randomi
zation scheme.

Other related questions: 7, 8, 57, 71, and 73.

19.7 QUESTIONS CONCERNING
TREATMENT ADMINISTRATION

36. Question: What should be done about
treatment protocol violations detected during
the trial?
Answer: Corrective action should be
taken to avoid future violations. The departures
noted and actions taken should be reported in
publications from the trial.

201

37. Question: Is there a reliable way to mea
sure treatment adherence in drug trials?
Answer: Not really, except in inpatient
settings. Various methods have been used to as
sess drug adherence in studies involving outpa
tient populations. However, all of them have
shortcomings. One method involves use of a
tracer substance that is added to the study drugs
and that can be assayed in the blood or urine of
study patients. One of the shortcomings of this
method has to do with formulary problems that
arise from the addition of any tracer substance
to existing drugs. The choice of substances must
be limited to those approved by the Food and
Drug Administration and that do not affect the
bioavailability or pharmacology of the drugs.
Another problem has to do with the mechanics
of obtaining blood or urine samples for the ad
herence test. They are normally collected as part
of scheduled follow-up visits. As a result they
can provide a biased view of adherence if pa
tients change their medicine-taking behavior in
preparation for a forthcoming clinic visit.
Blood or urine tests, designed to detect the
presence of the drug itself, can be used when it is
not feasible to use a tracer substance. However,
results from such tests can be quite variable and
may not be specific for the drug. In addition,
they suffer from the same problem mentioned
above if tests are performed as part of a regular
clinic visit.
The advent of miniaturized electronic devices
has led to development of electronic pill dis
pensers that automatically record the times at
which medicines are withdrawn from them.
Comparison of the observed time record with
the one prescribed provides an indirect measure
of compliance. Pill counts, based on medications
returned to the clinic by the patient, are some
times used as crude measures of adherence. How
ever, these measures have limited use. especially
when patients realize that they are used to check
on adherence.
38a. Question: Should a patient who either
refuses to take his assigned treatment upon entry
into the trial or who refuses to continue the
treatment after entry be retained in the trial?
Answer: Yes. All patients enrolled in the
trial should be retained for follow-up regardless
of treatment course.
38b. Question: Should patients who are
started on their assigned treatment and subse-

1
202

I

J;

I
F
'J

19.10 Questions concerning data integrity 203

Questions concerning the design, analysis, and interpretation of clinical trials

quently found to be ineligible for enrollment be
retained for followup9
Answer: Yes, particularly if the assigned
treatment is continued. However, even if a treat
ment change is required the patient should con
tinue to be followed.

39. Question: Should patients found to be
ineligible for the trial after randomization be
continued on treatment?
Answer: The answer depends on the na
ture of the treatments involved. Obviously, treat
ment should not be continued if there are con
traindications for doing so.
Some study designs require the initiation of
treatment before a final assessment of eligibility
is made (e.g., a trial involving MI patients who
are started on treatment in the emergency
room). Treatment may have to be stopped if
subsequent tests indicate that the individual did
not have the condition under study.
Termination of treatment may not be sensible
if the final eligibility assessment occurs some
time after the start of treatment and if there is no
reason to stop the treatment, as was the case in
the UGDP (University Group Diabetes Program
Research Group, l970d).

ing was widespread. The extent of the problem,
the way the tampering was done, the way in
which it was detected, and the action taken
should be reported in the study publication. It
should also indicate if the problem led to a data
Other related questions: 5, 38, and 65.
purge and, if so, the amount of data purged. If
no purge was made, the paper should indicate
19.9 QUESTIONS CONCERNING THE why the investigators believe none was required.
It is good practice to perform two sets of treat
OUTCOME MEASURE
ment comparisons when purges involving sizable
numbers of patients are made, one set for purged
46. Question: Is it all right to use a compos
patients and the other set for all remaining pa
ite outcome measure as the primary outcome
tients. The results of the two analyses should be
measure for a trial?
Answer: Yes, but it is much better to use included in a publication from the trial.
a single outcome measure for the primary mea
50.. ^/u^uun.
Question: ^an
Can exclusion —
of rpatients
sure. It is difficult to determine the clinical releju
---- vancy of most combinations of outcomes, par- judged to be ineligible after randomization affect
ticularly those due to a mixture of disease the credence placed in the results?
processes.
Answer: It can. Elimination of patients
who are randomized and subsequently found to
47. Question: Should an outcome measure be ineligible can bias the results if the judgments
u>cu in
iii the
u.v original sample size calculation, on eligibility are made by persons who know the
not! used
mentioned in the design documents for the treatment assignments. Exclusions, if allowed at
trial, be ignored when results of the trial are all (see answer to question 39), should be based
on data collected before randomization and
analyzed?
Answer: No. All available data should should be made by individuals masked to treat
be used in the evaluation of the study treatments. ment assignment.
While it is desirable to be as explicit as possible
in the design stage regarding the primary out
51. Question: What should be done with
come measure, failure to designate a variable as the data from a clinic in a multicenter trial that
an outcome measure does not preclude its use in
withdraws during the course of the trial?
data analysis. (See Section 20.5 for general pre
Answer: The answer depends on the rea
cautions.)
son for the withdrawal. The data should be
purged from the database if it was due to ques
48. Question: What if the outcome measure
tionable data practices. Otherwise they should
is subject to a treatment-related ascertainment
be retained. Whenever possible, an effort should
bias?
be made to continue follow-up of patients af
Answer: An effort should be made to
fected by the withdrawal. Sometimes this can be
assess the nature and magnitude of the bias, and
accomplished by transferring care responsibili’
.as suggested in the answer
a summary of the problem should be included in
to anO(her clinic,
the study publication.
to question 2b.
The elimination of data from a clinic will not
Other related questions: 4, 45, 58, 72, 75. and
necessarily have any impact on treatment com
76.
parisons, provided the proportionate mix of pa
tients by treatment group in the clinic eliminated
is the same as for the remaining clinics.
19.10 QUESTIONS CONCERNING
DATA INTEGRITY
52. Question: Is it possible to change data
collection or coding practices during the course
of the trial and still have a valid trial?
49. Question: What should be done if some
Answer: Yes. so long as the changes are
one has tampered with the randomization pro
independent of observed treatment effects. How
cess?
ever, it is desirable to minimize these changes for
Answer: The entire set of results from
practical as well as scientific reasons.
the trial may have to be discarded if the tamper-

19.8 QUESTIONS CONCERNING
PATIENT FOLLOW-UP

uho continue in the study. These differences
may place them at a higher (or lower) risk of
developing the event of interest.

41. Question: Should follow-up of a patient
be terminated once he experiences the event of
interest?
Answer: No, except when the event it
self precludes further follow-up. Added followup through the close of the trial for new events
can provide additional data for comparison of
the treatment groups.
42. Question: Is there any way to compen
sate for losses to follow-up due to dropouts or
lack of treatment compliance?
Answer: Yes and no. As noted in Chap
ter 9, there are ways to increase the sample size
to compensate for anticipated losses. However,
the increases do not protect against bias if the
losses are differential by treatment group.
43. Question: Some studies are designed to
add a new patient for each one who refuses the
assigned treatment, or whenever one drops out.
Is this a useful maneuver?
Answer: It can serve the same purpose
as the sample size adjustment alluded to in the
answer to question 42. However, the practice can
lead to a false sense of security if it is perceived
as a solution to treatment compliance or drop
out problems.
The practice is only useful in preserving the
statistical precision of the trial if patient recruit
ment continues over the entire course of follow
up. It is not a practical means of maintaining the
desired type I and II error protection if most of
the losses are from patients who drop out after
recruitment has been completed.

40. Question: Should clinic personnel be
provided with a supply of placebo tablets for use
in single-masked fashion if it is necessary to stop
a patient’s assigned treatment temporarily be
cause of a suspected drug reaction in a double
masked trial?
Answer: Single-masked administration
of a placebo may be of value when the com
plaints leading to the termination are vague and
there is a desire to determine whether they are
due to a real or an imagined cause. The proce
dure is of less value when the reaction can be
documented with laboratory tests or by some
44. Question: Does it pay to try to get pa
other objective means.
tients back under follow-up once they have
The CDP allowed study physicians to use a
dropped out?
single-masked placebo on patients who ap
Answer: Yes, especially in a long-term
peared to be having drug reactions (Coronary
trial. Periodic contact with patients who have
Drug Project Research Group, 1973a). How dropped out can be useful in convincing some to
ever, their use created a dilemma for physicians
resume treatment and to return to active follow
when they were called upon to answer questions
up (see Section 15.3 for further discussion).
from patients concerning their use. Often they
were placed in the position of having to tell
45. t
Question: Is ..it reasonable to assume
white lies to preserve the mask. The wisdom of that patients who remain under active follow-up
this deception is questionable because of the im- have the same risk of developing the event of
pact it may have on patient-physician relations.
interest as those who do not?
Answer: Often no. Patients who drop
Other related questions: 26, 27, 28, and 29.
out may have different risk factors than those

I

i

204

53. Question: What should be done with
contrived data?
Answer: The answer depends upon the
extent of the problem and on whether the con
trivance was treatment related. The results of the
entire trial may have to be discarded if the prob
lem is extensive and treatment related, whereas
no purge may be required if it is restricted to a
few isolated cases.
The Multiple Risk Factor Intervention Trial
(MRF1T) elected to retain data from one clinic
in which personnel were alleged to have falsified
blood pressure data for patients being screened
for enrollment (Presberg and Timnick, 1976).
On the other hand, the data center in the Eastern
Cooperative Oncology Study (ECOG) elected to
purge all data contributed by one of its clinics
because of the serious nature and extent of the
falsification (Boston Globe, 1980a, 1980b. 1980c,
and l980d; Boston Sunday Globe, 1980).
Manuscripts generated from trials in which
data falsification has occurred should indicate
the nature of the problem and the action taken,
if any, to eliminate the questionable data.

Other related questions: 4.

r

19.11 QUESTIONS CONCERNING
DATA ANALYSIS

i

i

i
s
I

•

19.11 Questions concerning data analysis

Questions concerning the design, analysis, and interpretation of clinical trials

54. Question: What is the basis for pooling
treatment results across clinics in a multicenter
trial?
Answer: It stems from the use of com
mon treatment and data collection procedures,
and from the ongoing quality assurance proce
dures designed to detect and minimize proce
dural differences among study clinics.

Answer: No. In fact, the first analysis
should be without regard to any subgrouping
Secondary analyses may be done within various
subgroups, including randomization strata.

Answer: The primary analysis should be
ha<;ed on the original treatment assignment (see
Section 18.1). Other analyses, including those
based on classification of patients by treatment
received, may be carried out.

i

i

61 Question: How does one take account
of changes in a patient’s adherence to treatment
over the course of the trial?
Answer: The problem with varying lev
els of adherence is common in drug trials in
uhich patients are expected to remain on their
a«ignired treatment for long periods of time. I he
primary analysis should be by the initial treat
ment assignment, without regard to adherence.
This analysis can be followed by others that are
designed to take account of observed adherence
levels (e.g., see University Group Diabetes Pro
gram Research Group, l970e).

57. Question: Can differences in the baseline composition of the study groups invalidate
treatment comparisons?
Answer: It depends on how large thev
are and how they occurred. They can if the differ
ences are an expression of a treatment-related
bias resulting from a breakdown in the assign
ment process, but not if they are relatively small
and unrelated to treatment.
Much of the discussion concerning the UGDP
results published in 1970 (University Group Dia
betes Program Research Group, 1970e) centered
on the comparability of the treatment groups at
the time of randomization. Critics argued that
the constellation of baseline entry characteristics
present in the tolbutamide-treated patients auto
matically predisposed them to a higher risk of
mortality than was the case for control-treated
patients (Feinstein, 1971; Schor, 1971; Seltzer.
1972). Arguments concerning comparability per
sisted in spite of the fact that the observed differ
ences were within the range of chance, that ad
justment for the differences did not materialh
affect the size of the tolbutamide-placebo differ
ence in mortality, and that analyses by others
outside the UGDP reached similar conclusions
regarding tolbutamide therapy (Committee for
the Assessment of Biometric Aspects of Con
trolled Trials of Hypoglycemic Agents. 1975;
Cornfield, 1971).

58. Question: Is it appropriate to consider
more than one outcome measure in the analysis
of the data?
Answer: Yes. As a matter of fact it is
often an essential part of the analysis process
See question 47.

55. Question: Is randomization required for
a valid analysis?
Answer: No. The main purpose of ran
domization is to provide a method of assignment
that is free of selection bias. Randomization the
ory has been used to form the basis for some
tests of significance, but the theory, per se, is not
crucial for most of the data analyses carried out
in the typical clinical trial.

59. Question: Are there dangers in analyses
that focus simply on patients who received the
assigned treatment?
Answer: Yes, they can lead to overesti
mation of the treatment effect (see Section 18 I)

56. Question: Is one obligated to make treat
ment comparisons in subgroups defined when
the trial was designed?

60. Question: Where should data on pa
tients who did not receive the assigned treatment
be counted?

Answer: While there is no substitute for
complete follow-up, the usual approach is to
carry out a series of analyses, each requiring a
different set of assumptions regarding the rate of
outcome events after patients are lost to follow
up. One of the analyses should be done assuming
a zero event rate over the periods patients are
lost to follow-up. Other analyses may be done in
which all patients lost to follow-up are assumed
to have had the event after loss to follow-up, or
alternatively, in which they are assumed to have
experienced the event at the same rate as a de
fined portion of the study population (e.g., the
control-treatment group of patients who re
mained under active follow-up). Losses are not a
serious source of concern if the various analyses
all support the same basic conclusion and if they
are not differential by treatment group.

62. Question: What should be done with
data from a patient whose treatment is un
masked for medical reasons?
...
Answer: They should be analyzed in the
treatment group indicated by the randomization Other analyses may be performed and re
ported in which data for such patients are ex
cluded to determine if doing so affects the
magnitude of the observed treatment effect.

6V Question: What if the treatment mask
ing was ineffective? Are the data still worth anaIvzing?
Answer: Masking is never 100% effec
tive. Treatment-related side effects may reveal
the treatment assignment to both patients and
physicians. The validity of treatment compari
sons will depend on whether or not the deficien
cies in masking allowed introduction of treatment-related biases.

i

'

64. Question: What should be done with
data for patients whose treatment assignment
was needlessly unmasked?
Answer: The analysis approach should
he similar to that outlined for question 61. How
ever, the frequency of frivolous unmaskings
should be noted in the published report. A large
number may be indicative of a lack of regard for
the study protocol by investigators in the trial
and may raise general questions regarding the
validity of the study.
65. Question: How does one deal with miss
ing data caused by losses to follow-up?

I

205

66. Question: How should aberrant labora
tory results be handled?
Answer: Outlier values, whether they
are a legitimate indicator of some underlying
biological problem or are due to a laboratory or
recording error, may have to be trimmed or
eliminated in analyses involving means or var
iances. The rules for trimming or elimination
should be constructed and administered without
regard to treatment assignment or effect and
should be specified in published reports from the
trial.
67. Question: What if there is a secular
trend in the laboratory data generated in a trial?
Will this affect comparisons between treatment
groups?
Answer: It should not. assuming that pa
tients in all treatment groups were enrolled over
the same time frame and that the time sequence
in which laboratory determinations were per
formed was independent of treatment assign
ment.
68. Question: How should data obtained
unscheduled examinations be
from interim u.
handled?
Answer: The first analysts should be
done ignoring the results. A second one may be
done with the results included. A differential
rate of interim unscheduled examinations by
treatment group can influence the rate at which
nonfatal events are diagnosed and reported,
CDP investigators were sufficiently concerned
about this possibility as to virtually ignore re-

206

suits from unscheduled examinations when ana
lyzing the dextrothyroxine results (Coronary
Drug Project Research Group, 1972; 1981).

ft

69. Question: Is it permissible to perform
analyses during the course of the trial to detect
treatment effects?
Answer Yes. They are not only permiss
ible but required in any trial in which the treat
ments are hazardous, or in which early detection
of a treatment effect may prove beneficial to
patients already in the trial or to those yet to be
enrolled (see Chapter 20).

I

70. Question: What if there is a major time
lag in the flow of data from the clinic to the data
center? Can this have an impact on the detection
of treatment differences during the trial?
Answer: Yes, especially if the time lag
is differential by treatment group. Procedures
should be established to ensure data flows that
are timely and uniform with regard to treatment
assignment (see Chlebowski and co-workers,
1981).
71a. Question: Is it reasonable to argue that
imbalance in the distribution of an important
but unobserved baseline risk factor could ac
count for an observed treatment difference or
lack of one in a randomized trial?
Answer: Not really. As noted in the
answers to questions 35a and 35b, the expected
distribution of an unobserved characteristic is
the same as for an observed characteristic.

4"
. i.

19.12 Questions concerning conclusions

Questions concerning the design, analysis, and interpretation of clinical trials

71b. Question: Is a trial invalid if there are
differences among the treatment groups with re
gard to key baseline variables?
Answer: Generally no, unless the differ
ences are due to selection biases arising from a
breakdown in the way treatment allocations
were made.

expected to yield a difference when the trial was
designed?
Answer: Yes, especially when the mea
sure has more clinical relevance than the one
used in the design of the trial. The focus on
mortality in assessment of the tolbutamide and
phenformin results in the UGDP. even though

19.12 QUESTIONS CONCERNING
CONCLUSIONS
73. Question: Is it really possible to drau
any conclusions from a clinical trial because of
the select nature of the study population in
volved?
Answer: Yes. Comparisons between treat
ment groups are valid so long as all groups ha\e
been exposed to the same selection factors.
74. Question: Is it possible to generalize
findings beyond the population studied and the
treatments used?
Answer: Any generalization that goev
beyond the study population must be made with
caution and is judgmental rather than statistical
in nature. Treatment effects observed in a speci
fied population with a particular dosage of a
drug may not be generalizable to a broader pop
ulation. Similarly, an effect produced with one
formulation of a compound may not be produced by a sister product. For example, it is
tempting to generalize the UGDP findings on
tolbutamide to other sulfonylurea compounds
However, the study included only one member
of the family (University Group Diabetes Pro
gram Research Group, !970d). The question
of scientific validity versus generalizability is
touched upon by the National Diet-Heart Studs
Research Group (1968).

75a. Question: Is it appropriate to base con
clusions from a trial on a nonfatal event if there
is differential mortality by treatment group9
Answer: No. Conclusions based on dif
ferences in a nonfatal outcome are only valid if
there is no difference among the study groups
with respect to mortality. A differential mortal
ity by treatment group may influence the rate of
occurrence of nonfatal events. The treatment
group with the highest mortality rate may have
the lowest nonfatal event rate if death occurs
before patients have a chance to develop the
nonfatal event of interest.

72. Question: Is it appropriate to use a
subset of deaths as the prime outcome measure?
Answer: The trial may be designed for
detection of a specified difference for a subset of
deaths, as was the case in MR FIT (Multiple
Risk Factor Intervention Trial Research Group,
1982). However, the initial analysis should be for
mortality from all causes (see question 75b).

75b. Question: Is it appropriate to base con
clusions on deaths due to a specific cause (e g.,
cardiovascular deaths)?
Answer: Only if the conclusion is con
sistent with the one reached when all deaths are
considered.

Other related questions: 4, 8, 34, 35, 45, 47, 48,
51, 53, 75, and 76.

76. Question: Is it appropriate to base con
clusions on an outcome measure that was not

I

207

the study was designed to look for differences in
nonfatal outcomes, is a case in point (University
Group Diabetes Program Research Group,
1970d, l970e. 1975).

Other related questions: 10, 26. 52, 54, 55, 57, 71,
and 72.

Media: 14146.pdf

Position: 3116 (3 views)