Sample size calculator for cluster randomized trials

https://doi.org/10.1016/S0010-4825(03)00039-8Get rights and content

Abstract

Cluster randomized trials, where individuals are randomized in groups are increasingly being used in healthcare evaluation. The adoption of a clustered design has implications for design, conduct and analysis of studies. In particular, standard sample sizes have to be inflated for cluster designs, as outcomes for individuals within clusters may be correlated; inflation can be achieved either by increasing the cluster size or by increasing the number of clusters in the study. A sample size calculator is presented for calculating appropriate sample sizes for cluster trials, whilst allowing the implications of both methods of inflation to be considered.

Introduction

Traditionally the patient randomized controlled trial has been widely accepted as the method of choice for evaluating new healthcare interventions [1], [2]. In many situations, however, the use of a cluster randomized trial (where groups rather than individuals are randomized) may be advantageous. For example, when evaluating interventions targeted at the health professional (such as the impact of educational training on good clinical practice), randomizing at the level of the professional rather than the individual patient may be the only feasible method of conducting a randomized trial in this field [3]. Similarly, when assessing a dietary intervention, it is common to randomize families as an intact unit, to avoid the possibility of different members of the same family being assigned to different interventions.

The adoption of a cluster design impacts on the design, conduct and analysis of the study. One particular impact is the influence of clustering on sample size requirements, as the clustering reduces the efficiency of the design. The main reason for this is that standard sample size calculation for patient-based trials only accommodate for variation between individuals. In cluster studies, however, there are two separate components of variation— variation among individuals within clusters, but also variation in outcome between clusters.

This paper presents the development of a sample size calculator for the planning of cluster randomized trials. The impact of clustering on sample size calculations will be outlined, together with potential strategies to achieve the increased sample size required. The algorithms developed for use within the calculator will then be described, together with examples of the use of the calculator.

Section snippets

Impact of clustering on sample size

A fundamental assumption of the patient-randomized trial is that the outcome for an individual patient is completely unrelated to that for any other patient—they are said to be ‘independent’. This assumption no longer holds when cluster randomization is adopted, because patients within any one cluster are more likely to respond in a similar manner. For example, the management of patients in a single hospital is more likely to be consistent than management across a number of hospitals. A

Strategies for increasing sample size

To achieve the increased sample size required, it is possible either to increase the number of clusters to be recruited to the study or to increase the number of subjects included from each cluster. Compared with increasing the number of clusters to be included within a study, however, the effect on power of increasing the cluster size is minimal [8].

Whilst increasing the number of clusters is theoretically the more effective method in redressing the loss of efficiency caused by clustering,

Scope of calculator

The sample size calculator was developed to address sample size issues for two group comparisons of either means or proportions, assuming:

  • a completely randomized design;

  • 1:1 randomization; and

  • equal cluster sizes.

Reflecting the desire for clinicians and researchers to consider the trade-off between increasing the cluster size or increasing the number of clusters, the sample size calculator was designed to allow the trade-off to be explicitly considered, through presentation of both options in a

Comparison of means

As detailed in Section 4.2.1 above, to calculate the sample size requirements to detect the difference between two means, the user is required to specify (a) the minimum difference to be detected; (b) the standard deviation; and (c) the desired significance and power settings.

For example, consider a study that is evaluating the implementation of a clinical guideline for the management of blood pressure. A difference of 5mmHg is deemed to be a clinically important change to detect and previous

Discussion

To date, the quality of the design, conduct and analysis of cluster randomized trials has been relatively poor. For example, Simpson et al. [10] showed that of 21 cluster trials identified in two major public health journals, only 4(19%) had accounted for the clustering in the planning of the trial. Similarly, a review by Divine et al. [11], which examined studies of physicians’ patient-care practices, observed that 70% of studies identified had not appropriately accounted for the clustered

Summary

In this paper, the development of a sample size calculator for the estimation of appropriate sample sizes for cluster randomized trials is described. It allows researchers explicitly to trade-off the options of achieving the appropriate sample size through the recruitment of a greater number of clusters or through increasing the cluster size. It also allows researchers to identify the minimum difference detectable for a given fixed cluster size. It is recognized, however, that the calculator

Acknowledgements

The Health Services Research Unit is funded by the Chief Scientist Office of the Scottish Executive Health Department. The views expressed are not necessarily those of the funding body.

Marion K. Campbell is the Director of the Health Care Assessment Programme in the Health Services Research Unit, University of Aberdeen, UK. The Health Care Assessment Programme focuses on the evaluation of primarily non-drug technologies. Marion is trained in statistics and is a Chartered Statistician of the Royal Statistical Society. She has over 10 years experience in health services research and medical statistics. Her main interests are in the methodology of evaluative research, especially

References (15)

  • S.J. Pocock

    Clinical TrialsA Practical Approach

    (1983)
  • A. Bowling

    Research Methods in Health

    (1997)
  • A. Donner et al.

    Design and Analysis of Cluster Randomization Trials in Health Research

    (2000)
  • A. Donner et al.

    Design considerations in the estimation of intraclass correlation

    Ann. Hum. Genet.

    (1982)
  • A. Donner et al.

    Randomization by cluster, Sample size requirements and analysis

    Am. J. Epidemiol.

    (1981)
  • L. Kish

    Survey Sampling

    (1965)
  • M.J. Campbell et al.

    Estimating sample sizes for binary, ordered categorical, and continuous outcomes in two group comparisons

    Br. Med. J.

    (1995)
There are more references available in the full text version of this article.

Cited by (169)

  • Culinary Nutrition Education Improves Home Food Availability and Psychosocial Factors Related to Healthy Meal Preparation Among Children

    2022, Journal of Nutrition Education and Behavior
    Citation Excerpt :

    Children with physical/intellectual disabilities, severe medical conditions (eg, diabetes or heart disease), or those with food allergies (eg, wheat/gluten, dairy, or peanuts) were excluded from the study. The sample size for the current study was determined based on sample size calculation for cluster trial.27 With 3.62 minimum difference detectable in children's cooking self-efficacy score as demonstrated in past cooking RCT among children,6 an estimated standard deviation of 0.47, 5% significance level and 80% power, the minimum sample size needed is 60 children from 2 schools.

View all citing articles on Scopus

Marion K. Campbell is the Director of the Health Care Assessment Programme in the Health Services Research Unit, University of Aberdeen, UK. The Health Care Assessment Programme focuses on the evaluation of primarily non-drug technologies. Marion is trained in statistics and is a Chartered Statistician of the Royal Statistical Society. She has over 10 years experience in health services research and medical statistics. Her main interests are in the methodology of evaluative research, especially the field of cluster randomized trials. She has been involved in the design and conduct of a number of cluster trials, and has also undertaken research into methodological aspects of cluster trials including factors which influence the magnitude and stability of intracluster correlation coefficients.

Sean Thomson graduated in 1993 from Aberdeen University in Electronics and Computing and began his career in the oil industry. He joined the Health Services Research Unit in 1996 as Information Systems Officer and managed the Unit's computer network as well as programming key systems for a number of research projects. Sean is currently in the marine electronics industry managing the information technology infrastructure of a multi-site sales and manufacturing company.

Craig R. Ramsay joined the Health Services Research Unit in January 1995 as a statistician. He took up the post of Senior Statistician within the Effective Professional Practice Programme in 2000 and has recently taken over responsibility for overseeing the Unit's statistics team. Craig is statistical editor for the Cochrane Effective Practice and Organisation of Care group. He graduated in Mathematics and Statistics at Edinburgh University in 1993. After receiving a Postgraduate Diploma in Information Systems from Napier University, he joined the Information and Statistics Division of the Common Services Agency, Edinburgh, before joining the Unit.

Graeme S. MacLennan joined the Health Services Research Unit in 1998 as a Statiscian having completed a PGCE(S) and a BSc degree in Mathematics at the University of Aberdeen. Current interests include design and analysis of cluster randomized trials and methods for incorporating data from such trials into meta-analyses.

Jeremy M. Grimshaw is the Director of the Clinical Epidemiology Program of the Ottawa Health Research Institute and Director of the Centre for Best Practice, Institute of Population Health, and University of Ottawa. He holds a Tier 1 Canadian Research Chair in Health Knowledge Transfer and Uptake and is a Full Professor in the Department of Epidemiology and Community Medicine, University of Ottawa. Prior to this he held a Personal Chair in Health Services Research at the University of Aberdeen, UK and was the Programme Director of the Effective Professional Programme within the Health Services Research. Dr. Grimshaw received an MBChB (MD equivalent) from the University of Edinburgh, UK. He trained as a family physician prior to undertaking a Ph.D. in health services research at the University of Aberdeen.

He is the Coordinating Editor of the Cochrane Effective Practice and Organization of Care group that aims to support systematic reviews of professional, organizational, financial and regulatory interventions to improve health care delivery and systems. He has been involved in 14 cluster randomized trials and two interrupted time series of different dissemination and implementation strategies. He has also undertaken research into statistical issues in the design, conduct and analysis of cluster randomized trials.

View full text