Statistics notes: Sample size in cluster randomisation

Sally M Kerry; J Martin Bland

doi:10.1136/bmj.316.7130.549

Education And Debate

Statistics notes: Sample size in cluster randomisation

BMJ 1998; 316 doi: https://doi.org/10.1136/bmj.316.7130.549 (Published 14 February 1998) Cite this as: BMJ 1998;316:549

Sally M Kerry, statisticiana,
J Martin Blandb, professor of medical statistics

^a Division of General Practice and Primary Care, St George's Hospital Medical School, London SW17 0RE
^b Department of Public Health Sciences

Correspondence to: Mrs Kerry

Abstract

Techniques for estimating sample size for randomised trials are well established,1 2 but most texts do not discuss sample size for trials which randomise groups (clusters) of people rather than individuals. For example, in a study of different preparations to control head lice all children in the same class were allocated to receive the same preparation. This was done to avoid contaminating the treatment groups through contact with control children in the same class.3 The children in the class cannot be considered independent of one another and the analysis should take this into account.4 5 There will be some loss of power due to randomising by cluster rather than individual and this should be reflected in the sample size calculations. Here we describe sample size calculations for a cluster randomised trial.

For a conventional randomised trial assessing the difference between two sample means the number of subjects required in each group, n, to detect a difference of d using a significance level of 5% and a power of 90% is given by n=21s²/ d² where s is the standard deviation of the outcome measure. Other values of power and significance can be used.1

For a trial using cluster randomisation we need to take the design into account. For a continuous outcome measurement such as serum cholesterol values, a simple method of analysis is based on the mean of the observations for all subjects in the cluster and compares these means between the treatment groups. We will denote the variance of observations within one cluster by sw² and assume that this variance is the same for all clusters. If there are m subjects in each cluster then the variance of a single sample mean is sw²/ m. The true cluster mean (unknown) will vary from cluster to cluster, with variance sc². The observed variance of the cluster means will be the sum of the variance between clusters and the variance within clusters—that is, variance of outcome=sc²+sw²/m. Hence we can replace s² by sc²+sw²/m in the formula for sample size above to obtain the number of clusters required in each intervention group. To do this we need estimates of sc² and sw².

For example, in a proposed study of a behavioural intervention in general practice to lower cholesterol concentrations practices were to be randomised into two groups, one to offer intensive dietary intervention by practice nurses using a behavioural approach and the other to offer usual general practice care. The outcome measure would be mean cholesterol values in patients attending each practice one year later. Estimates of between practice variance and within practice variance were obtained from the Medical Research Council thrombosis prevention trial6 and were sc²=0.0046 and sw²=1.28 respectively. The minimum difference considered to be clinically relevant was 0.1 mmol/l. If we recruit 50 patients per practice, we would have s²=sc²+sw²/m=0.0046+1.28/50=0.0302. The number of practices is given by n=21x0.0302/0.1²=63 in each group. We would require 63 practices in each group to detect a difference of 0.1 mmol/l with a power of 90% using a 5% significance level—a total of 3150 patients in each group.

It can be seen from the formula for the variance of the outcome that when the number of patients within a practice, m, is very large, sw²/m will be very small and so the overall variance is roughly the same as the variance between practices. In this situation, increasing the number of patients per practice will not increase the power of the study. The 1 shows the number of practices required for different values of m, the number of subjects per practice. In all situations the total number of subjects required is greater than if simple random allocation had been used.

Total number of practices required to detect a difference of 0.1 mmol/l cholesterol with 90% power at 5% significance level

View this table:

The ratio of the total number of subjects required using cluster randomisation to the number required using simple randomisation is called the design effect. Thus a cluster randomised trial which has a large design effect will require many more subjects than a trial of the same intervention which randomises individuals. As the number of patients per practice increases so does the design effect. In the 1, the design effect is very small when m is less than 10. This would involve recruiting a total of 558 practices, and the nature of the intervention and difficulties in recruiting practices made this impractical. Thus it was decided to recruit fewer practices. The design effect of using 126 practices with 50 patients from each practice was 1.17. This design requires the total sample size to be inflated by 17%. If the study involves training practice based staff it may be cost effective to reduce the number of practices even further. If we chose to use 32 practices then we would need 500 patients from each practice and the design effect would be 2.98. Thus the cluster design with 32 practices would require the total sample size to be trebled to maintain the same level of power.

We shall discuss the use of the intracluster correlation coefficient in these calculations in a future statistics note.

References

1.↵
1. Florey C du V
. Sample size for beginners. BMJ 1993;306:1181–4.
2.↵
1. Machin D,
2. Campbell MJ
. Statistical tables for the design of clinical trials. Oxford: Blackwell, 1987.
3.↵
1. Chosidow O,
2. Chastang C,
3. Brue C,
4. Bouvet E,
5. Izri M,
6. Monteny N,
7. et al
. Controlled study of malathion and d-phenothrin lotions for Pediculus humanus var capitis-infested schoolchildren. Lancet 1994;344:1724–7.
4.↵
1. Bland JM,
2. Kerry SM
. Trials randomised in clusters. BMJ 1997;315:600.
OpenUrl FREE Full Text
5.↵
1. Kerry SM,
2. Bland JM
. Analysis of a trial randomised in clusters. BMJ 1998;316:54.
OpenUrl FREE Full Text
6.↵
1. Meade TW,
2. Roderick PJ,
3. Brennan PJ,
4. Wilkes HC,
5. Kelleher CC
. Extracranial bleeding and other symptoms due to low dose aspirin and low intensity oral anticoagulation. Thromb Haemostasis 1992;68:1–6.
OpenUrl PubMed Web of Science

View Abstract

Article tools

PDF 2 responses

Respond to this article
Print
Alerts & updates
Article alerts
Please note: your email address is provided to the journal, which may use this information for marketing purposes.

Log in or register:

Username *

Password *

Register for alerts

If you have registered for alerts, you should use your registered email address as your username
Citation tools
Download this article to citation manager

Sally M Kerry, J Martin Bland

Kerry S M, Bland J M. Statistics notes: Sample size in cluster randomisation BMJ 1998; 316 :549 doi:10.1136/bmj.316.7130.549

BibTeX (win & mac)Download

EndNote (tagged)Download

EndNote 8 (xml)Download

RefWorks Tagged (win & mac)Download

RIS (win only)Download

MedlarsDownload

Help

If you are unable to import citations, please contact technical support for your product directly (links go to external sites):

EndNote

ProCite

Reference Manager

RefWorks

Zotero
Request permissions

Topics

[1] 1.↵
Florey C du V
. Sample size for beginners. BMJ 1993;306:1181–4.

[2] Florey C du V

[3] 2.↵
Machin D,
Campbell MJ
. Statistical tables for the design of clinical trials. Oxford: Blackwell, 1987.

[4] Machin D,

[5] Campbell MJ

[6] 3.↵
Chosidow O,
Chastang C,
Brue C,
Bouvet E,
Izri M,
Monteny N,
et al
. Controlled study of malathion and d-phenothrin lotions for Pediculus humanus var capitis-infested schoolchildren. Lancet 1994;344:1724–7.

[7] Chosidow O,

[8] Chastang C,

[9] Brue C,

[10] Bouvet E,

[11] Izri M,

[12] Monteny N,

[13] et al

[14] 4.↵
Bland JM,
Kerry SM
. Trials randomised in clusters. BMJ 1997;315:600.
OpenUrl FREE Full Text

[15] Bland JM,

[16] Kerry SM

[17] 5.↵
Kerry SM,
Bland JM
. Analysis of a trial randomised in clusters. BMJ 1998;316:54.
OpenUrl FREE Full Text

[18] Kerry SM,

[19] Bland JM

[20] 6.↵
Meade TW,
Roderick PJ,
Brennan PJ,
Wilkes HC,
Kelleher CC
. Extracranial bleeding and other symptoms due to low dose aspirin and low intensity oral anticoagulation. Thromb Haemostasis 1992;68:1–6.
OpenUrl PubMed Web of Science

[21] Meade TW,

[22] Roderick PJ,

[23] Brennan PJ,

[24] Wilkes HC,

[25] Kelleher CC

Statistics notes: Sample size in cluster randomisation

Abstract

References

Article alerts

Log in or register:

Download this article to citation manager

Help

Forward this page

Content links

About us

Resources

Explore BMJ

My account

Information

Search form

Statistics notes: Sample size in cluster randomisation

Abstract

References

Article alerts

Log in or register:

Download this article to citation manager

Help

Forward this page

Content links

About us

Resources

Explore BMJ

My account

Information