Article Text

Download PDFPDF

Understanding observational studies


The development of evidence-based medicine has encouraged clinicians to consider the best available evidence when making clinical decisions, individualised for the patient.1,2 A hierarchy of levels of evidence about therapy is described, with the highest being meta-analyses and large, well-designed randomised controlled trials.3-5 Lower down the hierarchy are observational studies, a group that includes cohort, case-control and cross-sectional studies. Here we describe the benefits and limitations of observational studies, and how to assess their quality.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Randomised controlled trials (RCTs) can indicate causal relationships between an intervention and an outcome in controlled experimental conditions where the intervention/control is the only key difference between treatment groups.6 In observational studies, the researcher does not determine or influence the intervention or exposure that participants experience during the research, so these studies may support but not prove causality.7 Of particular importance when considering observational studies are the concepts of bias, confounding and validity (see Box 1).

Box 1:

Bias, confounding factors and validity

Bias: any tendency to influence the results of a trial (or their interpretation) other than the experimental intervention.8 Examples include:

  • Selection bias: a systematic difference between the groups at baseline (e.g. a difference in confounding factors).4

  • Performance bias: errors and inconsistencies in the allocation, application and recording of interventions.4

  • Attrition bias: differential drop-out between groups.4

  • Detection bias: assessment of outcomes is not standardised nor blinded.4

Confounding factor: a variable that is not the one you are interested in but that may affect the results of the trial.8

Validity: the extent to which a variable or intervention measures what it is supposed to measure or accomplishes what it is supposed to accomplish. The internal validity of a study refers to the integrity of the experimental design. The external validity of a study refers to the appropriateness by which its results can be applied to non-study patients or populations.8

Types of observational studies

There are three main types of observational study (cohort studies, case-control studies and cross-sectional studies) that are distinguished by the objective of the research, how subjects are sampled and the timeline of data collection.5 These factors have a bearing on the extent to which studies may support a hypothesis of causality (see Box 2).

Box 2.

Aspects of an association between exposure and effect that help indicate causality:9,10

Strength of association (e.g. the death rate from lung cancer in cigarette smokers is 9 to 10 times the rate in non-smokers).

Consistency of association across different studies (e.g. the association of smoking with lung cancer in many retrospective and prospective studies).

Specificity of association (e.g. lack of association between smoking and many other types of cancer).

Temporal relationship (e.g. smoking precedes onset of lung cancer).

Dose-response relationship (e.g. the death rate from lung cancer rises linearly with the number of cigarettes smoked daily).

Experimental evidence (e.g. risk of lung cancer falls in those who quit smoking).

Cohort study

The identifying feature of a cohort study is that the subjects are followed over time.5 Cohort studies begin with individuals who are followed for the subsequent development of an outcome.5 In a prospective cohort study, the individuals are collected at the beginning of the study and followed up into the future for the occurrence of the outcome.5 An example is the Million Women Study, in which 1.3 million UK women were recruited between 1996 and 2001.10 It is investigating how various reproductive and lifestyle factors affect women's health.11

In a retrospective cohort study, the outcomes are known when the study starts and the investigators look back in time to identify the exposures from available records.5

Cohort studies are an appropriate study design when:

  • there is good evidence to suggest an association between an exposure and an outcome;

  • the interval between exposure and development of the outcome is relatively short to minimise loss to follow-up in prospective cohorts; and

  • the outcome is not too rare (so that the size of the cohort is not too large).5

A cohort study allows the investigators to identify new or ‘incident’ cases of the outcome, look at disease progression, staging and natural history, and assess the temporal relationship that would support causality.5

Limitations of cohort studies include:

  • large sample sizes may involve high data collection costs in prospective studies;12

  • it can be difficult to track subjects for the entire study in prospective studies;5

  • retrospective cohort studies may be limited by the data that were collected in the past, as records are often not available for the exposure of interest or are not sufficiently accurate or detailed;12 and

  • there is a risk of confounding factors differing between groups (selection bias).5

Case-control study

Case-control studies begin with individuals who have the outcome of interest (cases) and compare them with individuals who do not have the outcome (controls) according to past history of exposure to a factor.5

Case-control studies are appropriate when:

  • the outcome is rare, as they can generate a lot of information from relatively few subjects;5,13

  • there is reliable evidence of past exposure;5 and

  • there is a long latent period between an exposure and the disease.13

Common problems and limitations of case-control studies include:

  • They are restricted to a single outcome.13

  • Selection bias: controls should be similar to cases in all respects other than having the disease or should be similar to the general population from which the cases arose. Some case-control studies use matching to select controls so that they are similar to cases in specific characteristics (e.g. race, age, sex, socioeconomic status) that could be confounding factors.5

  • Low generalisability: it is desirable to select cases from multiple institutions to obtain more generalisable results.5 Criteria for case eligibility should be carefully specified. Where matching is used, cases that are unable to be matched are often excluded from the analyses; a high proportion excluded could limit the generalisability of the study.

  • Recall bias (a type of performance bias): if cases and controls differentially recall past exposures or if there is more thorough documentation on cases compared with controls.5

  • Interviewer or ascertainment bias (a type of detection bias): study investigators may interview cases more thoroughly regarding past exposures than controls because they know the subject is a case.5

Cross-sectional study

A cross-sectional study is an observational study in which exposure and outcome are determined simultaneously for each subject (e.g. a survey).5 This type of study can be conducted to estimate the prevalence of a disease, treatment, service or other outcome of interest for a given population at a specific point in time.5

Advantages of cross-sectional studies include:

  • subjects are not deliberately exposed, treated/not treated and so there are seldom ethical difficulties; and

  • data are collected only once and multiple outcomes can be studied, so this type of study is relatively quick and cheap.13

Limitations and drawbacks of cross-sectional studies include:

  • Because the exposure and outcome are simultaneously assessed, although the investigator may determine that there is an association between an exposure and an outcome, it may not be possible to determine the time sequence between them or causality.5

  • They evaluate prevalent rather than incident outcomes (i.e. having the outcome as opposed to developing the outcome); so there is a bias towards including in the study individuals with more favourable survivorship (it excludes people who develop the outcome but die before the study).5

  • It is difficult to study rare conditions using cross-sectional studies because, even in large samples, there may be no-one with the disease.13

  • There is a risk of confounding.5

  • Any study with a low response rate can be criticised because it can miss significant differences between the responders and non-responders, and volunteers are unlikely to be representative of the general population.13

Observational studies vs. RCTs

The key design differences between RCTs and observational studies are shown in the Table.


Comparison of experimental and observational study designs5,13

Advantages of RCTs

The main benefit of RCTs is the use of a randomisation procedure, which makes the comparison groups similar with respect to both known and unknown prognostic factors at baseline, apart from chance, reducing the possibility of selection bias.4 RCTs also use protocols to standardise interventions and outcome assessment. To reduce the likelihood of performance and detection bias, study outcomes are assessed by investigators who do not know which group participants were allocated to (known as blinded outcome assessment).4 RCTs often employ specific methods for follow-up to reduce attrition, such as strategies to motivate participants and clinicians, and strategies to improve communication with participants or trial sites.14

Disadvantages of RCTs

RCTs are conducted under conditions that do not necessarily reflect everyday clinical practice.15 For example:

  • inclusion criteria often result in selected, homogeneous patient populations with high compliance;

  • exclusion criteria frequently bar patients with comorbidities and co-medications from participation; and

  • quality-assurance measures under study conditions usually cannot be achieved in clinical routine.15

RCTs are expensive and there are some therapies for which RCTs are unethical, unnecessary, inappropriate, impossible or inadequate.1,4 Examples include the assessment of rare adverse effects of treatments, some preventive interventions and policy changes.4 In addition, endpoints that are clinically relevant and important for the patient, such as quality of life, may not be investigated in RCTs.15

In these situations, the evidence may rely on observational studies.4

The future of observational studies

The increasing amount of patient data that are stored in electronic databases provides a useable source of secondary data (in contrast to conducting a primary study).13 Examples include demographic and lifestyle data from the Census and General Household Survey; morbidity data from GP morbidity statistics, communicable disease surveillance, hospital activity analysis and cancer registration; mortality data from the Office of Population Censuses and Surveys (OPCS) mortality statistics; health service data from immunisation and screening records; and confidential inquiry reports; disability information from the OPCS survey of disability; and prescribing data from GP clinical system databases.13

The advantages include:

  • data are available on large numbers of people, reducing the risk of selection bias;13

  • data are collected by people other than the researcher and independently of any specific hypothesis, reducing the risk of performance or detection bias;13

  • the use of previously collected data is efficient and comparatively inexpensive;13

  • data are collected in a very standardised way, permitting comparisons over time and between different countries.13

One particular example is the use of patient registries, which collect uniform data to evaluate specified outcomes for a population defined by a particular disease, condition or exposure.16

Advantages of registries include:

  • they can provide clinically relevant data that are meaningful to patients and providers for research on patient-centred outcomes;

  • they can provide data on populations not typically studied in clinical trials (e.g. children, older people, minority groups, pregnant women, those with multiple comorbidities);

  • their non-experimental design (i.e. no randomisation) allows them to be used to examine the impact of clinical practice behaviours on quality of care, prescribing preference, and other important but difficult to quantify covariates.16

Disadvantages of secondary data may include:

  • problems with generalisability if the registry participants are not sufficiently comparable to the target population;16

  • data may not be ideally suited to the testing of the current hypothesis or may be incomplete (e.g. patients may miss visits, drop out of a registry, or decline to complete a patient-reported outcome questionnaire), resulting in performance, attrition or detection bias;13,16

  • potential confounding variables must be considered and appropriate statistical techniques used to address this issue during analysis;16

  • the primary use of the data may be for reasons other than medical research (e.g. GP clinical systems) and, therefore, may include only those events that are relevant for its primary purpose;17

  • concerns over multiple comparisons of data; if one analysis does not find a significant difference, further subsets or outcomes may be analysed until a significant result is found.18


Randomised controlled trials (RCTs) are studies of high methodological quality that describe outcomes related to an intervention compared with a control treatment. RCTs have many safeguards against possible sources of bias. However, they are carried out under controlled conditions that may not be representative of routine clinical practice and are expensive to conduct. Observational studies may capture a broader patient population in a real-world context and, therefore, provide information that is more generalisable. Such studies may support but not prove causality and care must be taken to assess the selection criteria for the groups of patients being compared and the measurement of the interventions/exposures and outcomes, to ensure that results are not biased or compromised by confounding factors. The increasing use of databases, such as patient registries, allows secondary analyses to be conducted relatively inexpensively on large datasets, rather than designing expensive primary studies to answer research questions. Such analyses are likely to be employed more in the future, especially to answer patient-centred questions among populations typically excluded from clinical trials. Nevertheless, it is important that the limitations of all such analyses are carefully considered. Of particular concern is the repeated analysis of data until a statistically significant result is found.