Table 1

Types of routine data in healthcare

Data typeDefinitionCharacteristicsExamples
Administrative dataData collected as part of the routine administration of healthcare, for example reimbursement and contracting. Secondary uses include the assessment of health outcomes and quality of care.Records of attendances, procedures and diagnoses entered manually into the administration system for a hospital or other healthcare organisation and then collated at regional or national level.
Little or no patient or clinician review; no data on severity of illness.
Hospital episode statistics (England): Clinical coders review patients’ notes, and assign and input codes following discharge. These codes are used within a grouper algorithm to calculate the payment owed to the care provider.11 HES data have been used to generate quality metrics, including hospital standardised mortality indicators.8 Hospital episode statistics (HES) data have also been used to build predictive risk models, for example to allow clinicians to identify cohorts at risk of readmissions,12 or to allocate scarce resources in real time.13
Clinically generated dataData collected by healthcare workers to provide diagnosis and treatment as part of clinical care. These data might arise from the patient (for example, reports of symptoms) but are recorded by the clinician. Secondary uses include the surveillance of disease incidence and prevalence.Electronic medical record of patient diagnoses and treatment.
Results of laboratory tests.
Compared with administrative data, less standardised in terms of the codes used and less likely to be collated at regional and national levels.
Electronic medical record: More than 90% of primary care doctors reported using the Electronic Medical Record (EMR) in Australia, the Netherlands, New Zealand, Norway and the UK in 2012.14 Linked EMR data have been used in Scotland to create a prospective cohort of patients with diabetes. In addition to being used to integrate patient care, they have been used in research to estimate life expectancy for the patient cohort.15 An evolution of the EMR (an electronic physiological surveillance system including improved recording of patients’ vital signs) was used to calculate early warning scores that led to a reduction in mortality as part of an advanced predictive risk model.16
National and regional microbiological surveillance system (UK): Results of clinical tests ordered by clinicians are recorded at a laboratory level before being reported regionally and nationally. There is mandatory reporting of certain infections and organisms (eg, Clostridium difficile) and voluntary reporting of others. These data are interpreted using automatic statistical algorithms to detect outbreaks of infectious disease.17
Patient-generated data (type 1: clinically directed)Data requested by the clinician or healthcare system and reported directly by the patient to monitor patient health.Data collected by the patient on clinical metrics (eg, blood pressure), symptoms, or patient reported outcomes.
Choice of data directed by the healthcare system.
Swedish rheumatology quality registry: Uses patient reported data as a decision support tool to optimise treatment during routine clinic visits and for comparative effectiveness studies. These data have also been used to examine the impact of multiple genetic, lifestyle and other factors on the health of patients.18
Telehealth: For example, patients with heart failure are asked to supply information on weight or symptoms on a regular basis, using either the telephone19 or Bluetooth-enabled devices20
Patient-generated data (type 2: individually directed)Data that the individual decides to record autonomously without the direct involvement of a health care practitioner, for personal monitoring of symptoms, social networking or peer support.Symptoms and treatment recorded by the patient.
Recorded outside the ‘traditional’ healthcare system structures.
Patients like me: An online (http://www.patientslikeme.com) quantitative personal research platform for patients with life-changing illnesses. A cross-sectional online survey showed that patients perceived benefit from using these networks to learn about a symptom they had experienced and to understand the side effects of their treatments.21 Similar platforms exist for mental health22 and cardiology.23
Individual and patient activity on social media: Analysis of key terms on Twitter has been used to monitor patient outcomes and perception of care. No clear relationship between Twitter sentiment and other measures of quality has been shown.24 25 There has also been an attempt to use search engine usage (Google) to track and predict flu outbreaks but to date there has been no demonstrated public health benefit.26
Machine-generated dataData automatically generated by a computer process, sensor, etc, to monitor staff or patient behaviour passively.Record of individual behaviour as generated by interaction with machines.
The nature of the data recorded is determined by the technology used and substantial processing is typically required to interpret it.
Indoor positioning technologies: Sensors have been used to record the movement of healthcare workers within out-of-hours care.27 A recent study used sensors on healthcare workers and hand hygiene dispensers to show that healthcare workers were three times more likely to use the gel dispensers when they could see the auditor.28
Telecare sensors: Telecare aims for remote, passive and automatic monitoring of behaviour within the home, for example for frail older people.29 A Cochrane review on ‘smart home’ technology found no studies that fulfilled the quality criteria,29 and a larger, more recent randomised study has failed to demonstrate a positive impact of this approach on healthcare usage.30