General Practice Extraction Service (GPES) Data for Pandemic Planning and Research (GDPPR) Dataset#

1. Summary#

The information below is retrieved from the Health Data Gateway API developed by NHS England, with additional fields added by UK LLC (indicated by italics).

# define target dataset to document
schema = 'nhsd'
table = 'GDPPR'
version = 'v0003'
# import functions from script helper
import sys
script_fp = "../../../../scripts/"
sys.path.insert(0, script_fp)
from data_doc_helper import DocHelper
# create instance
document = DocHelper(schema, table, version, script_fp)
# markdown/code hybrid cell module requirement
from IPython.display import display, Markdown
# get api data
dataset = document.get_api_data()
display(Markdown("**NHS England title of dataset:** "+dataset['datasetfields']['metadataquality']['title']))
display(Markdown("***Dataset name in UK LLC TRE:*** *nhsd.GDPPR*"))  
display(Markdown("**Short abstract:** "+dataset['datasetfields']['abstract']))
display(Markdown("***Extended abstract:*** *The existing General Practice Extraction Service (GPES) is used to run regular extracts from General Practices in England that have opted into contributing to the GDPPR dataset. Included in the extracts are all patients currently registered with a GP or with a date of death on or after 1 November 2019 whose health record contains coded information relevant to pandemic planning and research. Data collected include demographic information, diagnoses and findings, medications and other prescribed items, investigations, tests and results, treatments and outcomes, and vaccinations and immunisations. For a full list of variables see the [NHS England Metadata dashboard.](https://digital.nhs.uk/services/data-access-request-service-dars/dars-products-and-services/metadata-dashboard) For further information see the [GDPPR guide for analysts.](https://digital.nhs.uk/coronavirus/gpes-data-for-pandemic-planning-and-research/guide-for-analysts-and-users-of-the-data)*"))
display(Markdown("**Geographical coverage:** "+dataset['datasetfields']['geographicCoverage'][0]))
display(Markdown("**Temporal coverage:** "+dataset['datasetfields']['datasetStartDate']))
display(Markdown("***Data available in UK LLC TRE from:*** *01/06/2020 onwards*"))
display(Markdown("**Typical age range:** "+dataset['datasetfields']['ageBand']))
display(Markdown("**Collection situation:** "+dataset['datasetv2']['provenance']['origin']['collectionSituation'][0]))
display(Markdown("**Purpose:** "+dataset['datasetv2']['provenance']['origin']['purpose'][0]))
display(Markdown("**Source:** "+dataset['datasetv2']['provenance']['origin']['source'][0]))
display(Markdown("**Pathway:** "+dataset['datasetv2']['coverage']['pathway']))
display(Markdown("***Information collected:*** *Demographic information, diagnoses and findings, medications and other prescribed items, investigations, tests and results, treatments and outcomes, and vaccinations and immunisations*"))  
display(Markdown("***Structure of dataset:*** *Each line represents one participant*"))  
display(Markdown("***Update frequency in UK LLC TRE:*** *Quarterly*"))  
display(Markdown("***Dataset versions in UK LLC TRE:*** *TBC*"))
display(Markdown("***Data quality issues:*** *TBC*"))  
display(Markdown("***Restrictions to data usage***: *Research must be related to COVID-19; data must not be used for any form of performance management of General Practices; and research must be for medical purposes only (medical research) as defined in the NHS Act 2006: [https://www.legislation.gov.uk/ukpga/2006/41/part/13/crossheading/patient-information](https://www.legislation.gov.uk/ukpga/2006/41/part/13/crossheading/patient-information)*"))  
display(Markdown("***Further information:*** *[https://digital.nhs.uk/coronavirus/gpes-data-for-pandemic-planning-and-research/guide-for-analysts-and-users-of-the-data](https://digital.nhs.uk/coronavirus/gpes-data-for-pandemic-planning-and-research/guide-for-analysts-and-users-of-the-data)*"))

NHS England title of dataset: GPES Data for Pandemic Planning and Research (COVID-19)

Dataset name in UK LLC TRE: nhsd.GDPPR

Short abstract: NHS Digital’s fortnightly collection of GP data will provide data to support vital planning and research into coronavirus (COVID-19).

Extended abstract: The existing General Practice Extraction Service (GPES) is used to run regular extracts from General Practices in England that have opted into contributing to the GDPPR dataset. Included in the extracts are all patients currently registered with a GP or with a date of death on or after 1 November 2019 whose health record contains coded information relevant to pandemic planning and research. Data collected include demographic information, diagnoses and findings, medications and other prescribed items, investigations, tests and results, treatments and outcomes, and vaccinations and immunisations. For a full list of variables see the NHS England Metadata dashboard. For further information see the GDPPR guide for analysts.

Geographical coverage: United Kingdom,England

Temporal coverage: 01/01/1900

Data available in UK LLC TRE from: 01/06/2020 onwards

Typical age range: 16-150

Collection situation: PRIMARY CARE

Purpose: ADMINISTRATIVE

Source: EPR

Pathway: NOT APPLICABLE

Information collected: Demographic information, diagnoses and findings, medications and other prescribed items, investigations, tests and results, treatments and outcomes, and vaccinations and immunisations

Structure of dataset: Each line represents one participant

Update frequency in UK LLC TRE: Quarterly

Dataset versions in UK LLC TRE: TBC

Data quality issues: TBC

Restrictions to data usage: Research must be related to COVID-19; data must not be used for any form of performance management of General Practices; and research must be for medical purposes only (medical research) as defined in the NHS Act 2006: https://www.legislation.gov.uk/ukpga/2006/41/part/13/crossheading/patient-information

Further information: https://digital.nhs.uk/coronavirus/gpes-data-for-pandemic-planning-and-research/guide-for-analysts-and-users-of-the-data

2. Metrics#

The tables below summarise the GDPPR dataset in the UK LLC TRE.

Table 1 The number of participants from each LPS that are represented in the GDPPR dataset in the UK LLC TRE
(Note: numbers relate to the most recent extract of NHS England data)

gb_cohort = document.get_cohort_count()
print(gb_cohort.to_markdown(index=False, tablefmt="fancy_grid"))
╒════════════════╤═════════╕
│ cohort         │   count │
╞════════════════╪═════════╡
│ ALSPAC         │    5886 │
├────────────────┼─────────┤
│ BCS70          │    5777 │
├────────────────┼─────────┤
│ BIB            │   27323 │
├────────────────┼─────────┤
│ ELSA           │    6779 │
├────────────────┼─────────┤
│ EPICN          │   14119 │
├────────────────┼─────────┤
│ EXCEED         │    9419 │
├────────────────┼─────────┤
│ FENLAND        │   10105 │
├────────────────┼─────────┤
│ GLAD           │   64761 │
├────────────────┼─────────┤
│ MCS            │   17574 │
├────────────────┼─────────┤
│ NCDS58         │    5901 │
├────────────────┼─────────┤
│ NEXTSTEP       │    5147 │
├────────────────┼─────────┤
│ NIHRBIO_COPING │   16063 │
├────────────────┼─────────┤
│ NSHD46         │    2280 │
├────────────────┼─────────┤
│ TEDS           │    8038 │
├────────────────┼─────────┤
│ TRACKC19       │   13738 │
├────────────────┼─────────┤
│ TWINSUK        │   11711 │
├────────────────┼─────────┤
│ UKHLS          │    6729 │
├────────────────┼─────────┤
│ total          │  231350 │
╘════════════════╧═════════╛

3. Helpful syntax#

Below we will include syntax that may be helpful to other researchers in the UK LLC TRE. For longer scripts, we will include a snippet of the code plus a link to Git where you can find the full script.