COVID-19 Second Generation Surveillance System (COVIDSGSS) Dataset#

1. Summary#

The information below is retrieved from the Health Data Gateway API developed by NHS England, with additional fields added by UK LLC (indicated by italics).

# define target dataset to document
schema = 'nhsd'
table = 'COVIDSGSS'
version = 'v0003'
# import functions from script helper
import sys
script_fp = "../../../../scripts/"
sys.path.insert(0, script_fp)
from data_doc_helper import DocHelper
# create instance
document = DocHelper(schema, table, version, script_fp)
# markdown/code hybrid cell module requirement
from IPython.display import display, Markdown
# get api data
dataset = document.get_api_data()
display(Markdown("**NHS England title of dataset:** "+dataset['datasetfields']['metadataquality']['title']))
display(Markdown("***Dataset name in UK LLC TRE:*** *nhsd.COVIDSGSS*"))  
display(Markdown("**Short abstract:** "+dataset['datasetfields']['abstract']))
display(Markdown("***Extended abstract:*** *UK Health Security Agency's (UKHSA) Second Generation Surveillance System (SGSS) is used to capture routine laboratory surveillance data on infectious diseases from diagnostic laboratories across England. Diagnostic laboratories are required to notify the UKHSA when specified causative agents are found in a human sample. The COVIDSGSS data reflect swab testing offered to those in hospital and NHS key workers (i.e. Pillar 1) and the wider community at drive through test centres, walk in centres, home kits returned by post, care homes, etc. (i.e. Pillar 2).*"))
display(Markdown("**Geographical coverage:** "+dataset['datasetfields']['geographicCoverage'][0]))
display(Markdown("**Temporal coverage:** "+dataset['datasetfields']['datasetStartDate']))
display(Markdown("***Data available in UK LLC TRE from:*** *06/04/2020 onwards*"))
display(Markdown("**Typical age range:** "+dataset['datasetfields']['ageBand']))
display(Markdown("**Collection situation:** "+dataset['datasetv2']['provenance']['origin']['collectionSituation'][0]))
display(Markdown("**Purpose:** "+dataset['datasetv2']['provenance']['origin']['purpose'][0]))
display(Markdown("**Source:** "+dataset['datasetv2']['provenance']['origin']['source'][0]))
display(Markdown("**Pathway:** "+dataset['datasetv2']['coverage']['pathway']))
display(Markdown("***Information collected:*** *Demographic information about people who test positive for SARS-CoV-2. For a full list of variables see the [NHS England Metadata dashboard.](https://digital.nhs.uk/services/data-access-request-service-dars/dars-products-and-services/metadata-dashboard)*"))  
display(Markdown("***Structure of dataset:*** *Each line represents one participant.*"))  
display(Markdown("***Update frequency in UK LLC TRE:*** *Quarterly*"))  
display(Markdown("***Dataset versions in UK LLC TRE:*** *TBC*"))
display(Markdown("***Data quality issues:*** *TBC*"))  
display(Markdown("***Restrictions to data usage***: *Research must be related to COVID-19 and be for medical purposes only (medical research) as defined in the NHS Act 2006: [https://www.legislation.gov.uk/ukpga/2006/41/part/13/crossheading/patient-information](https://www.legislation.gov.uk/ukpga/2006/41/part/13/crossheading/patient-information)*"))  
display(Markdown("***Further information:*** *[https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/data-provision-notices-dpns/sgss-and-sari-watch-data](https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/data-provision-notices-dpns/sgss-and-sari-watch-data)*"))

NHS England title of dataset: Covid-19 Second Generation Surveillance System

Dataset name in UK LLC TRE: nhsd.COVIDSGSS

Short abstract: Data forming the Covid-19 Second Generation Surveillance Systems data set relate to demographic and diagnostic information from Pillar 1 swab testing in PHE labs and NHS hospitals and Pillar 2 Swab testing in the community.

Extended abstract: UK Health Security Agency’s (UKHSA) Second Generation Surveillance System (SGSS) is used to capture routine laboratory surveillance data on infectious diseases from diagnostic laboratories across England. Diagnostic laboratories are required to notify the UKHSA when specified causative agents are found in a human sample. The COVIDSGSS data reflect swab testing offered to those in hospital and NHS key workers (i.e. Pillar 1) and the wider community at drive through test centres, walk in centres, home kits returned by post, care homes, etc. (i.e. Pillar 2).

Geographical coverage: United Kingdom,England

Temporal coverage: 06/04/2020

Data available in UK LLC TRE from: 06/04/2020 onwards

Typical age range: 0-150

Collection situation: IN-PATIENTS

Purpose: CARE

Source: LIMS

Pathway: NOT APPLICABLE

Information collected: Demographic information about people who test positive for SARS-CoV-2. For a full list of variables see the NHS England Metadata dashboard.

Structure of dataset: Each line represents one participant.

Update frequency in UK LLC TRE: Quarterly

Dataset versions in UK LLC TRE: TBC

Data quality issues: TBC

Restrictions to data usage: Research must be related to COVID-19 and be for medical purposes only (medical research) as defined in the NHS Act 2006: https://www.legislation.gov.uk/ukpga/2006/41/part/13/crossheading/patient-information

Further information: https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/data-provision-notices-dpns/sgss-and-sari-watch-data

2. Metrics#

The tables below summarise the COVIDSGSS dataset in the UK LLC TRE.

Table 1 The number of participants from each LPS that are represented in the COVIDSGSS dataset in the UK LLC TRE
(Note: numbers relate to the most recent extract of NHS England data)

# group extract by date
gb_cohort = document.get_cohort_count()
print(gb_cohort.to_markdown(index=False, tablefmt="fancy_grid"))
#display(gb_cohort)
╒════════════════╤═════════╕
│ cohort         │   count │
╞════════════════╪═════════╡
│ ALSPAC         │    2524 │
├────────────────┼─────────┤
│ BCS70          │    2201 │
├────────────────┼─────────┤
│ BIB            │    9398 │
├────────────────┼─────────┤
│ ELSA           │    1761 │
├────────────────┼─────────┤
│ EPICN          │    2405 │
├────────────────┼─────────┤
│ EXCEED         │    2946 │
├────────────────┼─────────┤
│ FENLAND        │    3135 │
├────────────────┼─────────┤
│ GLAD           │   31484 │
├────────────────┼─────────┤
│ MCS            │    6784 │
├────────────────┼─────────┤
│ NCDS58         │    1684 │
├────────────────┼─────────┤
│ NEXTSTEP       │    2070 │
├────────────────┼─────────┤
│ NIHRBIO_COPING │    7170 │
├────────────────┼─────────┤
│ NSHD46         │     404 │
├────────────────┼─────────┤
│ TEDS           │    3479 │
├────────────────┼─────────┤
│ TRACKC19       │    6363 │
├────────────────┼─────────┤
│ TWINSUK        │    4074 │
├────────────────┼─────────┤
│ UKHLS          │    2660 │
├────────────────┼─────────┤
│ total          │   90542 │
╘════════════════╧═════════╛

3. Helpful syntax#

Below we will include syntax that may be helpful to other researchers in the UK LLC TRE. For longer scripts, we will include a snippet of the code plus a link to Git where you can find the full script.