{
"cells": [
{
"cell_type": "markdown",
"id": "beb97ca4",
"metadata": {},
"source": [
"# LPS Harmonised Demographic Dataset (reduced)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "e3b8790c",
"metadata": {
"tags": [
"remove-input"
]
},
"outputs": [
{
"data": {
"text/markdown": [
">Last modified: 27 Oct 2025"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import sys\n",
"import os\n",
"sys.path.append(os.path.abspath('../../../../scripts/'))\n",
"from data_doc_helper import UKLLCDataSet as DS, last_modified\n",
"API_KEY = os.environ['FASTAPI_KEY']\n",
"ds = DS(\"rtn_lps_sociodemo_harmonised_reduced\")\n",
"last_modified()"
]
},
{
"cell_type": "markdown",
"id": "31920e47",
"metadata": {},
"source": [
"<div style=\"background-color: rgba(0, 178, 169, 0.3); padding: 5px; border-radius: 5px;\"><strong>UK LLC has created a harmonised dataset of key demographic variables across the partner LPS.</strong></div> "
]
},
{
"cell_type": "markdown",
"id": "367be529",
"metadata": {},
"source": [
"<div style=\"background-color: rgb(229, 106, 84, 0.3); padding: 5px; border-radius: 5px;\"><strong>More information about this dataset is available <a href=\"LPS_derived.html#harmonisation-methodology\" target=\"_blank\">here.</a></strong></div>"
]
},
{
"cell_type": "markdown",
"id": "fe073008",
"metadata": {},
"source": [
"## 1. Summary"
]
},
{
"cell_type": "markdown",
"id": "c938a7b1",
"metadata": {},
"source": [
"The reduced LPS harmonised demographic dataset contains harmonised variables for **sex, gender, year of birth** and **ethnic group**. This dataset retains only the **most recent response** provided by a participant for each variable."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "fb0becc7",
"metadata": {
"tags": [
"remove-input"
]
},
"outputs": [
{
"data": {
"text/html": [
"<style type=\"text/css\">\n",
"#T_047ee th {\n",
" text-align: left;\n",
"}\n",
"#T_047ee_row0_col0, #T_047ee_row0_col1, #T_047ee_row1_col0, #T_047ee_row1_col1, #T_047ee_row2_col0, #T_047ee_row2_col1, #T_047ee_row3_col0, #T_047ee_row3_col1, #T_047ee_row4_col0, #T_047ee_row4_col1, #T_047ee_row5_col0, #T_047ee_row5_col1, #T_047ee_row6_col0, #T_047ee_row6_col1, #T_047ee_row7_col0, #T_047ee_row7_col1, #T_047ee_row8_col0, #T_047ee_row8_col1, #T_047ee_row9_col0, #T_047ee_row9_col1, #T_047ee_row10_col0, #T_047ee_row10_col1, #T_047ee_row11_col0, #T_047ee_row11_col1 {\n",
" text-align: left;\n",
"}\n",
"</style>\n",
"<table id=\"T_047ee\" style=\"font-size: 14px\">\n",
" <thead>\n",
" <tr>\n",
" <th id=\"T_047ee_level0_col0\" class=\"col_heading level0 col0\" >Dataset Descriptor</th>\n",
" <th id=\"T_047ee_level0_col1\" class=\"col_heading level0 col1\" >Dataset-specific Information</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td id=\"T_047ee_row0_col0\" class=\"data row0 col0\" >Name of Dataset in TRE</td>\n",
" <td id=\"T_047ee_row0_col1\" class=\"data row0 col1\" >UKLLC_rtn_lps_sociodemo_harmonised_reduced</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_047ee_row1_col0\" class=\"data row1 col0\" >Citation (APA)</td>\n",
" <td id=\"T_047ee_row1_col1\" class=\"data row1 col1\" >UK Longitudinal Linkage Collaboration. (2025). <i>UK LLC Managed: LPS Harmonised Demographic Dataset (reduced).</i> UK Longitudinal Linkage Collaboration (UK LLC). <a href=\"https://doi.org/10.71760/ukllc-dataset-00437-01\" rel=\"noopener noreferrer\" target=\"_blank\">https://doi.org/10.71760/ukllc-dataset-00437-01</a></td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_047ee_row2_col0\" class=\"data row2 col0\" >Download Citation</td>\n",
" <td id=\"T_047ee_row2_col1\" class=\"data row2 col1\" > <a href=\"https://api.datacite.org/application/vnd.citationstyles.csl+json/10.71760/ukllc-dataset-00437-01\" rel=\"noopener noreferrer\" target=\"_blank\">Citeproc JSON</a> <a href=\"https://api.datacite.org/application/x-bibtex/10.71760/ukllc-dataset-00437-01\" rel=\"noopener noreferrer\" target=\"_blank\">BibTeX</a> <a href=\"https://api.datacite.org/application/x-research-info-systems/10.71760/ukllc-dataset-00437-01\" rel=\"noopener noreferrer\" target=\"_blank\">RIS</a></td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_047ee_row3_col0\" class=\"data row3 col0\" >Series</td>\n",
" <td id=\"T_047ee_row3_col1\" class=\"data row3 col1\" > <a href=\"https://guidebook.ukllc.ac.uk/docs/ukllc_managed_data/ukllc_data\">UK LLC Managed</a></td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_047ee_row4_col0\" class=\"data row4 col0\" >Owner</td>\n",
" <td id=\"T_047ee_row4_col1\" class=\"data row4 col1\" >UK Longitudinal Linkage Collaboration</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_047ee_row5_col0\" class=\"data row5 col0\" >Temporal Coverage</td>\n",
" <td id=\"T_047ee_row5_col1\" class=\"data row5 col1\" >Unknown - Unknown</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_047ee_row6_col0\" class=\"data row6 col0\" >Keywords</td>\n",
" <td id=\"T_047ee_row6_col1\" class=\"data row6 col1\" >harmonised sociodemographic ethnicity age sex</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_047ee_row7_col0\" class=\"data row7 col0\" >Participant Count</td>\n",
" <td id=\"T_047ee_row7_col1\" class=\"data row7 col1\" >331675</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_047ee_row8_col0\" class=\"data row8 col0\" >Number of variables</td>\n",
" <td id=\"T_047ee_row8_col1\" class=\"data row8 col1\" >8</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_047ee_row9_col0\" class=\"data row9 col0\" >Number of observations</td>\n",
" <td id=\"T_047ee_row9_col1\" class=\"data row9 col1\" >1426986</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_047ee_row10_col0\" class=\"data row10 col0\" >Specific Restrictions to Data Use</td>\n",
" <td id=\"T_047ee_row10_col1\" class=\"data row10 col1\" >None</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_047ee_row11_col0\" class=\"data row11 col0\" >Build a Data Request</td>\n",
" <td id=\"T_047ee_row11_col1\" class=\"data row11 col1\" > <a href=\"https://explore.ukllc.ac.uk/\" rel=\"noopener noreferrer\" target=\"_blank\">https://explore.ukllc.ac.uk/</a></td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n"
],
"text/plain": [
"<pandas.io.formats.style.Styler at 0x16adc61acf0>"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ds.info_table()"
]
},
{
"cell_type": "markdown",
"id": "799ad0f1",
"metadata": {},
"source": [
"## 2. Variables"
]
},
{
"cell_type": "markdown",
"id": "ff885b97",
"metadata": {},
"source": [
"| Variable name | Variable description | \n",
"|---|---|\n",
"| LLC_xxxx_stud_id | Individual identifier (unique to each project in the TRE) |\n",
"| cohort | LPS name |\n",
"| source | LPS dataset holding the original demographic variable(s) for each participant (e.g. ALSPAC_wave1y) |\n",
"| object | Label indicating which of the harmonised variables is represented by the value (e.g. llc_sex, llc_gender) |\n",
"| value | Numeric value for each of the objects |\n",
"| label | Description of what each of the values represents | \n",
"| llc_timestamp | Date (month and year) on which the information was provided by the participant to the LPS | "
]
},
{
"cell_type": "markdown",
"id": "8a469c49",
"metadata": {},
"source": [
"## 3. Version History"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "41fd9910",
"metadata": {
"tags": [
"remove-input"
]
},
"outputs": [
{
"data": {
"text/html": [
"<style type=\"text/css\">\n",
"#T_c699a th {\n",
" text-align: left;\n",
"}\n",
"#T_c699a_row0_col0, #T_c699a_row0_col1, #T_c699a_row1_col0, #T_c699a_row1_col1, #T_c699a_row2_col0, #T_c699a_row2_col1, #T_c699a_row3_col0, #T_c699a_row3_col1, #T_c699a_row4_col0, #T_c699a_row4_col1 {\n",
" text-align: left;\n",
"}\n",
"</style>\n",
"<table id=\"T_c699a\" style=\"font-size: 14px\">\n",
" <thead>\n",
" <tr>\n",
" <th id=\"T_c699a_level0_col0\" class=\"col_heading level0 col0\" >Version</th>\n",
" <th id=\"T_c699a_level0_col1\" class=\"col_heading level0 col1\" >1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td id=\"T_c699a_row0_col0\" class=\"data row0 col0\" >Version Date</td>\n",
" <td id=\"T_c699a_row0_col1\" class=\"data row0 col1\" >04 Jun 2025</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_c699a_row1_col0\" class=\"data row1 col0\" >Number of Variables</td>\n",
" <td id=\"T_c699a_row1_col1\" class=\"data row1 col1\" >8</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_c699a_row2_col0\" class=\"data row2 col0\" >Number of Observations</td>\n",
" <td id=\"T_c699a_row2_col1\" class=\"data row2 col1\" >1426986</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_c699a_row3_col0\" class=\"data row3 col0\" >DOI</td>\n",
" <td id=\"T_c699a_row3_col1\" class=\"data row3 col1\" > <a href=\"https://doi.org/10.71760/ukllc-dataset-00437-01\" rel=\"noopener noreferrer\" target=\"_blank\">10.71760/ukllc-dataset-00437-01</a></td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_c699a_row4_col0\" class=\"data row4 col0\" >Change Log</td>\n",
" <td id=\"T_c699a_row4_col1\" class=\"data row4 col1\" > <a href=\"https://api.datacite.org/dois/10.71760/ukllc-dataset-00437-01/activities\" rel=\"noopener noreferrer\" target=\"_blank\">10.71760/ukllc-dataset-00437-01/activities</a></td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n"
],
"text/plain": [
"<pandas.io.formats.style.Styler at 0x16adc527d90>"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ds.version_history()"
]
},
{
"cell_type": "markdown",
"id": "bedf5551",
"metadata": {},
"source": [
"## 4. Useful Syntax"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "29e9d5bb",
"metadata": {
"tags": [
"remove-input"
]
},
"outputs": [
{
"data": {
"text/markdown": [
"Below we will include syntax that may be helpful to other researchers in the UK LLC TRE. For longer scripts, we will include a snippet of the code plus a link to Git where you can find the full scripts."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"ds.useful_syntax()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "jupbook",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}