Avoidable Admissions Analysis Pipeline Example
Admitted Care Data Set
Site: Lancashire Teaching Hospitals NHS Trust
Created: 2023-02-04
Lead site: SCHARR Institute, Sheffield University
References
- Statistical Analysis Plan - https://docs.google.com/document/d/1mpRKxNDbkTPwDhg7S_AaOXwbFQvHjXQRL7Qa31zn7DY/edit
- Data Processing - https://docs.google.com/document/d/1vysTKmvELK-5Rr7Dib3zDp8mCe_lUVr2e5EES23ShbQ/edit
- Analysis Tables - https://docs.google.com/document/d/10PuNTnEG5zTkWOVaMGlfOInleXJ0KA-_5hSH4upwhi4/edit
- LTH GitHub Repo - https://github.com/LTHTR-DST/hdruk_avoidable_admissions
- Pipeline Docs - https://lthtr-dst.github.io/hdruk_avoidable_admissions/
- Collaboration Docs - https://mattstammers.github.io/hdruk_avoidable_admissions_collaboration_docs/
%load_ext autoreload
%autoreload 2
from datetime import datetime
now = datetime.today()
print("Starting pipeline execution at", now)
Starting pipeline execution at 2023-02-05 11:51:15.151223
import os
from pathlib import Path
from dotenv import load_dotenv
load_dotenv("../.env")
dir_data = Path(os.getenv("DIR_DATA"))
dir_data
WindowsPath('T:/Business Intelligence/Data Science/Work/hdruk_avoidable_adms')
import numpy as np
import pandas as pd
from IPython.display import HTML
from avoidable_admissions.data.validate import (
AdmittedCareEpisodeSchema,
AdmittedCareFeatureSchema,
get_schema_properties,
validate_admitted_care_data,
validate_admitted_care_features,
validate_dataframe,
)
from avoidable_admissions.features import feature_maps
from avoidable_admissions.features.build_features import build_admitted_care_features
Load Data
# df_admcare = pd.read_csv(dir_data.joinpathpath('raw', 'admitted_care.csv')
df_admcare = pd.read_csv("../data/raw/admitted_care.csv")
# Create a copy of the data to fix DQ issues to avoid reloading data from source everytime
dfa = df_admcare.copy()
"Raw dataframe contains %d rows and %d columns" % dfa.shape
'Raw dataframe contains 47483 rows and 63 columns'
Pipeline Overview
See https://lthtr-dst.github.io/hdruk_avoidable_admissions/pipeline/ for more details.
First Validation
This is done using the raw data typically extracted from SQL databases maintained by Business Intelligence.
Receiving the data in a format that matches the Sheffield specification as closely as possible will make subsequent steps easier.
Always, always, have friends in BI. Who is your Quin?
Admitted Care Episode Validation Rules
Use get_schema_properties(Schema)
for an overview of what the expectations of the first validation schema are.
Help improve the validation rules and documentation, eg. title, description, etc. by contributing to the GitHub repo or raising an issue.
schema_1 = (
get_schema_properties(AdmittedCareEpisodeSchema)
.sort_values("name")
.set_index("name")
)
schema_1
dtype | nullable | unique | coerce | required | checks | regex | title | |
---|---|---|---|---|---|---|---|---|
name | ||||||||
admiage | int64 | False | False | False | True | [<Check greater_than_or_equal_to: greater_than_or_equal_to(18)>, <Check less_than_or_equal_to: less_than_or_equal_to(130)>] | False | None |
admidate | date | False | False | False | True | [<Check greater_than_or_equal_to: greater_than_or_equal_to(2021-10-01)>, <Check less_than_or_equal_to: less_than_or_equal_to(2022-09-30)>] | False | None |
admimeth | str | True | False | False | True | [<Check isin: isin({'31', '32', '28', '12', '24', '2A', '98', '82', '23', '81', '25', '13', '83', '2C', '21', '2D', '22', '2B', '11', '99'})>] | False | None |
admisorc | str | True | False | False | True | [<Check isin: isin({'53', '54', '66', '55', '65', '39', '37', '19', '98', '87', '79', '51', '88', '86', '52', '40', '56', '85', '29', '49', '99'})>] | False | None |
admitime | str | True | False | True | True | [<Check str_matches: str_matches(re.compile('2[0-3]|[01]?[0-9]:[0-5][0-9]'))>] | False | None |
diag_01 | str | True | False | False | True | [] | False | None |
diag_[0-9]{2} | str | True | False | False | True | [] | True | None |
disdest | str | True | False | False | True | [<Check isin: isin({'53', '54', '66', '48', '30', '65', '49', '39', '37', '19', '98', '87', '79', '84', '51', '88', '52', '85', '50', '29', '38', '99'})>] | False | None |
dismeth | str | True | False | False | True | [<Check isin: isin({'2', '8', '4', '1', '5', '9', '3'})>] | False | None |
epiorder | int64 | True | False | False | True | [<Check greater_than_or_equal_to: greater_than_or_equal_to(0)>] | False | None |
ethnos | str | False | False | False | True | [<Check isin: isin({'C', 'H', 'M', 'N', 'J', 'R', 'B', 'L', 'A', 'G', 'K', 'S', 'P', 'E', 'Z', 'F', '99', 'D'})>] | False | None |
gender | str | False | False | False | True | [<Check isin: isin({'2', 'X', '1', '9', '0'})>] | False | None |
length_of_stay | float64 | True | False | False | True | [<Check greater_than_or_equal_to: greater_than_or_equal_to(0)>] | False | None |
opdate_01 | datetime64[ns] | True | False | False | True | [] | False | None |
opdate_[0-9]{2} | datetime64[ns] | True | False | False | True | [] | True | None |
opertn_01 | str | True | False | False | True | [] | False | None |
opertn_[0-9]{2} | str | True | False | False | True | [] | True | None |
patient_id | int64 | False | False | False | True | [] | False | None |
procodet | str | False | False | False | True | [] | False | None |
sitetret | str | False | False | False | True | [] | False | None |
townsend_score_decile | int64 | False | False | False | True | [<Check greater_than_or_equal_to: greater_than_or_equal_to(0)>, <Check less_than_or_equal_to: less_than_or_equal_to(10)>] | False | None |
visit_id | int64 | False | True | False | True | [] | False | None |
Validate using AdmittedCareEpisodeSchema
The first time we try to validate the raw data, there are several errors for multiple reasons.
Spending a few minutes reading the schema validation error messages as well as reading the documentation for Pandera will save days for this and future projects.
validate_dataframe(dfa, AdmittedCareEpisodeSchema)
is equivalent to validate_admitted_care_data(dfa)
good, bad = validate_dataframe(dfa, AdmittedCareEpisodeSchema)
print("Good dataframe has %d rows" % good.shape[0])
print("Bad dataframe has %d rows" % bad.shape[0])
Schema AdmittedCareEpisodeSchema: A total of 18 schema errors were found.
Error Counts
------------
- column_not_in_dataframe: 1
- schema_component_check: 17
Schema Error Summary
--------------------
failure_cases n_failes
schema_context column check
DataFrameSchema <NA> column_in_dataframe [visit_id] 1
Column admiage greater_than_or_equal_to(18) [0,3,12,14,4,15,2,6,11,5,1,16,17,10,7,8,13,9] 18
admidate greater_than_ [TypeError("'>=' not supported between instances of 'str' and 'datetime.date'")] 1
less_than_or_equal_to(2022-09-30) [TypeError("'<=' not supported between instances of 'str' and 'datetime.date'")] 1
admimeth isin({'31','32','28','12','24','2A','98','82','23','81','25','13','83','2C','21','2D','22','2B','11','99'}) [27] 1
admisorc dtype('str') [int64] 1
isin({'53','54','66','55','65','39','37','19','98','87','79','51','88','86','52','40','56','85','29','49','99'}) [51,19,66,52,29,56,99,88,40,49,33,87,79,53] 14
disdest dtype('str') [int64] 1
isin({'53','54','66','48','30','65','49','39','37','19','98','87','79','84','51','88','52','85','50','29','38','99'}) [19,99,79,88,29,51,52,85,84,54,38,50,37,66,65,53,87,49,48,98,30] 21
dismeth dtype('str') [int64] 1
isin({'2','8','4','1','5','9','3'}) [1,4,2,8,9] 5
gender dtype('str') [int64] 1
isin({'2','X','1','9','0'}) [2,1,9] 3
opdate_01 dtype('datetime64[ns]') [object] 1
townsend_score_decile dtype('int64') [float64] 1
not_nullable [nan] 1
Usage Tip
---------
Directly inspect all errors by catching the exception:
```
try:
schema.validate(dataframe, lazy=True)
except SchemaErrors as err:
err.failure_cases # dataframe of schema errors
err.data # invalid dataframe
```
No data will pass validation due to column error. See output above.
Good dataframe has 0 rows
Bad dataframe has 246321 rows
Iterative DQ Fixes
Fix one column or one error at a time and rerun validate_dataframe(dfa, AdmittedCareEpisodeSchema)
in the previous cell to check if it has worked.
If you have made an error in data transformation, rerun dfa = df_admcare.copy()
to start again.
Error Behaviour: In case of errors other than SchemaErrors, validate_dataframe
will print out the error message and return an empty good
data frame and a bad
dataframe containing all rows in input dataframe. Column errors, i.e., missing, misspelt or unexpected additional columns will also result in identical behaviour.
The bad
dataframe may contain more rows than the input dataframe, as each error generates a new row. For instance, if one row in the input dataframe resulted in 5 errors in different columns, this will generate 5 rows in the bad
dataframe, each with details on the specific errors available in additional columns.
TODO: vc to fix. If you get a ufunc 'bitwise_or' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
error, this is usually caused by a dtype mismatch between datetime
in the input data when the validation code expects a date
dtype.
Feature Maps: These set out what values are allowed in various columns to pass first validation, how these will be transformed during feature engineering and what values are allowed in the new columns to pass second validation. https://github.com/LTHTR-DST/hdruk_avoidable_admissions/blob/dev/avoidable_admissions/features/feature_maps.py
_Help improve this process by raising a GitHub issue if validation fails when it shouldn't or you run into unexpected errors.
"Starting with %d rows and %d columns" % dfa.shape
'Starting with 47483 rows and 63 columns'
dfa["visit_id"] = dfa.reset_index(drop=True).index
dfa.patient_id = dfa.patient_id.astype(np.int64)
dfa = dfa.loc[dfa.admiage >= 18].copy()
"Dataframe has %d rows and %d columns" % dfa.shape
'Dataframe has 39168 rows and 64 columns'
cols_opdate = dfa.filter(regex="opdate").columns
dfa[cols_opdate] = dfa[cols_opdate].apply(pd.to_datetime, dayfirst=True)
dfa.admidate = pd.to_datetime(dfa.admidate, dayfirst=True).dt.date
# dfa.admimeth = dfa.admimeth.astype(str)
dfa = dfa[dfa.admimeth != "27"] # Careful
"Dataframe has %d rows and %d columns" % dfa.shape
'Dataframe has 39167 rows and 64 columns'
dfa.admisorc = dfa.admisorc.astype(str)
dfa = dfa[dfa.admisorc.isin(feature_maps.admisorc)]
dfa = dfa[dfa.admimeth != "27"] # Careful
"Dataframe has %d rows and %d columns" % dfa.shape
'Dataframe has 39166 rows and 64 columns'
dfa.gender = dfa.gender.astype(str)
dfa.disdest = dfa.disdest.astype(str)
dfa = dfa[dfa.disdest.isin(feature_maps.disdest)]
"Dataframe has %d rows and %d columns" % dfa.shape
'Dataframe has 39166 rows and 64 columns'
dfa.dismeth = dfa.dismeth.astype(str)
dfa = dfa[dfa.dismeth.isin(feature_maps.dismeth)]
"Dataframe has %d rows and %d columns" % dfa.shape
'Dataframe has 39166 rows and 64 columns'
# Set missing IMD values to 0 to pass validation. Address how this is dealth with at the analytics stage.
# This is to get around the quirks of what nan means in the Pandas (and Pandera) world.
dfa.townsend_score_decile = dfa.townsend_score_decile.fillna(0).astype(np.int64)
# If everything above worked, this should not throw any errors.
good, bad = validate_admitted_care_data(dfa)
print("Good dataframe has %d rows" % good.shape[0])
print("Bad dataframe has %d rows" % bad.shape[0])
Good dataframe has 39166 rows
Bad dataframe has 0 rows
Feature engineering
See documentation here: https://lthtr-dst.github.io/hdruk_avoidable_admissions/features/
This has been written to Sheffield's specifications and will change if the specification changes.
_Help improve this process by raising a GitHub issue or contributing code.
dff = build_admitted_care_features(good.copy())
Second validation
Second validation and addressing DQ is identical to the first validation process.
Admitted Care Feature Validation Rules
The features dataframe should contain all columns from the initial dataframe and the new generated features.
Once again have a look at feature_maps.py
for more details.
Let's look at only the new features and the rules that apply to them.
schema_2 = (
get_schema_properties(AdmittedCareFeatureSchema)
.sort_values("name")
.set_index("name")
)
schema_2[~schema_2.index.isin(schema_1.index)]
dtype | nullable | unique | coerce | required | checks | regex | title | description | |
---|---|---|---|---|---|---|---|---|---|
name | |||||||||
admiage_cat | str | False | False | False | True | [<Check isin: isin({'65 - 69', '25 - 29', '75 ... | False | None | None |
admidayofweek | None | True | False | False | True | [<Check isin: isin({'Sunday', 'Wednesday', 'Mo... | False | None | None |
admisorc_cat | None | True | False | False | True | [<Check isin: isin({'Medical care', 'Residence... | False | None | None |
diag_01_acsc | None | False | False | False | True | [<Check isin: isin({'COPD', 'Pneumothorax ', '... | False | None | None |
diag_seasonal_cat | None | True | False | False | True | [<Check isin: isin({'Respiratory infection', '... | False | None | None |
disdest_cat | None | True | False | False | True | [<Check isin: isin({'Medical care', 'Died', 'C... | False | None | None |
dismeth_cat | None | True | False | False | True | [<Check isin: isin({'Died', 'Discharged', 'Unk... | False | None | None |
ethnos_cat | str | True | False | False | True | [<Check isin: isin({'Other Ethnic Groups', 'As... | False | None | None |
gender_cat | str | False | False | False | True | [<Check isin: isin({'Female', 'Indeterminate',... | False | None | None |
length_of_stay_cat | None | True | False | False | True | [<Check isin: isin({'>=2 days', '<2 days'})>] | False | None | None |
opertn_count | int64 | False | False | False | True | [<Check greater_than_or_equal_to: greater_than... | False | None | None |
townsend_score_quintile | int64 | True | False | False | True | [<Check in_range: in_range(0, 5)>] | False | None | None |
Fix minor DQ issues and validate using AdmittedCareFeatureSchema
# TODO: vc to fix. Easiest will be to validate this as datetime rather than date
dff.admidate = dff.admidate.dt.date
# TODO: vc to fix. build_features makes this a categorical variable. Needs to be fixed either in build_features or in the validation schema
dff.admiage_cat = dff.admiage_cat.astype(str)
good_f, bad_f = validate_dataframe(dff, AdmittedCareFeatureSchema)
print("Good dataframe has %d rows" % good_f.shape[0])
print("Bad dataframe has %d rows" % bad_f.shape[0])
Good dataframe has 39166 rows
Bad dataframe has 0 rows
Analysis
Refer to https://docs.google.com/document/d/10PuNTnEG5zTkWOVaMGlfOInleXJ0KA-_5hSH4upwhi4/edit for details.
df = good_f.copy()
This section describes the analysis populations or the analysis of the APC data. These populations will be referred to in the analysis descriptions that follow.
All Acute Admissions
This analysis population includes all acute admissions.
Filter: (admimeth in 21, 22, 23, 24, 25, 2A, 2B, 2C, 2D, 28) & (epiorder = 1)
These filters should have been applied to the original data extract.
All Acute Admissions for ACSCs
This analysis population includes all acute admissions where the primary diagnosis from the first episode is an ACSC.
Filter: (admimeth in 21, 22, 23, 24, 25, 2A, 2B, 2C, 2D, 28) & (epiorder = 1) & (diag_01 = ACSC_Code_APC)
All Acute Admissions for Non-ACSCs
This analysis population includes all acute admissions where the primary diagnosis from the first episode is not an ACSC.
Filter: (admimeth in 21, 22, 23, 24, 25, 2A, 2B, 2C, 2D, 28) & (epiorder = 1) & (diag_01 != ACSC_Code_APC)
# All acute admissions - this should be redundant if done at extraction
df = df[
(df.admimeth.isin({"21", "22", "23", "24", "25", "2A", "2B", "2C", "2D", "28"}))
& (df.epiorder == 1)
]
# Acute admissions by ACSC status
df["is_acsc"] = df.diag_01_acsc.where(df.diag_01_acsc == "-", "ACSC").replace(
"-", "Non-ACSC"
)
Age
pd.concat(
[df.admiage.describe().rename("All"), df.groupby("is_acsc").admiage.describe().T],
axis=1,
).T
count | mean | std | min | 25% | 50% | 75% | max | |
---|---|---|---|---|---|---|---|---|
All | 3#.## | 5#.## | 2#.## | 1#.## | 3#.## | 6#.## | 7#.## | 1#.## |
ACSC | 1#.## | 6#.## | 1#.## | 1#.## | 5#.## | 6#.## | 8#.## | 1#.## |
Non-ACSC | 2#.## | 5#.## | 2#.## | 1#.## | 3#.## | 5#.## | 7#.## | 1#.## |
categorical_features = {
"admiage_cat": "Age Bands",
"gender_cat": "Gender",
"ethnos_cat": "Ethnicity",
"townsend_score_quintile": "Townsend Score Quintile",
"admisorc_cat": "Admission Source",
"admidayofweek": "Admission Day of Week",
"diag_seasonal_cat": "Seasonal Diagnosis",
"length_of_stay_cat": "Length of Stay",
"disdest_cat": "Discharge Destination",
"dismeth_cat": "Discharge Method",
}
def make_crosstab(colname, tablename):
x = pd.crosstab(df[k], df.is_acsc, margins=True, dropna=False, margins_name="Total")
y = (
pd.crosstab(
df[k],
df.is_acsc,
normalize="index",
dropna=False,
margins_name="Total",
)
.mul(100)
.round(2)
.rename(columns={"ACSC": "ACSC %", "Non-ACSC": "Non-ACSC %"})
)
z = pd.concat([x, y], axis=1).sort_index(axis=1).fillna("-")
z.index = pd.MultiIndex.from_tuples([(v, i) for i in z.index])
return z
# HTML Output
out = ""
df_results = []
for k, v in categorical_features.items():
z = make_crosstab(k, v)
df_results.append(z)
out += f"""
### {v}
{z.to_html()}
"""
df_results = pd.concat(df_results)
HTML(out)
Age Bands
is_acsc | ACSC | ACSC % | Non-ACSC | Non-ACSC % | Total | |
---|---|---|---|---|---|---|
Age Bands | 18-19 | 1### | 1#.## | 7### | 8#.## | 9### |
20 - 24 | 4### | 2#.## | 1### | 7#.## | 2### | |
25 - 29 | 4### | 1#.## | 2### | 8#.## | 2### | |
30 - 34 | 5### | 2#.## | 1### | 7#.## | 2### | |
35 - 39 | 5### | 2#.## | 1### | 7#.## | 2### | |
40 - 44 | 6### | 3#.## | 1### | 6#.## | 1### | |
45 - 49 | 7### | 3#.## | 1### | 6#.## | 1### | |
50 - 54 | 9### | 4#.## | 1### | 5#.## | 2### | |
55 - 59 | 1### | 4#.## | 1### | 5#.## | 2### | |
60 - 64 | 1### | 4#.## | 1### | 5#.## | 2### | |
65 - 69 | 1### | 4#.## | 1### | 5#.## | 2### | |
70 - 74 | 1### | 4#.## | 1### | 5#.## | 3### | |
75 - 79 | 1### | 4#.## | 1### | 5#.## | 3### | |
80 - 84 | 1### | 4#.## | 1### | 5#.## | 3### | |
>85 | 2### | 4#.## | 2### | 5#.## | 4### | |
Total | 1### | - | 2### | - | 3### |
Gender
is_acsc | ACSC | ACSC % | Non-ACSC | Non-ACSC % | Total | |
---|---|---|---|---|---|---|
Gender | Female | 8### | 3#.## | 1### | 6#.## | 2### |
Indeterminate | 4### | 3#.## | 9### | 6#.## | 1### | |
Male | 7### | 4#.## | 9### | 5#.## | 1### | |
Total | 1### | - | 2### | - | 3### |
Ethnicity
is_acsc | ACSC | ACSC % | Non-ACSC | Non-ACSC % | Total | |
---|---|---|---|---|---|---|
Ethnicity | Asian or Asian British | 7### | 3#.## | 1### | 6#.## | 2### |
Black or Black British | 1### | 3#.## | 1### | 6#.## | 3### | |
Mixed | 9### | 3#.## | 2### | 6#.## | 3### | |
Not known | 1### | 0#.## | 1### | 1#.## | 1### | |
Not stated | 5### | 2#.## | 1### | 7#.## | 2### | |
Other Ethnic Groups | 1### | 3#.## | 2### | 6#.## | 4### | |
White | 1### | 4#.## | 2### | 5#.## | 3### | |
Total | 1### | - | 2### | - | 3### |
Townsend Score Quintile
is_acsc | ACSC | ACSC % | Non-ACSC | Non-ACSC % | Total | |
---|---|---|---|---|---|---|
Townsend Score Quintile | 0 | 1### | 2#.## | 2### | 7#.## | 3### |
1 | 3### | 3#.## | 5### | 6#.## | 9### | |
2 | 3### | 3#.## | 4### | 6#.## | 7### | |
3 | 2### | 3#.## | 3### | 6#.## | 5### | |
4 | 3### | 3#.## | 5### | 6#.## | 8### | |
5 | 3### | 4#.## | 4### | 5#.## | 7### | |
Total | 1### | - | 2### | - | 3### |
Admission Source
is_acsc | ACSC | ACSC % | Non-ACSC | Non-ACSC % | Total | |
---|---|---|---|---|---|---|
Admission Source | Care Home | 2### | 1#.## | 1### | 8#.## | 1### |
Medical care | 8### | 3#.## | 1### | 6#.## | 2### | |
Penal | 5### | 5#.## | 5### | 5#.## | 1### | |
Residence | 1### | 3#.## | 2### | 6#.## | 3### | |
Unknown | 1### | 1#.## | 5### | 8#.## | 6### | |
Total | 1### | - | 2### | - | 3### |
Admission Day of Week
is_acsc | ACSC | ACSC % | Non-ACSC | Non-ACSC % | Total | |
---|---|---|---|---|---|---|
Admission Day of Week | Friday | 2### | 4#.## | 3### | 5#.## | 6### |
Monday | 2### | 3#.## | 3### | 6#.## | 5### | |
Saturday | 1### | 3#.## | 2### | 6#.## | 4### | |
Sunday | 1### | 3#.## | 2### | 6#.## | 3### | |
Thursday | 2### | 4#.## | 3### | 5#.## | 6### | |
Tuesday | 2### | 3#.## | 3### | 6#.## | 6### | |
Wednesday | 2### | 4#.## | 3### | 5#.## | 6### | |
Total | 1### | - | 2### | - | 3### |
Seasonal Diagnosis
is_acsc | ACSC | ACSC % | Non-ACSC | Non-ACSC % | Total | |
---|---|---|---|---|---|---|
Seasonal Diagnosis | - | 1### | 3#.## | 2### | 6#.## | 3### |
Chronic disease exacerbation | 1### | 9#.## | 9### | 7#.## | 1### | |
Respiratory infection | 1### | 8#.## | 2### | 1#.## | 2### | |
Total | 1### | - | 2### | - | 3### |
Length of Stay
is_acsc | ACSC | ACSC % | Non-ACSC | Non-ACSC % | Total | |
---|---|---|---|---|---|---|
Length of Stay | <2 days | 7### | 3#.## | 1### | 6#.## | 1### |
>=2 days | 8### | 4#.## | 1### | 5#.## | 1### | |
Total | 1### | - | 2### | - | 3### |
Discharge Destination
is_acsc | ACSC | ACSC % | Non-ACSC | Non-ACSC % | Total | |
---|---|---|---|---|---|---|
Discharge Destination | Care Home | 2### | 3#.## | 5### | 6#.## | 7### |
Died | 1### | 3#.## | 3### | 6#.## | 4### | |
Medical care | 1### | 2#.## | 5### | 7#.## | 6### | |
Penal | 3### | 1#.## | 2### | 8#.## | 2### | |
Residence | 1### | 4#.## | 1### | 5#.## | 3### | |
Unknown | 1### | 3#.## | 2### | 6#.## | 4### | |
Total | 1### | - | 2### | - | 3### |
Discharge Method
is_acsc | ACSC | ACSC % | Non-ACSC | Non-ACSC % | Total | |
---|---|---|---|---|---|---|
Discharge Method | Died | 5### | 3#.## | 1### | 6#.## | 1### |
Discharged | 1### | 3#.## | 2### | 6#.## | 3### | |
Not Applicable | 1### | 0#.## | 1### | 1#.## | 1### | |
Unknown | 1### | 0#.## | 1### | 1#.## | 1### | |
Total | 1### | - | 2### | - | 3### |
Export Output for Sheffield
timestamp = now.strftime("%Y_%m_%dT%H%M%S")
timestamp
'2023_02_05T115115'
Export as CSV
# save df_results to a CSV to send to Sheffield for programmatic meta-analysis
# df_results.to_csv('path/to/results_timestamp.csv')
Export as HTML
output_head = f"""
<h1> LTH Admitted Care Analysis Tables Draft</h1>
<pre>
prepared: {timestamp}
email: datascience@lthtr.nhs.uk
github: <a href='https://github.com/LTHTR-DST/'>https://github.com/LTHTR-DST/</a>
</pre>
"""
fp = f"../reports/lth_admitted_care_analysis_tables_draft_{timestamp}.html"
with open(fp, "w") as f:
x = df_results.to_html()
x = output_head + x
f.write(x)
print(
"Finished pipeline execution in",
round((datetime.today() - now).total_seconds()),
"seconds at",
datetime.today(),
)
Finished pipeline execution in 9 seconds at 2023-02-05 11:51:24.288155
The above outputs can easily be shared and if all sites used the same code, Sheffield should be able to generate the 'meta-analysis' more easily.
END OF NOTEBOOK