Please use this identifier to cite or link to this item: http://hdl.handle.net/1893/35101
Appears in Collections:Computing Science and Mathematics Research Reports
Title: Investigating Causal links from Observed Features in the first COVID-19 Waves in California
Author(s): Good, Sarah
O'Hare, Anthony
Contact Email: anthony.ohare@stir.ac.uk
Citation: Good S & O'Hare A (2023) <i>Investigating Causal links from Observed Features in the first COVID-19 Waves in California</i>. ArXiv: Ithaca, New York. https://doi.org/10.48550/arXiv.2303.14485
Issue Date: 25-Mar-2023
Date Deposited: 22-May-2023
Abstract: Determining who is at risk from a disease is important in order to protect vulnerable subpopula- tions during an outbreak. We are currently in a SARS-COV-2 (commonly referred to as COVID-19) pandemic which has had a massive impact across the world, with some communities and individuals seen to have a higher risk of severe outcomes and death from the disease compared to others. These risks are compounded for people of lower socioeconomic status, those who have limited access to health care, higher rates of chronic diseases, such as hypertension, diabetes (type-2), obesity, likely due to the chronic stress of these types of living conditions. Essential workers are also at a higher risk of COVID-19 due to having higher rates of exposure due to the nature of their work. In this study we determine the important features of the pandemic in California in terms of cumulative cases and deaths per 100,000 of population up to the date of 5 July, 2021 (the date of analysis) using Pearson correlation coefficients between population demographic features and cumulative cases and deaths. The most highly correlated features, based on the absolute value of their Pearson Correlation Coefficients in relation to cases or deaths per 100,000, were used to create regression models in two ways: using the top 5 features and using the top 20 features filtered out to limit interactions between features. These models were used to determine a) the most significant features out of these subsets and b) features that approximate different potential forces on COVID- 19 cases and deaths (especially in the case of the latter set). Additionally, co-correlations, defined as demographic features not within a given input feature set for the regression models but which are strongly correlated with the features included within, were calculated for all features. The five features which had the highest correlations to cumulative cases per 100,000 were found to be the following: Overcrowding (% of households), Average Household Size, Hispanic ethnicity (% of population), Ages 0-19 (% population), education level of 9th to 12th with no high school diploma (% of population older than 25 years), and incidence rates of Long-term Diabetes Compli- cations (per 100,000 population). For cumulative deaths per 100,000, the feature set was similar except Overcrowding (% of households) replaced Long-term Diabetes Complications. The feature set for uncorrelated features was the same for both cases and deaths. This set was comprised of Overcrowding (% of households), Wholesale trade (% of workforce employed in), ‘Transportation, warehousing, and utilities’ (% of workforce employed in), and ‘Graduate or professional degree’ (% of population older than 25 years).
Type: Research Report
URI: http://hdl.handle.net/1893/35101
DOI Link: 10.48550/arXiv.2303.14485
Rights: Licenced under the Creative Commons Attribution 4.0 International License (CC BY 4.0)
Affiliation: Mathematics
Licence URL(s): http://creativecommons.org/licenses/by/4.0/

Files in This Item:
File Description SizeFormat 
2303.14485.pdfFulltext - Published Version2.05 MBAdobe PDFView/Open



This item is protected by original copyright



A file in this item is licensed under a Creative Commons License Creative Commons

Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/

If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.