Code
library(tidyverse)
library(readxl)
library(dplyr)
library(ggplot2)
::opts_chunk$set(echo = TRUE) knitr
Rahul Somu
March 24, 2023
The disastrous effects that a highly contagious disease can have on the world have been strongly illustrated by the COVID-19 pandemic. Millions of people have passed away as a consequence of the pandemic and also impacted the lives of billions of people around the world. Current state of affairs has brought to light the necessity for research on factor and tactics to effectively combat pandemics in the future.
It’s critical to comprehend the variables affecting COVID-19 mortality as the pandemic spreads further. The goal of this study is to look at the correlations between a nation’s COVID-19 mortality rate and its population density, median age, GDP per-capita, prevalence of diabetes, hospital beds per 1,000 people, and human development index.
In this project I’m aiming to research To what extent do these socioeconomic factors contribute to the variation in COVID-19 mortality rate across the world and derive the relationship of COVID-19 mortality rate with population density,median age, GDP per capita, diabetes prevalence, hospital beds per thousand people and human development index.
#DataSet
The data set contains time series data of around 193 countries around the world. There are around 84,000 records of the countries over the period of time.
Datasource: https://www.kaggle.com/datasets/fedesoriano/coronavirus-covid19-vaccinations-data
df <- read_excel("_data/COVID_Data.xlsx")
df_selected <- df[,c("iso_code","continent","location","date","total_cases_per_million","population_density","median_age",
"gdp_per_capita","diabetes_prevalence","hospital_beds_per_thousand","human_development_index")]
dataset_dim <- (dim(df_selected))
dataset_dim
[1] 84772 11
[1] 193
# A tibble: 6 × 11
iso_code conti…¹ locat…² date total…³ popul…⁴ media…⁵ gdp_p…⁶ diabe…⁷ hospi…⁸
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 AFG Asia Afghan… 2020… 0.873 54.4 18.6 1804. 9.59 0.5
2 AFG Asia Afghan… 2020… 1.05 54.4 18.6 1804. 9.59 0.5
3 AFG Asia Afghan… 2020… 1.10 54.4 18.6 1804. 9.59 0.5
4 AFG Asia Afghan… 2020… 1.95 54.4 18.6 1804. 9.59 0.5
5 AFG Asia Afghan… 2020… 2.06 54.4 18.6 1804. 9.59 0.5
6 AFG Asia Afghan… 2020… 2.34 54.4 18.6 1804. 9.59 0.5
# … with 1 more variable: human_development_index <dbl>, and abbreviated
# variable names ¹continent, ²location, ³total_cases_per_million,
# ⁴population_density, ⁵median_age, ⁶gdp_per_capita, ⁷diabetes_prevalence,
# ⁸hospital_beds_per_thousand
iso_code continent location date
Length:84772 Length:84772 Length:84772 Length:84772
Class :character Class :character Class :character Class :character
Mode :character Mode :character Mode :character Mode :character
total_cases_per_million population_density median_age gdp_per_capita
Min. : 0.02 Min. : 1.98 Min. :15.10 Min. : 661.2
1st Qu.: 454.56 1st Qu.: 36.25 1st Qu.:22.20 1st Qu.: 4466.5
Median : 2579.10 Median : 82.60 Median :30.60 Median : 13367.6
Mean : 14350.40 Mean : 361.01 Mean :30.76 Mean : 19633.0
3rd Qu.: 16110.12 3rd Qu.: 205.86 3rd Qu.:39.60 3rd Qu.: 27936.9
Max. :179667.38 Max. :19347.50 Max. :48.20 Max. :116935.6
NA's :1 NA's :4819 NA's :5783 NA's :6696
diabetes_prevalence hospital_beds_per_thousand human_development_index
Min. : 0.990 Min. : 0.100 Min. :0.394
1st Qu.: 5.290 1st Qu.: 1.300 1st Qu.:0.602
Median : 7.110 Median : 2.400 Median :0.756
Mean : 7.651 Mean : 3.047 Mean :0.731
3rd Qu.: 9.740 3rd Qu.: 4.200 3rd Qu.:0.852
Max. :22.020 Max. :13.800 Max. :0.957
NA's :4872 NA's :12687 NA's :5802
Methodology:
Multiple linear regression models will be used to carry out the analysis. The socioeconomic determinants will be the independent variables, where as the COVID-19 mortality rate will be the dependent variable.
Expected Results:
The findings of this investigation will aid in understanding the variables affecting the COVID-19 mortality rate. Population density, median age, diabetes prevalence, and hospital beds per thousand people are anticipated to have a positive correlation with the COVID-19 mortality rate, whereas GDP per capita and the human development index are anticipated to have a negative correlation. Planning public health policies and actions to lessen the effects of COVID-19 will benefit from the findings.
Conclusion:
The goal of this study is to understand the relationship between socioeconomic factors and the COVID-19 mortality rate. The results will be helpful in planning public health policy and initiatives and will shed light on the factors that affect the COVID-19 death rate.
---
title: "Final Project Check-in 1"
author: "Rahul Somu"
desription: "Quantitative Analysis of the Relationship between COVID-19 Mortality Rate and Socioeconomic Factors"
date: "03/24/2023"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- FinalProject
- checkin1
- Rahul Somu
- dataset
- ggplot2
---
```{r}
#| label: setup
#| warning: false
library(tidyverse)
library(readxl)
library(dplyr)
library(ggplot2)
knitr::opts_chunk$set(echo = TRUE)
```
## Overview
The disastrous effects that a highly contagious disease can have on the world have been strongly illustrated by the COVID-19 pandemic. Millions of people have passed away as a consequence of the pandemic and also impacted the lives of billions of people around the world. Current state of affairs has brought to light the necessity for research on factor and tactics to effectively combat pandemics in the future.
It's critical to comprehend the variables affecting COVID-19 mortality as the pandemic spreads further. The goal of this study is to look at the correlations between a nation's COVID-19 mortality rate and its population density, median age, GDP per-capita, prevalence of diabetes, hospital beds per 1,000 people, and human development index.
In this project I'm aiming to research To what extent do these socioeconomic factors contribute to the variation in COVID-19 mortality rate across the world and derive the relationship of COVID-19 mortality rate with population density,median age, GDP per capita, diabetes prevalence, hospital beds per thousand people and human development index.
#DataSet
The data set contains time series data of around 193 countries around the world. There are around 84,000 records of the countries over the period of time.
Datasource: https://www.kaggle.com/datasets/fedesoriano/coronavirus-covid19-vaccinations-data
```{r}
df <- read_excel("_data/COVID_Data.xlsx")
df_selected <- df[,c("iso_code","continent","location","date","total_cases_per_million","population_density","median_age",
"gdp_per_capita","diabetes_prevalence","hospital_beds_per_thousand","human_development_index")]
dataset_dim <- (dim(df_selected))
dataset_dim
countries_count <- (length(unique(df_selected$location)))
countries_count
countries_list <- (unique(df_selected$location))
head(df_selected)
summary(df_selected)
```
Methodology:
Multiple linear regression models will be used to carry out the analysis. The socioeconomic determinants will be the independent variables, where as the COVID-19 mortality rate will be the dependent variable.
Expected Results:
The findings of this investigation will aid in understanding the variables affecting the COVID-19 mortality rate. Population density, median age, diabetes prevalence, and hospital beds per thousand people are anticipated to have a positive correlation with the COVID-19 mortality rate, whereas GDP per capita and the human development index are anticipated to have a negative correlation. Planning public health policies and actions to lessen the effects of COVID-19 will benefit from the findings.
Conclusion:
The goal of this study is to understand the relationship between socioeconomic factors and the COVID-19 mortality rate. The results will be helpful in planning public health policy and initiatives and will shed light on the factors that affect the COVID-19 death rate.