Final Project: Henry Nguyen

DACSS 601 Spring 2022

Henry Nguyen
2022-05-14
Show code

Introduction

In the United States, new infections with human immunodeficiency virus (HIV) has substantially decreased since the introduction of tolerable, highly active antiretroviral therapy. With the introduction of preexposure prophylaxis and the knowledge that someone living with HIV who has an undetectable viral load cannot transmit HIV, we have potent tools to mitigate the HIV epidemic (El-Sadr et. al., 2019). In 2019, the CDC issued a brief citing that the southern region of the United States (U.S) experiences the most significant burden of HIV, which is a shift from the dense urban cities that previously experienced the most burden(CDC, 2019). The U.S census bureau defines the southern region as Alabama, Arkansas, Delaware, District of Columbia, Florida, Georgia, Kentucky, Louisiana, Maryland, Mississippi, North Carolina, Oklahoma, South Carolina, Tennessee, Texas, Virginia, and West Virginia(U.S. Census Bureau, 2022).

As an HIV specialist in a large urban city, the burden of HIV on otherized populations is still palpable. Over the past several years, several colleagues have discussed whether or not they should go south to provide HIV primary care and whether or not it would make a difference.

The initial objective of this project was to see if there is an association between a state’s healthcare GDP and the incidence of HIV in its population. I pivoted to only look at the incidence of HIV across the U.S over time to see the trends for other regions.

Data

Read the dataset

I read in the data set and start tidying this data by removing rows, renaming variables, and changing variables to the appropriate form.

Show code
######### CASE DATA 


# Read in data set
# I skipped the first 10 rows which were notes and titles

HIV.State <- read_csv("HIV.by.State.CSV", skip = 10)

# Select rows that are relevant: Geography, Year, Cases, and Rate per 100k

HIV.State <- select(HIV.State, 2,3,5,6)

# Here I rename two columns: Geography to State, and Rate per 100k to Rate

HIV.State <- rename(HIV.State, State=2, Rate =4)

# rate is chr data, so I change it to numeric
HIV.State$Rate = as.numeric(HIV.State$Rate)


######### RATE DATA

# We have case data for 2008-2021, we only have rate data for 2008-2019. I don't have access to where they pull their state population data to calculate rate for 2020 and 2021.


# Here I create a new variable to remove 2020 and 2021 for rate data.

HIV.Rate.By.State <- filter(HIV.State, Year < 2020)

HIV.Rate.By.State <- group_by(HIV.Rate.By.State, State)

head(HIV.Rate.By.State)
# A tibble: 6 × 4
# Groups:   State [1]
   Year State   Cases  Rate
  <dbl> <chr>   <dbl> <dbl>
1  2019 Alabama   638  15.5
2  2018 Alabama   607  14.8
3  2017 Alabama   650  15.9
4  2016 Alabama   653  16  
5  2015 Alabama   663  16.3
6  2014 Alabama   664  16.4

Variables

The following variables are found in this table:

To visualize the trends in incident data by state, I pivoted wider.

Show code
##pivot wider to see year over year change for each state

HIV.State.Wide.Year <- HIV.State%>%
  select(1:3)%>%
  pivot_wider(
    names_from = Year,
    values_from = Cases
  )

## Here I use pivot wider to see year over year RATE change for each state

Rate.By.State.Wide.Year <- HIV.Rate.By.State%>%
  select(1,2,4)%>%
  pivot_wider(
    names_from = Year,
    values_from = Rate
  )
Show code
###Table.1a

knitr::kable(HIV.State.Wide.Year,caption = "HIV Incidence by State 2008-2021" )%>%
  kable_styling("striped", full_width = F) %>% 
 scroll_box(width = "1000px", height = "400px")
Table 1: HIV Incidence by State 2008-2021
State 2021 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008
Alabama 314 586 638 607 650 653 663 664 630 663 676 682 689 703
Alaska 13 29 27 23 29 37 25 38 23 28 23 37 21 39
Arizona 505 668 761 753 727 714 689 742 685 620 564 615 641 681
Arkansas 264 244 287 278 287 315 271 311 262 247 237 230 279 234
California 2308 3636 4354 4715 4806 5067 5029 5113 4673 5032 5027 5270 5417 5713
Colorado 256 326 461 402 435 421 377 377 309 370 360 420 370 445
Connecticut 136 171 213 259 274 256 276 296 326 288 349 387 343 347
Delaware 55 95 93 91 126 111 104 115 110 137 114 124 149 153
District of Columbia 64 198 255 281 316 351 365 413 490 570 632 788 822 1035
Florida 3463 3424 4378 4530 4557 4651 4595 4483 4295 4383 4575 4639 5041 5770
Georgia 1218 1833 2439 2482 2596 2526 2627 2383 2331 2596 2628 2612 2915 3162
Hawaii 41 50 65 72 77 77 117 98 96 82 78 101 84 82
Idaho 24 32 28 37 46 46 40 22 25 36 35 45 48 49
Illinois 533 1055 1252 1374 1367 1483 1547 1536 1581 1645 1612 1650 1728 1798
Indiana 329 434 486 509 515 489 632 465 474 486 468 485 461 462
Iowa 81 100 100 116 125 132 124 94 119 117 113 114 124 95
Kansas 112 138 131 157 119 147 155 130 147 151 136 135 153 138
Kentucky 269 301 326 378 365 338 340 341 355 360 306 330 343 354
Louisiana 663 725 881 961 996 1108 1095 1199 1123 1018 1188 1111 1171 1063
Maine 25 16 30 30 29 53 47 59 35 49 50 55 56 45
Maryland 506 708 918 991 1020 1097 1168 1230 1294 1311 1412 1717 1668 2011
Massachusetts 157 431 535 649 608 640 598 648 671 706 674 702 683 742
Michigan 460 521 674 715 773 746 723 781 751 784 773 764 801 773
Minnesota 203 226 274 288 277 297 296 311 306 315 294 334 380 330
Mississippi 275 401 477 476 428 427 502 473 470 440 523 454 491 513
Missouri 367 359 488 449 502 512 463 465 462 532 523 567 521 542
Montana 9 15 25 23 32 20 19 14 22 20 21 20 31 22
Nebraska 62 73 81 79 88 75 78 87 80 81 78 116 108 96
Nevada 359 391 512 501 494 509 477 429 431 364 381 371 360 390
New Hampshire 22 29 31 38 32 39 25 41 36 48 40 50 38 43
New Jersey 601 763 1057 1021 1123 1188 1192 1237 1209 1277 1166 1352 1388 1429
New Mexico 93 123 156 135 141 146 137 135 140 115 137 148 160 152
New York 1243 1958 2330 2449 2729 2820 3051 3307 3231 3517 3773 3904 4157 4560
North Carolina 966 1077 1365 1186 1295 1388 1328 1306 1275 1228 1430 1436 1570 1695
North Dakota 15 36 40 36 38 46 20 20 19 9 13 14 13 12
Ohio 484 885 980 973 983 955 925 945 1037 1015 1043 981 1042 1054
Oklahoma 163 235 320 278 299 294 314 303 331 283 309 286 293 285
Oregon 130 181 199 230 203 228 223 238 229 270 239 238 248 286
Pennsylvania 661 773 989 1023 1100 1131 1173 1195 1286 1411 1370 1472 1645 1767
Rhode Island 33 53 72 75 85 71 64 89 76 79 98 115 118 123
South Carolina 446 679 680 712 706 747 670 761 705 695 738 762 750 701
South Dakota 16 34 33 29 39 43 24 30 32 25 19 32 24 28
Tennessee 560 642 773 746 721 716 736 757 769 853 839 850 926 998
Texas 2539 3553 4302 4422 4356 4527 4529 4419 4336 4322 4267 4445 4347 4165
Utah 85 129 135 121 113 139 123 113 109 120 107 83 122 131
Vermont 4 9 11 18 20 5 14 17 13 14 13 20 17 18
Virginia 563 625 822 861 863 903 953 899 943 933 892 993 975 1057
Washington 335 421 483 500 432 425 447 441 442 494 477 540 520 521
West Virginia 91 129 146 84 77 68 71 84 74 79 85 72 73 81
Wisconsin 160 210 211 207 260 229 225 216 243 217 240 251 277 237
Wyoming 3 14 13 12 10 21 17 10 16 7 15 19 20 22


This table helps visualization of data differently. It is still busy and hard to see trends.

Descriptive Statistics

Statistics Total

Show code
HIV.Statistics.Total <- HIV.State%>%
  summarise(
    "Case Mean" = mean(Cases),
    "Case Median"  = median(Cases),
    "RATE Mean" = mean(Rate, na.rm = TRUE) ,
    "RATE Median" = median(Rate, na.rm = TRUE)
    )

knitr::kable(HIV.Statistics.Total,
                  caption = "Statistics Total" )
Table 2: Statistics Total
Case Mean Case Median RATE Mean RATE Median
754.1821 352.5 13.02386 9.6

Statistics By State

Show code
HIV.Statistics.by.State <- HIV.State%>% 
  group_by(State)%>%
  summarise(
    "Mean Cases by State" = mean(Cases),
    "Median Cases by State"  = median(Cases),
    "Mean Rate by State" = mean(Rate, na.rm = TRUE),
    "Median Rate by State"= median(Rate, na.rm = TRUE)
                  )

knitr::kable(HIV.Statistics.by.State,
                  caption = "Statistics by State" )%>%
  kable_styling("striped", full_width = F) %>% 
 scroll_box(width = "1000px", height = "400px")
Table 3: Statistics by State
State Mean Cases by State Median Cases by State Mean Rate by State Median Rate by State
Alabama 629.85714 658.0 16.408333 16.35
Alaska 28.00000 27.5 4.933333 4.60
Arizona 668.92857 683.0 12.241667 12.40
Arkansas 267.57143 267.5 11.016667 11.05
California 4725.71429 5028.0 15.841667 15.75
Colorado 380.64286 377.0 8.941667 8.80
Connecticut 280.07143 282.0 9.958333 9.60
Delaware 112.64286 112.5 15.325000 14.80
District of Columbia 470.00000 389.0 96.191667 79.50
Florida 4484.57143 4543.5 27.700000 26.50
Georgia 2453.42857 2561.0 31.608333 30.70
Hawaii 80.00000 80.0 7.358333 7.30
Idaho 36.64286 36.5 2.908333 2.90
Illinois 1440.07143 1541.5 14.475000 14.55
Indiana 478.21429 479.5 9.083333 8.85
Iowa 111.00000 115.0 4.450000 4.55
Kansas 139.21429 138.0 5.983333 6.05
Kentucky 336.14286 340.5 9.400000 9.45
Louisiana 1021.57143 1079.0 28.333333 28.85
Maine 41.35714 46.0 3.925000 4.20
Maryland 1217.92857 1199.0 26.875000 25.35
Massachusetts 603.14286 648.5 11.416667 11.45
Michigan 717.07143 757.5 9.033333 9.25
Minnesota 295.07143 296.5 6.858333 6.70
Mississippi 453.57143 471.5 19.308333 19.20
Missouri 482.28571 495.0 9.966667 9.95
Montana 20.92857 20.5 2.625000 2.55
Nebraska 84.42857 80.5 5.725000 5.25
Nevada 426.35714 410.0 18.525000 18.55
New Hampshire 36.57143 38.0 3.366667 3.40
New Jersey 1143.07143 1190.0 16.483333 16.15
New Mexico 137.00000 138.5 8.300000 8.10
New York 3073.50000 3141.0 20.100000 19.70
North Carolina 1324.64286 1317.0 16.766667 15.75
North Dakota 23.64286 19.5 3.858333 3.20
Ohio 950.14286 980.5 10.233333 10.10
Oklahoma 285.21429 293.5 9.500000 9.50
Oregon 224.42857 229.5 7.066667 7.00
Pennsylvania 1214.00000 1184.0 11.991667 11.40
Rhode Island 82.21429 77.5 9.825000 9.05
South Carolina 696.57143 705.5 17.875000 17.80
South Dakota 29.14286 29.5 4.291667 4.25
Tennessee 777.57143 763.0 14.841667 14.00
Texas 4180.64286 4341.5 20.350000 20.35
Utah 116.42857 120.5 5.200000 5.15
Vermont 13.78571 14.0 2.775000 2.85
Virginia 877.28571 901.0 13.433333 13.40
Washington 462.71429 462.0 8.125000 7.75
West Virginia 86.71429 80.0 5.308333 5.00
Wisconsin 227.35714 227.0 4.866667 4.85
Wyoming 14.21429 14.5 3.241667 3.25

Statistics by Year

Show code
HIV.Statistics.by.Year <- HIV.State%>% 
  group_by(Year)%>%
  summarise(
    "Mean Cases by Year" = mean(Cases),
    "Median Cases by Year"= median(Cases),
    "Mean Rate by Year" = mean(Rate, na.rm = TRUE),
    "Median Rate by Year" = median(Rate, na.rm = TRUE)
                  )

knitr::kable(HIV.Statistics.by.Year,
                  caption = "Statistics by Year" )%>%
  kable_styling("striped", full_width = F) %>% 
 scroll_box(width = "1000px", height = "400px")
Table 4: Statistics by Year
Year Mean Cases by Year Median Cases by Year Mean Rate by Year Median Rate by Year
2008 924.5490 390 17.17647 11.0
2009 874.9216 370 15.69804 10.5
2010 841.9216 387 14.96078 10.1
2011 807.0588 360 13.60980 10.0
2012 792.9804 364 13.03922 9.5
2013 767.1961 355 12.50196 9.7
2014 781.9608 377 12.34902 9.5
2015 778.4902 365 11.94706 9.5
2016 773.0784 351 11.92157 9.1
2017 750.7647 365 11.45882 9.3
2018 732.9804 378 10.94902 8.7
2019 712.4902 326 10.67451 9.0
2020 583.2157 301 NaN NA
2021 436.9412 256 NaN NA

The descriptive statistics gives us more information about the HIV epidemic in the United States. We can see that the incident cases decrease dramatically from 2008 (924) to 2021 (436). In 2008, across the states the mean is 924 and the median 390. In 2021 the mean is 436 and the median is 256. The mean rate per 100,000 people in 2008 was 17.17 and in 2019 it was 10.67. All showing a marked decrease in incident cases across the united states.

Data Transformation

It is hard to tell a story with the tables above. The statistics start to, but one thing that would help is to see change over time which is done below by calculating the difference year over year by state. This in and of itself is helpful to start telling a story. To highlights when cases increase, I changed the font to red.

Show code
##### Lag HIV CASE #####

Lag.HIV.Case <- HIV.State %>%
  group_by(State)%>%
  arrange(State,Year)%>%
  mutate(Case.Diff = Cases - lag(Cases),
         Case.Diff.Percent = ((Cases - lag(Cases))/Cases)*100,
        Increase.Case = Case.Diff > 0,
        Increase.Case.Percent = Case.Diff.Percent >0,
        )

# Lag.HIV.Case$Increase.Case = as.character(Lag.HIV.Case$Increase.Case)
# Lag.HIV.Case$Increase.Case.Percent = as.character(Lag.HIV.Case$Increase.Case.Percent)



Lag.HIV.Case.WIDE <- Lag.HIV.Case%>%
  select(1,2,5)%>%
  pivot_wider(
    names_from = Year,
    values_from = Case.Diff
  )


Table.Lag.HIV.Case.WIDE<- Lag.HIV.Case.WIDE%>%
   kbl("html",caption = "Year over Year Difference in HIV Cases by State")%>%
  kable_styling()%>%
  column_spec(3,color = if_else( Lag.HIV.Case.WIDE$`2009`>0, "red", "black", "black"))%>%
  column_spec(4,color = if_else( Lag.HIV.Case.WIDE$`2010`>0, "red", "black", "black"))%>%
  column_spec(5,color = if_else( Lag.HIV.Case.WIDE$`2011`>0, "red", "black", "black"))%>%
  column_spec(6,color = if_else( Lag.HIV.Case.WIDE$`2012`>0, "red", "black", "black"))%>%
  column_spec(7,color = if_else( Lag.HIV.Case.WIDE$`2013`>0, "red", "black", "black"))%>%
  column_spec(8,color = if_else( Lag.HIV.Case.WIDE$`2014`>0, "red", "black", "black"))%>%
  column_spec(9,color = if_else( Lag.HIV.Case.WIDE$`2015`> 0, "red", "black", "black"))%>%
  column_spec(10,color = if_else( Lag.HIV.Case.WIDE$`2016`>0, "red", "black", "black"))%>%
  column_spec(11,color = if_else( Lag.HIV.Case.WIDE$`2017`>0, "red", "black", "black"))%>%
  column_spec(12,color = if_else( Lag.HIV.Case.WIDE$`2018`>0, "red", "black", "black"))%>%
  column_spec(13,color = if_else( Lag.HIV.Case.WIDE$`2019`>0, "red", "black", "black"))%>%
  column_spec(14,color = if_else( Lag.HIV.Case.WIDE$`2020`>0, "red", "black", "black"))%>%
  column_spec(15,color = if_else( Lag.HIV.Case.WIDE$`2021`>0, "red", "black", "black"))%>%
  kable_styling("striped", full_width = F) %>% 
 scroll_box(width = "1000px", height = "400px")

Table.Lag.HIV.Case.WIDE
Table 5: Year over Year Difference in HIV Cases by State
State 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021
Alabama NA -14 -7 -6 -13 -33 34 -1 -10 -3 -43 31 -52 -272
Alaska NA -18 16 -14 5 -5 15 -13 12 -8 -6 4 2 -16
Arizona NA -40 -26 -51 56 65 57 -53 25 13 26 8 -93 -163
Arkansas NA 45 -49 7 10 15 49 -40 44 -28 -9 9 -43 20
California NA -296 -147 -243 5 -359 440 -84 38 -261 -91 -361 -718 -1328
Colorado NA -75 50 -60 10 -61 68 0 44 14 -33 59 -135 -70
Connecticut NA -4 44 -38 -61 38 -30 -20 -20 18 -15 -46 -42 -35
Delaware NA -4 -25 -10 23 -27 5 -11 7 15 -35 2 2 -40
District of Columbia NA -213 -34 -156 -62 -80 -77 -48 -14 -35 -35 -26 -57 -134
Florida NA -729 -402 -64 -192 -88 188 112 56 -94 -27 -152 -954 39
Georgia NA -247 -303 16 -32 -265 52 244 -101 70 -114 -43 -606 -615
Hawaii NA 2 17 -23 4 14 2 19 -40 0 -5 -7 -15 -9
Idaho NA -1 -3 -10 1 -11 -3 18 6 0 -9 -9 4 -8
Illinois NA -70 -78 -38 33 -64 -45 11 -64 -116 7 -122 -197 -522
Indiana NA -1 24 -17 18 -12 -9 167 -143 26 -6 -23 -52 -105
Iowa NA 29 -10 -1 4 2 -25 30 8 -7 -9 -16 0 -19
Kansas NA 15 -18 1 15 -4 -17 25 -8 -28 38 -26 7 -26
Kentucky NA -11 -13 -24 54 -5 -14 -1 -2 27 13 -52 -25 -32
Louisiana NA 108 -60 77 -170 105 76 -104 13 -112 -35 -80 -156 -62
Maine NA 11 -1 -5 -1 -14 24 -12 6 -24 1 0 -14 9
Maryland NA -343 49 -305 -101 -17 -64 -62 -71 -77 -29 -73 -210 -202
Massachusetts NA -59 19 -28 32 -35 -23 -50 42 -32 41 -114 -104 -274
Michigan NA 28 -37 9 11 -33 30 -58 23 27 -58 -41 -153 -61
Minnesota NA 50 -46 -40 21 -9 5 -15 1 -20 11 -14 -48 -23
Mississippi NA -22 -37 69 -83 30 3 29 -75 1 48 1 -76 -126
Missouri NA -21 46 -44 9 -70 3 -2 49 -10 -53 39 -129 8
Montana NA 9 -11 1 -1 2 -8 5 1 12 -9 2 -10 -6
Nebraska NA 12 8 -38 3 -1 7 -9 -3 13 -9 2 -8 -11
Nevada NA -30 11 10 -17 67 -2 48 32 -15 7 11 -121 -32
New Hampshire NA -5 12 -10 8 -12 5 -16 14 -7 6 -7 -2 -7
New Jersey NA -41 -36 -186 111 -68 28 -45 -4 -65 -102 36 -294 -162
New Mexico NA 8 -12 -11 -22 25 -5 2 9 -5 -6 21 -33 -30
New York NA -403 -253 -131 -256 -286 76 -256 -231 -91 -280 -119 -372 -715
North Carolina NA -125 -134 -6 -202 47 31 22 60 -93 -109 179 -288 -111
North Dakota NA 1 1 -1 -4 10 1 0 26 -8 -2 4 -4 -21
Ohio NA -12 -61 62 -28 22 -92 -20 30 28 -10 7 -95 -401
Oklahoma NA 8 -7 23 -26 48 -28 11 -20 5 -21 42 -85 -72
Oregon NA -38 -10 1 31 -41 9 -15 5 -25 27 -31 -18 -51
Pennsylvania NA -122 -173 -102 41 -125 -91 -22 -42 -31 -77 -34 -216 -112
Rhode Island NA -5 -3 -17 -19 -3 13 -25 7 14 -10 -3 -19 -20
South Carolina NA 49 12 -24 -43 10 56 -91 77 -41 6 -32 -1 -233
South Dakota NA -4 8 -13 6 7 -2 -6 19 -4 -10 4 1 -18
Tennessee NA -72 -76 -11 14 -84 -12 -21 -20 5 25 27 -131 -82
Texas NA 182 98 -178 55 14 83 110 -2 -171 66 -120 -749 -1014
Utah NA -9 -39 24 13 -11 4 10 16 -26 8 14 -6 -44
Vermont NA -1 3 -7 1 -1 4 -3 -9 15 -2 -7 -2 -5
Virginia NA -82 18 -101 41 10 -44 54 -50 -40 -2 -39 -197 -62
Washington NA -1 20 -63 17 -52 -1 6 -22 7 68 -17 -62 -86
West Virginia NA -8 -1 13 -6 -5 10 -13 -3 9 7 62 -17 -38
Wisconsin NA 40 -26 -11 -23 26 -27 9 4 31 -53 4 -1 -50
Wyoming NA -2 -1 -4 -8 9 -6 7 4 -11 2 1 1 -11
Show code
# knitr::kable(Table.Lag.HIV.Case.WIDE,
#                   caption = "Table 2a: Difference in cases by Year")


Lag.HIV.Case.WIDE.Percent <- Lag.HIV.Case%>%
  select(1,2,6)%>%
  pivot_wider(
    names_from = Year,
    values_from = Case.Diff.Percent
  )

###HENRY YOU NEED TO CHANGE OUTPUT DECIMAL PLACES###

Table.Lag.HIV.Case.WIDE.Percent<- Lag.HIV.Case.WIDE.Percent%>%
   kbl("html",caption = "Year Over Year % Difference in HIV Cases by State" )%>%
  kable_styling()%>%
  column_spec(3,color = if_else( Lag.HIV.Case.WIDE$`2009`>0, "red", "black", "black"))%>%
  column_spec(4,color = if_else( Lag.HIV.Case.WIDE$`2010`>0, "red", "black", "black"))%>%
  column_spec(5,color = if_else( Lag.HIV.Case.WIDE$`2011`>0, "red", "black", "black"))%>%
  column_spec(6,color = if_else( Lag.HIV.Case.WIDE$`2012`>0, "red", "black", "black"))%>%
  column_spec(7,color = if_else( Lag.HIV.Case.WIDE$`2013`>0, "red", "black", "black"))%>%
  column_spec(8,color = if_else( Lag.HIV.Case.WIDE$`2014`>0, "red", "black", "black"))%>%
  column_spec(9,color = if_else( Lag.HIV.Case.WIDE$`2015`> 0, "red", "black", "black"))%>%
  column_spec(10,color = if_else( Lag.HIV.Case.WIDE$`2016`>0, "red", "black", "black"))%>%
  column_spec(11,color = if_else( Lag.HIV.Case.WIDE$`2017`>0, "red", "black", "black"))%>%
  column_spec(12,color = if_else( Lag.HIV.Case.WIDE$`2018`>0, "red", "black", "black"))%>%
  column_spec(13,color = if_else( Lag.HIV.Case.WIDE$`2019`>0, "red", "black", "black"))%>%
  column_spec(14,color = if_else( Lag.HIV.Case.WIDE$`2020`>0, "red", "black", "black"))%>%
  column_spec(15,color = if_else( Lag.HIV.Case.WIDE$`2021`>0, "red", "black", "black"))%>%
  kable_styling("striped", full_width = F) %>% 
 scroll_box(width = "1000px", height = "400px")

Table.Lag.HIV.Case.WIDE.Percent
Table 5: Year Over Year % Difference in HIV Cases by State
State 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021
Alabama NA -2.0319303 -1.026393 -0.8875740 -1.9607843 -5.2380952 5.1204819 -0.1508296 -1.5313936 -0.4615385 -7.0840198 4.8589342 -8.8737201 -86.624204
Alaska NA -85.7142857 43.243243 -60.8695652 17.8571429 -21.7391304 39.4736842 -52.0000000 32.4324324 -27.5862069 -26.0869565 14.8148148 6.8965517 -123.076923
Arizona NA -6.2402496 -4.227642 -9.0425532 9.0322581 9.4890511 7.6819407 -7.6923077 3.5014006 1.7881706 3.4528552 1.0512484 -13.9221557 -32.277228
Arkansas NA 16.1290323 -21.304348 2.9535865 4.0485830 5.7251908 15.7556270 -14.7601476 13.9682540 -9.7560976 -3.2374101 3.1358885 -17.6229508 7.575758
California NA -5.4642791 -2.789374 -4.8338970 0.0993641 -7.6824310 8.6055154 -1.6703122 0.7499507 -5.4307116 -1.9300106 -8.2912265 -19.7469747 -57.538995
Colorado NA -20.2702703 11.904762 -16.6666667 2.7027027 -19.7411003 18.0371353 0.0000000 10.4513064 3.2183908 -8.2089552 12.7982646 -41.4110429 -27.343750
Connecticut NA -1.1661808 11.369509 -10.8882521 -21.1805556 11.6564417 -10.1351351 -7.2463768 -7.8125000 6.5693431 -5.7915058 -21.5962441 -24.5614035 -25.735294
Delaware NA -2.6845638 -20.161290 -8.7719298 16.7883212 -24.5454545 4.3478261 -10.5769231 6.3063063 11.9047619 -38.4615385 2.1505376 2.1052632 -72.727273
District of Columbia NA -25.9124088 -4.314721 -24.6835443 -10.8771930 -16.3265306 -18.6440678 -13.1506849 -3.9886040 -11.0759494 -12.4555160 -10.1960784 -28.7878788 -209.375000
Florida NA -14.4614164 -8.665661 -1.3989071 -4.3805613 -2.0488941 4.1936203 2.4374320 1.2040421 -2.0627606 -0.5960265 -3.4719050 -27.8621495 1.126191
Georgia NA -8.4734134 -11.600306 0.6088280 -1.2326656 -11.3685114 2.1821234 9.2881614 -3.9984165 2.6964561 -4.5930701 -1.7630176 -33.0605565 -50.492611
Hawaii NA 2.3809524 16.831683 -29.4871795 4.8780488 14.5833333 2.0408163 16.2393162 -51.9480519 0.0000000 -6.9444444 -10.7692308 -30.0000000 -21.951220
Idaho NA -2.0833333 -6.666667 -28.5714286 2.7777778 -44.0000000 -13.6363636 45.0000000 13.0434783 0.0000000 -24.3243243 -32.1428571 12.5000000 -33.333333
Illinois NA -4.0509259 -4.727273 -2.3573201 2.0060790 -4.0480708 -2.9296875 0.7110537 -4.3155765 -8.4857352 0.5094614 -9.7444089 -18.6729858 -97.936210
Indiana NA -0.2169197 4.948454 -3.6324786 3.7037037 -2.5316456 -1.9354839 26.4240506 -29.2433538 5.0485437 -1.1787819 -4.7325103 -11.9815668 -31.914894
Iowa NA 23.3870968 -8.771930 -0.8849558 3.4188034 1.6806723 -26.5957447 24.1935484 6.0606061 -5.6000000 -7.7586207 -16.0000000 0.0000000 -23.456790
Kansas NA 9.8039216 -13.333333 0.7352941 9.9337748 -2.7210884 -13.0769231 16.1290323 -5.4421769 -23.5294118 24.2038217 -19.8473282 5.0724638 -23.214286
Kentucky NA -3.2069971 -3.939394 -7.8431373 15.0000000 -1.4084507 -4.1055718 -0.2941176 -0.5917160 7.3972603 3.4391534 -15.9509202 -8.3056478 -11.895911
Louisiana NA 9.2228864 -5.400540 6.4814815 -16.6994106 9.3499555 6.3386155 -9.4977169 1.1732852 -11.2449799 -3.6420395 -9.0805902 -21.5172414 -9.351433
Maine NA 19.6428571 -1.818182 -10.0000000 -2.0408163 -40.0000000 40.6779661 -25.5319149 11.3207547 -82.7586207 3.3333333 0.0000000 -87.5000000 36.000000
Maryland NA -20.5635492 2.853815 -21.6005666 -7.7040427 -1.3137558 -5.2032520 -5.3082192 -6.4721969 -7.5490196 -2.9263370 -7.9520697 -29.6610169 -39.920949
Massachusetts NA -8.6383602 2.706553 -4.1543027 4.5325779 -5.2160954 -3.5493827 -8.3612040 6.5625000 -5.2631579 6.3174114 -21.3084112 -24.1299304 -174.522293
Michigan NA 3.4956305 -4.842932 1.1642950 1.4030612 -4.3941411 3.8412292 -8.0221300 3.0831099 3.4928849 -8.1118881 -6.0830861 -29.3666027 -13.260870
Minnesota NA 13.1578947 -13.772455 -13.6054422 6.6666667 -2.9411765 1.6077170 -5.0675676 0.3367003 -7.2202166 3.8194444 -5.1094891 -21.2389381 -11.330049
Mississippi NA -4.4806517 -8.149780 13.1931166 -18.8636364 6.3829787 0.6342495 5.7768924 -17.5644028 0.2336449 10.0840336 0.2096436 -18.9526185 -45.818182
Missouri NA -4.0307102 8.112875 -8.4130019 1.6917293 -15.1515152 0.6451613 -0.4319654 9.5703125 -1.9920319 -11.8040089 7.9918033 -35.9331476 2.179836
Montana NA 29.0322581 -55.000000 4.7619048 -5.0000000 9.0909091 -57.1428571 26.3157895 5.0000000 37.5000000 -39.1304348 8.0000000 -66.6666667 -66.666667
Nebraska NA 11.1111111 6.896552 -48.7179487 3.7037037 -1.2500000 8.0459770 -11.5384615 -4.0000000 14.7727273 -11.3924051 2.4691358 -10.9589041 -17.741936
Nevada NA -8.3333333 2.964960 2.6246719 -4.6703297 15.5452436 -0.4662005 10.0628931 6.2868369 -3.0364372 1.3972056 2.1484375 -30.9462916 -8.913649
New Hampshire NA -13.1578947 24.000000 -25.0000000 16.6666667 -33.3333333 12.1951220 -64.0000000 35.8974359 -21.8750000 15.7894737 -22.5806452 -6.8965517 -31.818182
New Jersey NA -2.9538905 -2.662722 -15.9519726 8.6922475 -5.6244830 2.2635408 -3.7751678 -0.3367003 -5.7880677 -9.9902057 3.4058657 -38.5321101 -26.955075
New Mexico NA 5.0000000 -8.108108 -8.0291971 -19.1304348 17.8571429 -3.7037037 1.4598540 6.1643836 -3.5460993 -4.4444444 13.4615385 -26.8292683 -32.258065
New York NA -9.6944912 -6.480533 -3.4720382 -7.2789309 -8.8517487 2.2981554 -8.3906916 -8.1914894 -3.3345548 -11.4332381 -5.1072961 -18.9989785 -57.522124
North Carolina NA -7.9617834 -9.331476 -0.4195804 -16.4495114 3.6862745 2.3736600 1.6566265 4.3227666 -7.1814672 -9.1905565 13.1135531 -26.7409471 -11.490683
North Dakota NA 7.6923077 7.142857 -7.6923077 -44.4444444 52.6315789 5.0000000 0.0000000 56.5217391 -21.0526316 -5.5555556 10.0000000 -11.1111111 -140.000000
Ohio NA -1.1516315 -6.218145 5.9443912 -2.7586207 2.1215043 -9.7354497 -2.1621622 3.1413613 2.8484232 -1.0277492 0.7142857 -10.7344633 -82.851240
Oklahoma NA 2.7303754 -2.447552 7.4433657 -9.1872792 14.5015106 -9.2409241 3.5031847 -6.8027211 1.6722408 -7.5539568 13.1250000 -36.1702128 -44.171779
Oregon NA -15.3225806 -4.201681 0.4184100 11.4814815 -17.9039301 3.7815126 -6.7264574 2.1929825 -12.3152709 11.7391304 -15.5778894 -9.9447514 -39.230769
Pennsylvania NA -7.4164134 -11.752717 -7.4452555 2.9057406 -9.7200622 -7.6150628 -1.8755328 -3.7135279 -2.8181818 -7.5268817 -3.4378160 -27.9430789 -16.944024
Rhode Island NA -4.2372881 -2.608696 -17.3469388 -24.0506329 -3.9473684 14.6067416 -39.0625000 9.8591549 16.4705882 -13.3333333 -4.1666667 -35.8490566 -60.606061
South Carolina NA 6.5333333 1.574803 -3.2520325 -6.1870504 1.4184397 7.3587385 -13.5820896 10.3078983 -5.8073654 0.8426966 -4.7058824 -0.1472754 -52.242153
South Dakota NA -16.6666667 25.000000 -68.4210526 24.0000000 21.8750000 -6.6666667 -25.0000000 44.1860465 -10.2564103 -34.4827586 12.1212121 2.9411765 -112.500000
Tennessee NA -7.7753780 -8.941176 -1.3110846 1.6412661 -10.9232770 -1.5852048 -2.8532609 -2.7932961 0.6934813 3.3512064 3.4928849 -20.4049844 -14.642857
Texas NA 4.1867955 2.204724 -4.1715491 1.2725590 0.3228782 1.8782530 2.4287922 -0.0441794 -3.9256198 1.4925373 -2.7894003 -21.0807768 -39.936983
Utah NA -7.3770492 -46.987952 22.4299065 10.8333333 -10.0917431 3.5398230 8.1300813 11.5107914 -23.0088496 6.6115702 10.3703704 -4.6511628 -51.764706
Vermont NA -5.8823529 15.000000 -53.8461538 7.1428571 -7.6923077 23.5294118 -21.4285714 -180.0000000 75.0000000 -11.1111111 -63.6363636 -22.2222222 -125.000000
Virginia NA -8.4102564 1.812689 -11.3228700 4.3944266 1.0604454 -4.8943270 5.6663169 -5.5370986 -4.6349942 -0.2322880 -4.7445255 -31.5200000 -11.012433
Washington NA -0.1923077 3.703704 -13.2075472 3.4412955 -11.7647059 -0.2267574 1.3422819 -5.1764706 1.6203704 13.6000000 -3.5196687 -14.7268409 -25.671642
West Virginia NA -10.9589041 -1.388889 15.2941176 -7.5949367 -6.7567568 11.9047619 -18.3098592 -4.4117647 11.6883117 8.3333333 42.4657534 -13.1782946 -41.758242
Wisconsin NA 14.4404332 -10.358566 -4.5833333 -10.5990783 10.6995885 -12.5000000 4.0000000 1.7467249 11.9230769 -25.6038647 1.8957346 -0.4761905 -31.250000
Wyoming NA -10.0000000 -5.263158 -26.6666667 -114.2857143 56.2500000 -60.0000000 41.1764706 19.0476190 -110.0000000 16.6666667 7.6923077 7.1428571 -366.666667
Show code
# knitr::kable(Table.Lag.HIV.Case.WIDE.Percent,
#                   caption = "Table 2b: Difference in Case Percentage Year Over Year by State")

###### LAG HIV RATE ######
lag.HIV.Rate <- HIV.State%>%
  group_by(State)%>%
  filter(Year < 2020)%>%
  arrange(State,Year)%>%
  mutate(Rate.Diff = Rate - lag(Rate),
         Rate.Diff.Percent =  (Rate - lag(Rate))/Rate*100,
        Increase.Rate = Rate.Diff >0,
        Increase.Rate.Percent = Rate.Diff.Percent >0)
 
  
  #kable()%>%
  #column_spec(3, color = if_else(lag.HIV.Rate$Rate.Diff<0, "black", "red"))

Lag.HIV.Rate.WIDE <- lag.HIV.Rate%>%
  select(1,2,5)%>%
  pivot_wider(
    names_from = Year,
    values_from = Rate.Diff
    )

# print(Lag.HIV.Rate.WIDE, n= 51)


# Table.Lag.HIV.Rate<- lag.HIV.Rate%>%
#   kbl()%>%
#   kable_styling()%>%
#   column_spec(3,color = spec_color(lag.HIV.Rate$Rate.Diff[1:612]))
# 
Table.Lag.HIV.Rate.WIDE<- Lag.HIV.Rate.WIDE%>%
   kbl("html",caption = "Year Over Year Difference in HIV Rates by State" )%>%
  kable_styling()%>%
  column_spec(3,color = if_else( Lag.HIV.Rate.WIDE$`2009`>0, "red", "black", "black"))%>%
  column_spec(4,color = if_else( Lag.HIV.Rate.WIDE$`2010`>0, "red", "black", "black"))%>%
  column_spec(5,color = if_else( Lag.HIV.Rate.WIDE$`2011`>0, "red", "black", "black"))%>%
  column_spec(6,color = if_else( Lag.HIV.Rate.WIDE$`2012`>0, "red", "black", "black"))%>%
  column_spec(7,color = if_else( Lag.HIV.Rate.WIDE$`2013`>0, "red", "black", "black"))%>%
  column_spec(8,color = if_else( Lag.HIV.Rate.WIDE$`2014`>0, "red", "black", "black"))%>%
  column_spec(9,color = if_else( Lag.HIV.Rate.WIDE$`2015`> 0, "red", "black", "black"))%>%
  column_spec(10,color = if_else( Lag.HIV.Rate.WIDE$`2016`>0, "red", "black", "black"))%>%
  column_spec(11,color = if_else( Lag.HIV.Rate.WIDE$`2017`>0, "red", "black", "black"))%>%
  column_spec(12,color = if_else( Lag.HIV.Rate.WIDE$`2018`>0, "red", "black", "black"))%>%
  column_spec(13,color = if_else( Lag.HIV.Rate.WIDE$`2019`>0, "red", "black", "black"))%>%
  kable_styling("striped", full_width = F) %>% 
 scroll_box(width = "1000px", height = "400px")

Table.Lag.HIV.Rate.WIDE
Table 5: Year Over Year Difference in HIV Rates by State
State 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
Alabama NA -0.5 -0.6 -0.2 -0.4 -0.9 0.8 -0.1 -0.3 -0.1 -1.1 0.7
Alaska NA -3.3 2.7 -2.5 0.8 -0.9 2.5 -2.1 1.9 -1.3 -1.0 0.7
Arizona NA -0.9 -0.3 -1.2 0.9 1.0 0.8 -1.1 0.2 0.0 0.2 -0.1
Arkansas NA 1.9 -2.2 0.2 0.3 0.6 1.9 -1.6 1.7 -1.2 -0.4 0.3
California NA -1.1 -0.9 -0.9 -0.2 -1.3 1.3 -0.5 0.0 -0.9 -0.3 -1.2
Colorado NA -2.0 1.1 -1.6 0.1 -1.5 1.4 -0.2 0.8 0.1 -0.8 1.1
Connecticut NA -0.2 1.2 -1.3 -2.0 1.2 -1.0 -0.7 -0.6 0.6 -0.5 -1.5
Delaware NA -0.8 -3.7 -1.5 2.8 -3.7 0.5 -1.5 0.7 1.7 -4.4 0.1
District of Columbia NA -44.4 -10.9 -31.8 -13.5 -16.2 -14.8 -9.4 -3.2 -6.5 -6.2 -4.6
Florida NA -4.9 -3.3 -0.8 -1.6 -0.9 0.7 0.2 -0.2 -1.0 -0.5 -1.2
Georgia NA -3.7 -3.7 -0.2 -0.8 -3.5 0.3 2.5 -1.6 0.4 -1.7 -0.8
Hawaii NA 0.2 1.0 -2.1 0.3 1.1 0.2 1.5 -3.4 0.0 -0.4 -0.6
Idaho NA -0.1 -0.3 -0.8 0.0 -0.9 -0.2 1.3 0.4 -0.1 -0.7 -0.7
Illinois NA -0.7 -0.8 -0.4 0.3 -0.7 -0.4 0.1 -0.6 -1.1 0.1 -1.1
Indiana NA -0.1 0.4 -0.4 0.3 -0.3 -0.2 3.0 -2.6 0.4 -0.2 -0.4
Iowa NA 1.2 -0.5 -0.1 0.2 0.0 -1.0 1.2 0.3 -0.3 -0.4 -0.6
Kansas NA 0.6 -0.8 0.0 0.6 -0.2 -0.7 1.0 -0.4 -1.1 1.5 -1.1
Kentucky NA -0.3 -0.5 -0.7 1.5 -0.2 -0.4 -0.1 -0.1 0.7 0.3 -1.4
Louisiana NA 2.7 -2.1 1.8 -4.7 2.6 1.8 -2.8 0.2 -2.8 -0.9 -2.1
Maine NA 1.0 -0.2 -0.4 -0.1 -1.2 2.0 -1.0 0.5 -2.1 0.1 0.0
Maryland NA -7.6 0.3 -6.6 -2.3 -0.5 -1.5 -1.3 -1.5 -1.6 -0.7 -1.5
Massachusetts NA -1.2 0.4 -0.6 0.4 -0.7 -0.5 -0.9 0.6 -0.6 0.7 -2.0
Michigan NA 0.3 -0.3 0.0 0.1 -0.4 0.3 -0.7 0.3 0.3 -0.7 -0.5
Minnesota NA 1.1 -1.1 -1.0 0.5 -0.3 0.1 -0.4 0.0 -0.5 0.2 -0.4
Mississippi NA -1.0 -1.8 2.7 -3.4 1.1 0.1 1.1 -3.0 0.0 1.9 0.0
Missouri NA -0.5 0.9 -0.9 0.1 -1.4 0.0 -0.1 1.0 -0.3 -1.1 0.8
Montana NA 1.1 -1.4 0.1 -0.1 0.2 -1.0 0.6 0.1 1.3 -1.0 0.2
Nebraska NA 0.8 0.4 -2.6 0.1 -0.1 0.5 -0.7 -0.2 0.8 -0.6 0.1
Nevada NA -1.6 -0.1 0.3 -1.0 2.7 -0.4 1.7 0.9 -1.0 -0.1 0.0
New Hampshire NA -0.4 1.1 -0.9 0.6 -1.0 0.4 -1.4 1.2 -0.7 0.5 -0.6
New Jersey NA -0.7 -0.7 -2.6 1.4 -0.9 0.3 -0.6 -0.1 -0.9 -1.4 0.5
New Mexico NA 0.4 -1.0 -0.8 -1.3 1.4 -0.3 0.1 0.5 -0.3 -0.4 1.2
New York NA -2.6 -1.4 -1.0 -1.6 -1.8 0.4 -1.6 -1.4 -0.5 -1.6 -0.7
North Carolina NA -1.9 -2.3 -0.2 -2.7 0.3 0.2 0.1 0.5 -1.3 -1.4 1.8
North Dakota NA 0.2 0.1 -0.2 -0.8 1.7 0.1 -0.1 4.2 -1.3 -0.3 0.6
Ohio NA -0.2 -0.6 0.6 -0.3 0.2 -1.0 -0.2 0.3 0.2 -0.1 0.0
Oklahoma NA 0.2 -0.4 0.7 -1.0 1.5 -1.0 0.3 -0.7 0.1 -0.6 1.2
Oregon NA -1.3 -0.4 -0.1 0.9 -1.3 0.2 -0.5 0.0 -0.8 0.7 -1.0
Pennsylvania NA -1.2 -1.7 -1.0 0.3 -1.2 -0.8 -0.2 -0.4 -0.3 -0.7 -0.4
Rhode Island NA -0.6 -0.4 -1.9 -2.1 -0.4 1.4 -2.8 0.8 1.5 -1.1 -0.3
South Carolina NA 1.0 0.0 -0.9 -1.3 0.1 1.1 -2.5 1.6 -1.2 -0.1 -1.0
South Dakota NA -0.6 1.2 -2.0 0.9 0.9 -0.3 -0.9 2.7 -0.6 -1.5 0.5
Tennessee NA -1.6 -1.6 -0.4 0.1 -1.6 -0.4 -0.5 -0.4 -0.1 0.3 0.3
Texas NA 0.5 -0.1 -1.3 -0.1 -0.3 0.0 0.1 -0.4 -1.0 0.0 -0.8
Utah NA -0.6 -1.8 1.1 0.5 -0.6 0.1 0.3 0.6 -1.2 0.2 0.4
Vermont NA -0.2 0.5 -1.3 0.2 -0.2 0.7 -0.5 -1.7 2.8 -0.4 -1.3
Virginia NA -1.4 -0.1 -1.6 0.4 0.0 -0.7 0.7 -0.8 -0.7 -0.1 -0.6
Washington NA -0.2 0.2 -1.2 0.2 -1.0 -0.1 0.0 -0.5 -0.1 1.0 -0.4
West Virginia NA -0.6 -0.1 0.8 -0.4 -0.3 0.6 -0.8 -0.2 0.7 0.4 4.1
Wisconsin NA 0.9 -0.6 -0.3 -0.5 0.6 -0.6 0.1 0.1 0.6 -1.1 0.1
Wyoming NA -0.5 -0.4 -0.9 -1.7 1.8 -1.2 1.4 0.9 -2.3 0.4 0.2
Show code
# knitr::kable(Table.Lag.HIV.Rate.WIDE,
#                   caption = "Table 2c: Difference in Case Rate Year Over Year by State")

Lag.HIV.Rate.WIDE.Percent <- lag.HIV.Rate%>%
  select(1,2,6)%>%
  pivot_wider(
    names_from = Year,
    values_from = Rate.Diff.Percent
    )

#### Henry you need to change output DECIMAL PLACES

 Table.Lag.HIV.Rate.WIDE.Percent<- Lag.HIV.Rate.WIDE.Percent%>%
   kbl("html",caption = "Year Over Year % Difference in HIV Rates by State" )%>%
  kable_styling()%>%
  column_spec(3,color = if_else( Lag.HIV.Rate.WIDE$`2009`>0, "red", "black", "black"))%>%
  column_spec(4,color = if_else( Lag.HIV.Rate.WIDE$`2010`>0, "red", "black", "black"))%>%
  column_spec(5,color = if_else( Lag.HIV.Rate.WIDE$`2011`>0, "red", "black", "black"))%>%
  column_spec(6,color = if_else( Lag.HIV.Rate.WIDE$`2012`>0, "red", "black", "black"))%>%
  column_spec(7,color = if_else( Lag.HIV.Rate.WIDE$`2013`>0, "red", "black", "black"))%>%
  column_spec(8,color = if_else( Lag.HIV.Rate.WIDE$`2014`>0, "red", "black", "black"))%>%
  column_spec(9,color = if_else( Lag.HIV.Rate.WIDE$`2015`> 0, "red", "black", "black"))%>%
  column_spec(10,color = if_else( Lag.HIV.Rate.WIDE$`2016`>0, "red", "black", "black"))%>%
  column_spec(11,color = if_else( Lag.HIV.Rate.WIDE$`2017`>0, "red", "black", "black"))%>%
  column_spec(12,color = if_else( Lag.HIV.Rate.WIDE$`2018`>0, "red", "black", "black"))%>%
  column_spec(13,color = if_else( Lag.HIV.Rate.WIDE$`2019`>0, "red", "black", "black"))%>%
   kable_styling("striped", full_width = F) %>% 
 scroll_box(width = "1000px", height = "400px")
 
Table.Lag.HIV.Rate.WIDE.Percent
Table 5: Year Over Year % Difference in HIV Rates by State
State 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
Alabama NA -2.824859 -3.5087719 -1.1834320 -2.4242424 -5.7692308 4.8780488 -0.6134969 -1.8750000 -0.6289308 -7.4324324 4.5161290
Alaska NA -89.189189 42.1875000 -64.1025641 17.0212766 -23.6842105 39.6825397 -50.0000000 31.1475410 -27.0833333 -26.3157895 15.5555556
Arizona NA -7.438016 -2.5423729 -11.3207547 7.8260870 8.0000000 6.0150376 -9.0163934 1.6129032 0.0000000 1.5873016 -0.8000000
Arkansas NA 16.101695 -22.9166667 2.0408163 2.9702970 5.6074766 15.0793651 -14.5454545 13.3858268 -10.4347826 -3.6036036 2.6315789
California NA -6.111111 -5.2631579 -5.5555556 -1.2500000 -8.8435374 8.1250000 -3.2258065 0.0000000 -6.1643836 -2.0979021 -9.1603053
Colorado NA -22.222222 10.8910891 -18.8235294 1.1627907 -21.1267606 16.4705882 -2.4096386 8.7912088 1.0869565 -9.5238095 11.5789474
Connecticut NA -1.724138 9.3750000 -11.3043478 -21.0526316 11.2149533 -10.3092784 -7.7777778 -7.1428571 6.6666667 -5.8823529 -21.4285714
Delaware NA -3.960396 -22.4242424 -10.0000000 15.7303371 -26.2411348 3.4246575 -11.4503817 5.0724638 10.9677419 -39.6396396 0.8928571
District of Columbia NA -27.871940 -7.3450135 -27.2727273 -13.0940834 -18.6421174 -20.5270458 -14.9920255 -5.3781513 -12.2641509 -13.2478632 -10.9004739
Florida NA -15.170279 -11.3793103 -2.8368794 -6.0150376 -3.5019455 2.6515152 0.7518797 -0.7575758 -3.9370079 -2.0080321 -5.0632911
Georgia NA -10.081744 -11.2121212 -0.6097561 -2.5000000 -12.2807018 1.0416667 7.9872204 -5.3872054 1.3289037 -5.9859155 -2.8985507
Hawaii NA 2.564103 11.3636364 -31.3432836 4.2857143 13.5802469 2.4096386 15.3061224 -53.1250000 0.0000000 -6.6666667 -11.1111111
Idaho NA -2.564103 -8.3333333 -28.5714286 0.0000000 -47.3684211 -11.7647059 43.3333333 11.7647059 -3.0303030 -26.9230769 -36.8421053
Illinois NA -4.294479 -5.1612903 -2.6490066 1.9480519 -4.7619048 -2.7972028 0.6944444 -4.3478261 -8.6614173 0.7812500 -9.4017094
Indiana NA -1.149425 4.3956044 -4.5977011 3.3333333 -3.4482759 -2.3529412 26.0869565 -29.2134831 4.3010753 -2.1978022 -4.5977011
Iowa NA 24.000000 -11.1111111 -2.2727273 4.3478261 0.0000000 -27.7777778 25.0000000 5.8823529 -6.2500000 -9.0909091 -15.7894737
Kansas NA 9.090909 -13.7931034 0.0000000 9.3750000 -3.2258065 -12.7272727 15.3846154 -6.5573770 -22.0000000 23.0769231 -20.3703704
Kentucky NA -3.125000 -5.4945055 -8.3333333 15.1515152 -2.0618557 -4.3010753 -1.0869565 -1.0989011 7.1428571 2.9702970 -16.0919540
Louisiana NA 8.490566 -7.0707071 5.7142857 -17.5373134 8.8435374 5.7692308 -9.8591549 0.6993007 -10.8527132 -3.6144578 -9.2105263
Maine NA 20.000000 -4.1666667 -9.0909091 -2.3255814 -38.7096774 39.2156863 -24.3902439 10.8695652 -84.0000000 3.8461538 0.0000000
Maryland NA -21.590909 0.8450704 -22.8373702 -8.6466165 -1.9157088 -6.0975610 -5.5793991 -6.8807339 -7.9207921 -3.5897436 -8.3333333
Massachusetts NA -9.836066 3.1746032 -5.0000000 3.2258065 -5.9829060 -4.4642857 -8.7378641 5.5045872 -5.8252427 6.3636364 -22.2222222
Michigan NA 3.125000 -3.2258065 0.0000000 1.0638298 -4.4444444 3.2258065 -8.1395349 3.3707865 3.2608696 -8.2352941 -6.2500000
Minnesota NA 12.643678 -14.4736842 -15.1515152 7.0422535 -4.4117647 1.4492754 -6.1538462 0.0000000 -8.3333333 3.2258065 -6.8965517
Mississippi NA -4.878049 -9.6256684 12.6168224 -18.8888889 5.7591623 0.5208333 5.4187192 -17.3410405 0.0000000 9.8958333 0.0000000
Missouri NA -4.761905 7.8947368 -8.5714286 0.9433962 -15.2173913 0.0000000 -1.0989011 9.9009901 -3.0612245 -12.6436782 8.4210526
Montana NA 28.947368 -58.3333333 4.0000000 -4.1666667 7.6923077 -62.5000000 27.2727273 4.3478261 36.1111111 -38.4615385 7.1428571
Nebraska NA 10.810811 5.1282051 -50.0000000 1.8867925 -1.9230769 8.7719298 -14.0000000 -4.1666667 14.2857143 -12.0000000 1.9607843
Nevada NA -9.523810 -0.5988024 1.7647059 -6.2500000 14.4385027 -2.1857923 8.5000000 4.3062201 -5.0251256 -0.5050505 0.0000000
New Hampshire NA -11.764706 24.4444444 -25.0000000 14.2857143 -31.2500000 11.1111111 -63.6363636 35.2941176 -25.9259259 15.6250000 -23.0769231
New Jersey NA -3.664922 -3.8043478 -16.4556962 8.1395349 -5.5214724 1.8072289 -3.7500000 -0.6289308 -6.0000000 -10.2941176 3.5460993
New Mexico NA 4.081633 -11.3636364 -10.0000000 -19.4029851 17.2839506 -3.8461538 1.2658228 5.9523810 -3.7037037 -5.1948052 13.4831461
New York NA -10.276680 -5.8577406 -4.3668122 -7.5117371 -9.2307692 2.0100503 -8.7431694 -8.2840237 -3.0487805 -10.8108108 -4.9645390
North Carolina NA -9.313726 -12.7071823 -1.1173184 -17.7631579 1.9354839 1.2738854 0.6329114 3.0674847 -8.6666667 -10.2941176 11.6883117
North Dakota NA 8.333333 4.0000000 -8.6956522 -53.3333333 53.1250000 3.0303030 -3.1250000 56.7567568 -21.3114754 -5.1724138 9.3750000
Ohio NA -1.851852 -5.8823529 5.5555556 -2.8571429 1.8691589 -10.3092784 -2.1052632 3.0612245 2.0000000 -1.0101010 0.0000000
Oklahoma NA 2.061856 -4.3010753 7.0000000 -11.1111111 14.2857143 -10.5263158 3.0612245 -7.6923077 1.0869565 -6.9767442 12.2448980
Oregon NA -16.666667 -5.4054054 -1.3698630 10.9756098 -18.8405797 2.8169014 -7.5757576 0.0000000 -13.7931034 10.7692308 -18.1818182
Pennsylvania NA -7.792208 -12.4087591 -7.8740157 2.3076923 -10.1694915 -7.2727273 -1.8518519 -3.8461538 -2.9702970 -7.4468085 -4.4444444
Rhode Island NA -4.545454 -3.1250000 -17.4311927 -23.8636364 -4.7619048 14.2857143 -40.0000000 10.2564103 16.1290323 -13.4146341 -3.7974684
South Carolina NA 5.050505 0.0000000 -4.7619048 -7.3863636 0.5649718 5.8510638 -15.3374233 8.9385475 -7.1856287 -0.6024096 -6.4102564
South Dakota NA -16.666667 25.0000000 -71.4285714 24.3243243 19.5652174 -6.9767442 -26.4705882 44.2622951 -10.9090909 -37.5000000 11.1111111
Tennessee NA -9.039548 -9.9378882 -2.5477707 0.6329114 -11.2676056 -2.8985507 -3.7593985 -3.1007752 -0.7812500 2.2900763 2.2388060
Texas NA 2.262443 -0.4545455 -6.2801932 -0.4854369 -1.4778325 0.0000000 0.4901961 -2.0000000 -5.2631579 0.0000000 -4.3956044
Utah NA -10.526316 -46.1538462 22.0000000 9.0909091 -12.2448980 2.0000000 5.6603774 10.1694915 -25.5319149 4.0816327 7.5471698
Vermont NA -6.250000 13.5135135 -54.1666667 7.6923077 -8.3333333 22.5806452 -19.2307692 -188.8888889 75.6756757 -12.1212121 -65.0000000
Virginia NA -9.395973 -0.6756757 -12.1212121 2.9411765 0.0000000 -5.4263566 5.1470588 -6.2500000 -5.7851240 -0.8333333 -5.2631579
Washington NA -2.127660 2.0833333 -14.2857143 2.3255814 -13.1578947 -1.3333333 0.0000000 -7.1428571 -1.4492754 12.6582278 -5.3333333
West Virginia NA -12.765957 -2.1739130 14.8148148 -8.0000000 -6.3829787 11.3207547 -17.7777778 -4.6511628 14.0000000 7.4074074 43.1578947
Wisconsin NA 15.254237 -11.3207547 -6.0000000 -11.1111111 11.7647059 -13.3333333 2.1739130 2.1276596 11.3207547 -26.1904762 2.3255814
Wyoming NA -11.111111 -9.7560976 -28.1250000 -113.3333333 54.5454545 -57.1428571 40.0000000 20.4545455 -109.5238095 16.0000000 7.4074074

Visualize Data

Visualizing these data on graphs could help in seeing the epidemic evolve over time. To see the overall trend we see the graphs of the cases and rates by state and year. These are busy graphs.

Show code
# visualization of trend in cases by state from 2008-2021
Figure.1a<- HIV.State%>%
  ggplot(mapping=aes(Year,Cases))+
  geom_point(position = "jitter")+
  geom_line(mapping = aes(color= State))+
  theme(legend.position = "none")+
  labs(y = "Cases", 
       title = "Figure 1a: HIV Cases by State",
       subtitle = "2008 - 2021"
  )

Figure.1a
Show code
# visualization of trend in rate by state from 2008-2021
Figure.1b<- HIV.State%>%
  ggplot(mapping = aes(Year,Rate))+
  geom_line(mapping = aes(Year,Rate,color = State))+
  theme(legend.position = "none")+
  labs(y = "Rate per 100k",
       title = "Figure 1b: HIV Incidence Rate by State",
       subtitle = "2008-2019")

Figure.1b

To see state specific trend the following figures used facet wrapping and applied logical parameters to the graph. Now we see years in which cases/rates increased or decreased by state.

Show code
Figure.2a<- Lag.HIV.Case%>%
  group_by(Year)%>%
  ggplot(mapping = aes(Year,Case.Diff))+
  geom_point(aes(color = Case.Diff>0), position = "jitter")+
  facet_wrap(~State, scales = "free")+
  labs(y= " Lagging Cases", 
       title = " Figure 2a: State HIV Case Difference by Lagging Year")+
  guides(x = guide_axis(angle = 90))

Figure.2b <- lag.HIV.Rate%>%
  group_by(Year)%>%
  ggplot(mapping = aes(Year,Rate.Diff))+
  geom_point(aes(color = Rate.Diff>0), position = "jitter")+
  facet_wrap(~State,scales = "free")+
  labs(y= "Lagging Rate",
       title = "Figure 2b: State HIV Incidence Rate difference by Lagging Year")+
  guides(x = guide_axis(angle = 90))
Figure 2a
Figure 2b

Graphs 2a and 2b allows us to understand the epidemic over the years by state. To visualize change year over year by state, we look at the proportion of states with positive or negative year over year change with the following bar graphs.

Show code
Figure.3a <- Lag.HIV.Case%>%
  select(Year,State,Increase.Case)%>%
  group_by(Year,State)%>%
  arrange(Year)%>%
  ggplot()+
  geom_bar(mapping = aes(x=Year, fill = Increase.Case)
  )+
  labs(title = "Proportion of States with Increasing or Decreasing Cases by Year")

Figure.3a
Show code
Figure.3b <- lag.HIV.Rate%>%
  select(Year,State,Increase.Rate)%>%
  group_by(Year,State)%>%
  arrange(Year)%>%
  ggplot()+
  geom_bar(mapping = aes(x=Year, fill=Increase.Rate))+
  labs(title = "Proportion of States with Increasing or Decreasing Rates by Year")
  
Figure.3b

Reflection

I took this class because, at the time, there was limited exposure to R offered in the Master in Public Health Program. As a data-driven nurse practitioner who worked in community medicine and has since moved over to work on the science side of the pharmaceutical industry. I wanted to learn R. In medicine, there is a significant push for real-world data, which more often than not are data obtained from large data repositories, analyzed, then published.

The process for this project started very slowly. I looked at several data sets, including the data sets provided by the class, but I could not settle on a data set or a combination of data sets. As I progressed through the tutorials, I felt I understood the concepts I learned from them. When I started to dive into different data sets, I found how challenging wrangling and tidying data could be. After discussing with some of my colleagues who are also HIV specialists, I settled on looking at HIV incidence data. We talked about the “epidemic of the south” and how we could help. Is it possible for us to go south and provide quality HIV and primary care, and would it even matter? I looked at this data set, hoping to identify whether HIV is genuinely just an epidemic in the southern states or do we also see concerning rates in other states. I also pulled health care GDP data, as I was curious whether there is an inverse relationship between a state’s health care GDP and its HIV rates.

The project started with difficulty; although the data was overall tidy, contending with 714 unique rows, fifty states plus the District of Columbia, thirteen years was much more complicated than expected. To move forward with the final project, I decided to only look at the HIV incidence data and no longer attempt to look at a correlation with health care GDP by state. To look at this data set and find the story it can tell was difficult. Pivoting wider by year to look at year over year data by state yielded a table filled with numbers that still did not give us much information visually. This also held when looking at trends in cases and rates by state, year over year. There was so much going on when graphed that outliers dampened the visualizations. Solving these issues took many trials. I needed numbers that told a story to move forward, so I looked at the numerical and percentage change in cases, year over year, by state using the ‘lag.’ This took time as the numbers didn’t add up until I correctly grouped and arranged the data. Next, I wanted to visualize this in a way that told a story. I started by creating a table, and I changed the font color for each positive observation to red (red because a positive change in rate/percentage is not good), and all other observations are black. Learning kable was also more difficult than I expected. It taught me how much self-learning will be important in applying R to my future work/research.

Conclusion

These data confirm that there has been a substantial decrease in incident HIV infections since 2008. It affirms that the disease burden is due to large numbers of incident cases in the southern states. We also see there are states in other regions that have increased HIV cases/rates over the past few years, although at a much lower level. HIV in these states is likely different than in the U.S south. As opposed to the socioeconomic factors and health care access issues affecting HIV treatment and prevention in the U.S South (CDC, 2019), the few infections seen in non-southern states may be related to an HIV cluster; for example, the HIV cluster in Indiana seen in 2015 (Goodnough, 2015).

I am still curious to see if there is a more inverse relationship or any relationship between health care GDP and the incidence of HIV by state. The next step I would take is to work with the healthcare GDP data and see if there is an association with incident HIV infection by state.

Referencees

Centers for Disease Control and Prevention. NCHHSTP AtlasPlus. Updated 2017. https://www.cdc.gov/nchhstp/atlas/index.htm. Accessed on [February 2022]

Centers for Disease Control and Prevention. HIV in the southern United States: CDC Issue Brief. 2016; https://www.cdc.gov/hiv/pdf/policies/cdc-hiv-in-the-south-issue-brief.pdf. Accessed 10 Aug 2019.

El-Sadr, W. M., Mayer, K. H., Rabkin, M., & Hodder, S. L. (2019). AIDS in America—back in the headlines at long last. New England Journal of Medicine, 380(21), 1985-1987.

Goodnough, A. (2015). Rural indiana struggles to contend with HIV outbreak. New York Times, 5.

R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/

U.S. Census Bureau. Census Regions and Divisions of the United States. Available at: https://www2.census.gov/geo/pdfs/maps-data/maps/reference/us_regdiv.pdf. Accessed May 2022.

Wickham, H., ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.

Wickham, H. et al., (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686,https://doi.org/10.21105/joss.01686

Wickham, H., François, R.,Henry,L., and Müller, K., (2021). dplyr: A Grammar of Data Manipulation. R package version 1.0.7. https://CRAN.R-project.org/package=dplyr

Wickham, H., & Grolemund, G. (2016). R for data science: Visualize, model, transform, tidy, and import data. OReilly Media.

Zhu, H (2021). kableExtra: Construct Complex Table with ‘kable’ and Pipe Syntax. R package version 1.3.4. https://CRAN.R-project.org/package=kableExtra

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Nguyen (2022, May 19). Data Analytics and Computational Social Science: Final Project: Henry Nguyen. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomhenryfnp901430/

BibTeX citation

@misc{nguyen2022final,
  author = {Nguyen, Henry},
  title = {Data Analytics and Computational Social Science: Final Project: Henry Nguyen},
  url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomhenryfnp901430/},
  year = {2022}
}