Statistical Analysis of Indian Education System
This project analyzes 7 dataframes from 2013-2016 related to Indian Education System extracted from Indian Government’s Data Management website - data.gov.in. The main objective of the project is to study the impact on dropout ratio of schools based on the facilites (access to computers, sanitation and electricity) across various levels(primary, upper-primary, secondary and higher secondary) in all states of the country. I also aim to analyze the correlation between all the dataframes linked to drpout ratio.
Loading the packagesReading Dataframe-1 | Gross Enrollment Ratio from 2013-2016 acoss all Indian States
gross_enrollment_ratio <- read_csv("601 Major Project/gross-enrollment-ratio.csv")
View(gross_enrollment_ratio)
head(gross_enrollment_ratio)
# A tibble: 6 x 14
State_UT Year Primary_Boys Primary_Girls Primary_Total
<chr> <chr> <dbl> <dbl> <dbl>
1 Andaman & Nicobar Is~ 2013~ 95.9 92.0 93.9
2 Andhra Pradesh 2013~ 96.6 96.9 96.7
3 Arunachal Pradesh 2013~ 129. 128. 128.
4 Assam 2013~ 112. 115. 113.
5 Bihar 2013~ 95.0 101. 98.0
6 Chandigarh 2013~ 88.4 96.1 91.8
# ... with 9 more variables: Upper_Primary_Boys <dbl>,
# Upper_Primary_Girls <dbl>, Upper_Primary_Total <dbl>,
# Secondary_Boys <dbl>, Secondary_Girls <dbl>,
# Secondary_Total <dbl>, Higher_Secondary_Boys <chr>,
# Higher_Secondary_Girls <chr>, Higher_Secondary_Total <chr>
colnames(gross_enrollment_ratio)
[1] "State_UT" "Year"
[3] "Primary_Boys" "Primary_Girls"
[5] "Primary_Total" "Upper_Primary_Boys"
[7] "Upper_Primary_Girls" "Upper_Primary_Total"
[9] "Secondary_Boys" "Secondary_Girls"
[11] "Secondary_Total" "Higher_Secondary_Boys"
[13] "Higher_Secondary_Girls" "Higher_Secondary_Total"
Datatypes of each column
str(gross_enrollment_ratio)
spec_tbl_df [110 x 14] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ State_UT : chr [1:110] "Andaman & Nicobar Islands" "Andhra Pradesh" "Arunachal Pradesh" "Assam" ...
$ Year : chr [1:110] "2013-14" "2013-14" "2013-14" "2013-14" ...
$ Primary_Boys : num [1:110] 95.9 96.6 129.1 111.8 95 ...
$ Primary_Girls : num [1:110] 92 96.9 127.8 115.2 101.2 ...
$ Primary_Total : num [1:110] 93.9 96.7 128.5 113.4 98 ...
$ Upper_Primary_Boys : num [1:110] 94.7 82.8 112.6 87.8 80.6 ...
$ Upper_Primary_Girls : num [1:110] 89 84.4 115.3 98.7 94.9 ...
$ Upper_Primary_Total : num [1:110] 91.8 83.6 113.9 93.1 87.2 ...
$ Secondary_Boys : num [1:110] 102.9 73.8 88.4 65.6 57.7 ...
$ Secondary_Girls : num [1:110] 97.4 76.8 84.9 77.2 63 ...
$ Secondary_Total : num [1:110] 100.2 75.2 86.7 71.2 60.1 ...
$ Higher_Secondary_Boys : chr [1:110] "105.4" "59.83" "65.16" "31.78" ...
$ Higher_Secondary_Girls: chr [1:110] "96.61" "60.83" "65.38" "34.27" ...
$ Higher_Secondary_Total: chr [1:110] "101.28" "60.3" "65.27" "32.94" ...
- attr(*, "spec")=
.. cols(
.. State_UT = col_character(),
.. Year = col_character(),
.. Primary_Boys = col_double(),
.. Primary_Girls = col_double(),
.. Primary_Total = col_double(),
.. Upper_Primary_Boys = col_double(),
.. Upper_Primary_Girls = col_double(),
.. Upper_Primary_Total = col_double(),
.. Secondary_Boys = col_double(),
.. Secondary_Girls = col_double(),
.. Secondary_Total = col_double(),
.. Higher_Secondary_Boys = col_character(),
.. Higher_Secondary_Girls = col_character(),
.. Higher_Secondary_Total = col_character()
.. )
- attr(*, "problems")=<externalptr>
Tidying the data
gross_enrollment_ratio[ gross_enrollment_ratio == "NR" ] <- NA
gross_enrollment_ratio[ gross_enrollment_ratio == "@" ] <- NA
select(gross_enrollment_ratio, 'Higher_Secondary_Boys', 'Higher_Secondary_Girls', 'Higher_Secondary_Total')
# A tibble: 110 x 3
Higher_Secondary_Boys Higher_Secondary_Girls Higher_Secondary_Total
<chr> <chr> <chr>
1 105.4 96.61 101.28
2 59.83 60.83 60.3
3 65.16 65.38 65.27
4 31.78 34.27 32.94
5 23.33 24.17 23.7
6 90.5 92.88 91.49
7 58.27 56.16 57.23
8 37.77 41.99 39.64
9 34.37 64.55 44.36
10 98.88 102.3 100.42
# ... with 100 more rows
Reading Dataframe-2 | Dropout Ratio across all Indian States from 2013-2016
dropout_ratio <- read_csv("601 Major Project/dropout-ratio.csv")
View(dropout_ratio)
head(dropout_ratio)
# A tibble: 6 x 14
State_UT year Primary_Boys Primary_Girls Primary_Total
<chr> <chr> <chr> <chr> <chr>
1 A & N Islands 2012-13 0.83 0.51 0.68
2 A & N Islands 2013-14 1.35 1.06 1.21
3 A & N Islands 2014-15 0.47 0.55 0.51
4 Andhra Pradesh 2012-13 3.3 3.05 3.18
5 Andhra Pradesh 2013-14 4.31 4.39 4.35
6 Andhra Pradesh 2014-15 6.57 6.89 6.72
# ... with 9 more variables: `Upper Primary_Boys` <chr>,
# `Upper Primary_Girls` <chr>, `Upper Primary_Total` <chr>,
# `Secondary _Boys` <chr>, `Secondary _Girls` <chr>,
# `Secondary _Total` <chr>, HrSecondary_Boys <chr>,
# HrSecondary_Girls <chr>, HrSecondary_Total <chr>
colnames(dropout_ratio)
[1] "State_UT" "year" "Primary_Boys"
[4] "Primary_Girls" "Primary_Total" "Upper Primary_Boys"
[7] "Upper Primary_Girls" "Upper Primary_Total" "Secondary _Boys"
[10] "Secondary _Girls" "Secondary _Total" "HrSecondary_Boys"
[13] "HrSecondary_Girls" "HrSecondary_Total"
Datatype of each column
str(dropout_ratio)
spec_tbl_df [110 x 14] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ State_UT : chr [1:110] "A & N Islands" "A & N Islands" "A & N Islands" "Andhra Pradesh" ...
$ year : chr [1:110] "2012-13" "2013-14" "2014-15" "2012-13" ...
$ Primary_Boys : chr [1:110] "0.83" "1.35" "0.47" "3.3" ...
$ Primary_Girls : chr [1:110] "0.51" "1.06" "0.55" "3.05" ...
$ Primary_Total : chr [1:110] "0.68" "1.21" "0.51" "3.18" ...
$ Upper Primary_Boys : chr [1:110] "Uppe_r_Primary" "NR" "1.44" "3.21" ...
$ Upper Primary_Girls: chr [1:110] "1.09" "1.54" "1.95" "3.51" ...
$ Upper Primary_Total: chr [1:110] "1.23" "0.51" "1.69" "3.36" ...
$ Secondary _Boys : chr [1:110] "5.57" "8.36" "11.47" "12.21" ...
$ Secondary _Girls : chr [1:110] "5.55" "5.98" "8.16" "13.25" ...
$ Secondary _Total : chr [1:110] "5.56" "7.2" "9.87" "12.72" ...
$ HrSecondary_Boys : chr [1:110] "17.66" "18.94" "21.05" "2.66" ...
$ HrSecondary_Girls : chr [1:110] "10.15" "12.2" "12.21" "NR" ...
$ HrSecondary_Total : chr [1:110] "14.14" "15.87" "16.93" "0.35" ...
- attr(*, "spec")=
.. cols(
.. State_UT = col_character(),
.. year = col_character(),
.. Primary_Boys = col_character(),
.. Primary_Girls = col_character(),
.. Primary_Total = col_character(),
.. `Upper Primary_Boys` = col_character(),
.. `Upper Primary_Girls` = col_character(),
.. `Upper Primary_Total` = col_character(),
.. `Secondary _Boys` = col_character(),
.. `Secondary _Girls` = col_character(),
.. `Secondary _Total` = col_character(),
.. HrSecondary_Boys = col_character(),
.. HrSecondary_Girls = col_character(),
.. HrSecondary_Total = col_character()
.. )
- attr(*, "problems")=<externalptr>
Tidying the data
dropout_ratio[ dropout_ratio == "NR" ] <- NA
dropout_ratio[ dropout_ratio == "Upper Primary_Boys" ] <- NA
dropout_ratio[ dropout_ratio == "Uppe_r_Primary" ] <- NA
Reading Dataframe-3 | Percentage of Schools with access to computers
percentage_of_schools_with_comps <- read_csv("601 Major Project/percentage-of-schools-with-comps.csv")
View(percentage_of_schools_with_comps)
head(percentage_of_schools_with_comps)
# A tibble: 6 x 13
State_UT year Primary_Only Primary_with_U_~ Primary_with_U_~
<chr> <chr> <dbl> <dbl> <dbl>
1 Andaman & Nico~ 2013~ 30.4 73.7 89.7
2 Andaman & Nico~ 2014~ 30.9 76.5 92.1
3 Andaman & Nico~ 2015~ 28.4 78.6 92.5
4 Andhra Pradesh 2013~ 12.7 42.7 87.0
5 Andhra Pradesh 2014~ 10.3 44.2 88.5
6 Andhra Pradesh 2015~ 11.5 44.8 89.5
# ... with 8 more variables: U_Primary_Only <dbl>,
# U_Primary_With_Sec_HrSec <dbl>, Primary_with_U_Primary_Sec <dbl>,
# U_Primary_With_Sec <dbl>, Sec_Only <dbl>, Sec_with_HrSec. <dbl>,
# HrSec_Only <dbl>, `All Schools` <dbl>
colnames(percentage_of_schools_with_comps)
[1] "State_UT"
[2] "year"
[3] "Primary_Only"
[4] "Primary_with_U_Primary"
[5] "Primary_with_U_Primary_Sec_HrSec"
[6] "U_Primary_Only"
[7] "U_Primary_With_Sec_HrSec"
[8] "Primary_with_U_Primary_Sec"
[9] "U_Primary_With_Sec"
[10] "Sec_Only"
[11] "Sec_with_HrSec."
[12] "HrSec_Only"
[13] "All Schools"
Datatype of each column
str(percentage_of_schools_with_comps)
spec_tbl_df [110 x 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ State_UT : chr [1:110] "Andaman & Nicobar Islands" "Andaman & Nicobar Islands" "Andaman & Nicobar Islands" "Andhra Pradesh" ...
$ year : chr [1:110] "2013-14" "2014-15" "2015-16" "2013-14" ...
$ Primary_Only : num [1:110] 30.4 30.9 28.4 12.7 10.3 ...
$ Primary_with_U_Primary : num [1:110] 73.7 76.5 78.6 42.7 44.1 ...
$ Primary_with_U_Primary_Sec_HrSec: num [1:110] 89.7 92.1 92.5 87 88.5 ...
$ U_Primary_Only : num [1:110] 0 100 0 45.5 50 ...
$ U_Primary_With_Sec_HrSec : num [1:110] 100 94.7 94.7 17.1 62.2 ...
$ Primary_with_U_Primary_Sec : num [1:110] 97.9 100 100 68.2 68.4 ...
$ U_Primary_With_Sec : num [1:110] 0 0 0 73.2 76.6 ...
$ Sec_Only : num [1:110] 0 0 0 60 71 ...
$ Sec_with_HrSec. : num [1:110] 100 100 100 33.3 66.7 ...
$ HrSec_Only : num [1:110] 0 0 0 19.3 41.6 ...
$ All Schools : num [1:110] 53.1 57.2 57 29.6 28.1 ...
- attr(*, "spec")=
.. cols(
.. State_UT = col_character(),
.. year = col_character(),
.. Primary_Only = col_double(),
.. Primary_with_U_Primary = col_double(),
.. Primary_with_U_Primary_Sec_HrSec = col_double(),
.. U_Primary_Only = col_double(),
.. U_Primary_With_Sec_HrSec = col_double(),
.. Primary_with_U_Primary_Sec = col_double(),
.. U_Primary_With_Sec = col_double(),
.. Sec_Only = col_double(),
.. Sec_with_HrSec. = col_double(),
.. HrSec_Only = col_double(),
.. `All Schools` = col_double()
.. )
- attr(*, "problems")=<externalptr>
Reading Dataframe-4 | Percentage of Schools with Electricity
percentage_of_schools_with_electricity <- read_csv("601 Major Project/percentage-of-schools-with-electricity.csv")
View(percentage_of_schools_with_electricity)
head(percentage_of_schools_with_electricity)
# A tibble: 6 x 13
State_UT year Primary_Only Primary_with_U_~ Primary_with_U_~
<chr> <chr> <dbl> <dbl> <dbl>
1 Andaman & Nico~ 2013~ 82.4 96.0 100
2 Andaman & Nico~ 2014~ 80.7 96.3 100
3 Andaman & Nico~ 2015~ 82.1 97.6 100
4 Andhra Pradesh 2013~ 87.7 93.6 99.3
5 Andhra Pradesh 2014~ 91.1 94.7 100
6 Andhra Pradesh 2015~ 91.6 95.6 100
# ... with 8 more variables: U_Primary_Only <dbl>,
# U_Primary_With_Sec_HrSec <dbl>, Primary_with_U_Primary_Sec <dbl>,
# U_Primary_With_Sec <dbl>, Sec_Only <dbl>, Sec_with_HrSec. <dbl>,
# HrSec_Only <dbl>, `All Schools` <dbl>
colnames(percentage_of_schools_with_electricity)
[1] "State_UT"
[2] "year"
[3] "Primary_Only"
[4] "Primary_with_U_Primary"
[5] "Primary_with_U_Primary_Sec_HrSec"
[6] "U_Primary_Only"
[7] "U_Primary_With_Sec_HrSec"
[8] "Primary_with_U_Primary_Sec"
[9] "U_Primary_With_Sec"
[10] "Sec_Only"
[11] "Sec_with_HrSec."
[12] "HrSec_Only"
[13] "All Schools"
Datatype of each column
str(percentage_of_schools_with_electricity)
spec_tbl_df [110 x 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ State_UT : chr [1:110] "Andaman & Nicobar Islands" "Andaman & Nicobar Islands" "Andaman & Nicobar Islands" "Andhra Pradesh" ...
$ year : chr [1:110] "2013-14" "2014-15" "2015-16" "2013-14" ...
$ Primary_Only : num [1:110] 82.4 80.7 82.1 87.7 91.1 ...
$ Primary_with_U_Primary : num [1:110] 96 96.3 97.6 93.6 94.7 ...
$ Primary_with_U_Primary_Sec_HrSec: num [1:110] 100 100 100 99.3 100 ...
$ U_Primary_Only : num [1:110] 0 100 0 100 100 ...
$ U_Primary_With_Sec_HrSec : num [1:110] 100 100 100 67.5 86.1 ...
$ Primary_with_U_Primary_Sec : num [1:110] 100 100 100 96.2 97.6 ...
$ U_Primary_With_Sec : num [1:110] 0 0 0 96.2 97.1 ...
$ Sec_Only : num [1:110] 0 0 0 97.5 93.5 ...
$ Sec_with_HrSec. : num [1:110] 100 100 100 100 83.3 ...
$ HrSec_Only : num [1:110] 0 0 0 91.3 93.2 ...
$ All Schools : num [1:110] 88.9 88.9 90.1 90.3 92.8 ...
- attr(*, "spec")=
.. cols(
.. State_UT = col_character(),
.. year = col_character(),
.. Primary_Only = col_double(),
.. Primary_with_U_Primary = col_double(),
.. Primary_with_U_Primary_Sec_HrSec = col_double(),
.. U_Primary_Only = col_double(),
.. U_Primary_With_Sec_HrSec = col_double(),
.. Primary_with_U_Primary_Sec = col_double(),
.. U_Primary_With_Sec = col_double(),
.. Sec_Only = col_double(),
.. Sec_with_HrSec. = col_double(),
.. HrSec_Only = col_double(),
.. `All Schools` = col_double()
.. )
- attr(*, "problems")=<externalptr>
Reading Dataframe-5 | Percentage of Schools with water faciltity
percentage_of_schools_with_water_facility <- read_csv("601 Major Project/percentage-of-schools-with-water-facility.csv")
View(percentage_of_schools_with_water_facility)
head(percentage_of_schools_with_water_facility)
# A tibble: 6 x 13
`State/UT` Year Primary_Only Primary_with_U_~ Primary_with_U_~
<chr> <chr> <dbl> <dbl> <dbl>
1 Andaman & Nico~ 2013~ 98.2 98.7 100
2 Andaman & Nico~ 2014~ 99.6 98.8 100
3 Andaman & Nico~ 2015~ 100 100 100
4 Andhra Pradesh 2013~ 86.9 94.5 99.7
5 Andhra Pradesh 2014~ 91.8 96.1 100
6 Andhra Pradesh 2015~ 93.9 97.0 100
# ... with 8 more variables: U_Primary_Only <dbl>,
# U_Primary_With_Sec_HrSec <dbl>, Primary_with_U_Primary_Sec <dbl>,
# U_Primary_With_Sec <dbl>, Sec_Only <dbl>, Sec_with_HrSec. <dbl>,
# HrSec_Only <dbl>, `All Schools` <dbl>
colnames(percentage_of_schools_with_water_facility)
[1] "State/UT"
[2] "Year"
[3] "Primary_Only"
[4] "Primary_with_U_Primary"
[5] "Primary_with_U_Primary_Sec_HrSec"
[6] "U_Primary_Only"
[7] "U_Primary_With_Sec_HrSec"
[8] "Primary_with_U_Primary_Sec"
[9] "U_Primary_With_Sec"
[10] "Sec_Only"
[11] "Sec_with_HrSec."
[12] "HrSec_Only"
[13] "All Schools"
Datatype of each column
str(percentage_of_schools_with_water_facility)
spec_tbl_df [110 x 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ State/UT : chr [1:110] "Andaman & Nicobar Islands" "Andaman & Nicobar Islands" "Andaman & Nicobar Islands" "Andhra Pradesh" ...
$ Year : chr [1:110] "2013-14" "2014-15" "2015-16" "2013-14" ...
$ Primary_Only : num [1:110] 98.2 99.5 100 86.9 91.8 ...
$ Primary_with_U_Primary : num [1:110] 98.7 98.8 100 94.5 96.1 ...
$ Primary_with_U_Primary_Sec_HrSec: num [1:110] 100 100 100 99.7 100 ...
$ U_Primary_Only : num [1:110] 0 100 0 90.9 100 ...
$ U_Primary_With_Sec_HrSec : num [1:110] 100 100 100 87.3 90 ...
$ Primary_with_U_Primary_Sec : num [1:110] 100 100 100 98.8 99.6 ...
$ U_Primary_With_Sec : num [1:110] 0 0 0 96 97.5 ...
$ Sec_Only : num [1:110] 0 0 0 97.5 100 100 0 0 0 88.3 ...
$ Sec_with_HrSec. : num [1:110] 100 100 100 100 100 ...
$ HrSec_Only : num [1:110] 0 0 0 97.5 98.4 ...
$ All Schools : num [1:110] 98.7 99.5 100 90.3 93.7 ...
- attr(*, "spec")=
.. cols(
.. `State/UT` = col_character(),
.. Year = col_character(),
.. Primary_Only = col_double(),
.. Primary_with_U_Primary = col_double(),
.. Primary_with_U_Primary_Sec_HrSec = col_double(),
.. U_Primary_Only = col_double(),
.. U_Primary_With_Sec_HrSec = col_double(),
.. Primary_with_U_Primary_Sec = col_double(),
.. U_Primary_With_Sec = col_double(),
.. Sec_Only = col_double(),
.. Sec_with_HrSec. = col_double(),
.. HrSec_Only = col_double(),
.. `All Schools` = col_double()
.. )
- attr(*, "problems")=<externalptr>
Reading Dataframe-6 | Percentage of Schools with boys toilet
schools_with_boys_toilet <- read_csv("601 Major Project/schools-with-boys-toilet.csv")
View(schools_with_boys_toilet)
head(schools_with_boys_toilet)
# A tibble: 6 x 13
State_UT year Primary_Only Primary_with_U_~ Primary_with_U_~
<chr> <chr> <dbl> <dbl> <dbl>
1 Andaman & Nico~ 2013~ 91.6 97.4 100
2 Andaman & Nico~ 2014~ 100 100 100
3 Andaman & Nico~ 2015~ 100 100 100
4 Andhra Pradesh 2013~ 53.0 62.6 82.0
5 Andhra Pradesh 2014~ 57.9 76.5 96
6 Andhra Pradesh 2015~ 99.6 99.9 99.0
# ... with 8 more variables: U_Primary_Only <dbl>,
# U_Primary_With_Sec_HrSec <dbl>, Primary_with_U_Primary_Sec <dbl>,
# U_Primary_With_Sec <dbl>, Sec_Only <dbl>, Sec_with_HrSec. <dbl>,
# HrSec_Only <dbl>, `All Schools` <dbl>
colnames(schools_with_boys_toilet)
[1] "State_UT"
[2] "year"
[3] "Primary_Only"
[4] "Primary_with_U_Primary"
[5] "Primary_with_U_Primary_Sec_HrSec"
[6] "U_Primary_Only"
[7] "U_Primary_With_Sec_HrSec"
[8] "Primary_with_U_Primary_Sec"
[9] "U_Primary_With_Sec"
[10] "Sec_Only"
[11] "Sec_with_HrSec."
[12] "HrSec_Only"
[13] "All Schools"
Datatype of each column
str(schools_with_boys_toilet)
spec_tbl_df [110 x 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ State_UT : chr [1:110] "Andaman & Nicobar Islands" "Andaman & Nicobar Islands" "Andaman & Nicobar Islands" "Andhra Pradesh" ...
$ year : chr [1:110] "2013-14" "2014-15" "2015-16" "2013-14" ...
$ Primary_Only : num [1:110] 91.6 100 100 53 57.9 ...
$ Primary_with_U_Primary : num [1:110] 97.4 100 100 62.6 76.5 ...
$ Primary_with_U_Primary_Sec_HrSec: num [1:110] 100 100 100 82 96 ...
$ U_Primary_Only : num [1:110] 0 100 0 45.5 75 ...
$ U_Primary_With_Sec_HrSec : num [1:110] 100 100 100 64.1 93.3 ...
$ Primary_with_U_Primary_Sec : num [1:110] 100 100 100 76.2 91.4 ...
$ U_Primary_With_Sec : num [1:110] 0 0 0 60.6 78 ...
$ Sec_Only : num [1:110] 0 0 0 59.3 80.7 ...
$ Sec_with_HrSec. : num [1:110] 100 100 100 85.7 60 ...
$ HrSec_Only : num [1:110] 0 0 0 73.4 86.5 ...
$ All Schools : num [1:110] 94.5 100 100 56.9 65.3 ...
- attr(*, "spec")=
.. cols(
.. State_UT = col_character(),
.. year = col_character(),
.. Primary_Only = col_double(),
.. Primary_with_U_Primary = col_double(),
.. Primary_with_U_Primary_Sec_HrSec = col_double(),
.. U_Primary_Only = col_double(),
.. U_Primary_With_Sec_HrSec = col_double(),
.. Primary_with_U_Primary_Sec = col_double(),
.. U_Primary_With_Sec = col_double(),
.. Sec_Only = col_double(),
.. Sec_with_HrSec. = col_double(),
.. HrSec_Only = col_double(),
.. `All Schools` = col_double()
.. )
- attr(*, "problems")=<externalptr>
Reading Dataframe-7 | Percentage of Schools with girls toilet
schools_with_girls_toilet <- read_csv("601 Major Project/schools-with-girls-toilet.csv")
View(schools_with_girls_toilet)
head(schools_with_girls_toilet)
# A tibble: 6 x 13
State_UT year Primary_Only Primary_with_U_~ Primary_with_U_~
<chr> <chr> <dbl> <dbl> <dbl>
1 All India 2013~ 88.7 96.0 98.8
2 All India 2014~ 91.2 96.9 99.5
3 All India 2015~ 97.0 99.0 99.7
4 Andaman & Nico~ 2013~ 89.7 97.4 100
5 Andaman & Nico~ 2014~ 100 100 100
6 Andaman & Nico~ 2015~ 100 100 100
# ... with 8 more variables: U_Primary_Only <dbl>,
# U_Primary_With_Sec_HrSec <dbl>, Primary_with_U_Primary_Sec <dbl>,
# U_Primary_With_Sec <dbl>, Sec_Only <dbl>, Sec_with_HrSec. <dbl>,
# HrSec_Only <dbl>, `All Schools` <dbl>
colnames(schools_with_girls_toilet)
[1] "State_UT"
[2] "year"
[3] "Primary_Only"
[4] "Primary_with_U_Primary"
[5] "Primary_with_U_Primary_Sec_HrSec"
[6] "U_Primary_Only"
[7] "U_Primary_With_Sec_HrSec"
[8] "Primary_with_U_Primary_Sec"
[9] "U_Primary_With_Sec"
[10] "Sec_Only"
[11] "Sec_with_HrSec."
[12] "HrSec_Only"
[13] "All Schools"
Datatype of each column
str(schools_with_girls_toilet)
spec_tbl_df [110 x 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ State_UT : chr [1:110] "All India" "All India" "All India" "Andaman & Nicobar Islands" ...
$ year : chr [1:110] "2013-14" "2014-15" "2015-16" "2013-14" ...
$ Primary_Only : num [1:110] 88.7 91.2 97 89.7 100 ...
$ Primary_with_U_Primary : num [1:110] 96 96.9 99 97.4 100 ...
$ Primary_with_U_Primary_Sec_HrSec: num [1:110] 98.8 99.5 99.7 100 100 ...
$ U_Primary_Only : num [1:110] 91.4 91.4 96.3 0 100 ...
$ U_Primary_With_Sec_HrSec : num [1:110] 98.2 99.2 99.6 100 100 ...
$ Primary_with_U_Primary_Sec : num [1:110] 97.3 98.2 99.3 100 100 ...
$ U_Primary_With_Sec : num [1:110] 94.4 96.6 98.8 0 0 ...
$ Sec_Only : num [1:110] 99.1 90.3 95.2 0 0 ...
$ Sec_with_HrSec. : num [1:110] 98.4 94 98.3 100 100 ...
$ HrSec_Only : num [1:110] 76.1 90.9 96.2 0 0 ...
$ All Schools : num [1:110] 91.2 93.1 97.5 93.4 100 ...
- attr(*, "spec")=
.. cols(
.. State_UT = col_character(),
.. year = col_character(),
.. Primary_Only = col_double(),
.. Primary_with_U_Primary = col_double(),
.. Primary_with_U_Primary_Sec_HrSec = col_double(),
.. U_Primary_Only = col_double(),
.. U_Primary_With_Sec_HrSec = col_double(),
.. Primary_with_U_Primary_Sec = col_double(),
.. U_Primary_With_Sec = col_double(),
.. Sec_Only = col_double(),
.. Sec_with_HrSec. = col_double(),
.. HrSec_Only = col_double(),
.. `All Schools` = col_double()
.. )
- attr(*, "problems")=<externalptr>
Preliminary Research Questions:
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Pola (2022, May 19). Data Analytics and Computational Social Science: HW-3 | Major Project Dataset and Preliminary Research Questions. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomniharika896286/
BibTeX citation
@misc{pola2022hw-3, author = {Pola, Niharika}, title = {Data Analytics and Computational Social Science: HW-3 | Major Project Dataset and Preliminary Research Questions}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomniharika896286/}, year = {2022} }