Imapct on the Indian Education System | A deep dive data analysis on the Enrollment and Dropout student ratio

Data Science Fundamental - Final Paper | Exploratory Data Analysis of the Indian Education System with special focus on the available facilities

Niharika Pola
2022-05-12

INTRODUCTION

India is a country of 28 states and 7 Union Territories (UTs). Universal and compulsory education is a cherished dream of the republic of India. Primary Education is a basic right of every citizen, but recent study on the dropouts from schools has stated that every year 62% of students have been dropping out. This is raising concern in the country. Education enriches people’s understanding of their own country and the world. It improves the quality of life, promotes creativity, productivity and entrepreneurship leading to the economic development of the country. As a nation with second most highest population in the world, India needs to have access to quality education to its citizens.

This motivated me to perform my analysis on the Indian Education System, specifically on the enrollment and dropout data. I have found 7 data sets from data.gov.in the official Indian government run website for data. As this is my first project in the field of data analysis I really wanted to work on a topic which I am always passionate about - Education. I am very proud and happy with the findings of my study.

In this project I have worked on 7 data sets related to the Indian Education System from 2013-2016. First two data sets talk about the Gross Enrollment Ratio and Dropout Ratio, remaining 5 talk about the availability of basic facilities (Water, Electricity, Boys & Girls Toilets and Computers) in Schools. Every data set has State/Union Territory, Year and Percentage data across various levels of the Education - Primary, Upper Primary, Secondary and Higher Secondary.

Lower Primary/ Primary - Nursery to class 1st Upper Primary - Class 1st to 5th Secondary - Class 6th to 8th Higher Secondary/Higher Secondary - Class 9th and 10th

The aim of this project is to perform Exploratory Data Analysis(EDA) of the 7 data sets to:

  1. Analyze the Gross Enrollment Ratio and Dropout Ratio in the above mentioned classes All over India & across states and understand the,

and provide few recommendations to the Indian Government based on the Analysis.

  1. Compare the states with lowest dropout ratio with the available facilities data sets.

  2. To find out the impact of non-availability of these facilities on the dropout ratio.

  3. To analyze the trends of available facilities data sets across India.

Loading the packages

Dataset-1 | Gross Enrollment Ratio from 2013-2016 across all Indian States

Gross Enrollment Ratio (GER) or Gross Enrollment Index (GEI) is a statistical measure used in the education sector, to determine the number of students enrolled in school at several different grade levels (like elementary, middle school and high school), and use it to show the ratio of the number of students who live in that country to those who qualify for the particular grade level.

The GER can be over 100% as it includes students who may be older or younger than the official age group.

For instance, in India it improved from 25.8 to 26.3, the GER includes students who are repeating a grade, those who enrolled late and are older than their classmates, or those who have advanced quickly and are younger than their classmates. This allows the total enrollment to exceed the population that corresponds to that level of education.

Calculation Method

a = number of students enrolled in a given level b = population of the age group corresponds to given level of education India

GER=a/b×100

Reading Dataset-1

gross_enrollment_ratio <- read_csv("601 Major Project/gross-enrollment-ratio.csv")
dim(gross_enrollment_ratio)
[1] 110  14
head(gross_enrollment_ratio)
# A tibble: 6 x 14
  State_UT              Year  Primary_Boys Primary_Girls Primary_Total
  <chr>                 <chr>        <dbl>         <dbl>         <dbl>
1 Andaman & Nicobar Is~ 2013~         95.9          92.0          93.9
2 Andhra Pradesh        2013~         96.6          96.9          96.7
3 Arunachal Pradesh     2013~        129.          128.          128. 
4 Assam                 2013~        112.          115.          113. 
5 Bihar                 2013~         95.0         101.           98.0
6 Chandigarh            2013~         88.4          96.1          91.8
# ... with 9 more variables: Upper_Primary_Boys <dbl>,
#   Upper_Primary_Girls <dbl>, Upper_Primary_Total <dbl>,
#   Secondary_Boys <dbl>, Secondary_Girls <dbl>,
#   Secondary_Total <dbl>, Higher_Secondary_Boys <chr>,
#   Higher_Secondary_Girls <chr>, Higher_Secondary_Total <chr>
colnames(gross_enrollment_ratio)
 [1] "State_UT"               "Year"                  
 [3] "Primary_Boys"           "Primary_Girls"         
 [5] "Primary_Total"          "Upper_Primary_Boys"    
 [7] "Upper_Primary_Girls"    "Upper_Primary_Total"   
 [9] "Secondary_Boys"         "Secondary_Girls"       
[11] "Secondary_Total"        "Higher_Secondary_Boys" 
[13] "Higher_Secondary_Girls" "Higher_Secondary_Total"
str(gross_enrollment_ratio)
spec_tbl_df [110 x 14] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ State_UT              : chr [1:110] "Andaman & Nicobar Islands" "Andhra Pradesh" "Arunachal Pradesh" "Assam" ...
 $ Year                  : chr [1:110] "2013-14" "2013-14" "2013-14" "2013-14" ...
 $ Primary_Boys          : num [1:110] 95.9 96.6 129.1 111.8 95 ...
 $ Primary_Girls         : num [1:110] 92 96.9 127.8 115.2 101.2 ...
 $ Primary_Total         : num [1:110] 93.9 96.7 128.5 113.4 98 ...
 $ Upper_Primary_Boys    : num [1:110] 94.7 82.8 112.6 87.8 80.6 ...
 $ Upper_Primary_Girls   : num [1:110] 89 84.4 115.3 98.7 94.9 ...
 $ Upper_Primary_Total   : num [1:110] 91.8 83.6 113.9 93.1 87.2 ...
 $ Secondary_Boys        : num [1:110] 102.9 73.8 88.4 65.6 57.7 ...
 $ Secondary_Girls       : num [1:110] 97.4 76.8 84.9 77.2 63 ...
 $ Secondary_Total       : num [1:110] 100.2 75.2 86.7 71.2 60.1 ...
 $ Higher_Secondary_Boys : chr [1:110] "105.4" "59.83" "65.16" "31.78" ...
 $ Higher_Secondary_Girls: chr [1:110] "96.61" "60.83" "65.38" "34.27" ...
 $ Higher_Secondary_Total: chr [1:110] "101.28" "60.3" "65.27" "32.94" ...
 - attr(*, "spec")=
  .. cols(
  ..   State_UT = col_character(),
  ..   Year = col_character(),
  ..   Primary_Boys = col_double(),
  ..   Primary_Girls = col_double(),
  ..   Primary_Total = col_double(),
  ..   Upper_Primary_Boys = col_double(),
  ..   Upper_Primary_Girls = col_double(),
  ..   Upper_Primary_Total = col_double(),
  ..   Secondary_Boys = col_double(),
  ..   Secondary_Girls = col_double(),
  ..   Secondary_Total = col_double(),
  ..   Higher_Secondary_Boys = col_character(),
  ..   Higher_Secondary_Girls = col_character(),
  ..   Higher_Secondary_Total = col_character()
  .. )
 - attr(*, "problems")=<externalptr> 

As you can see, 3 columns (Higher_Secondary_Boys, Higher_Secondary_Girls, Higher_Secondary_Total) are character instead of double. They have NR, @ in the observations. The data needs to be cleaned.

Data Wrangling

gross_enrollment_ratio[ gross_enrollment_ratio == "NR" ] <- NA
gross_enrollment_ratio[ gross_enrollment_ratio == "@" ] <- NA
ger1 <- data.frame(gross_enrollment_ratio)
ger <- na.exclude(ger1)
ger$Higher_Secondary_Boys = as.numeric(ger$Higher_Secondary_Boys)
ger$Higher_Secondary_Girls = as.numeric(ger$Higher_Secondary_Girls)
ger$Higher_Secondary_Total = as.numeric(ger$Higher_Secondary_Total)

str(ger)
'data.frame':   108 obs. of  14 variables:
 $ State_UT              : chr  "Andaman & Nicobar Islands" "Andhra Pradesh" "Arunachal Pradesh" "Assam" ...
 $ Year                  : chr  "2013-14" "2013-14" "2013-14" "2013-14" ...
 $ Primary_Boys          : num  95.9 96.6 129.1 111.8 95 ...
 $ Primary_Girls         : num  92 96.9 127.8 115.2 101.2 ...
 $ Primary_Total         : num  93.9 96.7 128.5 113.4 98 ...
 $ Upper_Primary_Boys    : num  94.7 82.8 112.6 87.8 80.6 ...
 $ Upper_Primary_Girls   : num  89 84.4 115.3 98.7 94.9 ...
 $ Upper_Primary_Total   : num  91.8 83.6 113.9 93.1 87.2 ...
 $ Secondary_Boys        : num  102.9 73.8 88.4 65.6 57.7 ...
 $ Secondary_Girls       : num  97.4 76.8 84.9 77.2 63 ...
 $ Secondary_Total       : num  100.2 75.2 86.7 71.2 60.1 ...
 $ Higher_Secondary_Boys : num  105.4 59.8 65.2 31.8 23.3 ...
 $ Higher_Secondary_Girls: num  96.6 60.8 65.4 34.3 24.2 ...
 $ Higher_Secondary_Total: num  101.3 60.3 65.3 32.9 23.7 ...
 - attr(*, "na.action")= 'exclude' Named int [1:2] 26 99
  ..- attr(*, "names")= chr [1:2] "26" "99"
all_india_ger <- filter(ger,  State_UT=="All India") %>% 
  arrange(Year)

plotting All India girls enrollment ratio:

all_india_ger_girls <- select(all_india_ger,Year, ends_with("girls")) 
head(all_india_ger_girls)
     Year Primary_Girls Upper_Primary_Girls Secondary_Girls
1 2013-14        102.65               92.75           76.47
2 2014-15        101.43               95.29           78.94
3 2015-16        100.69               97.57           80.97
  Higher_Secondary_Girls
1                  51.58
2                  53.81
3                  56.41
  fig1 <- pivot_longer(all_india_ger_girls, c(Primary_Girls, Upper_Primary_Girls, Secondary_Girls, Higher_Secondary_Girls), names_to = "Education_Level", values_to = "GER") 
  ggplot(fig1, aes(x=Year, y=GER, fill=Education_Level)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-1: Gross Enrollment Ratio of Girls in India") +  geom_text(aes(label=GER), size = 4, position = position_dodge(width = .9), vjust = 0, color = "black") + theme_classic()

Findings:

Plotting All India boys enrollment ratio:

all_india_ger_boys <- select(all_india_ger, Year, ends_with("boys"))
head(all_india_ger_boys)
     Year Primary_Boys Upper_Primary_Boys Secondary_Boys
1 2013-14       100.20              86.31          76.80
2 2014-15        98.85              87.71          78.13
3 2015-16        97.87              88.72          79.16
  Higher_Secondary_Boys
1                 52.77
2                 54.57
3                 55.95
  fig2 <- pivot_longer(all_india_ger_boys, c(Primary_Boys, Upper_Primary_Boys, Secondary_Boys, Higher_Secondary_Boys), names_to = "Education_Level", values_to = "GER") 
  ggplot(fig2, aes(x=Year, y=GER, fill=Education_Level)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-2: Gross Enrollment Ratio of boys in India") + geom_text(aes(label=GER), size = 4, position = position_dodge(width = .9), vjust = 0, color = "black") + theme_classic()

Findings:

Plotting All India total enrollment ratio:
fig3 <- pivot_longer(all_india_ger, c(Primary_Boys, Primary_Girls, Primary_Total, Upper_Primary_Boys, Upper_Primary_Girls, Upper_Primary_Total, Secondary_Boys, Secondary_Girls, Secondary_Total, Higher_Secondary_Girls, Higher_Secondary_Boys, Higher_Secondary_Total), names_to = "Education_Level")
ggplot(fig3, aes(x=value, y=Education_Level)) + geom_boxplot(color="red") + geom_text(aes(label=value), size=4, vjust=0) + labs(title = "Fig-3: All India Enrollment Ratio - Education Wise ", x="Enrollment Percentage", y="Education Level") + facet_wrap(~Year) 

Finding:


State-Wise Gross Enrollment Analysis

states_ger <- filter(ger,  State_UT != "All India") 
head(states_ger)
                   State_UT    Year Primary_Boys Primary_Girls
1 Andaman & Nicobar Islands 2013-14        95.88         91.97
2            Andhra Pradesh 2013-14        96.62         96.87
3         Arunachal Pradesh 2013-14       129.12        127.77
4                     Assam 2013-14       111.77        115.16
5                     Bihar 2013-14        95.03        101.15
6                Chandigarh 2013-14        88.42         96.09
  Primary_Total Upper_Primary_Boys Upper_Primary_Girls
1         93.93              94.70               88.98
2         96.74              82.81               84.38
3        128.46             112.64              115.27
4        113.43              87.85               98.69
5         97.96              80.60               94.92
6         91.85              99.93              103.02
  Upper_Primary_Total Secondary_Boys Secondary_Girls Secondary_Total
1               91.83         102.89           97.36          100.16
2               83.57          73.76           76.77           75.20
3              113.94          88.37           84.89           86.65
4               93.13          65.60           77.20           71.21
5               87.24          57.66           62.96           60.08
6              101.27          92.08           92.16           92.11
  Higher_Secondary_Boys Higher_Secondary_Girls Higher_Secondary_Total
1                105.40                  96.61                 101.28
2                 59.83                  60.83                  60.30
3                 65.16                  65.38                  65.27
4                 31.78                  34.27                  32.94
5                 23.33                  24.17                  23.70
6                 90.50                  92.88                  91.49

I used google maps access key to get the Indian map and to get latitude and longitude coordinates for the states. I merged the coordinates data with my existing dataset.

library(ggmap)
register_google(key = "AIzaSyDc2lDTQRLgvlGtdiZM6hkShq0fW_wv4-0")
coordinates <- geocode(states_ger$State_UT)
plot <- merge(states_ger,coordinates)
head(plot)
                   State_UT    Year Primary_Boys Primary_Girls
1 Andaman & Nicobar Islands 2013-14        95.88         91.97
2            Andhra Pradesh 2013-14        96.62         96.87
3         Arunachal Pradesh 2013-14       129.12        127.77
4                     Assam 2013-14       111.77        115.16
5                     Bihar 2013-14        95.03        101.15
6                Chandigarh 2013-14        88.42         96.09
  Primary_Total Upper_Primary_Boys Upper_Primary_Girls
1         93.93              94.70               88.98
2         96.74              82.81               84.38
3        128.46             112.64              115.27
4        113.43              87.85               98.69
5         97.96              80.60               94.92
6         91.85              99.93              103.02
  Upper_Primary_Total Secondary_Boys Secondary_Girls Secondary_Total
1               91.83         102.89           97.36          100.16
2               83.57          73.76           76.77           75.20
3              113.94          88.37           84.89           86.65
4               93.13          65.60           77.20           71.21
5               87.24          57.66           62.96           60.08
6              101.27          92.08           92.16           92.11
  Higher_Secondary_Boys Higher_Secondary_Girls Higher_Secondary_Total
1                105.40                  96.61                 101.28
2                 59.83                  60.83                  60.30
3                 65.16                  65.38                  65.27
4                 31.78                  34.27                  32.94
5                 23.33                  24.17                  23.70
6                 90.50                  92.88                  91.49
       lon      lat
1 92.65864 11.74009
2 92.65864 11.74009
3 92.65864 11.74009
4 92.65864 11.74009
5 92.65864 11.74009
6 92.65864 11.74009

The below map is a terrain style map of India. I wanted to integrate my data with a choropleth map, however i understood that R-Studio has pre-existing choropleth map for world and USA but not for other countries and ggmap supports very few map types - “terrain”, “satellite”, “hybrid” and “roadmap” but not choropleth. I feel this is a drawback for R-Studio as well as ggmaps.

map <- get_map(location = 'India', zoom = 5, maptype= 'terrain', scale = "auto")

Plotting the Education Level - wise GER data in the map

ggmap(map) + geom_point(data=plot, aes(x =lon, y =lat, size= Primary_Boys, colour=Primary_Boys, alpha=0.5))
ggmap(map) + geom_point(data=plot, aes(x =lon, y =lat, size=Upper_Primary_Boys, colour=Upper_Primary_Boys, alpha=0.5))
ggmap(map) + geom_point(data=plot, aes(x =lon, y =lat, size=Secondary_Boys, color=Secondary_Boys, alpha=0.5 ))
ggmap(map) + geom_point(data=plot, aes(x =lon, y =lat, size=Higher_Secondary_Boys, color=Higher_Secondary_Boys, alpha=0.5))

ggmap(map) + geom_point(data=plot, aes(x =lon, y =lat, size=Primary_Girls, color=Primary_Girls, alpha=0.5))
ggmap(map) + geom_point(data=plot, aes(x =lon, y =lat, size=Upper_Primary_Girls,color=Upper_Primary_Girls, alpha=0.5))
ggmap(map) + geom_point(data=plot, aes(x =lon, y =lat, size=Secondary_Girls,color=Secondary_Girls, alpha=0.5))
ggmap(map) + geom_point(data=plot, aes(x =lon, y =lat, size=Higher_Secondary_Girls,color=Higher_Secondary_Girls, alpha=0.5))

Findings from the maps:

The analysis would have been much more clear if the map is choropleth, I would take this as a scope for improvement in my next projects.


states_ger %>% 
  select(Year, State_UT, Primary_Boys,Upper_Primary_Boys, Secondary_Boys, Higher_Secondary_Boys) %>% 
  group_by(Year) %>% 
  summarise(avg_pb=mean(Primary_Boys), avg_upb=mean(Upper_Primary_Boys), avg_sb=mean(Secondary_Boys), avg_hsb=mean(Higher_Secondary_Boys)) 
# A tibble: 3 x 5
  Year    avg_pb avg_upb avg_sb avg_hsb
  <chr>    <dbl>   <dbl>  <dbl>   <dbl>
1 2013-14   105.    96.9   87.2    60.0
2 2014-15   102.    97.0   88.0    60.4
3 2015-16   100.    98.1   86.9    58.2
states_ger %>% 
  select(Year, Primary_Girls,Upper_Primary_Girls, Secondary_Girls, Higher_Secondary_Girls) %>% 
  group_by(Year) %>% 
  summarise(avg_pb=mean(Primary_Girls), avg_upb=mean(Upper_Primary_Girls), avg_sb=mean(Secondary_Girls), avg_hsb=mean(Higher_Secondary_Girls))
# A tibble: 3 x 5
  Year    avg_pb avg_upb avg_sb avg_hsb
  <chr>    <dbl>   <dbl>  <dbl>   <dbl>
1 2013-14   106.    99.8   88.0    60.5
2 2014-15   103.   102.    89.6    62.2
3 2015-16   101.   104.    89.4    61.8

Findings:

My further analysis will focus on analyzing the dropout percentage and finding if we can get any correlation between the Gross Enrollment and Dropout.

Data Set-2 | Dropout Ratio/Percentage across all Indian States from 2013-2016

There are varying definitions on the web for Dropout Ratio. I will keep it simple here. Dropout Ratio simply means any student who leaves school for any reason before graduation or completion of a program of studies without transferring to another school.

Reading Dataset-2

dropout_ratio <- read_csv("601 Major Project/dropout-ratio.csv")
head(dropout_ratio)
# A tibble: 6 x 14
  State_UT       year    Primary_Boys Primary_Girls Primary_Total
  <chr>          <chr>   <chr>        <chr>         <chr>        
1 A & N Islands  2012-13 0.83         0.51          0.68         
2 A & N Islands  2013-14 1.35         1.06          1.21         
3 A & N Islands  2014-15 0.47         0.55          0.51         
4 Andhra Pradesh 2012-13 3.3          3.05          3.18         
5 Andhra Pradesh 2013-14 4.31         4.39          4.35         
6 Andhra Pradesh 2014-15 6.57         6.89          6.72         
# ... with 9 more variables: `Upper Primary_Boys` <chr>,
#   `Upper Primary_Girls` <chr>, `Upper Primary_Total` <chr>,
#   `Secondary _Boys` <chr>, `Secondary _Girls` <chr>,
#   `Secondary _Total` <chr>, HrSecondary_Boys <chr>,
#   HrSecondary_Girls <chr>, HrSecondary_Total <chr>
colnames(dropout_ratio)
 [1] "State_UT"            "year"                "Primary_Boys"       
 [4] "Primary_Girls"       "Primary_Total"       "Upper Primary_Boys" 
 [7] "Upper Primary_Girls" "Upper Primary_Total" "Secondary _Boys"    
[10] "Secondary _Girls"    "Secondary _Total"    "HrSecondary_Boys"   
[13] "HrSecondary_Girls"   "HrSecondary_Total"  

Datatype of each column

str(dropout_ratio)
spec_tbl_df [110 x 14] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ State_UT           : chr [1:110] "A & N Islands" "A & N Islands" "A & N Islands" "Andhra Pradesh" ...
 $ year               : chr [1:110] "2012-13" "2013-14" "2014-15" "2012-13" ...
 $ Primary_Boys       : chr [1:110] "0.83" "1.35" "0.47" "3.3" ...
 $ Primary_Girls      : chr [1:110] "0.51" "1.06" "0.55" "3.05" ...
 $ Primary_Total      : chr [1:110] "0.68" "1.21" "0.51" "3.18" ...
 $ Upper Primary_Boys : chr [1:110] "Uppe_r_Primary" "NR" "1.44" "3.21" ...
 $ Upper Primary_Girls: chr [1:110] "1.09" "1.54" "1.95" "3.51" ...
 $ Upper Primary_Total: chr [1:110] "1.23" "0.51" "1.69" "3.36" ...
 $ Secondary _Boys    : chr [1:110] "5.57" "8.36" "11.47" "12.21" ...
 $ Secondary _Girls   : chr [1:110] "5.55" "5.98" "8.16" "13.25" ...
 $ Secondary _Total   : chr [1:110] "5.56" "7.2" "9.87" "12.72" ...
 $ HrSecondary_Boys   : chr [1:110] "17.66" "18.94" "21.05" "2.66" ...
 $ HrSecondary_Girls  : chr [1:110] "10.15" "12.2" "12.21" "NR" ...
 $ HrSecondary_Total  : chr [1:110] "14.14" "15.87" "16.93" "0.35" ...
 - attr(*, "spec")=
  .. cols(
  ..   State_UT = col_character(),
  ..   year = col_character(),
  ..   Primary_Boys = col_character(),
  ..   Primary_Girls = col_character(),
  ..   Primary_Total = col_character(),
  ..   `Upper Primary_Boys` = col_character(),
  ..   `Upper Primary_Girls` = col_character(),
  ..   `Upper Primary_Total` = col_character(),
  ..   `Secondary _Boys` = col_character(),
  ..   `Secondary _Girls` = col_character(),
  ..   `Secondary _Total` = col_character(),
  ..   HrSecondary_Boys = col_character(),
  ..   HrSecondary_Girls = col_character(),
  ..   HrSecondary_Total = col_character()
  .. )
 - attr(*, "problems")=<externalptr> 

Data Wrangling

library(janitor)
dropout_ratio <- clean_names(dropout_ratio)
dim(dropout_ratio)
[1] 110  14
dropout_ratio[ dropout_ratio == "NR" ] <- NA
#dropout_ratio[ dropout_ratio == "upper_primary_boys" ] <- NA
dropout_ratio[ dropout_ratio == "Uppe_r_Primary" ] <- NA
dropout_ratio <- data.frame(dropout_ratio)
dropout_ratio <- na.exclude(dropout_ratio)
dim(dropout_ratio)
[1] 55 14
dropout_ratio$primary_boys = as.numeric(dropout_ratio$primary_boys)
dropout_ratio$primary_girls = as.numeric(dropout_ratio$primary_girls)
dropout_ratio$primary_total = as.numeric(dropout_ratio$primary_total)
dropout_ratio$upper_primary_boys = as.numeric(dropout_ratio$upper_primary_boys)
dropout_ratio$upper_primary_girls = as.numeric(dropout_ratio$upper_primary_girls)
dropout_ratio$upper_primary_total = as.numeric(dropout_ratio$upper_primary_total)
dropout_ratio$secondary_boys = as.numeric(dropout_ratio$secondary_boys)
dropout_ratio$secondary_girls = as.numeric(dropout_ratio$secondary_girls)
dropout_ratio$secondary_total = as.numeric(dropout_ratio$secondary_total)
dropout_ratio$hr_secondary_boys = as.numeric(dropout_ratio$hr_secondary_boys)
dropout_ratio$hr_secondary_girls = as.numeric(dropout_ratio$hr_secondary_girls)
dropout_ratio$hr_secondary_total = as.numeric(dropout_ratio$hr_secondary_total)
str(dropout_ratio)
'data.frame':   55 obs. of  14 variables:
 $ state_ut           : chr  "A & N Islands" "Andhra Pradesh" "Arunachal  Pradesh" "Arunachal Pradesh" ...
 $ year               : chr  "2014-15" "2013-14" "2013-14" "2012-13" ...
 $ primary_boys       : num  0.47 4.31 11.54 15.84 11.51 ...
 $ primary_girls      : num  0.55 4.39 10.22 14.44 10.09 ...
 $ primary_total      : num  0.51 4.35 10.89 15.16 10.82 ...
 $ upper_primary_boys : num  1.44 3.46 4.44 5.86 5.31 7.89 7.6 6.47 3.31 3.7 ...
 $ upper_primary_girls: num  1.95 4.12 6.74 9.06 8.08 6.55 6.54 5.22 5.09 4.4 ...
 $ upper_primary_total: num  1.69 3.78 5.59 7.47 6.71 7.2 7.05 5.85 4.13 4.02 ...
 $ secondary_boys     : num  11.5 11.9 16.1 14 18.3 ...
 $ secondary_girls    : num  8.16 13.37 12.75 11.77 15.81 ...
 $ secondary_total    : num  9.87 12.65 14.49 12.93 17.11 ...
 $ hr_secondary_boys  : num  21.05 12.65 18.57 7.85 19.37 ...
 $ hr_secondary_girls : num  12.21 10.85 15.49 2.14 17.44 ...
 $ hr_secondary_total : num  16.93 11.79 17.07 5.11 18.42 ...
 - attr(*, "na.action")= 'exclude' Named int [1:55] 1 2 4 6 12 13 14 15 16 17 ...
  ..- attr(*, "names")= chr [1:55] "1" "2" "4" "6" ...

Plotting Education-Wise All India dropout percentage:

all_india_drop <- filter(dropout_ratio, state_ut=="All India") 
dim(all_india_drop)
[1]  1 14
fig4 <- pivot_longer(all_india_drop, c(primary_boys, primary_girls, primary_total, upper_primary_boys, upper_primary_girls, upper_primary_total, secondary_boys, secondary_girls, secondary_total, hr_secondary_girls, hr_secondary_boys, hr_secondary_total), names_to = "EducationLevel")
ggplot(fig4, aes(x=value, y=EducationLevel)) + geom_boxplot(color="red") + geom_text(aes(label=value), size=4) + labs(title = "Fig-4: All India Dropout Ratio - Education Wise ", x="Dropout Percentage", y="Education Level") + facet_wrap(~year) 

Findings:

Correlation between Gross Enrollment Ratio and Dropout Ratios:

Filtering out the State-Wise Dropouts data:

states_drop <- filter(dropout_ratio,  state_ut != "All India") 
head(states_drop)
            state_ut    year primary_boys primary_girls primary_total
1      A & N Islands 2014-15         0.47          0.55          0.51
2     Andhra Pradesh 2013-14         4.31          4.39          4.35
3 Arunachal  Pradesh 2013-14        11.54         10.22         10.89
4  Arunachal Pradesh 2012-13        15.84         14.44         15.16
5  Arunachal Pradesh 2014-15        11.51         10.09         10.82
6              Assam 2012-13         7.02          5.46          6.24
  upper_primary_boys upper_primary_girls upper_primary_total
1               1.44                1.95                1.69
2               3.46                4.12                3.78
3               4.44                6.74                5.59
4               5.86                9.06                7.47
5               5.31                8.08                6.71
6               7.89                6.55                7.20
  secondary_boys secondary_girls secondary_total hr_secondary_boys
1          11.47            8.16            9.87             21.05
2          11.95           13.37           12.65             12.65
3          16.08           12.75           14.49             18.57
4          13.99           11.77           12.93              7.85
5          18.33           15.81           17.11             19.37
6          25.65           27.79           26.77              4.87
  hr_secondary_girls hr_secondary_total
1              12.21              16.93
2              10.85              11.79
3              15.49              17.07
4               2.14               5.11
5              17.44              18.42
6               4.50               4.69

Analysis of the dropout ratio of Primary Boys:

primary_boys_drop <- states_drop[c("state_ut", "year", "primary_boys")] 
slice_min(primary_boys_drop, primary_boys)
  state_ut    year primary_boys
1  Gujarat 2012-13         0.21
top10 <- arrange(primary_boys_drop, desc(primary_boys))
top10 <- slice_head(top10, n=10)
top10
             state_ut    year primary_boys
1            Nagaland 2013-14        19.09
2             Manipur 2013-14        17.27
3   Arunachal Pradesh 2012-13        15.84
4  Arunachal  Pradesh 2013-14        11.54
5   Arunachal Pradesh 2014-15        11.51
6             Manipur 2012-13        10.24
7             Mizoram 2014-15        10.17
8      Madhya Pradesh 2013-14         9.91
9       Uttar Pradesh 2014-15         9.08
10              Assam 2013-14         8.19
ggplot(top10, aes(x=year, y=primary_boys, fill=year)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-5: top-10 states with highest dropout rate of boys in India ", subtitle = "Education Level - Primary ", y="dropout percentage", caption="data from the Government of India") + geom_text(aes(label=primary_boys), size = 4, position = position_dodge(width = .9), vjust = 1, color = "white") + theme_dark() + facet_wrap(~state_ut)

Analysis of the dropout ratio of Primary Girls:

primary_girls_drop <- states_drop[c("state_ut", "year", "primary_girls")] 
slice_min(primary_girls_drop, primary_girls)
     state_ut    year primary_girls
1 Daman & Diu 2014-15          0.29
top10 <- arrange(primary_girls_drop, desc(primary_girls))
top10 <- slice_head(top10, n=10)
top10
             state_ut    year primary_girls
1            Nagaland 2013-14         19.74
2             Manipur 2013-14         18.74
3   Arunachal Pradesh 2012-13         14.44
4      Madhya Pradesh 2013-14         10.40
5  Arunachal  Pradesh 2013-14         10.22
6   Arunachal Pradesh 2014-15         10.09
7             Mizoram 2014-15         10.03
8             Manipur 2012-13          9.48
9       Uttar Pradesh 2014-15          8.04
10           Nagaland 2012-13          7.03
ggplot(top10, aes(x=year, y=primary_girls, fill=year)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-6: top-10 states with highest dropout rate of girls in India ", subtitle = "Education Level - Primary ", y="dropout percentage", caption="data from the Government of India") + geom_text(aes(label=primary_girls), size = 4, position = position_dodge(width = .9), vjust = 1, color = "white") + theme_dark() + facet_wrap(~state_ut)

Analysis of dropout ratio of Upper Primary Boys:

upper_primary_boys_drop <- states_drop[c("state_ut", "year", "upper_primary_boys")] 
slice_min(upper_primary_boys_drop, upper_primary_boys)
    state_ut    year upper_primary_boys
1 Puducherry 2012-13               0.33
top10 <- arrange(upper_primary_boys_drop, desc(upper_primary_boys))
top10 <- slice_head(top10, n=10)
top10
         state_ut    year upper_primary_boys
1        Nagaland 2013-14              18.08
2        Nagaland 2012-13              10.15
3  Madhya Pradesh 2013-14               9.88
4       Jharkhand 2014-15               9.01
5           Assam 2012-13               7.89
6        Nagaland 2014-15               7.87
7           Assam 2013-14               7.60
8         Manipur 2013-14               7.48
9    Chhattisgarh 2014-15               6.47
10         Sikkim 2013-14               6.35
ggplot(top10, aes(x=year, y=upper_primary_boys, fill=year)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-7: top-10 states with highest dropout rate of boys in India ", subtitle = "Education Level - Upper Primary ",  y="dropout percentage", caption="data from the Government of India") + geom_text(aes(label=upper_primary_boys), size = 4, position = position_dodge(width = .9), vjust = 1, color = "white") + theme_dark() + facet_wrap(~state_ut)

Analysis of dropout ratio of upper primary girls:

upper_primary_girls_drop <- states_drop[c("state_ut", "year", "upper_primary_girls")] 
slice_min(upper_primary_girls_drop, upper_primary_girls)
          state_ut    year upper_primary_girls
1 Himachal Pradesh 2012-13                0.49
top10 <- arrange(upper_primary_girls_drop, desc(upper_primary_girls))
top10 <- slice_head(top10, n=10)
top10
            state_ut    year upper_primary_girls
1           Nagaland 2013-14               17.63
2     Madhya Pradesh 2013-14               13.57
3           Nagaland 2012-13                9.51
4  Arunachal Pradesh 2012-13                9.06
5          Jharkhand 2014-15                8.96
6            Gujarat 2014-15                8.54
7            Gujarat 2012-13                8.19
8  Arunachal Pradesh 2014-15                8.08
9            Gujarat 2013-14                8.04
10          Nagaland 2014-15                7.97
ggplot(top10, aes(x=year, y=upper_primary_girls, fill=year)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-8: top-10 states with highest dropout rate of girls in India ", subtitle = "Education Level - Upper Primary ", y="dropout percentage", caption="data from the Government of India") + geom_text(aes(label=upper_primary_girls), size = 4, position = position_dodge(width = .9), vjust = 1, color = "white") + theme_dark() + facet_wrap(~state_ut)

Analysis of dropout ratio of Secondary boys:

secondary_boys_drop <- states_drop[c("state_ut", "year", "secondary_boys")] 
slice_min(secondary_boys_drop, secondary_boys)
          state_ut    year secondary_boys
1 Himachal Pradesh 2014-15           6.31
top10 <- arrange(secondary_boys_drop, desc(secondary_boys))
top10 <- slice_head(top10, n=10)
top10
               state_ut    year secondary_boys
1             Karnataka 2012-13          40.70
2           Daman & Diu 2014-15          34.45
3              Nagaland 2013-14          34.14
4  Dadra & Nagar Haveli 2013-14          30.02
5                 Assam 2013-14          28.59
6               Tripura 2014-15          28.03
7              Nagaland 2012-13          26.70
8               Gujarat 2014-15          26.29
9                 Assam 2012-13          25.65
10       Madhya Pradesh 2013-14          25.21
ggplot(top10, aes(x=year, y=secondary_boys, fill=year)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-9: top-10 states with highest dropout rate of boys in India ", subtitle = "Education Level - secondary ",  y="dropout percentage", caption="data from the Government of India") + geom_text(aes(label=secondary_boys), size = 4, position = position_dodge(width = .9), vjust = 1, color = "white") + theme_dark() + facet_wrap(~state_ut)

Analysis of dropout ratio of Secondary Girls:

secondary_girls_drop <- states_drop[c("state_ut", "year", "secondary_girls")] 
slice_min(secondary_girls_drop, secondary_girls)
          state_ut    year secondary_girls
1 Himachal Pradesh 2014-15             5.8
top10 <- arrange(secondary_girls_drop, desc(secondary_girls))
top10 <- slice_head(top10, n=10)
top10
               state_ut    year secondary_girls
1             Karnataka 2012-13           39.07
2              Nagaland 2013-14           36.08
3                 Assam 2013-14           32.10
4           Daman & Diu 2014-15           29.73
5               Tripura 2014-15           28.83
6        Madhya Pradesh 2013-14           27.91
7                 Assam 2012-13           27.79
8               Tripura 2012-13           26.99
9  Dadra & Nagar Haveli 2013-14           26.83
10             Nagaland 2012-13           26.33
ggplot(top10, aes(x=year, y=secondary_girls, fill=year)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-10: top-10 states with highest dropout rate of girls in India ", subtitle = "Education Level - secondary ",  y="dropout percentage", caption="data from the Government of India") + geom_text(aes(label=secondary_girls), size = 4, position = position_dodge(width = .9), vjust = 1, color = "white") + theme_dark() + facet_wrap(~state_ut)

Analysis of dropout ratio of Higher-Secondary boys:

hr_secondary_boys_drop <- states_drop[c("state_ut", "year", "hr_secondary_boys")] 
slice_min(hr_secondary_boys_drop, hr_secondary_boys)
        state_ut    year hr_secondary_boys
1 Madhya Pradesh 2013-14              0.52
top10 <- arrange(hr_secondary_boys_drop, desc(hr_secondary_boys))
top10 <- slice_head(top10, n=10)
top10
             state_ut    year hr_secondary_boys
1         Daman & Diu 2014-15             44.38
2       A & N Islands 2014-15             21.05
3           Karnataka 2012-13             19.47
4   Arunachal Pradesh 2014-15             19.37
5            Nagaland 2012-13             18.67
6  Arunachal  Pradesh 2013-14             18.57
7            Nagaland 2013-14             15.36
8         Daman & Diu 2013-14             14.48
9              Sikkim 2013-14             14.11
10    Jammu & Kashmir 2014-15             13.85
ggplot(top10, aes(x=year, y=hr_secondary_boys, fill=year)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-11: top-10 states with highest dropout rate of boys in India ", subtitle = "Education Level - Higher secondary ",  y="dropout percentage", caption="data from the Government of India") + geom_text(aes(label=hr_secondary_boys), size = 4, position = position_dodge(width = .9), vjust = 1, color = "white") + theme_dark() + facet_wrap(~state_ut)

Analysis of dropout ratio of Higher Secondary Girls:

hr_secondary_girls_drop <- states_drop[c("state_ut", "year", "hr_secondary_girls")] 
slice_min(hr_secondary_girls_drop, hr_secondary_girls)
  state_ut    year hr_secondary_girls
1  Gujarat 2012-13                0.3
top10 <- arrange(hr_secondary_girls_drop, desc(hr_secondary_girls))
top10 <- slice_head(top10, n=10)
top10
             state_ut    year hr_secondary_girls
1         Daman & Diu 2014-15              36.05
2            Nagaland 2012-13              17.87
3   Arunachal Pradesh 2014-15              17.44
4  Arunachal  Pradesh 2013-14              15.49
5           Telangana 2013-14              13.20
6            Nagaland 2013-14              12.96
7       A & N Islands 2014-15              12.21
8              Sikkim 2013-14              11.92
9           Karnataka 2012-13              11.26
10    Jammu & Kashmir 2014-15              11.20
ggplot(top10, aes(x=year, y=hr_secondary_girls, fill=year)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-12: top-10 states with highest dropout rate of girls in India ", subtitle = "Education Level - Higher secondary ",  y="dropout percentage", caption="data from the Government of India") + geom_text(aes(label=hr_secondary_girls), size = 4, position = position_dodge(width = .9), vjust = 1, color = "white") + theme_dark() + facet_wrap(~state_ut, scales = "free_y")

Findings:

My key findings from the analysis of dropout percentage of boys and girls across all the 4 education levels are as follows:

Boys:

Girls:

My analysis shows that Nagaland, Karnataka and Daman & Diu has the highest dropout rates for boys and girls in primary & upper primary levels, Secondary and Higher Secondary levels respectively.

Gujarat is doing well in terms of the dropout percentage and similar to the Gross Enrollment Ratio the dropout rates are good for girls rather than boys.

My further analysis will be focused on the states of Nagaland, Karnataka and Daman & Diu to findout whether the availability of facilities in schools is effecting the dropout percentage. The study will also focus on those states having less or no facilities at all.


Dataset - 3 | Percentage of Schools with access to computers in India

Reading Dataset-3

schools_with_comps <- read_csv("601 Major Project/percentage-of-schools-with-comps.csv")
colnames(schools_with_comps)
 [1] "State_UT"                        
 [2] "year"                            
 [3] "Primary_Only"                    
 [4] "Primary_with_U_Primary"          
 [5] "Primary_with_U_Primary_Sec_HrSec"
 [6] "U_Primary_Only"                  
 [7] "U_Primary_With_Sec_HrSec"        
 [8] "Primary_with_U_Primary_Sec"      
 [9] "U_Primary_With_Sec"              
[10] "Sec_Only"                        
[11] "Sec_with_HrSec."                 
[12] "HrSec_Only"                      
[13] "All Schools"                     
schools_with_comps <- rename(schools_with_comps, primary="Primary_Only", upper_primary="U_Primary_Only", secondary="Sec_Only", hr_secondary="HrSec_Only")

Plotting Education-Level wise percentage of All India access to computers:

All_India <- filter(schools_with_comps, State_UT=="All India") %>% 
  select(year, primary, upper_primary, secondary, hr_secondary)
All_India <- pivot_longer(All_India, c(primary, upper_primary, secondary, hr_secondary), names_to = "Education_Level", values_to = "Percentage") 
ggplot(All_India, aes(x=year, y=Percentage, fill=Education_Level)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-13: Percentage of Schools with access to Computer facility all over india") + geom_text(aes(label=Percentage), size = 4, position = position_dodge(width = .9), vjust = 1, color = "black") + theme_classic() + facet_wrap(~Education_Level)

Findings:

Plotting State-Wise Primary level percentage with access to computers:

library(tidytext)
primary_wise<- select(schools_with_comps, State_UT, year, primary) %>% 
  filter(State_UT != "All India")

primary_wise <- arrange(primary_wise, primary)
primary_wise <- slice_head(primary_wise, n=50)
ggplot(primary_wise, aes(x=primary, y=State_UT, fill=State_UT))+geom_bar(stat="identity")+facet_wrap(~year) + labs(title = "Fig-14: States with lowest percentage of Computer Facility", subtitle = "Education Level - Primary",  x="percentage", y="State name") + geom_text(aes(label=primary), size = 3, position = position_dodge(width = .9), vjust = 0, color = "black")

Findings:

Nagaland is among this data, which can be related to the lowest percentage of dropouts in that area. Least access to computers might be one of the reasons for their highest dropout percentage in Primary schools.

library(kableExtra)
upper_primary_wise<- select(schools_with_comps, State_UT, year, upper_primary) %>% 
  filter(State_UT != "All India") %>% 
  arrange(upper_primary) %>% 
  slice(1:20)

kable(upper_primary_wise, digits = 4, align = "ccccccc", col.names = c("State/Union Territory", "Year", "Percentage"), caption = "Table1 : State-wise Percentage of Upper Primary Schools having lowest access to computers", color="black") %>%
  kable_styling(font_size = 15) %>%
  row_spec(c(1,1,1))
(#tab:State-wise_Percentage_of_Upper_Primary_Schools_having_lowest_access_to_computers)Table1 : State-wise Percentage of Upper Primary Schools having lowest access to computers
State/Union Territory Year Percentage
Andaman & Nicobar Islands 2013-14 0.00
Andaman & Nicobar Islands 2015-16 0.00
Chandigarh 2013-14 0.00
Chandigarh 2014-15 0.00
Chandigarh 2015-16 0.00
Puducherry 2013-14 0.00
Sikkim 2015-16 0.00
Telangana 2015-16 0.00
West Bengal 2013-14 7.93
Odisha 2013-14 8.24
Jammu And Kashmir 2013-14 8.82
Odisha 2014-15 9.19
Odisha 2015-16 9.28
West Bengal 2014-15 9.32
West Bengal 2015-16 9.97
Bihar 2013-14 10.79
Jammu And Kashmir 2014-15 11.19
Jammu And Kashmir 2015-16 11.28
Bihar 2015-16 11.64
Bihar 2014-15 11.65

Findings:

Analyzing Primary and Upper-Primary data of Nagaland

nagaland <- filter(schools_with_comps, State_UT=="Nagaland")
nagaland <- select(nagaland, State_UT, year, primary, upper_primary)
nagaland
# A tibble: 3 x 4
  State_UT year    primary upper_primary
  <chr>    <chr>     <dbl>         <dbl>
1 Nagaland 2013-14    4.98          59.1
2 Nagaland 2014-15    4.97          68.8
3 Nagaland 2015-16    5.53          76.9
ggplot() + geom_line(data=nagaland, mapping=aes(x=year, y=primary, group=State_UT), size=1, color="red") + geom_point(data=nagaland, mapping=aes(x=year, y=primary, group=State_UT), color="black") +
  geom_line(data=nagaland, mapping=aes(x=year, y=upper_primary, group=State_UT), color="blue", size=1) + geom_point(data=nagaland, mapping=aes(x=year, y=upper_primary, group=State_UT), color="black") + labs(title = "Fig-15: Percentage of Schools with access to Computers in Nagaland State", subtitle = "Education Level - Primary and Upper Primary (Red-Primary, Blue-Upper Primary)",  x="year", y="dropout percentage") 

Plotting States with high access to computer facility:

secondary_wise<- select(schools_with_comps, State_UT, year, secondary) %>% 
  filter(State_UT != "All India")

secondary_wise <- arrange(secondary_wise, desc(secondary))
secondary_wise
# A tibble: 107 x 3
   State_UT         year    secondary
   <chr>            <chr>       <dbl>
 1 Daman & Diu      2013-14     100  
 2 Himachal Pradesh 2013-14     100  
 3 Kerala           2014-15     100  
 4 Kerala           2015-16     100  
 5 Nagaland         2015-16     100  
 6 Punjab           2013-14     100  
 7 Daman & Diu      2014-15      92.3
 8 Daman & Diu      2015-16      92.3
 9 Maharashtra      2015-16      90.5
10 Maharashtra      2014-15      88.4
# ... with 97 more rows
secondary_wise <- slice_head(secondary_wise, n=40)
ggplot(secondary_wise, aes(x=secondary, y=State_UT, fill=State_UT))+geom_bar(stat="identity")+facet_wrap(~year) + labs(title = "Fig-16: States with highest percentage of Computer Facility", subtitle = "Education Level - secondary",  x="percentage", y="State name") + geom_text(aes(label=secondary), size = 3, position = position_dodge(width = .1), vjust = 0, color = "black")

Plotting Secondary level access to computers in the state of Karnataka

karnataka <-  filter(schools_with_comps, State_UT=="Karnataka")
karnataka <- select(karnataka, State_UT, year, secondary)
karnataka
# A tibble: 3 x 3
  State_UT  year    secondary
  <chr>     <chr>       <dbl>
1 Karnataka 2013-14      67.0
2 Karnataka 2014-15      69.9
3 Karnataka 2015-16      69.3
ggplot(karnataka, aes(x=year, y=secondary, group=State_UT)) + geom_line(size=1, color="purple") + geom_point() + geom_text(aes(label=secondary), size = 5) + labs(title = "Fig-17: Percentage of Schools with access to Computers in the state of Karnataka", subtitle = "Education Level - Secondary",  x="year", y="Percentage")

Plotting States with no access to computers in the Higher Secondary Schools

hr_secondary_wise<- select(schools_with_comps, State_UT, year, hr_secondary) %>% 
  filter(State_UT != "All India") %>% 
  arrange(hr_secondary) %>% 
  slice(1:20)

kable(hr_secondary_wise, digits = 4, align = "ccccccc", col.names = c("State/Union Territory", "Year", "Percentage"), caption = "Table2 : State-wise percentage of Higher Secondary Schools having zero access to computers") %>%
  kable_styling(font_size = 15) %>%
  row_spec(c(1,1,1))
Table 1: Table2 : State-wise percentage of Higher Secondary Schools having zero access to computers
State/Union Territory Year Percentage
Andaman & Nicobar Islands 2013-14 0
Andaman & Nicobar Islands 2014-15 0
Andaman & Nicobar Islands 2015-16 0
Arunachal Pradesh 2014-15 0
Arunachal Pradesh 2015-16 0
Chandigarh 2013-14 0
Chandigarh 2014-15 0
Chandigarh 2015-16 0
Chhattisgarh 2013-14 0
Dadra & Nagar Haveli 2013-14 0
Dadra & Nagar Haveli 2014-15 0
Dadra & Nagar Haveli 2015-16 0
Delhi 2013-14 0
Delhi 2014-15 0
Haryana 2013-14 0
Lakshadweep 2013-14 0
Lakshadweep 2014-15 0
Lakshadweep 2015-16 0
Odisha 2013-14 0
Odisha 2014-15 0

Findings:

Dataset-4 | Percentage of Schools with access to Electricity in India

Reading Dataset-4

schools_with_electricity <- read_csv("601 Major Project/percentage-of-schools-with-electricity.csv")
head(schools_with_electricity)
# A tibble: 6 x 13
  State_UT        year  Primary_Only Primary_with_U_~ Primary_with_U_~
  <chr>           <chr>        <dbl>            <dbl>            <dbl>
1 Andaman & Nico~ 2013~         82.4             96.0            100  
2 Andaman & Nico~ 2014~         80.7             96.3            100  
3 Andaman & Nico~ 2015~         82.1             97.6            100  
4 Andhra Pradesh  2013~         87.7             93.6             99.3
5 Andhra Pradesh  2014~         91.1             94.7            100  
6 Andhra Pradesh  2015~         91.6             95.6            100  
# ... with 8 more variables: U_Primary_Only <dbl>,
#   U_Primary_With_Sec_HrSec <dbl>, Primary_with_U_Primary_Sec <dbl>,
#   U_Primary_With_Sec <dbl>, Sec_Only <dbl>, Sec_with_HrSec. <dbl>,
#   HrSec_Only <dbl>, `All Schools` <dbl>
colnames(schools_with_electricity)
 [1] "State_UT"                        
 [2] "year"                            
 [3] "Primary_Only"                    
 [4] "Primary_with_U_Primary"          
 [5] "Primary_with_U_Primary_Sec_HrSec"
 [6] "U_Primary_Only"                  
 [7] "U_Primary_With_Sec_HrSec"        
 [8] "Primary_with_U_Primary_Sec"      
 [9] "U_Primary_With_Sec"              
[10] "Sec_Only"                        
[11] "Sec_with_HrSec."                 
[12] "HrSec_Only"                      
[13] "All Schools"                     
schools_with_electricity <- rename(schools_with_electricity, primary="Primary_Only", upper_primary="U_Primary_Only", secondary="Sec_Only", hr_secondary="HrSec_Only")

Datatype of each column

str(schools_with_electricity)
spec_tbl_df [110 x 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ State_UT                        : chr [1:110] "Andaman & Nicobar Islands" "Andaman & Nicobar Islands" "Andaman & Nicobar Islands" "Andhra Pradesh" ...
 $ year                            : chr [1:110] "2013-14" "2014-15" "2015-16" "2013-14" ...
 $ primary                         : num [1:110] 82.4 80.7 82.1 87.7 91.1 ...
 $ Primary_with_U_Primary          : num [1:110] 96 96.3 97.6 93.6 94.7 ...
 $ Primary_with_U_Primary_Sec_HrSec: num [1:110] 100 100 100 99.3 100 ...
 $ upper_primary                   : num [1:110] 0 100 0 100 100 ...
 $ U_Primary_With_Sec_HrSec        : num [1:110] 100 100 100 67.5 86.1 ...
 $ Primary_with_U_Primary_Sec      : num [1:110] 100 100 100 96.2 97.6 ...
 $ U_Primary_With_Sec              : num [1:110] 0 0 0 96.2 97.1 ...
 $ secondary                       : num [1:110] 0 0 0 97.5 93.5 ...
 $ Sec_with_HrSec.                 : num [1:110] 100 100 100 100 83.3 ...
 $ hr_secondary                    : num [1:110] 0 0 0 91.3 93.2 ...
 $ All Schools                     : num [1:110] 88.9 88.9 90.1 90.3 92.8 ...
 - attr(*, "spec")=
  .. cols(
  ..   State_UT = col_character(),
  ..   year = col_character(),
  ..   Primary_Only = col_double(),
  ..   Primary_with_U_Primary = col_double(),
  ..   Primary_with_U_Primary_Sec_HrSec = col_double(),
  ..   U_Primary_Only = col_double(),
  ..   U_Primary_With_Sec_HrSec = col_double(),
  ..   Primary_with_U_Primary_Sec = col_double(),
  ..   U_Primary_With_Sec = col_double(),
  ..   Sec_Only = col_double(),
  ..   Sec_with_HrSec. = col_double(),
  ..   HrSec_Only = col_double(),
  ..   `All Schools` = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 

Plotting Education Level wise percentage with access to Electricity all over India

All_India <- filter(schools_with_electricity, State_UT=="All India") %>% 
  select(year, primary, upper_primary, secondary, hr_secondary)
All_India <- pivot_longer(All_India, c(primary, upper_primary, secondary, hr_secondary), names_to = "Education_Level", values_to = "Percentage") 
ggplot(All_India, aes(x=year, y=Percentage, fill=Education_Level)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-18: Percentage of Schools with access to Electricity all over india") + geom_text(aes(label=Percentage), size = 4, position = position_dodge(width = .9), vjust = 1, color = "black") + theme_classic() + facet_wrap(~Education_Level)

Findings:


states_electricity <- filter(schools_with_electricity, State_UT != "All India")

nagaland <- filter(states_electricity, State_UT == "Nagaland") %>% 
  select(year, primary, upper_primary )

nagaland <- pivot_longer(nagaland, c(primary,upper_primary), names_to = "Education_Level", values_to = "Percentage")
ggplot(nagaland, aes(x=year, y=Percentage, fill=Education_Level)) + geom_bar(position="dodge", stat="identity") + coord_polar() + geom_text(aes(label=Percentage), size = 4, position = position_dodge(width = .9), vjust = 0, color = "black") + labs(title = "Fig-19: Percentage of Schools with access to Electricity in the state of Nagaland", subtitle = "Education Level - Primary and Upper Primary" )

karnataka <- filter(states_electricity, State_UT == "Karnataka") %>% 
  select(year, secondary, State_UT )

ggplot(karnataka, aes(x=year, y=secondary, group=State_UT)) + geom_line(color="red", size=1) + geom_point() + geom_text(aes(label=secondary), size = 4, position = position_dodge(width = .6), vjust = 0, color = "black") + labs(title = "Fig-20: Percentage of Schools with access to Electricity in the state of Karnataka", subtitle = "Education Level - Secondary" )

states_electricity
# A tibble: 107 x 13
   State_UT            year  primary Primary_with_U_~ Primary_with_U_~
   <chr>               <chr>   <dbl>            <dbl>            <dbl>
 1 Andaman & Nicobar ~ 2013~   82.4              96.0            100  
 2 Andaman & Nicobar ~ 2014~   80.7              96.3            100  
 3 Andaman & Nicobar ~ 2015~   82.1              97.6            100  
 4 Andhra Pradesh      2013~   87.7              93.6             99.3
 5 Andhra Pradesh      2014~   91.1              94.7            100  
 6 Andhra Pradesh      2015~   91.6              95.6            100  
 7 Arunachal Pradesh   2013~   19.7              53.6             92.2
 8 Arunachal Pradesh   2014~   21.5              55.0             96.8
 9 Arunachal Pradesh   2015~   22.6              53.9             95.5
10 Assam               2013~    9.51             51.1             81.2
# ... with 97 more rows, and 8 more variables: upper_primary <dbl>,
#   U_Primary_With_Sec_HrSec <dbl>, Primary_with_U_Primary_Sec <dbl>,
#   U_Primary_With_Sec <dbl>, secondary <dbl>, Sec_with_HrSec. <dbl>,
#   hr_secondary <dbl>, `All Schools` <dbl>
Daman_Diu <- filter(states_electricity, State_UT == "Daman & Diu") %>% 
  select(State_UT, year, hr_secondary )

kable(Daman_Diu, digits = 4, align = "ccccccc", col.names = c("State/Union Territory", "Year", "Percentage"), caption = "Table4 : Daman & Diu Percentage of Schools with Electricity") %>%
  kable_styling(font_size = 15) %>%
  row_spec(c(1,1,1))
Table 2: Table4 : Daman & Diu Percentage of Schools with Electricity
State/Union Territory Year Percentage
Daman & Diu 2013-14 100
Daman & Diu 2014-15 100
Daman & Diu 2015-16 100

Findings:

Dataset-5 | Percentage of Schools with water faciliity in India

Reading Dataset-5

schools_with_water <- read_csv("601 Major Project/percentage-of-schools-with-water-facility.csv")
head(schools_with_water)
# A tibble: 6 x 13
  `State/UT`      Year  Primary_Only Primary_with_U_~ Primary_with_U_~
  <chr>           <chr>        <dbl>            <dbl>            <dbl>
1 Andaman & Nico~ 2013~         98.2             98.7            100  
2 Andaman & Nico~ 2014~         99.6             98.8            100  
3 Andaman & Nico~ 2015~        100              100              100  
4 Andhra Pradesh  2013~         86.9             94.5             99.7
5 Andhra Pradesh  2014~         91.8             96.1            100  
6 Andhra Pradesh  2015~         93.9             97.0            100  
# ... with 8 more variables: U_Primary_Only <dbl>,
#   U_Primary_With_Sec_HrSec <dbl>, Primary_with_U_Primary_Sec <dbl>,
#   U_Primary_With_Sec <dbl>, Sec_Only <dbl>, Sec_with_HrSec. <dbl>,
#   HrSec_Only <dbl>, `All Schools` <dbl>
colnames(schools_with_water)
 [1] "State/UT"                        
 [2] "Year"                            
 [3] "Primary_Only"                    
 [4] "Primary_with_U_Primary"          
 [5] "Primary_with_U_Primary_Sec_HrSec"
 [6] "U_Primary_Only"                  
 [7] "U_Primary_With_Sec_HrSec"        
 [8] "Primary_with_U_Primary_Sec"      
 [9] "U_Primary_With_Sec"              
[10] "Sec_Only"                        
[11] "Sec_with_HrSec."                 
[12] "HrSec_Only"                      
[13] "All Schools"                     
schools_with_water <- rename(schools_with_water, State_UT="State/UT", primary="Primary_Only", upper_primary="U_Primary_Only", secondary="Sec_Only", hr_secondary="HrSec_Only" )

Datatype of each column

str(schools_with_water)
spec_tbl_df [110 x 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ State_UT                        : chr [1:110] "Andaman & Nicobar Islands" "Andaman & Nicobar Islands" "Andaman & Nicobar Islands" "Andhra Pradesh" ...
 $ Year                            : chr [1:110] "2013-14" "2014-15" "2015-16" "2013-14" ...
 $ primary                         : num [1:110] 98.2 99.5 100 86.9 91.8 ...
 $ Primary_with_U_Primary          : num [1:110] 98.7 98.8 100 94.5 96.1 ...
 $ Primary_with_U_Primary_Sec_HrSec: num [1:110] 100 100 100 99.7 100 ...
 $ upper_primary                   : num [1:110] 0 100 0 90.9 100 ...
 $ U_Primary_With_Sec_HrSec        : num [1:110] 100 100 100 87.3 90 ...
 $ Primary_with_U_Primary_Sec      : num [1:110] 100 100 100 98.8 99.6 ...
 $ U_Primary_With_Sec              : num [1:110] 0 0 0 96 97.5 ...
 $ secondary                       : num [1:110] 0 0 0 97.5 100 100 0 0 0 88.3 ...
 $ Sec_with_HrSec.                 : num [1:110] 100 100 100 100 100 ...
 $ hr_secondary                    : num [1:110] 0 0 0 97.5 98.4 ...
 $ All Schools                     : num [1:110] 98.7 99.5 100 90.3 93.7 ...
 - attr(*, "spec")=
  .. cols(
  ..   `State/UT` = col_character(),
  ..   Year = col_character(),
  ..   Primary_Only = col_double(),
  ..   Primary_with_U_Primary = col_double(),
  ..   Primary_with_U_Primary_Sec_HrSec = col_double(),
  ..   U_Primary_Only = col_double(),
  ..   U_Primary_With_Sec_HrSec = col_double(),
  ..   Primary_with_U_Primary_Sec = col_double(),
  ..   U_Primary_With_Sec = col_double(),
  ..   Sec_Only = col_double(),
  ..   Sec_with_HrSec. = col_double(),
  ..   HrSec_Only = col_double(),
  ..   `All Schools` = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 

plotting Education-Wise percentage of schools with water faciltity in India

All_India <- filter(schools_with_water, State_UT=="All India") %>% 
  select(Year, primary, upper_primary, secondary, hr_secondary)

All_India <- pivot_longer(All_India, c(primary, upper_primary, secondary, hr_secondary), names_to = "Education_Level", values_to = "Percentage") 
ggplot(All_India, aes(x=Year, y=Percentage, fill=Education_Level)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-21: Percentage of Schools Water Facility all over india") + geom_text(aes(label=Percentage), size = 4, position = position_dodge(width = .98), vjust = 1, color = "black") + theme_classic() + facet_wrap(~Education_Level)

Finding: It is really glad to know that schools at every education level have more than 90% access to water facility, but still it is not 100%.


states_water <- filter(schools_with_water, State_UT != "All India")

nagaland <- filter(states_water, State_UT == "Nagaland") %>% 
  select(Year, primary, upper_primary )

nagaland <- pivot_longer(nagaland, c(primary,upper_primary), names_to = "Education_Level", values_to = "Percentage")
ggplot(nagaland, aes(x=Year, y=Percentage, fill=Education_Level)) + geom_bar(position="dodge", stat="identity") + coord_polar() + geom_text(aes(label=Percentage), size = 4, position = position_dodge(width = .9), vjust = 0, color = "black") + labs(title = "Fig-22: Percentage of Schools with access to Drinking Water in the state of Nagaland", subtitle = "Education Level - Primary and Upper Primary" )

karnataka <- filter(schools_with_water, State_UT == "Karnataka") %>% 
  select(Year, secondary, State_UT )

ggplot(karnataka, aes(x=Year, y=secondary, group=State_UT)) + geom_line(color="red", size=1) + geom_point() + geom_text(aes(label=secondary), size = 4, position = position_dodge(width = .6), vjust = 0, color = "black") + labs(title = "Fig-23: Percentage of Schools with access to Drinking Water in the state of Karnataka", subtitle = "Education Level - Secondary" )

Daman_Diu <- filter(schools_with_water, State_UT == "Daman & Diu") %>% 
  select(State_UT, Year, hr_secondary )

kable(Daman_Diu, digits = 4, align = "ccccccc", col.names = c("State/Union Territory", "Year", "Percentage"), caption = "Table4 : Daman & Diu Percentage of Schools with Drinking Water Facility") %>%
  kable_styling(font_size = 15) %>%
  row_spec(c(1,1,1))
Table 3: Table4 : Daman & Diu Percentage of Schools with Drinking Water Facility
State/Union Territory Year Percentage
Daman & Diu 2013-14 100
Daman & Diu 2014-15 100
Daman & Diu 2015-16 100

Dataset-6 | Percentage of Schools with boys toilet all over India

Reading Dataset-6

schools_with_boys_toilet <- read_csv("601 Major Project/schools-with-boys-toilet.csv")
head(schools_with_boys_toilet)
# A tibble: 6 x 13
  State_UT        year  Primary_Only Primary_with_U_~ Primary_with_U_~
  <chr>           <chr>        <dbl>            <dbl>            <dbl>
1 Andaman & Nico~ 2013~         91.6             97.4            100  
2 Andaman & Nico~ 2014~        100              100              100  
3 Andaman & Nico~ 2015~        100              100              100  
4 Andhra Pradesh  2013~         53.0             62.6             82.0
5 Andhra Pradesh  2014~         57.9             76.5             96  
6 Andhra Pradesh  2015~         99.6             99.9             99.0
# ... with 8 more variables: U_Primary_Only <dbl>,
#   U_Primary_With_Sec_HrSec <dbl>, Primary_with_U_Primary_Sec <dbl>,
#   U_Primary_With_Sec <dbl>, Sec_Only <dbl>, Sec_with_HrSec. <dbl>,
#   HrSec_Only <dbl>, `All Schools` <dbl>
colnames(schools_with_boys_toilet)
 [1] "State_UT"                        
 [2] "year"                            
 [3] "Primary_Only"                    
 [4] "Primary_with_U_Primary"          
 [5] "Primary_with_U_Primary_Sec_HrSec"
 [6] "U_Primary_Only"                  
 [7] "U_Primary_With_Sec_HrSec"        
 [8] "Primary_with_U_Primary_Sec"      
 [9] "U_Primary_With_Sec"              
[10] "Sec_Only"                        
[11] "Sec_with_HrSec."                 
[12] "HrSec_Only"                      
[13] "All Schools"                     
schools_with_boys_toilet <- rename(schools_with_boys_toilet, primary="Primary_Only", upper_primary="U_Primary_Only", secondary="Sec_Only", hr_secondary="HrSec_Only")

Datatype of each column

str(schools_with_boys_toilet)
spec_tbl_df [110 x 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ State_UT                        : chr [1:110] "Andaman & Nicobar Islands" "Andaman & Nicobar Islands" "Andaman & Nicobar Islands" "Andhra Pradesh" ...
 $ year                            : chr [1:110] "2013-14" "2014-15" "2015-16" "2013-14" ...
 $ primary                         : num [1:110] 91.6 100 100 53 57.9 ...
 $ Primary_with_U_Primary          : num [1:110] 97.4 100 100 62.6 76.5 ...
 $ Primary_with_U_Primary_Sec_HrSec: num [1:110] 100 100 100 82 96 ...
 $ upper_primary                   : num [1:110] 0 100 0 45.5 75 ...
 $ U_Primary_With_Sec_HrSec        : num [1:110] 100 100 100 64.1 93.3 ...
 $ Primary_with_U_Primary_Sec      : num [1:110] 100 100 100 76.2 91.4 ...
 $ U_Primary_With_Sec              : num [1:110] 0 0 0 60.6 78 ...
 $ secondary                       : num [1:110] 0 0 0 59.3 80.7 ...
 $ Sec_with_HrSec.                 : num [1:110] 100 100 100 85.7 60 ...
 $ hr_secondary                    : num [1:110] 0 0 0 73.4 86.5 ...
 $ All Schools                     : num [1:110] 94.5 100 100 56.9 65.3 ...
 - attr(*, "spec")=
  .. cols(
  ..   State_UT = col_character(),
  ..   year = col_character(),
  ..   Primary_Only = col_double(),
  ..   Primary_with_U_Primary = col_double(),
  ..   Primary_with_U_Primary_Sec_HrSec = col_double(),
  ..   U_Primary_Only = col_double(),
  ..   U_Primary_With_Sec_HrSec = col_double(),
  ..   Primary_with_U_Primary_Sec = col_double(),
  ..   U_Primary_With_Sec = col_double(),
  ..   Sec_Only = col_double(),
  ..   Sec_with_HrSec. = col_double(),
  ..   HrSec_Only = col_double(),
  ..   `All Schools` = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 

Plotting Education Level wise percentage of schools with boys toilet in India

All_India <- filter(schools_with_boys_toilet, State_UT=="All India") %>% 
  select(year, primary, upper_primary, secondary, hr_secondary)

All_India <- pivot_longer(All_India, c(primary, upper_primary, secondary, hr_secondary), names_to = "Education_Level", values_to = "Percentage")
ggplot(All_India, aes(x=year, y=Percentage, fill=Education_Level)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-24: Percentage of Schools with Boys toilet all over india") + geom_text(aes(label=Percentage), size = 4, position = position_dodge(width = .98), vjust = 1, color = "black") + theme_classic() + facet_wrap(~Education_Level)

Findings:


states <- c("Nagaland", "Karnataka", "Daman & Diu")
states_with_boys_toilet <- filter(schools_with_boys_toilet, State_UT == states) %>% 
  pivot_longer(c(primary,upper_primary, secondary, hr_secondary), names_to = "Education_Level", values_to = "Percentage") %>% 
  select(State_UT, year, Education_Level, Percentage)
  
ggplot(states_with_boys_toilet, aes(x=year, y=Percentage, fill=Education_Level)) + geom_bar(position = "dodge", stat = "identity") + facet_wrap(~State_UT) + theme_dark() + labs(title = "Fig-25: Percentage of Schools with Boys toilet", subtitle = "Daman & Diu, Karnataka, Nagaland")


States with no boys toilets in India

states_with_no_boys_toilet <- filter(schools_with_boys_toilet, State_UT != "All India", upper_primary==0, secondary==0, hr_secondary==0) %>%
  pivot_longer(c(upper_primary, secondary, hr_secondary), names_to = "Education_Level", values_to = "Percentage") %>% 
  select(State_UT, year, Education_Level, Percentage)

kable(states_with_no_boys_toilet, digits = 4, align = "ccccccc", col.names = c("State/Union Territory", "Year", "Education Level", "Percentage"), caption = "Table4 : States with no boys toilet") %>%
  kable_styling(font_size = 15) %>%
  row_spec(c(1,1,1))
Table 4: Table4 : States with no boys toilet
State/Union Territory Year Education Level Percentage
Andaman & Nicobar Islands 2013-14 upper_primary 0
Andaman & Nicobar Islands 2013-14 secondary 0
Andaman & Nicobar Islands 2013-14 hr_secondary 0
Andaman & Nicobar Islands 2015-16 upper_primary 0
Andaman & Nicobar Islands 2015-16 secondary 0
Andaman & Nicobar Islands 2015-16 hr_secondary 0
Arunachal Pradesh 2013-14 upper_primary 0
Arunachal Pradesh 2013-14 secondary 0
Arunachal Pradesh 2013-14 hr_secondary 0
Chandigarh 2013-14 upper_primary 0
Chandigarh 2013-14 secondary 0
Chandigarh 2013-14 hr_secondary 0
Chandigarh 2014-15 upper_primary 0
Chandigarh 2014-15 secondary 0
Chandigarh 2014-15 hr_secondary 0
Chandigarh 2015-16 upper_primary 0
Chandigarh 2015-16 secondary 0
Chandigarh 2015-16 hr_secondary 0
Dadra & Nagar Haveli 2013-14 upper_primary 0
Dadra & Nagar Haveli 2013-14 secondary 0
Dadra & Nagar Haveli 2013-14 hr_secondary 0

Dataset-7 | Percentage of Schools with girls toilets in India

Reading Dataset-7

schools_with_girls_toilet <- read_csv("601 Major Project/schools-with-girls-toilet.csv")
schools_with_girls_toilet <- rename(schools_with_girls_toilet, primary="Primary_Only", upper_primary="U_Primary_Only", secondary="Sec_Only", hr_secondary="HrSec_Only")
head(schools_with_girls_toilet)
# A tibble: 6 x 13
  State_UT             year  primary Primary_with_U_~ Primary_with_U_~
  <chr>                <chr>   <dbl>            <dbl>            <dbl>
1 All India            2013~    88.7             96.0             98.8
2 All India            2014~    91.2             96.9             99.5
3 All India            2015~    97.0             99.0             99.7
4 Andaman & Nicobar I~ 2013~    89.7             97.4            100  
5 Andaman & Nicobar I~ 2014~   100              100              100  
6 Andaman & Nicobar I~ 2015~   100              100              100  
# ... with 8 more variables: upper_primary <dbl>,
#   U_Primary_With_Sec_HrSec <dbl>, Primary_with_U_Primary_Sec <dbl>,
#   U_Primary_With_Sec <dbl>, secondary <dbl>, Sec_with_HrSec. <dbl>,
#   hr_secondary <dbl>, `All Schools` <dbl>

Datatype of each column

str(schools_with_girls_toilet)
spec_tbl_df [110 x 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ State_UT                        : chr [1:110] "All India" "All India" "All India" "Andaman & Nicobar Islands" ...
 $ year                            : chr [1:110] "2013-14" "2014-15" "2015-16" "2013-14" ...
 $ primary                         : num [1:110] 88.7 91.2 97 89.7 100 ...
 $ Primary_with_U_Primary          : num [1:110] 96 96.9 99 97.4 100 ...
 $ Primary_with_U_Primary_Sec_HrSec: num [1:110] 98.8 99.5 99.7 100 100 ...
 $ upper_primary                   : num [1:110] 91.4 91.4 96.3 0 100 ...
 $ U_Primary_With_Sec_HrSec        : num [1:110] 98.2 99.2 99.6 100 100 ...
 $ Primary_with_U_Primary_Sec      : num [1:110] 97.3 98.2 99.3 100 100 ...
 $ U_Primary_With_Sec              : num [1:110] 94.4 96.6 98.8 0 0 ...
 $ secondary                       : num [1:110] 99.1 90.3 95.2 0 0 ...
 $ Sec_with_HrSec.                 : num [1:110] 98.4 94 98.3 100 100 ...
 $ hr_secondary                    : num [1:110] 76.1 90.9 96.2 0 0 ...
 $ All Schools                     : num [1:110] 91.2 93.1 97.5 93.4 100 ...
 - attr(*, "spec")=
  .. cols(
  ..   State_UT = col_character(),
  ..   year = col_character(),
  ..   Primary_Only = col_double(),
  ..   Primary_with_U_Primary = col_double(),
  ..   Primary_with_U_Primary_Sec_HrSec = col_double(),
  ..   U_Primary_Only = col_double(),
  ..   U_Primary_With_Sec_HrSec = col_double(),
  ..   Primary_with_U_Primary_Sec = col_double(),
  ..   U_Primary_With_Sec = col_double(),
  ..   Sec_Only = col_double(),
  ..   Sec_with_HrSec. = col_double(),
  ..   HrSec_Only = col_double(),
  ..   `All Schools` = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 

Plotting Education-Wise percentage of schools with girls toilets in India

All_India <- filter(schools_with_girls_toilet, State_UT=="All India") %>% 
  select(year, primary, upper_primary, secondary, hr_secondary)

All_India <- pivot_longer(All_India, c(primary, upper_primary, secondary, hr_secondary), names_to = "Education_Level", values_to = "Percentage")
ggplot(All_India, aes(x=year, y=Percentage, fill=Education_Level)) +
  geom_bar(position = "dodge", stat = "identity") + labs(title = "Fig-26: Percentage of Schools with Girls toilet all over india") + geom_text(aes(label=Percentage), size = 4, position = position_dodge(width = .98), vjust = 1, color = "black") + theme_classic() + facet_wrap(~Education_Level)


states <- c("Nagaland", "Karnataka", "Daman & Diu")
states_with_girls_toilet <- filter(schools_with_girls_toilet, State_UT == states) %>% 
  pivot_longer(c(primary,upper_primary, secondary, hr_secondary), names_to = "Education_Level", values_to = "Percentage") %>% 
  select(State_UT, year, Education_Level, Percentage)
  
ggplot(states_with_girls_toilet, aes(x=year, y=Percentage, fill=Education_Level)) + geom_bar(position = "dodge", stat = "identity") + facet_wrap(~State_UT) + theme_dark() + labs(title = "Fig-27: Percentage of Schools with Girls toilet", subtitle = "Daman & Diu, Karnataka, Nagaland")

States with no girls toilets in India

states_with_no_girls_toilet <- filter(schools_with_girls_toilet, State_UT != "All India", upper_primary==0, secondary==0, hr_secondary==0) %>%
  pivot_longer(c(upper_primary, secondary, hr_secondary), names_to = "Education_Level", values_to = "Percentage") %>% 
  select(State_UT, year, Education_Level, Percentage)

kable(states_with_no_girls_toilet, digits = 4, align = "ccccccc", col.names = c("State/Union Territory", "Year", "Education Level", "Percentage"), caption = "Table4 : States with no girls toilet") %>%
  kable_styling(font_size = 15) %>%
  row_spec(c(1,1,1))
Table 5: Table4 : States with no girls toilet
State/Union Territory Year Education Level Percentage
Andaman & Nicobar Islands 2013-14 upper_primary 0
Andaman & Nicobar Islands 2013-14 secondary 0
Andaman & Nicobar Islands 2013-14 hr_secondary 0
Andaman & Nicobar Islands 2015-16 upper_primary 0
Andaman & Nicobar Islands 2015-16 secondary 0
Andaman & Nicobar Islands 2015-16 hr_secondary 0
Chandigarh 2013-14 upper_primary 0
Chandigarh 2013-14 secondary 0
Chandigarh 2013-14 hr_secondary 0
Chandigarh 2014-15 upper_primary 0
Chandigarh 2014-15 secondary 0
Chandigarh 2014-15 hr_secondary 0
Chandigarh 2015-16 upper_primary 0
Chandigarh 2015-16 secondary 0
Chandigarh 2015-16 hr_secondary 0

Conclusion:

Summary of my key Findings:

Gross Enrollment Ratio:

Dropout Ratio:

Boys:

Girls:

My analysis shows that Nagaland, Karnataka and Daman & Diu has the highest dropout rates for boys and girls in primary & upper primary levels, Secondary and Higher Secondary levels respectively.

Gujarat is doing well in terms of the dropout percentage and similar to the Gross Enrollment Ratio the dropout rates are good for girls rather than boys.

Access to basic facilities in schools (Electricity, Water, Toilets and Computers):

References

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Pola (2022, May 19). Data Analytics and Computational Social Science: Imapct on the Indian Education System | A deep dive data analysis on the Enrollment and Dropout student ratio. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httprpubscomniharika901864/

BibTeX citation

@misc{pola2022imapct,
  author = {Pola, Niharika},
  title = {Data Analytics and Computational Social Science: Imapct on the Indian Education System | A deep dive data analysis on the Enrollment and Dropout student ratio},
  url = {https://github.com/DACSS/dacss_course_website/posts/httprpubscomniharika901864/},
  year = {2022}
}