HW#2 - Getting familiar with the US public schools data set

Using the public school characteristic dataset collected in the 2017-2018 school year

Brittany Kenison
09-16-2021

Introduction

Let’s get familiar with the data set by looking at the a few records:

# A tibble: 6 x 79
      X     Y OBJECTID NCESSCH  NMCNTY   SURVYEAR STABR LEAID ST_LEAID
  <dbl> <dbl>    <dbl> <chr>    <chr>    <chr>    <chr> <chr> <chr>   
1 -149.  61.6        1 0200510~ Matanus~ 2017-20~ AK    0200~ AK-33   
2 -157.  71.3        2 0200610~ North S~ 2017-20~ AK    0200~ AK-36   
3 -151.  60.5        3 0200390~ Kenai P~ 2017-20~ AK    0200~ AK-24   
4 -151.  60.6        4 0200390~ Kenai P~ 2017-20~ AK    0200~ AK-24   
5 -151.  60.6        5 0200390~ Kenai P~ 2017-20~ AK    0200~ AK-24   
6 -133.  56.1        6 0200700~ Prince ~ 2017-20~ AK    0200~ AK-44   
# ... with 70 more variables: LEA_NAME <chr>, SCH_NAME <chr>,
#   LSTREET1 <chr>, LSTREET2 <chr>, LSTREET3 <lgl>, LCITY <chr>,
#   LSTATE <chr>, LZIP <chr>, LZIP4 <chr>, PHONE <chr>, GSLO <chr>,
#   GSHI <chr>, VIRTUAL <chr>, TOTFRL <dbl>, FRELCH <dbl>,
#   REDLCH <dbl>, PK <dbl>, KG <dbl>, G01 <dbl>, G02 <dbl>,
#   G03 <dbl>, G04 <dbl>, G05 <dbl>, G06 <dbl>, G07 <dbl>, G08 <dbl>,
#   G09 <dbl>, G10 <dbl>, G11 <dbl>, G12 <dbl>, G13 <lgl>, ...

Rows and Columns

Hmm. That looks like a lot of data. How many rows and columns are here?

[1] 100729     79

Column Names

Finally, the column names may help us understand the types of information collected.

 [1] "X"                "Y"                "OBJECTID"        
 [4] "NCESSCH"          "NMCNTY"           "SURVYEAR"        
 [7] "STABR"            "LEAID"            "ST_LEAID"        
[10] "LEA_NAME"         "SCH_NAME"         "LSTREET1"        
[13] "LSTREET2"         "LSTREET3"         "LCITY"           
[16] "LSTATE"           "LZIP"             "LZIP4"           
[19] "PHONE"            "GSLO"             "GSHI"            
[22] "VIRTUAL"          "TOTFRL"           "FRELCH"          
[25] "REDLCH"           "PK"               "KG"              
[28] "G01"              "G02"              "G03"             
[31] "G04"              "G05"              "G06"             
[34] "G07"              "G08"              "G09"             
[37] "G10"              "G11"              "G12"             
[40] "G13"              "TOTAL"            "MEMBER"          
[43] "AM"               "HI"               "BL"              
[46] "WH"               "HP"               "TR"              
[49] "FTE"              "LATCOD"           "LONCOD"          
[52] "ULOCALE"          "STUTERATIO"       "STITLEI"         
[55] "AMALM"            "AMALF"            "ASALM"           
[58] "ASALF"            "HIALM"            "HIALF"           
[61] "BLALM"            "BLALF"            "WHALM"           
[64] "WHALF"            "HPALM"            "HPALF"           
[67] "TRALM"            "TRALF"            "TOTMENROL"       
[70] "TOTFENROL"        "STATUS"           "UG"              
[73] "AE"               "SCHOOL_TYPE_TEXT" "SY_STATUS_TEXT"  
[76] "SCHOOL_LEVEL"     "AS"               "CHARTER_TEXT"    
[79] "MAGNET_TEXT"     

The End

Now that we are more familiar with the date set, the next blog post will start to wrangle the data for our eventual analysis.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Kenison (2021, Sept. 16). DACSS 601 Fall 2021: HW#2 - Getting familiar with the US public schools data set. Retrieved from https://mrolfe.github.io/DACSS601Fall21/posts/2021-09-15-examining-characteristics-of-public-schools/

BibTeX citation

@misc{kenison2021hw#2,
  author = {Kenison, Brittany},
  title = {DACSS 601 Fall 2021: HW#2 - Getting familiar with the US public schools data set},
  url = {https://mrolfe.github.io/DACSS601Fall21/posts/2021-09-15-examining-characteristics-of-public-schools/},
  year = {2021}
}