true
true
true
Reading in data and creating a post
Author

Linda Humphrey

Published

March 1, 2023

Code
library(tidyverse)
library(readxl)
library(lubridate)
library(psych)
library(DT)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Challenge Overview

Today’s challenge is to

  1. read in a data set, and describe the data using both words and any supporting information (e.g., tables, etc)
  2. provide summary statistics for different interesting groups within the data, and interpret those statistics

Read in the Data

Read in one (or more) of the following data sets, available in the posts/_data folder, using the correct R package and command.

  • railroad*.csv or StateCounty2012.xls ⭐
  • FAOstat*.csv or birds.csv ⭐⭐⭐
  • hotel_bookings.csv ⭐⭐⭐⭐
Code
# Exploring hotel_bookings data
library(readr)
dataset<- read_csv("~/Desktop/Dacss601_spring2023/posts/_data/hotel_bookings.csv") 

#View data structure

str(dataset)
spc_tbl_ [119,390 × 32] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ hotel                         : chr [1:119390] "Resort Hotel" "Resort Hotel" "Resort Hotel" "Resort Hotel" ...
 $ is_canceled                   : num [1:119390] 0 0 0 0 0 0 0 0 1 1 ...
 $ lead_time                     : num [1:119390] 342 737 7 13 14 14 0 9 85 75 ...
 $ arrival_date_year             : num [1:119390] 2015 2015 2015 2015 2015 ...
 $ arrival_date_month            : chr [1:119390] "July" "July" "July" "July" ...
 $ arrival_date_week_number      : num [1:119390] 27 27 27 27 27 27 27 27 27 27 ...
 $ arrival_date_day_of_month     : num [1:119390] 1 1 1 1 1 1 1 1 1 1 ...
 $ stays_in_weekend_nights       : num [1:119390] 0 0 0 0 0 0 0 0 0 0 ...
 $ stays_in_week_nights          : num [1:119390] 0 0 1 1 2 2 2 2 3 3 ...
 $ adults                        : num [1:119390] 2 2 1 1 2 2 2 2 2 2 ...
 $ children                      : num [1:119390] 0 0 0 0 0 0 0 0 0 0 ...
 $ babies                        : num [1:119390] 0 0 0 0 0 0 0 0 0 0 ...
 $ meal                          : chr [1:119390] "BB" "BB" "BB" "BB" ...
 $ country                       : chr [1:119390] "PRT" "PRT" "GBR" "GBR" ...
 $ market_segment                : chr [1:119390] "Direct" "Direct" "Direct" "Corporate" ...
 $ distribution_channel          : chr [1:119390] "Direct" "Direct" "Direct" "Corporate" ...
 $ is_repeated_guest             : num [1:119390] 0 0 0 0 0 0 0 0 0 0 ...
 $ previous_cancellations        : num [1:119390] 0 0 0 0 0 0 0 0 0 0 ...
 $ previous_bookings_not_canceled: num [1:119390] 0 0 0 0 0 0 0 0 0 0 ...
 $ reserved_room_type            : chr [1:119390] "C" "C" "A" "A" ...
 $ assigned_room_type            : chr [1:119390] "C" "C" "C" "A" ...
 $ booking_changes               : num [1:119390] 3 4 0 0 0 0 0 0 0 0 ...
 $ deposit_type                  : chr [1:119390] "No Deposit" "No Deposit" "No Deposit" "No Deposit" ...
 $ agent                         : chr [1:119390] "NULL" "NULL" "NULL" "304" ...
 $ company                       : chr [1:119390] "NULL" "NULL" "NULL" "NULL" ...
 $ days_in_waiting_list          : num [1:119390] 0 0 0 0 0 0 0 0 0 0 ...
 $ customer_type                 : chr [1:119390] "Transient" "Transient" "Transient" "Transient" ...
 $ adr                           : num [1:119390] 0 0 75 75 98 ...
 $ required_car_parking_spaces   : num [1:119390] 0 0 0 0 0 0 0 0 0 0 ...
 $ total_of_special_requests     : num [1:119390] 0 0 0 0 1 1 0 1 1 0 ...
 $ reservation_status            : chr [1:119390] "Check-Out" "Check-Out" "Check-Out" "Check-Out" ...
 $ reservation_status_date       : Date[1:119390], format: "2015-07-01" "2015-07-01" ...
 - attr(*, "spec")=
  .. cols(
  ..   hotel = col_character(),
  ..   is_canceled = col_double(),
  ..   lead_time = col_double(),
  ..   arrival_date_year = col_double(),
  ..   arrival_date_month = col_character(),
  ..   arrival_date_week_number = col_double(),
  ..   arrival_date_day_of_month = col_double(),
  ..   stays_in_weekend_nights = col_double(),
  ..   stays_in_week_nights = col_double(),
  ..   adults = col_double(),
  ..   children = col_double(),
  ..   babies = col_double(),
  ..   meal = col_character(),
  ..   country = col_character(),
  ..   market_segment = col_character(),
  ..   distribution_channel = col_character(),
  ..   is_repeated_guest = col_double(),
  ..   previous_cancellations = col_double(),
  ..   previous_bookings_not_canceled = col_double(),
  ..   reserved_room_type = col_character(),
  ..   assigned_room_type = col_character(),
  ..   booking_changes = col_double(),
  ..   deposit_type = col_character(),
  ..   agent = col_character(),
  ..   company = col_character(),
  ..   days_in_waiting_list = col_double(),
  ..   customer_type = col_character(),
  ..   adr = col_double(),
  ..   required_car_parking_spaces = col_double(),
  ..   total_of_special_requests = col_double(),
  ..   reservation_status = col_character(),
  ..   reservation_status_date = col_date(format = "")
  .. )
 - attr(*, "problems")=<externalptr> 

Add any comments or documentation as needed. More challenging data may require additional code chunks and documentation.

Describe the data

Using a combination of words and results of R commands, can you provide a high level description of the data? Describe as efficiently as possible where/how the data was (likely) gathered, indicate the cases and variables (both the interpretation and any details you deem useful to the reader to fully understand your chosen data). * Data gathered was an analysis of hotel bookings from 2015 to 2017

Code
library(readr)
dataset<- read_csv("~/Desktop/Dacss601_spring2023/posts/_data/hotel_bookings.csv") 
# Finding summary statistics for 'adults'

summary(dataset$adults)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   2.000   2.000   1.856   2.000  55.000 

This data describes demand data for two different types of hotels, with 31 variables describing 40,060 observations and 79,330 observations.

Code
library(readr)
dataset<- read_csv("~/Desktop/Dacss601_spring2023/posts/_data/hotel_bookings.csv") 
# Calculating the summary of the dataset

summary(dataset)
    hotel            is_canceled       lead_time   arrival_date_year
 Length:119390      Min.   :0.0000   Min.   :  0   Min.   :2015     
 Class :character   1st Qu.:0.0000   1st Qu.: 18   1st Qu.:2016     
 Mode  :character   Median :0.0000   Median : 69   Median :2016     
                    Mean   :0.3704   Mean   :104   Mean   :2016     
                    3rd Qu.:1.0000   3rd Qu.:160   3rd Qu.:2017     
                    Max.   :1.0000   Max.   :737   Max.   :2017     
                                                                    
 arrival_date_month arrival_date_week_number arrival_date_day_of_month
 Length:119390      Min.   : 1.00            Min.   : 1.0             
 Class :character   1st Qu.:16.00            1st Qu.: 8.0             
 Mode  :character   Median :28.00            Median :16.0             
                    Mean   :27.17            Mean   :15.8             
                    3rd Qu.:38.00            3rd Qu.:23.0             
                    Max.   :53.00            Max.   :31.0             
                                                                      
 stays_in_weekend_nights stays_in_week_nights     adults      
 Min.   : 0.0000         Min.   : 0.0         Min.   : 0.000  
 1st Qu.: 0.0000         1st Qu.: 1.0         1st Qu.: 2.000  
 Median : 1.0000         Median : 2.0         Median : 2.000  
 Mean   : 0.9276         Mean   : 2.5         Mean   : 1.856  
 3rd Qu.: 2.0000         3rd Qu.: 3.0         3rd Qu.: 2.000  
 Max.   :19.0000         Max.   :50.0         Max.   :55.000  
                                                              
    children           babies              meal             country         
 Min.   : 0.0000   Min.   : 0.000000   Length:119390      Length:119390     
 1st Qu.: 0.0000   1st Qu.: 0.000000   Class :character   Class :character  
 Median : 0.0000   Median : 0.000000   Mode  :character   Mode  :character  
 Mean   : 0.1039   Mean   : 0.007949                                        
 3rd Qu.: 0.0000   3rd Qu.: 0.000000                                        
 Max.   :10.0000   Max.   :10.000000                                        
 NA's   :4                                                                  
 market_segment     distribution_channel is_repeated_guest
 Length:119390      Length:119390        Min.   :0.00000  
 Class :character   Class :character     1st Qu.:0.00000  
 Mode  :character   Mode  :character     Median :0.00000  
                                         Mean   :0.03191  
                                         3rd Qu.:0.00000  
                                         Max.   :1.00000  
                                                          
 previous_cancellations previous_bookings_not_canceled reserved_room_type
 Min.   : 0.00000       Min.   : 0.0000                Length:119390     
 1st Qu.: 0.00000       1st Qu.: 0.0000                Class :character  
 Median : 0.00000       Median : 0.0000                Mode  :character  
 Mean   : 0.08712       Mean   : 0.1371                                  
 3rd Qu.: 0.00000       3rd Qu.: 0.0000                                  
 Max.   :26.00000       Max.   :72.0000                                  
                                                                         
 assigned_room_type booking_changes   deposit_type          agent          
 Length:119390      Min.   : 0.0000   Length:119390      Length:119390     
 Class :character   1st Qu.: 0.0000   Class :character   Class :character  
 Mode  :character   Median : 0.0000   Mode  :character   Mode  :character  
                    Mean   : 0.2211                                        
                    3rd Qu.: 0.0000                                        
                    Max.   :21.0000                                        
                                                                           
   company          days_in_waiting_list customer_type           adr         
 Length:119390      Min.   :  0.000      Length:119390      Min.   :  -6.38  
 Class :character   1st Qu.:  0.000      Class :character   1st Qu.:  69.29  
 Mode  :character   Median :  0.000      Mode  :character   Median :  94.58  
                    Mean   :  2.321                         Mean   : 101.83  
                    3rd Qu.:  0.000                         3rd Qu.: 126.00  
                    Max.   :391.000                         Max.   :5400.00  
                                                                             
 required_car_parking_spaces total_of_special_requests reservation_status
 Min.   :0.00000             Min.   :0.0000            Length:119390     
 1st Qu.:0.00000             1st Qu.:0.0000            Class :character  
 Median :0.00000             Median :0.0000            Mode  :character  
 Mean   :0.06252             Mean   :0.5714                              
 3rd Qu.:0.00000             3rd Qu.:1.0000                              
 Max.   :8.00000             Max.   :5.0000                              
                                                                         
 reservation_status_date
 Min.   :2014-10-17     
 1st Qu.:2016-02-01     
 Median :2016-08-07     
 Mean   :2016-07-30     
 3rd Qu.:2017-02-08     
 Max.   :2017-09-14     
                        

The above data is calculating summary of the Hotel_bookings dataset.

Code
library(readr)
dataset<- read_csv("~/Desktop/Dacss601_spring2023/posts/_data/hotel_bookings.csv") 
# Calculating null value count for all columns

colSums(is.na(dataset))
                         hotel                    is_canceled 
                             0                              0 
                     lead_time              arrival_date_year 
                             0                              0 
            arrival_date_month       arrival_date_week_number 
                             0                              0 
     arrival_date_day_of_month        stays_in_weekend_nights 
                             0                              0 
          stays_in_week_nights                         adults 
                             0                              0 
                      children                         babies 
                             4                              0 
                          meal                        country 
                             0                              0 
                market_segment           distribution_channel 
                             0                              0 
             is_repeated_guest         previous_cancellations 
                             0                              0 
previous_bookings_not_canceled             reserved_room_type 
                             0                              0 
            assigned_room_type                booking_changes 
                             0                              0 
                  deposit_type                          agent 
                             0                              0 
                       company           days_in_waiting_list 
                             0                              0 
                 customer_type                            adr 
                             0                              0 
   required_car_parking_spaces      total_of_special_requests 
                             0                              0 
            reservation_status        reservation_status_date 
                             0                              0 

The above data is calculating null value count for all columns.

Code
library(readr)
dataset<- read_csv("~/Desktop/Dacss601_spring2023/posts/_data/hotel_bookings.csv") 
# Generating a table for the first 100 rows in the dataset.

knitr::kable(head(dataset,n = 100), "pandoc")
hotel is_canceled lead_time arrival_date_year arrival_date_month arrival_date_week_number arrival_date_day_of_month stays_in_weekend_nights stays_in_week_nights adults children babies meal country market_segment distribution_channel is_repeated_guest previous_cancellations previous_bookings_not_canceled reserved_room_type assigned_room_type booking_changes deposit_type agent company days_in_waiting_list customer_type adr required_car_parking_spaces total_of_special_requests reservation_status reservation_status_date
Resort Hotel 0 342 2015 July 27 1 0 0 2 0 0 BB PRT Direct Direct 0 0 0 C C 3 No Deposit NULL NULL 0 Transient 0.00 0 0 Check-Out 2015-07-01
Resort Hotel 0 737 2015 July 27 1 0 0 2 0 0 BB PRT Direct Direct 0 0 0 C C 4 No Deposit NULL NULL 0 Transient 0.00 0 0 Check-Out 2015-07-01
Resort Hotel 0 7 2015 July 27 1 0 1 1 0 0 BB GBR Direct Direct 0 0 0 A C 0 No Deposit NULL NULL 0 Transient 75.00 0 0 Check-Out 2015-07-02
Resort Hotel 0 13 2015 July 27 1 0 1 1 0 0 BB GBR Corporate Corporate 0 0 0 A A 0 No Deposit 304 NULL 0 Transient 75.00 0 0 Check-Out 2015-07-02
Resort Hotel 0 14 2015 July 27 1 0 2 2 0 0 BB GBR Online TA TA/TO 0 0 0 A A 0 No Deposit 240 NULL 0 Transient 98.00 0 1 Check-Out 2015-07-03
Resort Hotel 0 14 2015 July 27 1 0 2 2 0 0 BB GBR Online TA TA/TO 0 0 0 A A 0 No Deposit 240 NULL 0 Transient 98.00 0 1 Check-Out 2015-07-03
Resort Hotel 0 0 2015 July 27 1 0 2 2 0 0 BB PRT Direct Direct 0 0 0 C C 0 No Deposit NULL NULL 0 Transient 107.00 0 0 Check-Out 2015-07-03
Resort Hotel 0 9 2015 July 27 1 0 2 2 0 0 FB PRT Direct Direct 0 0 0 C C 0 No Deposit 303 NULL 0 Transient 103.00 0 1 Check-Out 2015-07-03
Resort Hotel 1 85 2015 July 27 1 0 3 2 0 0 BB PRT Online TA TA/TO 0 0 0 A A 0 No Deposit 240 NULL 0 Transient 82.00 0 1 Canceled 2015-05-06
Resort Hotel 1 75 2015 July 27 1 0 3 2 0 0 HB PRT Offline TA/TO TA/TO 0 0 0 D D 0 No Deposit 15 NULL 0 Transient 105.50 0 0 Canceled 2015-04-22
Resort Hotel 1 23 2015 July 27 1 0 4 2 0 0 BB PRT Online TA TA/TO 0 0 0 E E 0 No Deposit 240 NULL 0 Transient 123.00 0 0 Canceled 2015-06-23
Resort Hotel 0 35 2015 July 27 1 0 4 2 0 0 HB PRT Online TA TA/TO 0 0 0 D D 0 No Deposit 240 NULL 0 Transient 145.00 0 0 Check-Out 2015-07-05
Resort Hotel 0 68 2015 July 27 1 0 4 2 0 0 BB USA Online TA TA/TO 0 0 0 D E 0 No Deposit 240 NULL 0 Transient 97.00 0 3 Check-Out 2015-07-05
Resort Hotel 0 18 2015 July 27 1 0 4 2 1 0 HB ESP Online TA TA/TO 0 0 0 G G 1 No Deposit 241 NULL 0 Transient 154.77 0 1 Check-Out 2015-07-05
Resort Hotel 0 37 2015 July 27 1 0 4 2 0 0 BB PRT Online TA TA/TO 0 0 0 E E 0 No Deposit 241 NULL 0 Transient 94.71 0 0 Check-Out 2015-07-05
Resort Hotel 0 68 2015 July 27 1 0 4 2 0 0 BB IRL Online TA TA/TO 0 0 0 D E 0 No Deposit 240 NULL 0 Transient 97.00 0 3 Check-Out 2015-07-05
Resort Hotel 0 37 2015 July 27 1 0 4 2 0 0 BB PRT Offline TA/TO TA/TO 0 0 0 E E 0 No Deposit 8 NULL 0 Contract 97.50 0 0 Check-Out 2015-07-05
Resort Hotel 0 12 2015 July 27 1 0 1 2 0 0 BB IRL Online TA TA/TO 0 0 0 A E 0 No Deposit 240 NULL 0 Transient 88.20 0 0 Check-Out 2015-07-02
Resort Hotel 0 0 2015 July 27 1 0 1 2 0 0 BB FRA Corporate Corporate 0 0 0 A G 0 No Deposit NULL 110 0 Transient 107.42 0 0 Check-Out 2015-07-02
Resort Hotel 0 7 2015 July 27 1 0 4 2 0 0 BB GBR Direct Direct 0 0 0 G G 0 No Deposit 250 NULL 0 Transient 153.00 0 1 Check-Out 2015-07-05
Resort Hotel 0 37 2015 July 27 1 1 4 1 0 0 BB GBR Online TA TA/TO 0 0 0 F F 0 No Deposit 241 NULL 0 Transient 97.29 0 1 Check-Out 2015-07-06
Resort Hotel 0 72 2015 July 27 1 2 4 2 0 0 BB PRT Direct Direct 0 0 0 A A 1 No Deposit 250 NULL 0 Transient 84.67 0 1 Check-Out 2015-07-07
Resort Hotel 0 72 2015 July 27 1 2 4 2 0 0 BB PRT Direct Direct 0 0 0 A A 1 No Deposit 250 NULL 0 Transient 84.67 0 1 Check-Out 2015-07-07
Resort Hotel 0 72 2015 July 27 1 2 4 2 0 0 BB PRT Direct Direct 0 0 0 D D 1 No Deposit 250 NULL 0 Transient 99.67 0 1 Check-Out 2015-07-07
Resort Hotel 0 127 2015 July 27 1 2 5 2 0 0 HB GBR Offline TA/TO TA/TO 0 0 0 D I 0 No Deposit 115 NULL 0 Contract 94.95 0 1 Check-Out 2015-07-01
Resort Hotel 0 78 2015 July 27 1 2 5 2 0 0 BB PRT Offline TA/TO TA/TO 0 0 0 D D 0 No Deposit 5 NULL 0 Transient 63.60 1 0 Check-Out 2015-07-08
Resort Hotel 0 48 2015 July 27 1 2 5 2 0 0 BB IRL Offline TA/TO TA/TO 0 0 0 D D 0 No Deposit 8 NULL 0 Contract 79.50 0 0 Check-Out 2015-07-08
Resort Hotel 1 60 2015 July 27 1 2 5 2 0 0 BB PRT Online TA TA/TO 0 0 0 E E 0 No Deposit 240 NULL 0 Transient 107.00 0 2 Canceled 2015-05-11
Resort Hotel 0 77 2015 July 27 1 2 5 2 0 0 BB PRT Online TA TA/TO 0 0 0 A A 0 No Deposit 240 NULL 0 Transient 94.00 0 0 Check-Out 2015-07-08
Resort Hotel 0 99 2015 July 27 1 2 5 2 0 0 BB PRT Online TA TA/TO 0 0 0 D D 0 No Deposit 240 NULL 0 Transient 87.30 1 1 Check-Out 2015-07-08
Resort Hotel 0 118 2015 July 27 1 4 10 1 0 0 BB NULL Direct Direct 0 0 0 A A 2 No Deposit NULL NULL 0 Transient 62.00 0 2 Check-Out 2015-07-15
Resort Hotel 0 95 2015 July 27 1 4 11 2 0 0 BB GBR Offline TA/TO TA/TO 0 0 0 D D 0 No Deposit 241 NULL 0 Transient 63.86 0 0 Check-Out 2015-07-16
Resort Hotel 1 96 2015 July 27 1 2 8 2 0 0 BB PRT Direct Direct 0 0 0 E E 0 No Deposit NULL NULL 0 Transient 108.30 0 2 Canceled 2015-05-29
Resort Hotel 0 69 2015 July 27 2 2 4 2 0 0 BB IRL Offline TA/TO TA/TO 0 0 0 A C 0 No Deposit 175 NULL 0 Transient 65.50 0 0 Check-Out 2015-07-08
Resort Hotel 1 45 2015 July 27 2 1 3 3 0 0 BB PRT Online TA TA/TO 0 0 0 D D 0 No Deposit 241 NULL 0 Transient 108.80 0 1 Canceled 2015-05-19
Resort Hotel 1 40 2015 July 27 2 1 3 3 0 0 BB PRT Online TA TA/TO 0 0 0 D D 0 No Deposit 241 NULL 0 Transient 108.80 0 1 Canceled 2015-06-19
Resort Hotel 0 15 2015 July 27 2 1 3 2 0 0 BB ESP Online TA TA/TO 0 0 0 A C 0 No Deposit 240 NULL 0 Transient 98.00 0 0 Check-Out 2015-07-06
Resort Hotel 0 36 2015 July 27 2 1 3 3 0 0 BB PRT Online TA TA/TO 0 0 0 D D 0 No Deposit 241 NULL 0 Transient 108.80 0 1 Check-Out 2015-07-06
Resort Hotel 1 43 2015 July 27 2 1 3 3 0 0 BB PRT Online TA TA/TO 0 0 0 D D 0 No Deposit 241 NULL 0 Transient 108.80 0 0 Canceled 2015-05-23
Resort Hotel 0 70 2015 July 27 2 2 3 2 0 0 HB ROU Direct Direct 0 0 0 E E 0 No Deposit 250 NULL 0 Transient 137.00 0 1 Check-Out 2015-07-07
Resort Hotel 1 45 2015 July 27 2 2 3 2 0 0 BB PRT Online TA TA/TO 0 0 0 G G 0 No Deposit 241 NULL 0 Transient 117.81 0 0 Canceled 2015-05-18
Resort Hotel 0 45 2015 July 27 2 2 3 2 0 0 BB IRL Offline TA/TO TA/TO 0 0 0 D D 0 No Deposit 8 NULL 0 Contract 79.50 0 0 Check-Out 2015-07-07
Resort Hotel 0 16 2015 July 27 2 2 3 2 0 0 BB ESP Direct Direct 0 0 0 F F 0 No Deposit NULL NULL 0 Transient 123.00 0 0 Check-Out 2015-07-07
Resort Hotel 0 70 2015 July 27 2 2 3 2 0 0 HB ROU Direct Direct 0 0 0 E E 0 No Deposit 250 NULL 0 Transient 137.00 0 1 Check-Out 2015-07-07
Resort Hotel 0 107 2015 July 27 2 2 5 2 0 0 BB PRT Online TA TA/TO 0 0 0 A A 0 No Deposit 240 NULL 0 Transient 110.70 0 2 Check-Out 2015-07-09
Resort Hotel 1 47 2015 July 27 2 2 5 2 2 0 BB PRT Online TA TA/TO 0 0 0 G G 0 No Deposit 240 NULL 0 Transient 153.00 0 0 Canceled 2015-06-02
Resort Hotel 0 96 2015 July 27 2 2 5 2 0 0 BB ESP Offline TA/TO TA/TO 0 0 0 A A 0 No Deposit 134 NULL 0 Transient 58.95 0 1 Check-Out 2015-07-09
Resort Hotel 0 113 2015 July 27 2 2 5 2 0 0 BB NOR Offline TA/TO TA/TO 0 0 0 E E 0 No Deposit 156 NULL 0 Transient-Party 82.88 0 2 Check-Out 2015-07-09
Resort Hotel 0 90 2015 July 27 2 2 5 2 0 0 HB GBR Offline TA/TO TA/TO 0 0 0 A B 1 No Deposit 243 NULL 0 Contract 82.35 0 0 Check-Out 2015-07-09
Resort Hotel 0 50 2015 July 27 2 2 5 2 0 0 HB IRL Online TA TA/TO 0 0 0 E F 1 No Deposit 241 NULL 0 Transient 119.35 0 1 Check-Out 2015-07-09
Resort Hotel 0 113 2015 July 27 2 2 5 2 0 0 BB NOR Offline TA/TO TA/TO 0 0 0 D D 0 No Deposit 156 NULL 0 Transient-Party 67.58 0 2 Check-Out 2015-07-09
Resort Hotel 0 93 2015 July 27 2 3 8 2 0 0 BB IRL Offline TA/TO TA/TO 0 0 0 A A 0 No Deposit 156 NULL 0 Contract 56.01 0 0 Check-Out 2015-07-13
Resort Hotel 0 76 2015 July 27 2 4 10 2 0 0 BB OMN Offline TA/TO TA/TO 0 0 0 D D 0 No Deposit 243 NULL 0 Contract 110.70 0 0 Check-Out 2015-07-16
Resort Hotel 0 3 2015 July 27 2 0 1 2 0 0 BB ESP Online TA TA/TO 0 0 0 A C 0 No Deposit 240 NULL 0 Transient 88.20 1 0 Check-Out 2015-07-03
Resort Hotel 0 1 2015 July 27 2 0 1 2 0 0 BB ARG Online TA TA/TO 0 0 0 H H 0 No Deposit 240 NULL 0 Transient 147.00 1 0 Check-Out 2015-07-03
Resort Hotel 0 1 2015 July 27 2 0 1 2 2 0 BB ESP Direct Direct 0 0 0 C C 0 No Deposit NULL NULL 0 Transient 107.00 1 2 Check-Out 2015-07-03
Resort Hotel 0 0 2015 July 27 2 0 1 2 0 0 BB PRT Direct Direct 0 0 0 H H 0 No Deposit NULL NULL 0 Transient 147.00 0 0 Check-Out 2015-07-03
Resort Hotel 0 0 2015 July 27 2 0 1 2 0 0 BB PRT Online TA TA/TO 0 0 0 A D 0 No Deposit 240 NULL 0 Transient 117.90 0 2 Check-Out 2015-07-03
Resort Hotel 0 0 2015 July 27 2 0 1 2 0 0 BB PRT Direct Direct 0 0 0 G G 0 No Deposit NULL NULL 0 Transient 123.00 0 0 Check-Out 2015-07-03
Resort Hotel 0 14 2015 July 27 2 0 2 2 0 0 BB USA Online TA TA/TO 0 0 0 A C 0 No Deposit 242 NULL 0 Transient 98.00 0 1 Check-Out 2015-07-04
Resort Hotel 0 10 2015 July 27 2 0 2 2 0 0 BB PRT Online TA TA/TO 0 0 0 G G 0 No Deposit 241 NULL 0 Transient 117.81 0 0 Check-Out 2015-07-04
Resort Hotel 0 5 2015 July 27 2 0 2 2 0 0 BB IRL Online TA TA/TO 0 0 0 E E 0 No Deposit 240 NULL 0 Transient 135.00 1 2 Check-Out 2015-07-04
Resort Hotel 0 17 2015 July 27 2 0 3 2 0 0 BB ESP Direct Direct 0 0 0 F F 0 No Deposit 250 NULL 0 Transient 133.00 0 1 Check-Out 2015-07-05
Resort Hotel 0 93 2015 July 27 2 0 3 2 0 0 BB IRL Offline TA/TO TA/TO 0 0 0 A C 0 No Deposit 115 NULL 0 Contract 58.95 0 0 Check-Out 2015-07-05
Resort Hotel 1 3 2015 July 27 2 0 3 2 0 0 HB PRT Online TA TA/TO 0 0 0 A A 0 No Deposit 240 NULL 0 Transient 136.33 0 2 Canceled 2015-06-29
Resort Hotel 0 10 2015 July 27 3 0 2 2 2 0 BB USA Online TA TA/TO 0 0 0 G H 0 No Deposit 240 NULL 0 Transient 153.00 1 0 Check-Out 2015-07-05
Resort Hotel 0 3 2015 July 27 3 0 2 2 0 0 BB ESP Online TA TA/TO 0 0 0 A C 0 No Deposit 240 NULL 0 Transient 110.50 0 0 Check-Out 2015-07-05
Resort Hotel 0 51 2015 July 27 3 0 2 2 0 0 BB POL Online TA TA/TO 0 0 0 D D 0 No Deposit 242 NULL 0 Transient 97.00 0 0 Check-Out 2015-07-05
Resort Hotel 1 71 2015 July 27 3 0 2 3 0 0 BB PRT Offline TA/TO TA/TO 0 0 0 A A 0 No Deposit 242 NULL 0 Transient 110.30 0 2 Canceled 2015-06-16
Resort Hotel 1 63 2015 July 27 3 0 2 2 0 0 BB PRT Online TA TA/TO 0 0 0 A A 0 No Deposit 240 NULL 0 Transient 82.00 0 2 Canceled 2015-06-18
Resort Hotel 1 62 2015 July 27 3 0 2 2 0 0 BB PRT Online TA TA/TO 0 0 0 D D 0 No Deposit 240 NULL 0 Transient 97.00 0 1 Canceled 2015-07-03
Resort Hotel 1 101 2015 July 27 3 0 2 2 0 0 BB PRT Online TA TA/TO 0 0 0 A A 0 No Deposit 240 NULL 0 Transient 73.80 0 1 Canceled 2015-06-12
Resort Hotel 0 2 2015 July 27 3 0 2 2 0 0 HB PRT Offline TA/TO TA/TO 0 0 0 A A 0 No Deposit 3 NULL 0 Transient 91.50 0 0 Check-Out 2015-07-05
Resort Hotel 0 15 2015 July 27 3 0 2 2 0 0 BB ESP Online TA TA/TO 0 0 0 A A 0 No Deposit 240 NULL 0 Transient 114.50 0 0 Check-Out 2015-07-05
Resort Hotel 1 51 2015 July 27 3 0 2 3 0 0 BB PRT Online TA TA/TO 0 0 0 A A 0 No Deposit 242 NULL 0 Transient 110.30 0 0 Canceled 2015-06-09
Resort Hotel 0 3 2015 July 27 3 1 2 2 0 0 BB ESP Online TA TA/TO 0 0 0 A A 0 No Deposit 240 NULL 0 Transient 90.90 1 0 Check-Out 2015-07-06
Resort Hotel 1 48 2015 July 27 3 1 2 2 0 0 BB PRT Online TA TA/TO 0 0 0 E E 0 No Deposit 240 NULL 0 Transient 123.00 0 0 Canceled 2015-05-26
Resort Hotel 0 2 2015 July 27 3 2 2 1 0 0 BB PRT Online TA TA/TO 0 0 0 A A 0 No Deposit 240 NULL 0 Transient 122.00 0 0 Check-Out 2015-07-07
Resort Hotel 0 72 2015 July 27 3 2 2 2 0 0 BB DEU Offline TA/TO TA/TO 0 0 0 A A 0 No Deposit 105 NULL 0 Transient 110.70 1 0 Check-Out 2015-07-07
Resort Hotel 0 81 2015 July 27 3 2 6 3 0 0 BB PRT Offline TA/TO TA/TO 0 0 0 D D 0 No Deposit 5 NULL 0 Transient 85.86 0 0 Check-Out 2015-07-11
Resort Hotel 0 99 2015 July 27 3 2 7 2 0 0 BB FRA Offline TA/TO TA/TO 0 0 0 A A 0 No Deposit 40 NULL 0 Contract 58.95 0 0 Check-Out 2015-07-12
Resort Hotel 1 368 2015 July 27 3 3 7 2 0 0 BB PRT Offline TA/TO TA/TO 0 0 0 A A 0 No Deposit 40 NULL 0 Contract 55.68 0 0 Canceled 2015-05-19
Resort Hotel 0 364 2015 July 27 3 3 7 2 0 0 BB GBR Offline TA/TO TA/TO 0 0 0 A A 0 No Deposit 40 NULL 0 Contract 55.68 0 0 Check-Out 2015-07-13
Resort Hotel 1 81 2015 July 27 3 3 7 2 0 0 HB PRT Direct Direct 0 0 0 A A 2 No Deposit 250 NULL 0 Transient 124.00 0 1 Canceled 2015-06-09
Resort Hotel 0 99 2015 July 27 3 3 7 2 0 0 HB GBR Offline TA/TO TA/TO 0 0 0 E E 0 No Deposit 115 NULL 0 Contract 111.15 0 0 Check-Out 2015-07-13
Resort Hotel 0 324 2015 July 27 3 4 10 2 0 0 HB GBR Offline TA/TO TA/TO 0 0 0 E E 0 No Deposit 40 NULL 0 Contract 134.73 0 0 Check-Out 2015-07-17
Resort Hotel 0 69 2015 July 27 3 4 10 2 0 0 BB GBR Online TA TA/TO 0 0 0 F F 1 No Deposit 241 NULL 0 Transient 92.45 0 1 Check-Out 2015-07-17
Resort Hotel 1 79 2015 July 27 3 6 15 2 1 0 BB PRT Offline TA/TO TA/TO 0 0 0 A A 0 No Deposit 242 NULL 0 Transient 108.73 0 2 Canceled 2015-04-15
Resort Hotel 0 12 2015 July 27 3 0 1 2 0 0 BB GBR Online TA TA/TO 0 0 0 A C 0 No Deposit 240 NULL 0 Transient 73.80 0 0 Check-Out 2015-07-04
Resort Hotel 0 9 2015 July 27 3 0 1 2 0 0 BB PRT Online TA TA/TO 0 0 0 A C 0 No Deposit 240 NULL 0 Transient 98.00 1 2 Check-Out 2015-07-04
Resort Hotel 0 1 2015 July 27 3 0 1 2 0 0 BB ESP Online TA TA/TO 0 0 0 A C 0 No Deposit 240 NULL 0 Transient 131.00 0 1 Check-Out 2015-07-04
Resort Hotel 0 21 2015 July 27 3 0 1 2 0 0 BB PRT Online TA TA/TO 0 0 0 E E 0 No Deposit 240 NULL 0 Transient 123.00 0 0 Check-Out 2015-07-04
Resort Hotel 0 9 2015 July 27 3 0 1 2 0 0 BB USA Online TA TA/TO 0 0 0 C C 0 No Deposit 241 NULL 0 Transient 94.71 0 0 Check-Out 2015-07-04
Resort Hotel 0 109 2015 July 27 3 0 1 2 0 0 BB BEL Online TA TA/TO 0 0 0 A A 0 No Deposit 240 NULL 0 Transient 123.00 0 2 Check-Out 2015-07-04
Resort Hotel 1 109 2015 July 27 3 0 2 2 0 0 BB PRT Online TA TA/TO 0 0 0 A A 0 No Deposit 240 NULL 0 Transient 123.00 0 1 Canceled 2015-05-26
Resort Hotel 1 72 2015 July 27 3 0 2 2 0 0 BB PRT Online TA TA/TO 0 0 0 A A 0 No Deposit 240 NULL 0 Transient 73.80 0 1 Canceled 2015-06-29
Resort Hotel 1 63 2015 July 27 3 2 5 2 0 0 BB PRT Online TA TA/TO 0 0 0 F F 0 No Deposit 242 NULL 0 Transient 117.00 0 1 Canceled 2015-05-13
Resort Hotel 0 63 2015 July 27 3 2 5 3 0 0 HB ESP Offline TA/TO TA/TO 0 0 0 E E 0 No Deposit 105 NULL 0 Transient 196.54 0 1 Check-Out 2015-07-10
Resort Hotel 0 101 2015 July 27 3 2 5 2 1 0 BB PRT Online TA TA/TO 0 0 0 D D 0 No Deposit 240 NULL 0 Transient 99.30 1 2 Check-Out 2015-07-10
Resort Hotel 0 102 2015 July 27 3 2 5 2 0 0 BB DEU Direct Direct 0 0 0 E E 0 No Deposit 250 NULL 0 Transient 90.95 0 0 Check-Out 2015-07-10

The above table shows the first 100 rows in the dataset.

Code
library(readr)
dataset<- read_csv("~/Desktop/Dacss601_spring2023/posts/_data/hotel_bookings.csv") 
# Finding summary statistics for 'children'

summary(dataset$children)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
 0.0000  0.0000  0.0000  0.1039  0.0000 10.0000       4 

The above shows the summary statistics of children column.

Provide Grouped Summary Statistics

Conduct some exploratory data analysis, using dplyr commands such as group_by(), select(), filter(), and summarise(). Find the central tendency (mean, median, mode) and dispersion (standard deviation, mix/max/quantile) for different subgroups within the data set.

Code
library(readr)
dataset<- read_csv("~/Desktop/Dacss601_spring2023/posts/_data/hotel_bookings.csv") 
# To show the distribution of the data
multi.hist(dataset[,sapply(dataset, is.numeric)])

From the above histograms, August is the busiest month, with the most bookings in 2016 and the second half of the year.

Code
library(readr)
dataset<- read_csv("~/Desktop/Dacss601_spring2023/posts/_data/hotel_bookings.csv")
# Room Summary
room_summary <- dataset %>%
  filter(is_canceled == 0) %>% 
  group_by(reserved_room_type) %>%
  summarize(room_count = n()) %>% 
  arrange(-room_count)

knitr::kable(room_summary)
reserved_room_type room_count
A 52364
D 13099
E 4621
F 2017
G 1331
B 750
C 624
H 356
L 4

A is the most popular type of room, so the corporation should increase its number of type A rooms.

Code
library(readr)
dataset<- read_csv("~/Desktop/Dacss601_spring2023/posts/_data/hotel_bookings.csv")
ggplot(dataset,aes(reserved_room_type,fill = (hotel))) +
  geom_bar(position = 'dodge') +
  ylab("Number Of Bookings") +
  xlab("Room Type") +
  ggtitle("Room Type Preferred") +
  labs(fill = 'Hotel Type')

*As the City Hotel generates the majority of bookings (66% of all reservations each year), which is far more than the resort hotel, the corporation may design strategies accordingly.

Code
library(readr)
dataset<- read_csv("~/Desktop/Dacss601_spring2023/posts/_data/hotel_bookings.csv")

# country with the most Guests
data_country <- dataset %>% group_by(country) %>%  summarise(booking_count = n()) %>% arrange(desc(booking_count))
top_n(data_country,10,booking_count) %>% 
  ggplot(.,aes(country, booking_count)) +
  geom_bar(stat = "identity", width = 0.25, fill ="blue")

Summary of country with the most guest, we can see PRT has the highest rate of guests.

Explain and Interpret

Be sure to explain why you choose a specific group. Comment on the interpretation of any interesting differences between groups that you uncover. This section can be integrated with the exploratory data analysis, just be sure it is included. The hotels’ ultimate goal is to boost their earnings, therefore they want to comprehend and concentrate on everything that can do so.The most important details are that the City Hotel receives the majority of reservations and generates the majority of money, and that PRT.A is the most popular type of room. This helps the customer plan for more guests, make the necessary preparations, and conduct effective marketing..