Final Project Proposal

finalpart1
Niyati Sharma
Initial proposal for my final project
Author

Niyati Sharma

Published

October 11, 2022

Code
library(tidyverse)
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.3.6      ✔ purrr   0.3.4 
✔ tibble  3.1.8      ✔ dplyr   1.0.10
✔ tidyr   1.2.1      ✔ stringr 1.4.1 
✔ readr   2.1.2      ✔ forcats 0.5.2 
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
Code
library(dplyr)

library(ggplot2)

knitr::opts_chunk$set(echo = TRUE)

Introduction

Credit risk is defined as the risk of loss resulting from the failure by a borrower to repay the principal and interest owed to the leader.So the purpose of credit analysis is to determine the creditworthiness of borrowers by measuring the risk of loss that the lender is exposed to.When calculating the credit risk of a particular borrower, lenders consider various factors like analyze different documents, such as the borrower’s income statement, balance sheet, credit reports, and other documents that reveal the financial situation of the borrower. to evaluate the characteristics of the borrower and conditions of the loan to estimate the probability of default and the subsequent risk of financial loss.

Research Question

Q1. How credit risk depends on the age of the person. Q2. Dominating factor on which credit risk depends. Q3. Is credit risk depends on loan_intent?

Hypothesis

According to research credit risk of a particular borrower, lenders consider various factors include the borrower’s capacity to repay are income, character, house ownership, and credit history. Check the relationship between the age, income with credit risk with new dataset.

Dataset

This dataset contains columns simulating credit bureau data, factors on which credit risk depends. The variables of interest for me are income, age, employment length and home ownership.

Code
library(readr)
df <- read_csv("C:/Users/Lenovo/Downloads/credit_risk_dataset_1.csv")
Rows: 32581 Columns: 12
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): person_home_ownership, loan_intent, loan_grade, cb_person_default_o...
dbl (8): person_age, person_income, person_emp_length, loan_amnt, loan_int_r...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Code
summary(df)
   person_age     person_income     person_home_ownership person_emp_length
 Min.   : 20.00   Min.   :   4000   Length:32581          Min.   :  0.00   
 1st Qu.: 23.00   1st Qu.:  38500   Class :character      1st Qu.:  2.00   
 Median : 26.00   Median :  55000   Mode  :character      Median :  4.00   
 Mean   : 27.73   Mean   :  66075                         Mean   :  4.79   
 3rd Qu.: 30.00   3rd Qu.:  79200                         3rd Qu.:  7.00   
 Max.   :144.00   Max.   :6000000                         Max.   :123.00   
                                                          NA's   :895      
 loan_intent         loan_grade          loan_amnt     loan_int_rate  
 Length:32581       Length:32581       Min.   :  500   Min.   : 5.42  
 Class :character   Class :character   1st Qu.: 5000   1st Qu.: 7.90  
 Mode  :character   Mode  :character   Median : 8000   Median :10.99  
                                       Mean   : 9589   Mean   :11.01  
                                       3rd Qu.:12200   3rd Qu.:13.47  
                                       Max.   :35000   Max.   :23.22  
                                                       NA's   :3116   
  loan_status     loan_percent_income cb_person_default_on_file
 Min.   :0.0000   Min.   :0.0000      Length:32581             
 1st Qu.:0.0000   1st Qu.:0.0900      Class :character         
 Median :0.0000   Median :0.1500      Mode  :character         
 Mean   :0.2182   Mean   :0.1702                               
 3rd Qu.:0.0000   3rd Qu.:0.2300                               
 Max.   :1.0000   Max.   :0.8300                               
                                                               
 cb_person_cred_hist_length
 Min.   : 2.000            
 1st Qu.: 3.000            
 Median : 4.000            
 Mean   : 5.804            
 3rd Qu.: 8.000            
 Max.   :30.000