RMarkdown Demo

Intro to RMarkdown

Sean Conway
5/28/2022

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For Python users, RMarkdown is very similar to Jupyter Notebook.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

2+2
[1] 4

The primary purpose of RMarkdown documents is the sharing of a data analysis with others.

For this class, we will use the distill package to create RMarkdown posts. distill is a great R package for creating professional looking blog posts from RMarkdown files. Make sure you have distill installed on your computer.

R operations

Below is an R code chunk. To create a code chunk, type “{r} to begin and to end”. We’ll use this chunk to load in the ggplot2 and dplyr libraries.

Subheadings

This is a subheading. To create headings with further down levels, simply add more # symbols.

Sub-subheadings

You can write plain text in the white space.
But, make sure to always double space when you want to go to a new line.
This applies to headings as well.

This is my line.
This is another line.

R Coding in RMarkdown

You can include as many R code chunks as you want in an RMarkdown document. Any objects you create in one code chunk will be available in your environment.

Running code chunks

x <- c(1,2,3,4)
x + 2
[1] 3 4 5 6
x/3
[1] 0.3333333 0.6666667 1.0000000 1.3333333

and x is still available in our environment:

x
[1] 1 2 3 4
1
[1] 1
2
[1] 2
2
[1] 2
3
[1] 3
2
[1] 2

RMarkdown is great for showing your work to others. Regular R scripts (files that end in .R) are great for computationally intensive work, but for simpler analyses that are necessary to present to others, RMarkdown is hard to beat.

Directories in R/RMarkdown

caption
library(readr)
my_data <- read_delim("../data.txt")

Example visualization in RMarkdown

Here we use the pre-built in dataset mtcars to visualize the relationship between weight and miles-per-gallon in cars.

mtcars <- as_tibble(mtcars)
ggplot(mtcars, aes(wt,mpg))+
  geom_point()+
  geom_smooth(method="lm")+
  labs(x="weight","mpg")+
  theme_classic()

Code Chunk Options

There are various settings for code chunks. Sometimes you may want to include a computation in your script, but you don’t need readers to see it. For example, say I want to simulate some data, but I don’t need to bog down readers with my simulation code. I can use the echo=F setting to make sure that the simulation code is hidden.

Below, I simulate some data from a linear regression model (the details of this are irrelevant). Setting echo=F will hide this from the reader.

Now, I’ll show the reader a visualization of this simulated data.

ggplot(dat, aes(x,y))+
  geom_point(size=3,alpha=.75)+
  geom_smooth(method="lm")

Sometimes running an R command will output a message that we don’t need the reader to see. This often happens when loading R packages. To avoid this, we can use the command message=F to make sure it doesn’t print the message. I’ll demonstrate this when I load the BayesFactor package for Bayesian statistical analysis (details of this package are irrelevant for this course).

# load bayes factor package
library(BayesFactor)

Sometimes we want to show the reader a particular R command without actually running it. For example, I might want to show the reader that they should install the package dplyr to do a set of analysis. However, I already have dplyr installed in my R, so there’s no need for me to run the command. Here, I’ll use the eval=F setting to print the command in proper R syntax highlighting, but without actually running the command.

install.packages("dplyr")

Other notes about RMarkdown

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Conway (2022, June 7). Data Analytics and Computational Social Science: RMarkdown Demo. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomspconway909277/

BibTeX citation

@misc{conway2022rmarkdown,
  author = {Conway, Sean},
  title = {Data Analytics and Computational Social Science: RMarkdown Demo},
  url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomspconway909277/},
  year = {2022}
}