HW3 First Attempt
The dataset I am using for my final project is a set taken from ‘Kaggle’, containing all of the songs and lyrics from Taylor Swift’s discography up until 2017.Due to the wide range of albums containing 20+ songs each, I will be comparing the lyrics of the first album ‘Taylor Swift’ and the most recent in the data set ‘Reputation’.
head(Swift_lyrics)
# A tibble: 6 x 7
artist album track_title track_n lyric line year
<chr> <chr> <chr> <dbl> <chr> <dbl> <dbl>
1 Taylor Swift Taylor Swift Tim McGraw 1 "He said ~ 1 2006
2 Taylor Swift Taylor Swift Tim McGraw 1 "Put thos~ 2 2006
3 Taylor Swift Taylor Swift Tim McGraw 1 "I said, ~ 3 2006
4 Taylor Swift Taylor Swift Tim McGraw 1 "Just a b~ 4 2006
5 Taylor Swift Taylor Swift Tim McGraw 1 "That had~ 5 2006
6 Taylor Swift Taylor Swift Tim McGraw 1 "On backr~ 6 2006
colnames(Swift_lyrics)
[1] "artist" "album" "track_title" "track_n"
[5] "lyric" "line" "year"
What word shows up the most overall? Are there any visible trends in words or topics in the chosen albums? *How have the lyrics changed over time?
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Kruzlic (2022, March 6). Data Analytics and Computational Social Science: Kruzlic Homework 3. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscombkruzlichw3attempt1/
BibTeX citation
@misc{kruzlic2022kruzlic, author = {Kruzlic, Bryn}, title = {Data Analytics and Computational Social Science: Kruzlic Homework 3}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscombkruzlichw3attempt1/}, year = {2022} }