Code
library(tidyverse)
library(cld3)
library(dplyr)
library(textclean)
library(stringi)
library(stringr)
library(here)
::opts_chunk$set(echo = TRUE) knitr
Molly Hackbarth
October 1, 2022
In order to understand how to download twitter and reddit I looked more into APIs. I’ve heard reddit is a bit frustrating, I decided to try twitter first. This included me downloading the R package “rtweet”. Once I was able to use their API to create a project, I went ahead and tried to download multiple tweets.
Unfortunately “rtweet” had a very similar issue to the package “RedditextractorR”, both had a limit that made it difficult to work with. “rtweet” only allowed you to search from the last 6-9 days of tweets. This makes it hard to gather a lot of data over time. “RedditextractorR” only allowed you have comments from 7 posts at a time. It seems that using R for both types of packages proved to be very difficult.
I also tried to use the package “twitteR” however it would not load properly for me. It kept giving me errors. Even with the properly set up Twitter API I was unable to have it connect to the account. This took over eight hours to try to get to work (including looking at multiple pages that suggested adding more packages to make both “twitteR” and “rtweet” to work) before I decide to give up on trying to using all of the packages.
I ended up deciding to look into other ways I could download tweets and Reddit posts. While most websites offered the same API options as mentioned before, a few of them recommended using Python instead.
After awhile I ended up deciding to download Python and Visual Studio Code to run Python. I had little hope and had some frustrations with downloading the “pip” package but was able to download it.
After finding a YouTube video I was able to use the python package “snscrape” that someone had created for python (you can watch the explanation of how it works here!) in order to allow downloading tweets without having to us an API. This was extremely helpful as the whole time to download all of the tweets I was interested in (both #loveisblindjapan and “love is blind japan”) were downloaded within a few minutes.
For the Reddit posts I used a website that explained to me how to download all the comments that were on the subreddit r/loveisblindjapan. This also only took a few minutes.
Between Reddit and Twitter I was able to download over 20k comments from users who watched the TV show.
Since I did this through python I ended up saving the data into a csv file. This allowed me to check out the data in better detail in Google Sheets. I did a few things in Google Sheets since it was easier:
Combined the two twitter csv files (One for #loveisblindjapan and another for the phrase “love is blind japan”) and removed any duplicates between them with the “remove duplicate” function.
I noticed the reddit csv file had time categorized as “utc” which stands for coordinate universal time. This gave me numbers such as “1643382213” which is fairly unreadable to me. Thus I used this formula to fix it: =X2/86400+DATE(1970,1,1)+time(5,30,0). This allowed me to have 1/28/2022 20:33:33 which is easier to understand. However to match the twitter csv file (done in year/month/day (YMD)) I used removed the time from the end and formatted it using Google Sheet’s “custom date and time” format to end up with 2022-01-28.
Since the twitter csv file had YMD and then time I split the column so it only had YMD.
I ended up merging the files together (This included a count of comments from people, the username of the person, and the actually tweet or post). I made an extra column that would say if it was from Twitter or Reddit.
While almost all Reddit posts were made in English, I noticed there were quite a few tweets that were partially or completely in a different language. This has lead to me debating on if I should just remove the non English tweets entirely or leave them in.
I also noticed there were more tweets that had spelling errors than on Reddit posts. This is likely due to being unable to edit tweets, however this may cause a problem. Additionally tweets were more likely to use slang than Reddit.
From a quick glance I also noticed that tweets were often writing about how the show made them feel rather than about the contestants on the show. This may lead me to change my research question or decide to use only Reddit posts. Reddit posts seemed to focus on the contestants more often.
For the Reddit posts I also noticed that unfortunately the data does not seem to tell me how to now if people are replying to another comment on the post. Some of the posts will start with “I know what you mean!” This could lead to less examples of contestants names being shown, which could make my research question difficult.
Previously my research question was: Do Reddit and Twitter differentiate on their views of contestants and their relationships in Love is Blind Japan?
My current research question I’m leaning towards is: How do Reddit and Twitter users feel about the show Love is Blind Japan?
Why I’m considering the change: It seems that although the contestants are important, if I want to focus on purely how viewers felt about the contestants I would need to only use Reddit posts. Additionally I will be analyzing the positive and negative sentiments of Reddit and Twitter together.
In order to check the data I’ve added my csv file to my repository. I will first check that it was added correctly.
I use the “here” package because it allows you to bypass the issue of setwd(), allowing you to change your working directory file. A relative path to the project root directory will always be created using here().
date
1 2022-09-23
2 2022-09-23
3 2022-09-22
4 2022-09-22
5 2022-09-22
6 2022-09-22
text
1 #LoveIsBlindJapan is wayyy better than the US version
2 S01 | E05I've just watched episode Just the Two of Us of Love is Blind: Japan! #loveisblindjapan https://t.co/lTQZT8XzdX #tvtime https://t.co/NZW6DkFR3O
3 Let’s be real, Kaoru just wanted to promote her music career #LoveIsBlindJapan
4 Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan
5 Ayano and Shuntaro make me uncomfortable ngl #LoveIsBlindJapan
6 I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan. \nRyotaro and Motomi are still the most adorable people and couple. #LoveIsBlind #LoveIsBlindJapan
twitter_reddit
1 twitter
2 twitter
3 twitter
4 twitter
5 twitter
6 twitter
Here we can see the data loaded in correctly and all three of the columns I wanted!
While the data is in the correct columns, I would still like to try a bit of cleaning to see if we can remove some items. The first thing I will do is remove non english from all of my posts. This is due to me being unable to analyze other languages correctly.
The first thing I thought of was removing Japanese as the show was in Japan, so I found this answer.
str_rm_jap = function(x) {
#we replace japanese blocks with nothing, and clean any double whitespace from this
#reference at http://www.rikai.com/library/kanjitables/kanji_codes.unicode.shtml
x %>%
#japanese style punctuation
str_replace_all("[\u3000-\u303F]", "") %>%
#katakana
str_replace_all("[\u30A0-\u30FF]", "") %>%
#hiragana
str_replace_all("[\u3040-\u309F]", "") %>%
#kanji
str_replace_all("[\u4E00-\u9FAF]", "") %>%
#remove excess whitespace
str_replace_all(" +", " ") %>%
str_trim()
}
corpus_posts <- corpus$text %>% str_rm_jap
However I realized there were many more languages. This made it a bit more difficult. So I decided to keep looking and found this answer.
This seemed to work well! It may not be the perfect solution but it seems to have removed any tweets or posts that were not in English.
The next package I’ll use for that is “textclean”.
I’ll first check any posts or tweets (henceforth known as posts) using the check_text() function.
This takes quite awhile (I didn’t actually time it but I had enough time to watch a ton of Youtube clips!)
===========
CONTRACTION
===========
The following observations contain contractions:
1, 9, 12, 14, 16, 18, 21, 27, 33, 35...[truncated]...
This issue affected the following text:
1: S01 | E05I've just watched episode Just the Two of Us of Love is Blind: Japan! #loveisblindjapan https://t.co/lTQZT8XzdX #tvtime https://t.co/NZW6DkFR3O
...[truncated]...
9: This probably seems so silly but I'm annoyed that none of the #LoveIsBlindJapan cast members have verified Instagram accounts while Netflix will make sure to get every single American reality show cast member a verified Instagram account.
...[truncated]...
12: S01 | E02I've just watched episode Nice to Meet You, My Beloved of Love is Blind: Japan! #loveisblindjapan https://t.co/ls7SHX5c4u #tvtime https://t.co/m2n3p8O0xt
...[truncated]...
14: I'm crying laughing, crying and laughing again so cute #LoveIsBlindJapan
...[truncated]...
16: My favourite thing about Midori and Wataru's relationship is they reassure each other. When the other isn't sure and is feeling insecurities about the relationship the other takes charge and make their intention clear. #LoveIsBlindJapan
...[truncated]...
18: Very much in love with Midori, her earnest nature and kindness. She's courageous in the way she feels so openly and says what she's thinking - always with kindness. Lover her! #LoveIsBlindJapan
...[truncated]...
21: Why didn't anyone tell me #LoveIsBlindJapan was this good? Omg https://t.co/xY450z8soT
...[truncated]...
27: 🥰🥰 I still love the #LoveIsBlindJapan cast so much and what great news: Midori and Wataru are expecting baby Mitaru! Aw they'll be new parents next year 🥹
https://t.co/4sp722eIkz
#LoveIsBlind
#loveisblindjapan https://t.co/DkjCo9baRJ
...[truncated]...
33: I just remembered, like Pringles, you can't just just have one episode of #LoveIsBlindJapan
Dinner will now be a little late.
...[truncated]...
35: #LoveIsBlindJapan is so amazing👏its like a couples therapy for me n my bae..we discussed,argued n come terms w it def like a rollercoaster of emo we been through 🫂 we def cheered for Ryotaro & Motomi throughout d entire show,they're so genuine, def most ppl fav iktr💞 https://t.co/rcdueO8EJw
...[truncated]...
*Suggestion: Consider running `replace_contraction`
====
DATE
====
The following observations contain dates:
725, 755, 5873, 8223, 8357, 11114, 11145, 11583, 11596, 14340...[truncated]...
This issue affected the following text:
725: motomi and ryoutaro got officially married on 1/11/22 oh I didn’t think a love is blind couple would be better than cam and lauren for me BUT HERE WE ARE #LoveIsBlindJapan
...[truncated]...
755: Update as per their ig:
Motomi & Ryotaro are officially married on 2022/01/11, are still together and are each other's soulmate. ❤️🥺❤️
Midori & Wattaru are gonna register their marriage in March on her bday, are still together and Midori loves him. ❤️🥺❤️
#LoveIsBlindJapan https://t.co/yOJ6lEU5mq
...[truncated]...
5873: My netflix wanted me to watch an episode of Love is Blind:Japan. Watched until ep.10. When will i ever learn… https://t.co/UGBk5miUNQ
...[truncated]...
8223: Kaoru also said he was arrested with a woman - which implied that the family found out he was having an affair. An article I found about his sentencing said that she was his mistress and he said at his trial that he still cared about his mistress after his wife pledged to support him with getting treatment for his drug addiction, and the mistress said he slipped her the drugs. I can’t speak to the Japanese public’s reactions to drug use, but I can I understand why Kaoru was so hurt and traumatized by all this and still estranged from him. He completely betrayed their family in such a public and shameful way.
https://www.japantimes.co.jp/news/2014/09/12/national/crime-legal/pop-star-aska-gets-suspended-sentence-drug-use/
...[truncated]...
8357: https://dnbstories.com/2017/12/10-meanest-countries-on-social-media.html
Mic drop.
...[truncated]...
11114: According to participants, there is a heavy amount of interference and editing in their storylines. It's also just a bunch of ppl trying to get famous and promote their brands. All of that turned me off of watching another season. [There's also this.](https://www.nytimes.com/2020/07/17/arts/television/terrace-house-suicide.html)
...[truncated]...
11145: It says her birthday is 28/02/1994 on her Instagram, so I am under the impression she turned 28, this year
...[truncated]...
11583: I just rewatched ep 2 and realized it is Motomi too. There is a zoom in her face at 46.24.
...[truncated]...
11596: I thought Kaoru was hafu at first too, but she’s full Japanese. Her mother is Yoko Yajima, a former announcer. [Looking at Kaoru’s older pics,](https://yorozudailynews.blog.ss-blog.jp/2014-05-17-1) I’m pretty certain she’s had double eyelid surgery, giving her more of a Eurasian look
...[truncated]...
14340: I will be messaging you in 2 days on [**2022-02-20 19:08:16 UTC**](http://www.wolframalpha.com/input/?i=2022-02-20%2019:08:16%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/LoveIsBlindJapan/comments/svpc1s/what_do_actual_japanese_people_think_of_this_show/hxhdf74/?context=3)
[**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FLoveIsBlindJapan%2Fcomments%2Fsvpc1s%2Fwhat_do_actual_japanese_people_think_of_this_show%2Fhxhdf74%2F%5D%0A%0ARemindMe%21%202022-02-20%2019%3A08%3A16%20UTC) to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%20svpc1s)
*****
|[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)|
|-|-|-|-|
...[truncated]...
*Suggestion: Consider running `replace date`
=====
DIGIT
=====
The following observations contain digits/numbers:
1, 3, 5, 12, 21, 24, 25, 27, 28, 29...[truncated]...
This issue affected the following text:
1: S01 | E05I've just watched episode Just the Two of Us of Love is Blind: Japan! #loveisblindjapan https://t.co/lTQZT8XzdX #tvtime https://t.co/NZW6DkFR3O
...[truncated]...
3: I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan.
Ryotaro and Motomi are still the most adorable people and couple. #LoveIsBlind #LoveIsBlindJapan
...[truncated]...
5: Not these grown ass women crying over a 23 year of hairdresser 😩 #LoveIsBlindJapan
...[truncated]...
12: S01 | E02I've just watched episode Nice to Meet You, My Beloved of Love is Blind: Japan! #loveisblindjapan https://t.co/ls7SHX5c4u #tvtime https://t.co/m2n3p8O0xt
...[truncated]...
21: Why didn't anyone tell me #LoveIsBlindJapan was this good? Omg https://t.co/xY450z8soT
...[truncated]...
24: first 20 mins in and ik this mf did not just say that the kitchen is a woman’s domain……. BOOOO TOMATO TOMATO #LoveIsBlindJapan https://t.co/LERT8R6fU5
...[truncated]...
25: #loveisblindjapan was an interesting netflix show. apparently some of the japan members are still doing streams #netflix
https://t.co/h73ZncPOox
...[truncated]...
27: 🥰🥰 I still love the #LoveIsBlindJapan cast so much and what great news: Midori and Wataru are expecting baby Mitaru! Aw they'll be new parents next year 🥹
https://t.co/4sp722eIkz
#LoveIsBlind
#loveisblindjapan https://t.co/DkjCo9baRJ
...[truncated]...
28: Continuing my Netflix journey and so impressed with #LoveIsBlindJapan which feels completely removed from Love Is Blind: USA — Japan’s spinoff is patient, emotional, sincere, vulnerable, romantic. Maybe it’s a cultural difference, here people are literally there for love. https://t.co/i8G80qGbe3
...[truncated]...
29: I voice #Nana in the #EnglishDub of #LoveIsBlindJapan 💜
A huge thank you to the marvelous Mimi for this amazing opportunity
🙏🥰
Cheers to many more 😘
#flashbackfriday https://t.co/9I9yFP606f
...[truncated]...
*Suggestion: Consider using `replace_number`
========
EMOTICON
========
The following observations contain emoticons:
1, 11, 12, 21, 24, 25, 27, 28, 29, 34...[truncated]...
This issue affected the following text:
1: S01 | E05I've just watched episode Just the Two of Us of Love is Blind: Japan! #loveisblindjapan https://t.co/lTQZT8XzdX #tvtime https://t.co/NZW6DkFR3O
...[truncated]...
11: Shuntaro almost lost Ayano in the pods by not speaking up and it appears he learned nothing from that experience. The women are so direct and half of the men have misled them throughout the process. This is nuts. #LoveIsBlindJapan
...[truncated]...
12: S01 | E02I've just watched episode Nice to Meet You, My Beloved of Love is Blind: Japan! #loveisblindjapan https://t.co/ls7SHX5c4u #tvtime https://t.co/m2n3p8O0xt
...[truncated]...
21: Why didn't anyone tell me #LoveIsBlindJapan was this good? Omg https://t.co/xY450z8soT
...[truncated]...
24: first 20 mins in and ik this mf did not just say that the kitchen is a woman’s domain……. BOOOO TOMATO TOMATO #LoveIsBlindJapan https://t.co/LERT8R6fU5
...[truncated]...
25: #loveisblindjapan was an interesting netflix show. apparently some of the japan members are still doing streams #netflix
https://t.co/h73ZncPOox
...[truncated]...
27: 🥰🥰 I still love the #LoveIsBlindJapan cast so much and what great news: Midori and Wataru are expecting baby Mitaru! Aw they'll be new parents next year 🥹
https://t.co/4sp722eIkz
#LoveIsBlind
#loveisblindjapan https://t.co/DkjCo9baRJ
...[truncated]...
28: Continuing my Netflix journey and so impressed with #LoveIsBlindJapan which feels completely removed from Love Is Blind: USA — Japan’s spinoff is patient, emotional, sincere, vulnerable, romantic. Maybe it’s a cultural difference, here people are literally there for love. https://t.co/i8G80qGbe3
...[truncated]...
29: I voice #Nana in the #EnglishDub of #LoveIsBlindJapan 💜
A huge thank you to the marvelous Mimi for this amazing opportunity
🙏🥰
Cheers to many more 😘
#flashbackfriday https://t.co/9I9yFP606f
...[truncated]...
34: I love this show, & I’m thrilled to be a part of it! I’m honored to be the English-speaking voice for Ryotaro in #LoveIsBlindJapan. He is such a good & kind soul. I wish the best for him & Motomi! Everyone, pls go watch their story on Netflix. Do it for love! 😄❤️😍🥰🙏 https://t.co/4QgIv5YSOk
...[truncated]...
*Suggestion: Consider using `replace_emoticons`
=======
ESCAPED
=======
The following observations contain escaped back spaced characters:
8113, 8666, 9341, 10041, 10079, 11029, 11296, 11813, 12987, 13183...[truncated]...
This issue affected the following text:
8113: For the Mori thing, I agree with others. The way Minami says things so bluntly is very unusual in Japan - at least straight off the bat towards someone you’ve known for less than a year. So when Mori said that it might be an issue, like after the second meeting or so, that was warning #1 lol also the hair thing was bad. In Japanese, you would go:
woman: points at hair\*, (you seem to be shedding a lot), are you ok?
Mori: ah, sorry. It’s a new medical thing I’m trying at the moment, you see.
woman: ah, I see. Seems difficult. Clean up must be hard, huh?
mori: ah, yes it is.
woman: ganbatte ne
mori: hai, ganbarimasu
from this convo, someone typically less direct would bring up the issue very subtly, then sympathize then tell the person to do their best and get their shit together. Hence why things keep being repeated in conversations. I think Mori felt bad for Minami when she mention how she dated Mori to change herself for the positive. However, Mori messed up when he mentions his dreams for the third time and then Minami gives her reasoning for wanting to keep her career. They were not wrong for having their dreams, but that fact that it was personal for them both and neither of them would change their dream it further proved that they wouldn’t work between them.
I think Kaori and Periya is another good example. For Kaori, the whole thing about her father and her having to point out to Misaki that he didn’t even ask her how she felt about her whole situation felt heartless to her. The fact that she had to explain this was already too much in Kaori mind. When you say super sad shit in Japanese, typically you respond with “Tai hen sou” (that sounds hard/difficult to deal with) or “kibishi sou desu ne” (that sounds difficult/harsh) and then ask the person if they are ok “daijoubu” the thing with Periya and Mizuki is that when dating in Japan, you are asking your age, family and job occupation is fairly normal. The fact that Periya had to dig and dig for answer from him was flag #1. The other flags kept appearing when he fluffed that he was an owner of a restaurant when he clearly was not. His dreams did not make sense and he felt like he was trying to be someone he was not. For her, she felt like her time was wasted or false because Mizuki was not really present during LIB.
For Periya, Mizuki felt shady vibes in the sense that his occupation was a lie. He ordered the most expensive bottle on the menu and seemed to live beyond his means (based on Periya’s observations). Now that with his dream retirement to be in another country with no idea on how to continue said business, provide for his family and be idk on his citizenship status in another country marked all the red flags in my books. Mizuki was nice, but there wasn’t an warmth in his voice lol you can easily sense tones and emotions in Japanese as things are meant to be pointed at times and you have to read between the lines.
&#x200B;
experience: MA on Japan, 7 formal years of learning Japanese lol
...[truncated]...
8666:
\* Panru = Lupin (sometimes to be cutesy, you reverse syllabic order of words/names, so Rupan/Lupin = Panru...)
Okay this is fascinating, thank you for including this in your translation!
...[truncated]...
9341: ;\_\_\_;
this made the week for me. So happy for them and in this era of superficial romances, they really are one in a million to develop from a blind dating couple to official husband and wife.
Congratulations!! (I tried to watch the US version S2 and all the personalities are so insufferable...)
...[truncated]...
10041: The funny thing is that if you look at her IG, she has blonde hair in several of her pictures lol. Though I understand there may be a different standard for men https://www.instagram.com/motomi\_228/
...[truncated]...
10079: great post!
Also i dont know if this was me but i felt like the pressure was more put on the \*women\* to keep the men intrested in them? thats the vibe i got from it at least
...[truncated]...
11029: LOL I was also obsessed. I want this shortsleeve/longsleeve convertible hoodie!\]
...[truncated]...
11296: I enjoyed the following recapper:
[https://www.youtube.com/watch?v=CbAA5V-FCRs](https://www.youtube.com/watch?v=CbAA5V-FCRs)
&
[https://www.youtube.com/watch?v=Olvv\_8gkNh8](https://www.youtube.com/watch?v=Olvv_8gkNh8)
The former did 3 long podcasts while the latter is doing episodic review and is currently on episode 5.
...[truncated]...
11813: The TH sub also found Kaoru showing up in the TH 2019 season.
https://www.reddit.com/r/terracehouse/comments/stc6wr/haruka\_cameo\_in\_love\_is\_blind/hx34p8m/?utm\_source=reddit&utm\_medium=web2x&context=3
...[truncated]...
12987: \>Her friends definitely speaks more fluent english that him.
&#x200B;
Pretty sure her friends are American. If not they have spent their college years abroad. She said they were college friends, and she went to school in the US.
...[truncated]...
13183: Anyone else got some f\*boy vibes with tattoo guy?
...[truncated]...
*Suggestion: Consider using `replace_white`
====
HASH
====
The following observations contain Twitter style hash tags (e.g., #rstats):
1, 2, 3, 4, 5, 6, 7, 8, 9, 10...[truncated]...
This issue affected the following text:
1: S01 | E05I've just watched episode Just the Two of Us of Love is Blind: Japan! #loveisblindjapan https://t.co/lTQZT8XzdX #tvtime https://t.co/NZW6DkFR3O
...[truncated]...
2: Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan
...[truncated]...
3: I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan.
Ryotaro and Motomi are still the most adorable people and couple. #LoveIsBlind #LoveIsBlindJapan
...[truncated]...
4: Are they all getting engaged then?! 😳 #LoveIsBlindJapan
...[truncated]...
5: Not these grown ass women crying over a 23 year of hairdresser 😩 #LoveIsBlindJapan
...[truncated]...
6: Yudai really thinks this is a game 😩 #LoveIsBlindJapan
...[truncated]...
7: Midori-chan is gone, my girl is in love 😭 She’s actually the best fit for Wataru, I think, because she calls him out on his BS, but genuinely accepts him for who he is #LoveIsBlindJapan
...[truncated]...
8: Omg the bridge thing is beautiful, the American version could never! #LoveIsBlindJapan
...[truncated]...
9: This probably seems so silly but I'm annoyed that none of the #LoveIsBlindJapan cast members have verified Instagram accounts while Netflix will make sure to get every single American reality show cast member a verified Instagram account.
...[truncated]...
10: Misrepresenting your personality, your willingness to have children, misrepresenting your acceptance of having a working wife, misrepresenting your job…? This is a shambles. #LoveIsBlindJapan
...[truncated]...
*Suggestion: Consider using `qdapRegex::ex_tag' (to capture meta-data) and/or replace_hash
====
HTML
====
The following observations contain HTML markup:
34, 35, 44, 105, 127, 150, 152, 155, 190, 256...[truncated]...
This issue affected the following text:
34: I love this show, & I’m thrilled to be a part of it! I’m honored to be the English-speaking voice for Ryotaro in #LoveIsBlindJapan. He is such a good & kind soul. I wish the best for him & Motomi! Everyone, pls go watch their story on Netflix. Do it for love! 😄❤️😍🥰🙏 https://t.co/4QgIv5YSOk
...[truncated]...
35: #LoveIsBlindJapan is so amazing👏its like a couples therapy for me n my bae..we discussed,argued n come terms w it def like a rollercoaster of emo we been through 🫂 we def cheered for Ryotaro & Motomi throughout d entire show,they're so genuine, def most ppl fav iktr💞 https://t.co/rcdueO8EJw
...[truncated]...
44: Finished watching both the Seasons of the @LoveisBlindShow & #LoveIsBlindBrazil & #LoveIsBlindJapan. Waiting for their next season 💯
...[truncated]...
105: @itsamandared Just watching this episode now and that’s why I came here! I knew it couldn’t just be me seeing the gaslighting. He did a 180 on sharing the household/childcare duties & enjoying her “quirks”… 😡 Minami, you deserve better. #wegotyou #LoveIsBlindJapan 💜
...[truncated]...
127: 💇🏻♂️ Cuuuuuuute!!! Two fan favourites from #LoveIsBlindJapan 🤩 Shuntaro getting his hair cut by Ryoutaro! Motomi & Ryoutaro still appear to be happily together 🥰 https://t.co/U3Q1joWxc6
...[truncated]...
150: This girl is balancing in this balance ball like it’s nothing & super easy & her feet are like 6 inches off the floor. I would fall right off. #LoveIsBlindJapan
...[truncated]...
152: They have earphones & it seems that without the earphones they can’t hear each other at all. #LoveIsBlindJapan
...[truncated]...
155: I watched #LoveisBlindJapan and it’s so refreshing & wholesome compared to the US one🥺 Glad the 2 couples pulled through and acc got married aw
...[truncated]...
190: Although I am still in "let's cancel the Netflix acct" negotiations, #LoveIsBlindJapan was good.
I think there was a lot lost in translation re level of affection shown & exactly what happened w. Minami and Mori. Also, the best looking, most stylish guy is a fiancee's dad.
...[truncated]...
256: I finally finished #LoveIsBlindJapan and I’m SO happy my fav/cutest couple (Ryotaro and Motomi) made it to the end!!! I WAS rooting for Mori & Minami BUT I’m glad she chose her career since he just wanted a housewife💀
...[truncated]...
*Suggestion: Consider running `replace_html`
==========
INCOMPLETE
==========
The following observations contain incomplete sentences (e.g., uses ending punctuation like '...'):
35, 48, 52, 89, 97, 133, 148, 200, 203, 212...[truncated]...
This issue affected the following text:
35: #LoveIsBlindJapan is so amazing👏its like a couples therapy for me n my bae..we discussed,argued n come terms w it def like a rollercoaster of emo we been through 🫂 we def cheered for Ryotaro & Motomi throughout d entire show,they're so genuine, def most ppl fav iktr💞 https://t.co/rcdueO8EJw
...[truncated]...
48: compared to #loveisblindjapan the American version is pretty vulgar and kinda... rude
exactly what I thought it would be
all the talk about sex https://t.co/6i6t4a5Vkh
...[truncated]...
52: also.. up to this point (ep 3) i think the best and most genuine couple is Ryotaro and Motomi? like idk i just love how they are with ea/o. and the dude really seems like an honest good guy.. even around the males he always tried to turn negative into positive.
#LoveIsBlindJapan
...[truncated]...
89: If Midori does not marry Wataru after all of those aggresively intense workouts she had him doing... #loveisblind #loveisblindJapan
...[truncated]...
97: Sometimes I think about that one couple on #LoveIsBlindJapan that got together because she prepared a powerpoint presentation on why he should date her ... and he felt very convinced. Apparently, they're still married.
...[truncated]...
133: Just finished #LoveIsBlindJapan...they took just ALL the way through it!! https://t.co/3VQbNIVA92
...[truncated]...
148: Probz gonna binge watch Love is Blind Brazil after LIB Japan...
I heard there is gonna be a huge culture shock esp re PDA or skinship.
I already knew beforehand there is for sure a significant difference there 🤭
So, I am expecting it.
#LoveIsBlindJapan #LoveIsBlindBrazil
...[truncated]...
200: @BrightlyAgain #LoveAfterLockup
Indie needs to watch a few episodes of #LoveIsBlindJapan...
These women didn't come to play...first red flag, they're like... https://t.co/OghLs6LgsS
...[truncated]...
203: Yudai is so cute but... as someone who is also 23 years old I think that's too young to go on a dating/marriage show. I worry he might not be ready/mature enough. #LoveIsBlindJapan
...[truncated]...
212: I don't realize how hard it is to know the soundtrack of episode 11 of Love is Blind: Japan...
The songs in that finale episode were GOOD.....
But it is hard looking for the song titles when you barely hear the lyrics 😭
#LoveIsBlindJapan
...[truncated]...
*Suggestion: Consider using `replace_incomplete`
====
KERN
====
The following observations contain kerning (e.g., 'The B O M B!'):
170, 801, 942, 1006, 1040, 1233, 1383, 1483, 1569, 1575...[truncated]...
This issue affected the following text:
170: Omg the 56 year olds backstory 🥺🥺 not his previous love passing away omg. HE BETTER FIND LOVE OR IMMA CRY I SWEAR #LoveIsBlindJapan
...[truncated]...
801: PRIYA IS ONE HECK OF A WOMAN !! she is fierce, straightforward, ambitious, she’s a boss lady !! homegirl owns a CBD skincare company, can train elephants and was Miss Japan 2016 pls
ain’t nobody matching her energy, she deserves a rare and good man #LoveIsBlindJapan Ep7
...[truncated]...
942: RYOTOMI SHOWING UP TO THE AFTERSHOW BOTH WITH BRIGHTLY COLORED HAIR AND RYOTARO SAYING HE REALLY APPRECIATES COMING HOME TO HER AND HER COOKING AND MOTOMI SAYING HE SAYS HER FOOD IS GOOD EVERY 3 SECONDS AND THEY PROBABLY WORKED IN A FIELD TOGETHER IN A PAST LIFE #LoveIsBlindJapan https://t.co/syIHPG7YUa
...[truncated]...
1006: #LoveIsBlindJapan BRO IM CRYING RN RYOTARO AND MOTOMI ARE SO CUTE I CANT BELIEVE SHE DYED HER HAIR BLONDE TOO
...[truncated]...
1040: THEY ARE BOTH BLONDE OH MY GOD I LOVE THEM #LoveIsBlindJapan
...[truncated]...
1233: MOTOMI DYED HER HAIR LIGHT BROWN AFTER MARRYING RYOTARO OMG
THEY MATCH CUZ HE'S BACK TO BLONDE OMG
THEYRE SO CUTE I CANT HANDLE IT #LoveIsBlindJapan
...[truncated]...
1383: NO WAY THEY DID A THREE MONTHS LATER!!!!!!! this is amazing, because my ass was about to stalk each and every instagram page of the cast😭😭😭😭 #LoveIsBlindJapan
...[truncated]...
1483: I DIDN'T EXPECT MOTOMI TO DYE HER HAIR BLONDE OMG, ITS A YES TO ME
Im happy for her and Ryotaro 😭
#LoveIsBlindJapan
...[truncated]...
1569: HOLLLYYY SHIIIITTT!!! MIDORI AND WATARU BOTH SAID "I DO" 😳😳😳😅😅😅😅
DID NOT SEE THIS COMING, HOPED FOR THIS, BUT I WAS STILL DOUBTING MIDORI WASNT GONNA SAY YES. 👏👏👏👏
#LoveIsBlindJapan
...[truncated]...
1575: THEY BOTH SAID "I DO".....THIS IS NOT A DRILL....RYOTARO AND MOTOMI BOTH SAID "I DO" 🥳🥳🥳🥳🥳❤️❤️❤️❤️❤️🥺🥺🥺
#LoveIsBlindJapan https://t.co/AOic3Tc7GW
...[truncated]...
*Suggestion: Consider using `replace_kern`
==========
MISSPELLED
==========
The following observations contain potentially misspelled words:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10...[truncated]...
This issue affected the following text:
1: S01 | E05I've <<<<ju>>s>>t wa<<tch>>ed <<<<ep>>i>>so<<de>> <<Jus>>t <<th>>e T<<wo>> of Us of L<<ove>> is Bli<<nd>>: J<<ap>>an! #<<<<l<<ove>>isbli<<nd>>>><<<<ja>>p>>an>> <<<<ht<<tp>>>>s>>://t.co/<<lTQZT>>8<<XzdX>> #<<<<tv>><<t<<im>>>>e>> <<<<ht<<tp>>>>s>>://t.co/<<NZW>>6<<<<Dk>>FR>>3O
...[truncated]...
2: W<<hy>> do I <<fe>><<el>> li<<ke>> a lot of <<th>>e<<se>> peop<<le>> <<<<ju>>s>>t wan<<te>>d a fr<<ee>> h<<ol>>i<<da>>y? #<<<<L<<ove>>I<<sB>>li<<nd>>>>J<<ap>>an>>
...[truncated]...
3: I d<<ec>>i<<de>>d I n<<ee>>d a p<<al>>a<<te>> c<<le>>an<<se>>r f<<<<ro>>m>> <<th>>e US L<<ove>> is Bli<<nd>> S2 be<<fo>>re I wa<<tch>> S3, so I <<hav>>e b<<<<ee>>n>> <<re<<wa<<tch>>in>>g>> p<<ar>>ts of L<<ove>> is Bli<<nd>> J<<ap>>an.
<<<<<<Ryo>>ta>><<ro>>>> a<<nd>> <<Mot<<omi>>>> <<ar>>e still <<th>>e m<<os>>t ado<<ra>>b<<le>> peop<<le>> a<<nd>> c<<ou>>p<<le>>. #<<L<<ove>>I<<sB>>li<<nd>>>> #<<<<L<<ove>>I<<sB>>li<<nd>>>>J<<ap>>an>>
...[truncated]...
4: Are <<th>>ey <<al>>l <<ge<<tti>>n>>g <<e<<ng>>>><<ag>>ed <<th>>en?! 😳 #<<<<L<<ove>>I<<sB>>li<<nd>>>>J<<ap>>an>>
...[truncated]...
5: Not <<th>>e<<se>> g<<ro>><<wn>> a<<ss>> <<wo>>men <<<<cr>>yin>>g <<ove>>r a 23 ye<<ar>> of <<h<<ai>>>>r<<dr>>e<<ss>>er 😩 #<<<<L<<ove>>I<<sB>>li<<nd>>>>J<<ap>>an>>
...[truncated]...
6: <<Yud<<ai>>>> <<r<<e<<al>>>>l>>y t<<hi<<nk>>>>s <<th>>is is a g<<ame>> 😩 #<<<<L<<ove>>I<<sB>>li<<nd>>>>J<<ap>>an>>
...[truncated]...
7: <<<<Mido>><<ri>>>>-<<chan>> is <<gon>>e, my g<<irl>> is in l<<ove>> 😭 Sh<<e<<<<’>>s>>>> ac<<tu>><<al>>ly <<th>>e best fit <<fo>>r <<<<Wat>><<ar>>u>>, I t<<hi<<nk>>>>, <<b<<ec>>>><<<<au>><<se>>>> she c<<al>>ls h<<im>> <<ou>>t on his BS, but genuin<<el>>y <<acc>>e<<pts>> h<<im>> <<fo>>r who he is #<<<<L<<ove>>I<<sB>>li<<nd>>>>J<<ap>>an>>
...[truncated]...
8: <<Omg>> <<th>>e b<<ri>>dge <<th>><<i<<ng>>>> is <<be<<au>>ti<<fu>>>>l, <<th>>e A<<m<<e<<ri>>>>ca>>n <<<<ver>>s>>ion co<<uld>> <<ne>><<ver>>! #<<<<L<<ove>>I<<sB>>li<<nd>>>>J<<ap>>an>>
...[truncated]...
9: This p<<ro>>bably s<<ee>>ms so <<si>>lly but I'm annoyed <<th>>at no<<ne>> of <<th>>e #<<<<L<<ove>>I<<sB>>li<<nd>>>>J<<ap>>an>> cast <<mem>>bers <<hav>>e <<ver>>i<<f<<ie>>d>> <<Insta>>g<<ra>>m <<acc>>o<<un>>ts whi<<le>> N<<et>><<flix>> will ma<<ke>> s<<ur>>e to g<<et>> e<<ver>>y s<<i<<ng>>>><<le>> A<<m<<e<<ri>>>>ca>>n r<<e<<al>>>>i<<ty>> <<sho>>w cast <<mem>>ber a <<ver>>i<<f<<ie>>d>> <<Insta>>g<<ra>>m <<acc>>o<<un>>t.
...[truncated]...
10: Mi<<sr>>e<<pre>><<se>>nt<<i<<ng>>>> y<<ou>>r person<<al>>i<<ty>>, y<<ou>>r will<<i<<ng>>>><<<<ne>><<ss>>>> to <<hav>>e child<<ren>>, <<mis>>re<<pre>><<se>>nt<<i<<ng>>>> y<<ou>>r <<acc>><<ep>>tan<<ce>> of <<hav>><<i<<ng>>>> a <<wo>><<rk>><<i<<ng>>>> w<<i<<fe>>>>, <<mis>>re<<pre>><<se>>nt<<i<<ng>>>> y<<ou>>r <<jo>>b…? This is a <<sha>>mb<<<<le>>s>>. #<<<<L<<ove>>I<<sB>>li<<nd>>>>J<<ap>>an>>
...[truncated]...
*Suggestion: Consider running `hunspell::hunspell_find` & `hunspell::hunspell_suggest`
==========
NO ENDMARK
==========
The following observations contain elements with missing ending punctuation:
1, 2, 3, 4, 5, 6, 7, 8, 10, 11...[truncated]...
This issue affected the following text:
1: S01 | E05I've just watched episode Just the Two of Us of Love is Blind: Japan! #loveisblindjapan https://t.co/lTQZT8XzdX #tvtime https://t.co/NZW6DkFR3O
...[truncated]...
2: Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan
...[truncated]...
3: I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan.
Ryotaro and Motomi are still the most adorable people and couple. #LoveIsBlind #LoveIsBlindJapan
...[truncated]...
4: Are they all getting engaged then?! 😳 #LoveIsBlindJapan
...[truncated]...
5: Not these grown ass women crying over a 23 year of hairdresser 😩 #LoveIsBlindJapan
...[truncated]...
6: Yudai really thinks this is a game 😩 #LoveIsBlindJapan
...[truncated]...
7: Midori-chan is gone, my girl is in love 😭 She’s actually the best fit for Wataru, I think, because she calls him out on his BS, but genuinely accepts him for who he is #LoveIsBlindJapan
...[truncated]...
8: Omg the bridge thing is beautiful, the American version could never! #LoveIsBlindJapan
...[truncated]...
10: Misrepresenting your personality, your willingness to have children, misrepresenting your acceptance of having a working wife, misrepresenting your job…? This is a shambles. #LoveIsBlindJapan
...[truncated]...
11: Shuntaro almost lost Ayano in the pods by not speaking up and it appears he learned nothing from that experience. The women are so direct and half of the men have misled them throughout the process. This is nuts. #LoveIsBlindJapan
...[truncated]...
*Suggestion: Consider cleaning the raw text or running `add_missing_endmark`
====================
NO SPACE AFTER COMMA
====================
The following observations contain commas with no space afterwards:
22, 35, 180, 248, 263, 428, 1321, 1472, 1486, 2140...[truncated]...
This issue affected the following text:
22: AND FUCK YUDAI,, hes literally so fucking weird for making the most hasty ass decision like u are proposing to someone, its not easy, u never even clarified to the rest of the women u are talking to what ur decision was?? he honestly treated the show like a joke #LoveIsBlindJapan
...[truncated]...
35: #LoveIsBlindJapan is so amazing👏its like a couples therapy for me n my bae..we discussed,argued n come terms w it def like a rollercoaster of emo we been through 🫂 we def cheered for Ryotaro & Motomi throughout d entire show,they're so genuine, def most ppl fav iktr💞 https://t.co/rcdueO8EJw
...[truncated]...
180: The concept of falling in love solely by their heart without other factor don’t work in Japan.They consider everything together.2 Couples who succeed in married have same emotional level,same lifestyle,good looking,strong financially and come from proper family.#LoveIsBlindJapan
...[truncated]...
248: Girl,.. just tell him you want him or you don't want him. STOP WASTING HIS TIME. #LoveIsBlindJapan
...[truncated]...
263: ‘Forget about Kenya,’ says Misaki, after bringing Kenya into every conversation he’s had in the last 4 episodes #LoveIsBlindJapan
...[truncated]...
428: Am I really myself if I don't google or twittersearch a show I'm watching,,I think not
#LoveIsBlindJapan
...[truncated]...
1321: #LoveIsBlindJapan
I had a good time:
- Your set was magical omg.
- Having little insights on Japanese ways, norms, values…loved every bit of it.
- Points for tourism 😉👍🏾
- Your cast; great pick. ( The ladies were a unique set 😂)
Well done to you all 👏🏾 ,and “Arigato.”
...[truncated]...
1472: I Love My Sixpack So Much,I Protect It With A Layer Of Fat.
#gg #LoveIsBlindJapan https://t.co/kAQtvCQnH0
...[truncated]...
1486: I’ve just finished watching the whole season.
I LOVED this show,
I hope the married couples will stay happily together forever 😭
#LoveIsBlindJapan
...[truncated]...
2140: If anyone loves you because of #MONEY ,the person truly #loves you.#FallingInLoveWithMe #LoveIsBlind #LoveIsBlindJapan #LoveIsColorBlind #MoneyTalks #moneytwitter https://t.co/JdslD1def9
...[truncated]...
*Suggestion: Consider running `add_comma_space`
=========
NON ASCII
=========
The following observations contain non-ASCII text:
3, 4, 5, 6, 7, 10, 13, 19, 20, 23...[truncated]...
This issue affected the following text:
3: I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan.
Ryotaro and Motomi are still the most adorable people and couple. #LoveIsBlind #LoveIsBlindJapan
...[truncated]...
4: Are they all getting engaged then?! 😳 #LoveIsBlindJapan
...[truncated]...
5: Not these grown ass women crying over a 23 year of hairdresser 😩 #LoveIsBlindJapan
...[truncated]...
6: Yudai really thinks this is a game 😩 #LoveIsBlindJapan
...[truncated]...
7: Midori-chan is gone, my girl is in love 😭 She’s actually the best fit for Wataru, I think, because she calls him out on his BS, but genuinely accepts him for who he is #LoveIsBlindJapan
...[truncated]...
10: Misrepresenting your personality, your willingness to have children, misrepresenting your acceptance of having a working wife, misrepresenting your job…? This is a shambles. #LoveIsBlindJapan
...[truncated]...
13: I dunno if it’s having to read the words that is making it sink in more but how is it that the men don’t think of the future much compared to the women? #LoveIsBlindJapan
...[truncated]...
19: Really sad and confused at why Odacchi changed so much after the pods, I was rooting for him and Nanako so much. 💔 #LoveIsBlindJapan
...[truncated]...
20: The genuine connections, the sincerity they have when they speak to each other. The romance, the LOVE. 💚 #LoveIsBlindJapan is one of the best things Netflix has ever given me.
...[truncated]...
23: what is wrong with ayano’s hair…like why is no one telling her it looks like that????? #LoveIsBlindJapan
...[truncated]...
*Suggestion: Consider running `replace_non_ascii`
==================
NON SPLIT SENTENCE
==================
The following observations contain unsplit sentences (more than one sentence per element):
1, 2, 3, 4, 8, 10, 11, 12, 13, 15...[truncated]...
This issue affected the following text:
1: S01 | E05I've just watched episode Just the Two of Us of Love is Blind: Japan! #loveisblindjapan https://t.co/lTQZT8XzdX #tvtime https://t.co/NZW6DkFR3O
...[truncated]...
2: Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan
...[truncated]...
3: I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan.
Ryotaro and Motomi are still the most adorable people and couple. #LoveIsBlind #LoveIsBlindJapan
...[truncated]...
4: Are they all getting engaged then?! 😳 #LoveIsBlindJapan
...[truncated]...
8: Omg the bridge thing is beautiful, the American version could never! #LoveIsBlindJapan
...[truncated]...
10: Misrepresenting your personality, your willingness to have children, misrepresenting your acceptance of having a working wife, misrepresenting your job…? This is a shambles. #LoveIsBlindJapan
...[truncated]...
11: Shuntaro almost lost Ayano in the pods by not speaking up and it appears he learned nothing from that experience. The women are so direct and half of the men have misled them throughout the process. This is nuts. #LoveIsBlindJapan
...[truncated]...
12: S01 | E02I've just watched episode Nice to Meet You, My Beloved of Love is Blind: Japan! #loveisblindjapan https://t.co/ls7SHX5c4u #tvtime https://t.co/m2n3p8O0xt
...[truncated]...
13: I dunno if it’s having to read the words that is making it sink in more but how is it that the men don’t think of the future much compared to the women? #LoveIsBlindJapan
...[truncated]...
15: Ryotaro and Motomi are perfect together, their love for each other is easy. #LoveIsBlindJapan
...[truncated]...
*Suggestion: Consider running `textshape::split_sentence`
===
TAG
===
The following observations contain Twitter style handle tags (e.g., @trinker):
61, 72, 77, 93, 105, 113, 185, 294, 321, 375...[truncated]...
This issue affected the following text:
61: If you like #LoveIsBlind you MUST watch #LoveIsBlindJapan it’s sooooooooo sweet 💕you will get past having to read subtitles by the end of the 1st ep. Trust me it’s unbelievable. @laurenlapkus I think u would like it! 🤗
...[truncated]...
72: @suxelamai Lord I don’t think this show for me
I know these love reality shows are a mess but I feel like they are doomed from the start
It seems the “matchmakers” purposely sabotage the marching for ratings I’m waiting for something like this but more wholesome 🥲 like #LoveisBlindJapan
...[truncated]...
77: @qwhee_in lol nah, just bc they didn’t last doesn’t mean it wasn’t real. They had a great connection but in the end they both have things to work thru. I appreciate their honesty in breaking the engagement, but I still think they could’ve worked if they tried harder #LoveIsBlindJapan
...[truncated]...
93: @teysixeight1 Minami, you don't even need a big reason as to why you want to work/keep your job. Any man that wants you to value his career over yours on the basis of gender roles does not deserve you. #LoveIsBlindJapan
...[truncated]...
105: @itsamandared Just watching this episode now and that’s why I came here! I knew it couldn’t just be me seeing the gaslighting. He did a 180 on sharing the household/childcare duties & enjoying her “quirks”… 😡 Minami, you deserve better. #wegotyou #LoveIsBlindJapan 💜
...[truncated]...
113: I hope to find someone who makes me laugh the way @odaccii made me laugh watching Love is Blind Japan, literally the best one I’ve watched! #LoveIsBlindJapan
...[truncated]...
185: It’s always a good time with @pstinny at the hosting helm! Catch me on the latest episode of The Stream Team talking abt my latest #streaming faves #LoveIsBlindJapan 🇯🇵 and #Zola! https://t.co/uHjAe1GQMm
...[truncated]...
294: Up here watching @loveisblind #Japan #LoveIsBlindJapan and this dude #Mori started doing the Rerun Stubbs after getting engaged 😎
#Poppin who would’ve thought an MD from Japan to pull that out of his pocket #DoThatAtTheWedding 😊❤️@netflix
...[truncated]...
321: Hot couple, the end 👀 @themoonforces #LoveIsBlindJapan
...[truncated]...
375: @nikillinit One of the #LoveIsBlindJapan contestants made a PPT about why this guy should marry her.
Think you’re onto something.
...[truncated]...
*Suggestion: Consider using `qdapRegex::ex_tag' (to capture meta-data) and/or `replace_tag`
====
TIME
====
The following observations contain timestamps:
722, 3696, 3879, 4156, 4542, 4712, 6331, 7286, 7489, 8178...[truncated]...
This issue affected the following text:
722: i’m almost at that part where Priya will find out Mizuki lied about his job and income, i’m already so embarrassed omg #LoveIsBlindJapan Ep9 08:33mn https://t.co/TV9yVo1Gru
...[truncated]...
3696: It’s 3:45am, it’s my Saturday, I’m watching Love is Blind: Japan, and I have a 3.5 lb. box of taquitos.
Life is good.
...[truncated]...
3879: [💬] 3:32pm KST
• cleaning, building space for monroe, watching ‘love is blind : japan’ and boYYYYYY IS IT GETTING INTERESTING
...[truncated]...
4156: @san_dogukan @Tuneflix1 if you would please help me identify the music in Love is Blind Japan season 1 ep.1 from 33:36 to 35:30 Please 🥺🙏
...[truncated]...
4542: promised myself that at 9:15 I was going to stop working and take a break and watch love is blind japan before bed so here is my public accountability tweet that I stopped very close to 9:15 and am going to shut off my computer now
...[truncated]...
4712: Stayed up until 3:30 this morning watching Love is Blind: Japan 😭
...[truncated]...
6331: @strongbags116 😂 It’s 1:30 am, I’m a bottle of wine down and watching Love is Blind Japan. What am I doing? Thankfully it’s a long weekend.
...[truncated]...
7286: watching Love Is Blind Japan at 12:27am it’s lit folks
...[truncated]...
7489: Netflix really is giving me whiplash going from Japan Love is Blind where the asshole men to kind men is a 1:10 ratio and the US Love is Blind where it’s the exact reverse
...[truncated]...
8178: Episode 2 , 52:32
Midori tells Wataru (English caption) “ You like to switch to English to show off, and it makes me want to tease you.”
...[truncated]...
*Suggestion: Consider using `replace_time`
===
URL
===
The following observations contain URLs:
1, 12, 21, 24, 25, 27, 28, 29, 34, 35...[truncated]...
This issue affected the following text:
1: S01 | E05I've just watched episode Just the Two of Us of Love is Blind: Japan! #loveisblindjapan https://t.co/lTQZT8XzdX #tvtime https://t.co/NZW6DkFR3O
...[truncated]...
12: S01 | E02I've just watched episode Nice to Meet You, My Beloved of Love is Blind: Japan! #loveisblindjapan https://t.co/ls7SHX5c4u #tvtime https://t.co/m2n3p8O0xt
...[truncated]...
21: Why didn't anyone tell me #LoveIsBlindJapan was this good? Omg https://t.co/xY450z8soT
...[truncated]...
24: first 20 mins in and ik this mf did not just say that the kitchen is a woman’s domain……. BOOOO TOMATO TOMATO #LoveIsBlindJapan https://t.co/LERT8R6fU5
...[truncated]...
25: #loveisblindjapan was an interesting netflix show. apparently some of the japan members are still doing streams #netflix
https://t.co/h73ZncPOox
...[truncated]...
27: 🥰🥰 I still love the #LoveIsBlindJapan cast so much and what great news: Midori and Wataru are expecting baby Mitaru! Aw they'll be new parents next year 🥹
https://t.co/4sp722eIkz
#LoveIsBlind
#loveisblindjapan https://t.co/DkjCo9baRJ
...[truncated]...
28: Continuing my Netflix journey and so impressed with #LoveIsBlindJapan which feels completely removed from Love Is Blind: USA — Japan’s spinoff is patient, emotional, sincere, vulnerable, romantic. Maybe it’s a cultural difference, here people are literally there for love. https://t.co/i8G80qGbe3
...[truncated]...
29: I voice #Nana in the #EnglishDub of #LoveIsBlindJapan 💜
A huge thank you to the marvelous Mimi for this amazing opportunity
🙏🥰
Cheers to many more 😘
#flashbackfriday https://t.co/9I9yFP606f
...[truncated]...
34: I love this show, & I’m thrilled to be a part of it! I’m honored to be the English-speaking voice for Ryotaro in #LoveIsBlindJapan. He is such a good & kind soul. I wish the best for him & Motomi! Everyone, pls go watch their story on Netflix. Do it for love! 😄❤️😍🥰🙏 https://t.co/4QgIv5YSOk
...[truncated]...
35: #LoveIsBlindJapan is so amazing👏its like a couples therapy for me n my bae..we discussed,argued n come terms w it def like a rollercoaster of emo we been through 🫂 we def cheered for Ryotaro & Motomi throughout d entire show,they're so genuine, def most ppl fav iktr💞 https://t.co/rcdueO8EJw
...[truncated]...
*Suggestion: Consider using `replace_url`
We’re able to see here there’s multiple issues with the text that I pulled. What I like about this package is it also gives options to fix these items too. The first thing I’ll try is to replace internet slang function.
[1] "S01 | E05I've just watched episode Just the Two of Us of Love is Blind: Japan! #loveisblindjapan https://t.co/lTQZT8XzdX #tvtime https://t.co/NZW6DkFR3O"
[2] "Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan"
[3] "I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan. \nRyotaro and Motomi are still the most adorable people and couple. #LoveIsBlind #LoveIsBlindJapan"
[4] "Are they all getting engaged then?! 😳 #LoveIsBlindJapan"
[5] "Not these grown ass women crying over a 23 year of hairdresser 😩 #LoveIsBlindJapan"
[6] "Yudai really thinks this is a game 😩 #LoveIsBlindJapan"
[7] "Midori-chan is gone, my girl is in love 😭 She’s actually the best fit for Wataru, I think, because she calls him out on his BS, but genuinely accepts him for who he is #LoveIsBlindJapan"
[8] "oh my god the bridge thing is beautiful, the American version could never! #LoveIsBlindJapan"
[9] "This probably seems so silly but I'm annoyed that none of the #LoveIsBlindJapan cast members have verified Instagram accounts while Netflix will make sure to get every single American reality show cast member a verified Instagram account."
[10] "Misrepresenting your personality, your willingness to have children, misrepresenting your acceptance of having a working wife, misrepresenting your job…? This is a shambles. #LoveIsBlindJapan"
[11] "Shuntaro almost lost Ayano in the pods by not speaking up and it appears he learned nothing from that experience. The women are so direct and half of the men have misled them throughout the process. This is nuts. #LoveIsBlindJapan"
[12] "S01 | E02I've just watched episode Nice to Meet You, My Beloved of Love is Blind: Japan! #loveisblindjapan https://t.co/ls7SHX5c4u #tvtime https://t.co/m2n3p8O0xt"
[13] "I dunno if it’s having to read the words that is making it sink in more but how is it that the men don’t think of the future much compared to the women? #LoveIsBlindJapan"
[14] "I'm crying laughing, crying and laughing again so cute #LoveIsBlindJapan"
[15] "Ryotaro and Motomi are perfect together, their love for each other is easy. #LoveIsBlindJapan"
[16] "My favourite thing about Midori and Wataru's relationship is they reassure each other. When the other isn't sure and is feeling insecurities about the relationship the other takes charge and make their intention clear. #LoveIsBlindJapan"
[17] "I really appreciate the rawness of #LoveIsBlindJapan, the difficulty in navigating living together after falling in love in the pods is so honest. The show is also directed so well, the stories are told so beautifully."
[18] "Very much in love with Midori, her earnest nature and kindness. She's courageous in the way she feels so openly and says what she's thinking - always with kindness. Lover her! #LoveIsBlindJapan"
[19] "Really sad and confused at why Odacchi changed so much after the pods, I was rooting for him and Nanako so much. 💔 #LoveIsBlindJapan"
[20] "The genuine connections, the sincerity they have when they speak to each other. The romance, the LOVE. 💚 #LoveIsBlindJapan is one of the best things Netflix has ever given me."
This has worked well. It has changed slang words like “ppl” to “people”! This makes me quite happy.
I’ll go ahead and do “replace_date”, “replace_kern” (to adjust spacing that was done manually such as writing “A M A Z I N G” as “AMAZING”), “replace_curly_quotes”, “replace_word_elongation” (If someone writes “woooah” it’ll change it to “woah”) and “replace_contraction”.
Warning in as.data.table.list(x, keep.rownames = keep.rownames, check.names
= check.names, : Item 2 has 13 rows but longest item has 31; recycled with
remainder.
I also want to remove emojis. To do this I found in the DACSS slack channel someone who was looking for similar information and was given an answer! Below you will see the emojis removed
[1] "Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan|Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan"
[2] "Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan"
[3] "I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan. Ryotaro and Motomi are still the most adorable people and couple. #LoveIsBlind #LoveIsBlindJapan"
[4] "Are they all getting engaged then?! #LoveIsBlindJapan"
[5] "Not these grown ass women crying over a 23 year of hairdresser #LoveIsBlindJapan"
[6] "Yudai really thinks this is a game #LoveIsBlindJapan"
[7] "Midori-chan is gone, my girl is in love She's actually the best fit for Wataru, I think, because she calls him out on his BS, but genuinely accepts him for who he is #LoveIsBlindJapan"
[8] "oh my god the bridge thing is beautiful, the American version could never! #LoveIsBlindJapan"
[9] "This probably seems so silly but I'm annoyed that none of the #LoveIsBlindJapan cast members have verified Instagram accounts while Netflix will make sure to get every single American reality show cast member a verified Instagram account."
[10] "Misrepresenting your personality, your willingness to have children, misrepresenting your acceptance of having a working wife, misrepresenting your job? This is a shambles. #LoveIsBlindJapan"
[11] "Shuntaro almost lost Ayano in the pods by not speaking up and it appears he learned nothing from that experience. The women are so direct and half of the men have misled them throughout the process. This is nuts. #LoveIsBlindJapan"
[12] "Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan| E02I've just watched episode Nice to Meet You, My Beloved of Love is Blind: Japan! #loveisblindjapan https://t.co/ls7SHX5c4u #tvtime https://t.co/m2n3p8O0xt"
[13] "I dunno if it's having to read the words that is making it sink in more but how is it that the men don't think of the future much compared to the women? #LoveIsBlindJapan"
[14] "I'm crying laughing, crying and laughing again so cute #LoveIsBlindJapan"
[15] "Ryotaro and Motomi are perfect together, their love for each other is easy. #LoveIsBlindJapan"
[16] "My favourite thing about Midori and Wataru's relationship is they reassure each other. When the other isn't sure and is feeling insecurities about the relationship the other takes charge and make their intention clear. #LoveIsBlindJapan"
[17] "I really appreciate the rawness of #LoveIsBlindJapan, the difficulty in navigating living together after falling in love in the pods is so honest. The show is also directed so well, the stories are told so beautifully."
[18] "Very much in love with Midori, her earnest nature and kindness. She's courageous in the way she feels so openly and says what she's thinking - always with kindness. Lover her! #LoveIsBlindJapan"
[19] "Really sad and confused at why Odacchi changed so much after the pods, I was rooting for him and Nanako so much. #LoveIsBlindJapan"
[20] "The genuine connections, the sincerity they have when they speak to each other. The romance, the LOVE. #LoveIsBlindJapan is one of the best things Netflix has ever given me."
There are a few html links that I believe mostly lead to Youtube clips of the shows or gifs. I would like to remove those as they don’t add to my analysis. I followed this answer.
===========
CONTRACTION
===========
The following observations contain contractions:
7, 9, 12, 13, 14, 16, 18, 21, 27, 28...[truncated]...
This issue affected the following text:
7: Midori-chan is gone, my girl is in love She's actually the best fit for Wataru, I think, because she calls him out on his BS, but genuinely accepts him for who he is #LoveIsBlindJapan
...[truncated]...
9: This probably seems so silly but I'm annoyed that none of the #LoveIsBlindJapan cast members have verified Instagram accounts while Netflix will make sure to get every single American reality show cast member a verified Instagram account.
...[truncated]...
12: Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan| E02I've just watched episode Nice to Meet You, My Beloved of Love is Blind: Japan! #loveisblindjapan
...[truncated]...
13: I dunno if it's having to read the words that is making it sink in more but how is it that the men don't think of the future much compared to the women? #LoveIsBlindJapan
...[truncated]...
14: I'm crying laughing, crying and laughing again so cute #LoveIsBlindJapan
...[truncated]...
16: My favourite thing about Midori and Wataru's relationship is they reassure each other. When the other isn't sure and is feeling insecurities about the relationship the other takes charge and make their intention clear. #LoveIsBlindJapan
...[truncated]...
18: Very much in love with Midori, her earnest nature and kindness. She's courageous in the way she feels so openly and says what she's thinking - always with kindness. Lover her! #LoveIsBlindJapan
...[truncated]...
21: Why didn't anyone tell me #LoveIsBlindJapan was this good? oh my god
...[truncated]...
27: I still love the #LoveIsBlindJapan cast so much and what great news: Midori and Wataru are expecting baby Mitaru! Aw they'll be new parents next year
...[truncated]...
28: Continuing my Netflix journey and so impressed with #LoveIsBlindJapan which feels completely removed from Love Is Blind: USA Japan's spinoff is patient, emotional, sincere, vulnerable, romantic. Maybe it's a cultural difference, here people are literally there for love.
...[truncated]...
*Suggestion: Consider running `replace_contraction`
====
DATE
====
The following observations contain dates:
725, 5873, 11583, 23561, 23639, 31539, 31617, 39517, 39595, 47495...[truncated]...
This issue affected the following text:
725: motomi and ryoutaro got officially married on 1/11/22 oh I didn't think a love is blind couple would be better than cam and lauren for me BUT HERE WE ARE #LoveIsBlindJapan
...[truncated]...
5873: My netflix wanted me to watch an episode of Love is Blind:Japan. Watched until ep.10. When will i ever learn
...[truncated]...
11583: I just rewatched ep 2 and realized it is Motomi too. There is a zoom in her face at 46.24.
...[truncated]...
23561: motomi and ryoutaro got officially married on 1/11/22 oh I didn't think a love is blind couple would be better than cam and lauren for me BUT HERE WE ARE #LoveIsBlindJapan
...[truncated]...
23639: My netflix wanted me to watch an episode of Love is Blind:Japan. Watched until ep.10. When will i ever learn
...[truncated]...
31539: motomi and ryoutaro got officially married on 1/11/22 oh I didn't think a love is blind couple would be better than cam and lauren for me BUT HERE WE ARE #LoveIsBlindJapan
...[truncated]...
31617: My netflix wanted me to watch an episode of Love is Blind:Japan. Watched until ep.10. When will i ever learn
...[truncated]...
39517: motomi and ryoutaro got officially married on 1/11/22 oh I didn't think a love is blind couple would be better than cam and lauren for me BUT HERE WE ARE #LoveIsBlindJapan
...[truncated]...
39595: My netflix wanted me to watch an episode of Love is Blind:Japan. Watched until ep.10. When will i ever learn
...[truncated]...
47495: motomi and ryoutaro got officially married on 1/11/22 oh I didn't think a love is blind couple would be better than cam and lauren for me BUT HERE WE ARE #LoveIsBlindJapan
...[truncated]...
*Suggestion: Consider running `replace date`
=====
DIGIT
=====
The following observations contain digits/numbers:
3, 5, 12, 39, 52, 65, 67, 73, 76, 82...[truncated]...
This issue affected the following text:
3: I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan. Ryotaro and Motomi are still the most adorable people and couple. #LoveIsBlind #LoveIsBlindJapan
...[truncated]...
5: Not these grown ass women crying over a 23 year of hairdresser #LoveIsBlindJapan
...[truncated]...
12: Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan| E02I've just watched episode Nice to Meet You, My Beloved of Love is Blind: Japan! #loveisblindjapan
...[truncated]...
39: I finished watching #LoveisBlindJapan first what a beautiful setting. 2nd no cussing from anyone.3rd when couples broke it off they were so polite so graceful. Like no yelling no screaming. Just a polite it is over. This couple is totally adorable from start to end of show.
...[truncated]...
52: also.. up to this point (ep 3) i think the best and most genuine couple is Ryotaro and Motomi? like i don't know i just love how they are with ea/o. and the dude really seems like an honest good guy.. even around the males he always tried to turn negative into positive. #LoveIsBlindJapan
...[truncated]...
65: Motomi and Ryotaro in the episode 10 just made me feel so emotional. They are just so good together #LoveIsBlindJapan
...[truncated]...
67: okay why does this 56 year old man have so much game #loveisblindjapan
...[truncated]...
73: Love Is Blind: Japan is so good! Episode 1 is like a romance novel come to life . #LoveIsBlindJapan
...[truncated]...
76: 2022 Watch List Status CW #Tomorrow 4-5 #LoveIsBlindJapan 3-4 #UnexpectedBusiness2 4-5 #ItsBeautifulNow 5-6 Completed #FishbowlWives
...[truncated]...
82: 1 episode into #LoveIsBlindJapan and Nanako and Odacchi already made me cry and them meeting was the sweetest thing ever
...[truncated]...
*Suggestion: Consider using `replace_number`
========
EMOTICON
========
The following observations contain emoticons:
11, 27, 44, 50, 64, 76, 79, 123, 125, 138...[truncated]...
This issue affected the following text:
11: Shuntaro almost lost Ayano in the pods by not speaking up and it appears he learned nothing from that experience. The women are so direct and half of the men have misled them throughout the process. This is nuts. #LoveIsBlindJapan
...[truncated]...
27: I still love the #LoveIsBlindJapan cast so much and what great news: Midori and Wataru are expecting baby Mitaru! Aw they'll be new parents next year
...[truncated]...
44: Finished watching both the Seasons of the @LoveisBlindShow & #LoveIsBlindBrazil & #LoveIsBlindJapan. Waiting for their next season
...[truncated]...
50: also her voice is cute but her laughter is a bit extra? i don't know if she's exaggerating it but it can sound a bit annoying? #LoveIsBlindJapan
...[truncated]...
64: #loveisblindjapan was actually really wholesome and not as big of a shitshow as i was expecting based on its american predecessor. i'm actually pretty floored by how sweet it was and how many tears i shed.
...[truncated]...
76: 2022 Watch List Status CW #Tomorrow 4-5 #LoveIsBlindJapan 3-4 #UnexpectedBusiness2 4-5 #ItsBeautifulNow 5-6 Completed #FishbowlWives
...[truncated]...
79: #LoveIsBlindJapan is such an honest take on the experiment
...[truncated]...
123: Not Wataru crying in the pod when Midori confesses her strong feelings for him!! I wasn't expecting that! I think, as much as he clicks with Priya, he's gonna choose Midori#LoveIsBlindJapan
...[truncated]...
125: I love the way they did #LoveIsBlindJapan Especially how they made it clear what the other should expect at the alter so as not to embarrass them in front of family
...[truncated]...
138: Currently watching #loveisblindjapan and gosh it's so wholesome. Didn't expect to cry this much
...[truncated]...
*Suggestion: Consider using `replace_emoticons`
=====
EMPTY
=====
The following observations contain empty text cells (all white space):
1
This issue affected the following text:
1: Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan|Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan
*Suggestion: Consider running `drop_empty_row`
=======
ESCAPED
=======
The following observations contain escaped back spaced characters:
8666, 9341, 10079, 11029, 12987, 13183, 13186, 13577, 14620
This issue affected the following text:
8666: \* Panru = Lupin (sometimes to be cutesy, you reverse syllabic order of words/names, so Rupan/Lupin = Panru...) Okay this is fascinating, thank you for including this in your translation!
9341: ;\_\_\_; this made the week for me. So happy for them and in this era of superficial romances, they really are one in a million to develop from a blind dating couple to official husband and wife. Congratulations!! (I tried to watch the US version S2 and all the personalities are so insufferable...)
10079: great post! Also i dont know if this was me but i felt like the pressure was more put on the \*women\* to keep the men intrested in them? thats the vibe i got from it at least
11029: laughing out loud I was also obsessed. I want this shortsleeve/longsleeve convertible hoodie!\]
12987: \>Her friends definitely speaks more fluent english that him. &#x200B; Pretty sure her friends are American. If not they have spent their college years abroad. She said they were college friends, and she went to school in the US.
13183: Anyone else got some f\*boy vibes with tattoo guy?
13186: Welp Wataru really dragged things on and led folks on. \*shrug
13577: You might be confusing \*polite\* with \*nice\*
14620: new couch (brown leather)\*\*\*
*Suggestion: Consider using `replace_white`
====
HASH
====
The following observations contain Twitter style hash tags (e.g., #rstats):
1, 2, 3, 4, 5, 6, 7, 8, 9, 10...[truncated]...
This issue affected the following text:
1: Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan|Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan
...[truncated]...
2: Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan
...[truncated]...
3: I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan. Ryotaro and Motomi are still the most adorable people and couple. #LoveIsBlind #LoveIsBlindJapan
...[truncated]...
4: Are they all getting engaged then?! #LoveIsBlindJapan
...[truncated]...
5: Not these grown ass women crying over a 23 year of hairdresser #LoveIsBlindJapan
...[truncated]...
6: Yudai really thinks this is a game #LoveIsBlindJapan
...[truncated]...
7: Midori-chan is gone, my girl is in love She's actually the best fit for Wataru, I think, because she calls him out on his BS, but genuinely accepts him for who he is #LoveIsBlindJapan
...[truncated]...
8: oh my god the bridge thing is beautiful, the American version could never! #LoveIsBlindJapan
...[truncated]...
9: This probably seems so silly but I'm annoyed that none of the #LoveIsBlindJapan cast members have verified Instagram accounts while Netflix will make sure to get every single American reality show cast member a verified Instagram account.
...[truncated]...
10: Misrepresenting your personality, your willingness to have children, misrepresenting your acceptance of having a working wife, misrepresenting your job? This is a shambles. #LoveIsBlindJapan
...[truncated]...
*Suggestion: Consider using `qdapRegex::ex_tag' (to capture meta-data) and/or replace_hash
====
HTML
====
The following observations contain HTML markup:
34, 35, 44, 105, 150, 152, 155, 190, 256, 262...[truncated]...
This issue affected the following text:
34: I love this show, & I'm thrilled to be a part of it! I'm honored to be the English-speaking voice for Ryotaro in #LoveIsBlindJapan. He is such a good & kind soul. I wish the best for him & Motomi! Everyone, pls go watch their story on Netflix. Do it for love!
...[truncated]...
35: #LoveIsBlindJapan is so amazingits like a couples therapy for me n my bae..we discussed,argued n come terms w it def like a rollercoaster of emo we been through we def cheered for Ryotaro & Motomi throughout d entire show,they're so genuine, def most people fav iktr
...[truncated]...
44: Finished watching both the Seasons of the @LoveisBlindShow & #LoveIsBlindBrazil & #LoveIsBlindJapan. Waiting for their next season
...[truncated]...
105: @itsamandared Just watching this episode now and that's why I came here! I knew it couldn't just be me seeing the gaslighting. He did a 180 on sharing the household/childcare duties & enjoying her "quirks" Minami, you deserve better. #wegotyou #LoveIsBlindJapan
...[truncated]...
150: This girl is balancing in this balance ball like it's nothing & super easy & her feet are like 6 inches off the floor. I would fall right off. #LoveIsBlindJapan
...[truncated]...
152: They have earphones & it seems that without the earphones they can't hear each other at all. #LoveIsBlindJapan
...[truncated]...
155: I watched #LoveisBlindJapan and it's so refreshing & wholesome compared to the US one Glad the 2 couples pulled through and acc got married aw
...[truncated]...
190: Although I am still in "let's cancel the Netflix acct" negotiations, #LoveIsBlindJapan was good. I think there was a lot lost in translation re level of affection shown & exactly what happened w. Minami and Mori. Also, the best looking, most stylish guy is a fiancee's dad.
...[truncated]...
256: I finally finished #LoveIsBlindJapan and I'm SO happy my fav/cutest couple (Ryotaro and Motomi) made it to the end!!! I WAS rooting for Mori & Minami BUT I'm glad she chose her career since he just wanted a housewife
...[truncated]...
262: The Japanese version is a times better than the American one! They have respect, they are sincere, they are more inclusive in looks, age & situation, and they just ask better questions and are more reflected. #LoveIsBlind #LoveIsBlindJapan Also the visuals!!!
...[truncated]...
*Suggestion: Consider running `replace_html`
==========
INCOMPLETE
==========
The following observations contain incomplete sentences (e.g., uses ending punctuation like '...'):
35, 48, 52, 89, 97, 133, 148, 200, 203, 212...[truncated]...
This issue affected the following text:
35: #LoveIsBlindJapan is so amazingits like a couples therapy for me n my bae..we discussed,argued n come terms w it def like a rollercoaster of emo we been through we def cheered for Ryotaro & Motomi throughout d entire show,they're so genuine, def most people fav iktr
...[truncated]...
48: compared to #loveisblindjapan the American version is pretty vulgar and kinda... rude exactly what I thought it would be all the talk about sex
...[truncated]...
52: also.. up to this point (ep 3) i think the best and most genuine couple is Ryotaro and Motomi? like i don't know i just love how they are with ea/o. and the dude really seems like an honest good guy.. even around the males he always tried to turn negative into positive. #LoveIsBlindJapan
...[truncated]...
89: If Midori does not marry Wataru after all of those aggresively intense workouts she had him doing... #loveisblind #loveisblindJapan
...[truncated]...
97: Sometimes I think about that one couple on #LoveIsBlindJapan that got together because she prepared a powerpoint presentation on why he should date her ... and he felt very convinced. Apparently, they're still married.
...[truncated]...
133: Just finished #LoveIsBlindJapan...they took just ALL the way through it!!
...[truncated]...
148: Probz gonna binge watch Love is Blind Brazil after LIB Japan... I heard there is gonna be a huge culture shock esp re PDA or skinship. I already knew beforehand there is for sure a significant difference there So, I am expecting it. #LoveIsBlindJapan #LoveIsBlindBrazil
...[truncated]...
200: @BrightlyAgain #LoveAfterLockup Indie needs to watch a few episodes of #LoveIsBlindJapan... These women didn't come to play...first red flag, they're like...
...[truncated]...
203: Yudai is so cute but... as someone who is also 23 years old I think that's too young to go on a dating/marriage show. I worry he might not be ready/mature enough. #LoveIsBlindJapan
...[truncated]...
212: I don't realize how hard it is to know the soundtrack of episode 11 of Love is Blind: Japan... The songs in that finale episode were GOOD..... But it is hard looking for the song titles when you barely hear the lyrics #LoveIsBlindJapan
...[truncated]...
*Suggestion: Consider using `replace_incomplete`
====
KERN
====
The following observations contain kerning (e.g., 'The B O M B!'):
170, 801, 942, 1006, 1040, 1233, 1383, 1483, 1575, 1930...[truncated]...
This issue affected the following text:
170: oh my god the 56 year olds backstory not his previous love passing away oh my god. HE BETTER FIND LOVE OR IMMA CRY I SWEAR #LoveIsBlindJapan
...[truncated]...
801: PRIYA IS ONE HECK OF A WOMAN !! she is fierce, straightforward, ambitious, she's a boss lady !! homegirl owns a CBD skincare company, can train elephants and was Miss Japan 2016 pls ain't nobody matching her energy, she deserves a rare and good man #LoveIsBlindJapan Ep7
...[truncated]...
942: RYOTOMI SHOWING UP TO THE AFTERSHOW BOTH WITH BRIGHTLY COLORED HAIR AND RYOTARO SAYING HE REALLY APPRECIATES COMING HOME TO HER AND HER COOKING AND MOTOMI SAYING HE SAYS HER FOOD IS GOOD EVERY 3 SECONDS AND THEY PROBABLY WORKED IN A FIELD TOGETHER IN A PAST LIFE #LoveIsBlindJapan
...[truncated]...
1006: #LoveIsBlindJapan BRO IM CRYING right now RYOTARO AND MOTOMI ARE SO CUTE I CANT BELIEVE SHE DYED HER HAIR BLONDE TOO
...[truncated]...
1040: THEY ARE BOTH BLONDE OH MY GOD I LOVE THEM #LoveIsBlindJapan
...[truncated]...
1233: MOTOMI DYED HER HAIR LIGHT BROWN AFTER MARRYING RYOTARO oh my god THEY MATCH CUZ HE'S BACK TO BLONDE oh my god THEYRE SO CUTE I CANT HANDLE IT #LoveIsBlindJapan
...[truncated]...
1383: NO WAY THEY DID A THREE MONTHS LATER!!!!!!! this is amazing, because my ass was about to stalk each and every instagram page of the cast #LoveIsBlindJapan
...[truncated]...
1483: I DIDN'T EXPECT MOTOMI TO DYE HER HAIR BLONDE oh my god, ITS A YES TO ME Im happy for her and Ryotaro #LoveIsBlindJapan
...[truncated]...
1575: THEY BOTH SAID "I DO".....THIS IS NOT A DRILL....RYOTARO AND MOTOMI BOTH SAID "I DO" #LoveIsBlindJapan
...[truncated]...
1930: IF I LOST MY SIGHT I WOULD STILL BE IN LOVR WITH YOU #LoveIsBlindJapan
...[truncated]...
*Suggestion: Consider using `replace_kern`
=============
MISSING VALUE
=============
The following observations contain missing values:
24, 26, 45, 53, 61, 95, 104, 106, 127, 132...[truncated]...
*Suggestion: Consider running `drop_NA`
==========
MISSPELLED
==========
The following observations contain potentially misspelled words:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10...[truncated]...
This issue affected the following text:
1: Why do I fe<<el>> li<<ke>> a lot of <<th>>e<<se>> peop<<le>> <<jus>>t wan<<te>>d a free h<<ol>>i<<da>>y? #<<<<L<<ove>><<IsBli<<nd>>>>>>Japan>>|Why do I fe<<el>> li<<ke>> a lot of <<th>>e<<se>> peop<<le>> <<jus>>t wan<<te>>d a free h<<ol>>i<<da>>y? #<<<<L<<ove>><<IsBli<<nd>>>>>>Japan>>
...[truncated]...
2: Why do I fe<<el>> li<<ke>> a lot of <<th>>e<<se>> peop<<le>> <<jus>>t wan<<te>>d a free h<<ol>>i<<da>>y? #<<<<L<<ove>><<IsBli<<nd>>>>>>Japan>>
...[truncated]...
3: I <<de>>ci<<de>>d I <<ne>>ed a p<<al>>a<<te>> c<<le>>an<<se>>r f<<<<ro>>m>> <<th>>e US L<<ove>> is Bli<<nd>> S2 be<<fo>>re I wa<<tch>> S3, so I have b<<een>> <<re<<wa<<tch>>in>>g>> p<<ar>>ts of L<<ove>> is Bli<<nd>> Japan. <<<<<<Ryo>>ta>><<ro>>>> a<<nd>> <<Mot<<omi>>>> <<ar>>e still <<th>>e m<<os>>t ado<<ra>>b<<le>> peop<<le>> a<<nd>> c<<ou>>p<<le>>. #<<L<<ove>><<IsBli<<nd>>>>>> #<<<<L<<ove>><<IsBli<<nd>>>>>>Japan>>
...[truncated]...
4: Are <<th>>ey <<al>>l <<g<<et>>tin>>g <<eng>><<ag>>ed <<th>>en?! #<<<<L<<ove>><<IsBli<<nd>>>>>>Japan>>
...[truncated]...
5: Not <<th>>e<<se>> g<<ro>>wn a<<ss>> <<wo>>men <<cryin>>g <<ove>>r a 23 ye<<ar>> of <<h<<ai>>>>r<<dr>>e<<ss>>er #<<<<L<<ove>><<IsBli<<nd>>>>>>Japan>>
...[truncated]...
6: <<Yud<<ai>>>> <<r<<e<<al>>>>l>>y t<<hi<<nk>>>>s <<th>>is is a g<<ame>> #<<<<L<<ove>><<IsBli<<nd>>>>>>Japan>>
...[truncated]...
7: <<<<Mido>><<ri>>>>-<<c<<han>>>> is <<gon>>e, my girl is in l<<ove>> Sh<<e's>> ac<<tu>><<al>>ly <<th>>e best fit <<fo>>r <<<<Wat>><<ar>>u>>, I t<<hi<<nk>>>>, <<bec>><<<<au>><<se>>>> she c<<al>>ls h<<im>> <<ou>>t on his BS, but genuin<<el>>y <<acc>>e<<pts>> h<<im>> <<fo>>r who he is #<<<<L<<ove>><<IsBli<<nd>>>>>>Japan>>
...[truncated]...
8: oh my god <<th>>e b<<ri>>dge <<th>><<ing>> is <<be<<au>>ti<<fu>>>>l, <<th>>e A<<m<<e<<ri>>>>ca>>n <<<<ver>>s>>ion co<<uld>> <<ne>><<ver>>! #<<<<L<<ove>><<IsBli<<nd>>>>>>Japan>>
...[truncated]...
9: This p<<ro>>bably <<se>>ems so silly but I'm annoyed <<th>>at no<<ne>> of <<th>>e #<<<<L<<ove>><<IsBli<<nd>>>>>>Japan>> cast <<mem>>bers have <<ver>>i<<f<<ie>>d>> <<Insta>>g<<ra>>m <<acc>>o<<un>>ts whi<<le>> N<<et>><<flix>> will ma<<ke>> sure to g<<et>> e<<ver>>y s<<ing>><<le>> A<<m<<e<<ri>>>>ca>>n r<<e<<al>>>>i<<ty>> <<sho>>w cast <<mem>>ber a <<ver>>i<<f<<ie>>d>> <<Insta>>g<<ra>>m <<acc>>o<<un>>t.
...[truncated]...
10: Misre<<pre>><<se>>nt<<ing>> y<<ou>>r person<<al>>i<<ty>>, y<<ou>>r will<<ing>><<<<ne>><<ss>>>> to have child<<ren>>, <<mis>>re<<pre>><<se>>nt<<ing>> y<<ou>>r <<acc>><<ep>>tance of hav<<ing>> a <<wo>>rk<<ing>> wife, <<mis>>re<<pre>><<se>>nt<<ing>> y<<ou>>r <<jo>>b? This is a <<sha>>mb<<<<le>>s>>. #<<<<L<<ove>><<IsBli<<nd>>>>>>Japan>>
...[truncated]...
*Suggestion: Consider running `hunspell::hunspell_find` & `hunspell::hunspell_suggest`
========
NO ALPHA
========
The following observations contain elements with no alphabetic (a-z) letters:
3695, 3703, 3782, 4374, 4764, 5187, 5286, 6442, 7036, 7429...[truncated]...
This issue affected the following text:
3695:
...[truncated]...
3703:
...[truncated]...
3782:
...[truncated]...
4374:
...[truncated]...
4764:
...[truncated]...
5187:
...[truncated]...
5286:
...[truncated]...
6442:
...[truncated]...
7036:
...[truncated]...
7429:
...[truncated]...
*Suggestion: Consider cleaning the raw text or running `filter_row`
==========
NO ENDMARK
==========
The following observations contain elements with missing ending punctuation:
1, 2, 3, 4, 5, 6, 7, 8, 10, 11...[truncated]...
This issue affected the following text:
1: Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan|Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan
...[truncated]...
2: Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan
...[truncated]...
3: I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan. Ryotaro and Motomi are still the most adorable people and couple. #LoveIsBlind #LoveIsBlindJapan
...[truncated]...
4: Are they all getting engaged then?! #LoveIsBlindJapan
...[truncated]...
5: Not these grown ass women crying over a 23 year of hairdresser #LoveIsBlindJapan
...[truncated]...
6: Yudai really thinks this is a game #LoveIsBlindJapan
...[truncated]...
7: Midori-chan is gone, my girl is in love She's actually the best fit for Wataru, I think, because she calls him out on his BS, but genuinely accepts him for who he is #LoveIsBlindJapan
...[truncated]...
8: oh my god the bridge thing is beautiful, the American version could never! #LoveIsBlindJapan
...[truncated]...
10: Misrepresenting your personality, your willingness to have children, misrepresenting your acceptance of having a working wife, misrepresenting your job? This is a shambles. #LoveIsBlindJapan
...[truncated]...
11: Shuntaro almost lost Ayano in the pods by not speaking up and it appears he learned nothing from that experience. The women are so direct and half of the men have misled them throughout the process. This is nuts. #LoveIsBlindJapan
...[truncated]...
*Suggestion: Consider cleaning the raw text or running `add_missing_endmark`
====================
NO SPACE AFTER COMMA
====================
The following observations contain commas with no space afterwards:
22, 35, 180, 248, 263, 428, 1321, 1472, 2140, 2182...[truncated]...
This issue affected the following text:
22: AND FUCK YUDAI,, hes literally so fucking weird for making the most hasty ass decision like u are proposing to someone, its not easy, u never even clarified to the rest of the women u are talking to what your decision was?? he honestly treated the show like a joke #LoveIsBlindJapan
...[truncated]...
35: #LoveIsBlindJapan is so amazingits like a couples therapy for me n my bae..we discussed,argued n come terms w it def like a rollercoaster of emo we been through we def cheered for Ryotaro & Motomi throughout d entire show,they're so genuine, def most people fav iktr
...[truncated]...
180: The concept of falling in love solely by their heart without other factor don't work in Japan.They consider everything together.2 Couples who succeed in married have same emotional level,same lifestyle,good looking,strong financially and come from proper family.#LoveIsBlindJapan
...[truncated]...
248: Girl,.. just tell him you want him or you don't want him. STOP WASTING HIS TIME. #LoveIsBlindJapan
...[truncated]...
263: 'Forget about Kenya,' says Misaki, after bringing Kenya into every conversation he's had in the last 4 episodes #LoveIsBlindJapan
...[truncated]...
428: Am I really myself if I don't google or twittersearch a show I'm watching,,I think not #LoveIsBlindJapan
...[truncated]...
1321: #LoveIsBlindJapan I had a good time: - Your set was magical oh my god. - Having little insights on Japanese ways, norms, valuesloved every bit of it. - Points for tourism - Your cast; great pick. ( The ladies were a unique set ) Well done to you all ,and "Arigato."
...[truncated]...
1472: I Love My Sixpack So Much,I Protect It With A Layer Of Fat. #gg #LoveIsBlindJapan
...[truncated]...
2140: If anyone loves you because of #MONEY ,the person truly #loves you.#FallingInLoveWithMe #LoveIsBlind #LoveIsBlindJapan #LoveIsColorBlind #MoneyTalks #moneytwitter
...[truncated]...
2182: ".and I'm okay with that," "I know and that's part of the problem." #LoveIsBlindJapan
...[truncated]...
*Suggestion: Consider running `add_comma_space`
==================
NON SPLIT SENTENCE
==================
The following observations contain unsplit sentences (more than one sentence per element):
1, 2, 3, 4, 8, 10, 11, 12, 13, 15...[truncated]...
This issue affected the following text:
1: Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan|Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan
...[truncated]...
2: Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan
...[truncated]...
3: I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan. Ryotaro and Motomi are still the most adorable people and couple. #LoveIsBlind #LoveIsBlindJapan
...[truncated]...
4: Are they all getting engaged then?! #LoveIsBlindJapan
...[truncated]...
8: oh my god the bridge thing is beautiful, the American version could never! #LoveIsBlindJapan
...[truncated]...
10: Misrepresenting your personality, your willingness to have children, misrepresenting your acceptance of having a working wife, misrepresenting your job? This is a shambles. #LoveIsBlindJapan
...[truncated]...
11: Shuntaro almost lost Ayano in the pods by not speaking up and it appears he learned nothing from that experience. The women are so direct and half of the men have misled them throughout the process. This is nuts. #LoveIsBlindJapan
...[truncated]...
12: Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan| E02I've just watched episode Nice to Meet You, My Beloved of Love is Blind: Japan! #loveisblindjapan
...[truncated]...
13: I dunno if it's having to read the words that is making it sink in more but how is it that the men don't think of the future much compared to the women? #LoveIsBlindJapan
...[truncated]...
15: Ryotaro and Motomi are perfect together, their love for each other is easy. #LoveIsBlindJapan
...[truncated]...
*Suggestion: Consider running `textshape::split_sentence`
===
TAG
===
The following observations contain Twitter style handle tags (e.g., @trinker):
72, 77, 93, 105, 113, 185, 294, 321, 375, 467...[truncated]...
This issue affected the following text:
72: @suxelamai Lord I don't think this show for me I know these love reality shows are a mess but I feel like they are doomed from the start It seems the "matchmakers" purposely sabotage the marching for ratings I'm waiting for something like this but more wholesome like #LoveisBlindJapan
...[truncated]...
77: @qwhee_in laughing out loud nah, just bc they didn't last doesn't mean it wasn't real. They had a great connection but in the end they both have things to work thru. I appreciate their honesty in breaking the engagement, but I still think they could've worked if they tried harder #LoveIsBlindJapan
...[truncated]...
93: @teysixeight1 Minami, you don't even need a big reason as to why you want to work/keep your job. Any man that wants you to value his career over yours on the basis of gender roles does not deserve you. #LoveIsBlindJapan
...[truncated]...
105: @itsamandared Just watching this episode now and that's why I came here! I knew it couldn't just be me seeing the gaslighting. He did a 180 on sharing the household/childcare duties & enjoying her "quirks" Minami, you deserve better. #wegotyou #LoveIsBlindJapan
...[truncated]...
113: I hope to find someone who makes me laugh the way @odaccii made me laugh watching Love is Blind Japan, literally the best one I've watched! #LoveIsBlindJapan
...[truncated]...
185: It's always a good time with @pstinny at the hosting helm! Catch me on the latest episode of The Stream Team talking abt my latest #streaming faves #LoveIsBlindJapan and #Zola!
...[truncated]...
294: Up here watching @loveisblind #Japan #LoveIsBlindJapan and this dude #Mori started doing the Rerun Stubbs after getting engaged #Poppin who would've thought an MD from Japan to pull that out of his pocket #DoThatAtTheWedding @netflix
...[truncated]...
321: Hot couple, the end @themoonforces #LoveIsBlindJapan
...[truncated]...
375: @nikillinit One of the #LoveIsBlindJapan contestants made a PPT about why this guy should marry her. Think you're onto something.
...[truncated]...
467: @PKhakpour @jarry Shake is in S2. Also watch #LoveIsBlindJapan and #LoveIsBlindBrazil.. the latter two are 100x better than #LoveIsBlind.
...[truncated]...
*Suggestion: Consider using `qdapRegex::ex_tag' (to capture meta-data) and/or `replace_tag`
====
TIME
====
The following observations contain timestamps:
722, 3696, 4156, 4542, 4712, 6331, 7286, 7489, 8178, 8211...[truncated]...
This issue affected the following text:
722: i'm almost at that part where Priya will find out Mizuki lied about his job and income, i'm already so embarrassed oh my god #LoveIsBlindJapan Ep9 08:33mn
...[truncated]...
3696: It's 3:45am, it's my Saturday, I'm watching Love is Blind: Japan, and I have a 3.5 lb. box of taquitos. Life is good.
...[truncated]...
4156: @san_dogukan @Tuneflix1 if you would please help me identify the music in Love is Blind Japan season 1 ep.1 from 33:36 to 35:30 Please
...[truncated]...
4542: promised myself that at 9:15 I was going to stop working and take a break and watch love is blind japan before bed so here is my public accountability tweet that I stopped very close to 9:15 and am going to shut off my computer now
...[truncated]...
4712: Stayed up until 3:30 this morning watching Love is Blind: Japan
...[truncated]...
6331: @strongbags116 It's 1:30 am, I'm a bottle of wine down and watching Love is Blind Japan. What am I doing? Thankfully it's a long weekend.
...[truncated]...
7286: watching Love Is Blind Japan at 12:27am it's lit folks
...[truncated]...
7489: Netflix really is giving me whiplash going from Japan Love is Blind where the asshole men to kind men is a 1:10 ratio and the US Love is Blind where it's the exact reverse
...[truncated]...
8178: Episode 2 , 52:32 Midori tells Wataru (English caption) " You like to switch to English to show off, and it makes me want to tease you."
...[truncated]...
8211: i don't think so, it's an English song. The song is played at 39:30 in ep 11
...[truncated]...
*Suggestion: Consider using `replace_time`
===
URL
===
The following observations contain URLs:
2004, 18448, 26426, 34404, 42382, 50360, 58338
This issue affected the following text:
2004: "What's That Saying; "In the End, Love Wins."?" #NFT #NFTs #NFTart #NFTartist #NFTdrop #NFTcommunity #nftcollector #NFTProject #nftphotography #NFTCollection #LoveIsBlind #LoveIsBlindJapan
18448: "What's That Saying; "In the End, Love Wins."?" #NFT #NFTs #NFTart #NFTartist #NFTdrop #NFTcommunity #nftcollector #NFTProject #nftphotography #NFTCollection #LoveIsBlind #LoveIsBlindJapan
26426: "What's That Saying; "In the End, Love Wins."?" #NFT #NFTs #NFTart #NFTartist #NFTdrop #NFTcommunity #nftcollector #NFTProject #nftphotography #NFTCollection #LoveIsBlind #LoveIsBlindJapan
34404: "What's That Saying; "In the End, Love Wins."?" #NFT #NFTs #NFTart #NFTartist #NFTdrop #NFTcommunity #nftcollector #NFTProject #nftphotography #NFTCollection #LoveIsBlind #LoveIsBlindJapan
42382: "What's That Saying; "In the End, Love Wins."?" #NFT #NFTs #NFTart #NFTartist #NFTdrop #NFTcommunity #nftcollector #NFTProject #nftphotography #NFTCollection #LoveIsBlind #LoveIsBlindJapan
50360: "What's That Saying; "In the End, Love Wins."?" #NFT #NFTs #NFTart #NFTartist #NFTdrop #NFTcommunity #nftcollector #NFTProject #nftphotography #NFTCollection #LoveIsBlind #LoveIsBlindJapan
58338: "What's That Saying; "In the End, Love Wins."?" #NFT #NFTs #NFTart #NFTartist #NFTdrop #NFTcommunity #nftcollector #NFTProject #nftphotography #NFTCollection #LoveIsBlind #LoveIsBlindJapan
*Suggestion: Consider using `replace_url`
Here we can see there’s still a lot of problems. I’ll attempt to fix some of them that I notice right away that seem easy.
Here I’m cleaning some issues that don’t seem to work through the “textclean” replace packages.
corpus_posts <- mgsub(corpus_posts, c("it's", "i'm", "i've", "she's", "he's", "don't", "isn't", "didn't", "they'll", "can't", "they're", "you're", "EP02Iv'e"), c("it is", "i am", "i have", "she is", "he is", "does not", "is not", "did not", "they will", "cannot", "they are", "you are", "it is", "i have"))
corpus_posts <- mgsub(corpus_posts, c("1/11/22", "23"), c("january eleventh two thousand twenty two", "twenty three"))
corpus_posts <- corpus_posts %>% textshape::split_sentence(corpus_posts, text.var = TRUE)
head(corpus_posts, 10)
[[1]]
[1] "Why do I feel like a lot of these people just wanted a free holiday?"
[2] "#LoveIsBlindJapan|Why do I feel like a lot of these people just wanted a free holiday?"
[3] "#LoveIsBlindJapan"
[[2]]
[1] "Why do I feel like a lot of these people just wanted a free holiday?"
[2] "#LoveIsBlindJapan"
[[3]]
[1] "I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan."
[2] "Ryotaro and Motomi are still the most adorable people and couple."
[3] "#LoveIsBlind #LoveIsBlindJapan"
[[4]]
[1] "Are they all getting engaged then?!" "#LoveIsBlindJapan"
[[5]]
[1] "Not these grown ass women crying over a twenty three year of hairdresser #LoveIsBlindJapan"
[[6]]
[1] "Yudai really thinks this is a game #LoveIsBlindJapan"
[[7]]
[1] "Midori-chan is gone, my girl is in love She is actually the best fit for Wataru, I think, because she calls him out on his BS, but genuinely accepts him for who he is #LoveIsBlindJapan"
[[8]]
[1] "oh my god the bridge thing is beautiful, the American version could never!"
[2] "#LoveIsBlindJapan"
[[9]]
[1] "This probably seems so silly but I'm annoyed that none of the #LoveIsBlindJapan cast members have verified Instagram accounts while Netflix will make sure to get every single American reality show cast member a verified Instagram account."
[[10]]
[1] "Misrepresenting your personality, your willingness to have children, misrepresenting your acceptance of having a working wife, misrepresenting your job?"
[2] "This is a shambles."
[3] "#LoveIsBlindJapan"
This a test area to see if this will allow me to put my data back from values to data.
Attaching package: 'textshape'
The following object is masked from 'package:dplyr':
combine
The following object is masked from 'package:purrr':
flatten
The following object is masked from 'package:tibble':
column_to_rownames
Warning: package 'cleanNLP' was built under R version 4.1.2
Attaching package: 'NLP'
The following object is masked from 'package:ggplot2':
annotate
It worked somewhat! It isn’t a true data table but it is back under the data category.
The columns I had originally in the table, date and if the post was a tweet or Reddit post are missing. This isn’t a problem currently, but if I would like those back I’ll have to join the edited post to the table.
The textclean package has been super useful! It’s helped to make a lot of the cleaning fairly easily. However it has left /n around for a few items. I’m unsure why, (as I believe it happened removing other languages) so I’ll have to remove it later.
I noticed that some posts are now “NA” I’d like to remove those from my database by the next blog post.
I would also like to change all text to lower case.
I may want to use the “tm” package to clean up my data even more. Currently I feel like a lot of the harder things have been taken out.
I’m a bit confused on the creating the corpus as token with the “textshape” package. It seems to work but I’m unsure of how it’s working.
After taking a brief look at tutorial 5 there does seem to be some very useful tips in there. Here is currently what I have. I may be able to combine some of what I was looking at and what lesson 5 has together!
Warning: package 'devtools' was built under R version 4.1.2
Loading required package: usethis
Warning: package 'usethis' was built under R version 4.1.2
Warning: package 'tidytext' was built under R version 4.1.2
Warning: package 'plyr' was built under R version 4.1.2
------------------------------------------------------------------------------
You have loaded plyr after dplyr - this is likely to cause problems.
If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
library(plyr); library(dplyr)
------------------------------------------------------------------------------
Attaching package: 'plyr'
The following object is masked from 'package:here':
here
The following objects are masked from 'package:dplyr':
arrange, count, desc, failwith, id, mutate, rename, summarise,
summarize
The following object is masked from 'package:purrr':
compact
Warning: package 'quanteda' was built under R version 4.1.2
Package version: 3.2.3
Unicode version: 14.0
ICU version: 70.1
Parallel computing: 4 of 4 threads used.
See https://quanteda.io for tutorials and examples.
Attaching package: 'quanteda'
The following objects are masked from 'package:NLP':
meta, meta<-
Warning in stri_extract_first_regex(string, pattern, opts_regex =
opts(pattern)): argument is not an atomic vector; coercing
Tokens consisting of 115,422 documents.
text1 :
[1] "#loveisblindjapan" "wayyy" "better"
[4] "us" "version"
text2 :
[1] "s01" "|" "e05i've" "just" "watched" "episode" "just"
[8] "two" "us" "love" "blind" "japan"
[ ... and 4 more ]
text3 :
[1] "real" "kaoru" "just"
[4] "wanted" "promote" "music"
[7] "career" "#loveisblindjapan"
text4 :
[1] "feel" "like" "lot"
[4] "people" "just" "wanted"
[7] "free" "holiday" "#loveisblindjapan"
text5 :
[1] "ayano" "shuntaro" "make"
[4] "uncomfortable" "ngl" "#loveisblindjapan"
text6 :
[1] "decided" "need" "palate" "cleanser" "us"
[6] "love" "blind" "s2" "watch" "s3"
[11] "rewatching" "parts"
[ ... and 11 more ]
[ reached max_ndoc ... 115,416 more documents ]
Error in here("posts", "loveisblind_socialmedia.csv"): unused argument ("loveisblind_socialmedia.csv")
date
1 2022-09-23
2 2022-09-23
3 2022-09-22
4 2022-09-22
5 2022-09-22
6 2022-09-22
text
1 #LoveIsBlindJapan is wayyy better than the US version
2 S01 | E05I've just watched episode Just the Two of Us of Love is Blind: Japan! #loveisblindjapan https://t.co/lTQZT8XzdX #tvtime https://t.co/NZW6DkFR3O
3 Let’s be real, Kaoru just wanted to promote her music career #LoveIsBlindJapan
4 Why do I feel like a lot of these people just wanted a free holiday? #LoveIsBlindJapan
5 Ayano and Shuntaro make me uncomfortable ngl #LoveIsBlindJapan
6 I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan. \nRyotaro and Motomi are still the most adorable people and couple. #LoveIsBlind #LoveIsBlindJapan
twitter_reddit
1 twitter
2 twitter
3 twitter
4 twitter
5 twitter
6 twitter
corpus2 <- subset(corpus, detect_language(corpus$text) == "en")
corpus_posts <- replace_internet_slang(corpus2$text)
corpus_posts <- replace_date(corpus_posts) %>%
replace_contraction(corpus_posts) %>%
replace_kern(corpus_posts) %>%
replace_curly_quote(corpus_posts) %>%
replace_word_elongation(corpus_posts) %>%
replace_white(corpus_posts) %>%
replace_html(corpus_posts) %>%
textshape::split_sentence(corpus_posts)
Warning in as.data.table.list(x, keep.rownames = keep.rownames, check.names
= check.names, : Item 2 has 13 rows but longest item has 31; recycled with
remainder.
Warning in stri_replace_all_regex(string, pattern,
fix_replacement(replacement), : argument is not an atomic vector; coercing
corpus_posts <- gsub(" ?(f|ht)(tp)(s?)(://)(.*)[.|/](.*)", "", corpus_posts)
corpus_posts <- mgsub(corpus_posts, c("it's", "i'm", "i've", "she's", "he's", "don't", "isn't", "didn't", "they'll", "can't", "they're", "you're", "EP02Iv'e"), c("it is", "i am", "i have", "she is", "he is", "does not", "is not", "did not", "they will", "cannot", "they are", "you are", "it is", "i have"))
corpus_posts <- mgsub(corpus_posts, c("1/11/22", "23"), c("january eleventh two thousand twenty two", "twenty three"))
corpus_postse <- corpus_postse %>% textshape::split_sentence(corpus_postse)
head(corpus_posts, 10)
[1] "c(\"Why do I feel like a lot of these people just wanted a free holiday?\", \"#LoveIsBlindJapan|Why do I feel like a lot of these people just wanted a free holiday?\", \"#LoveIsBlindJapan\")"
[2] "c(\"Why do I feel like a lot of these people just wanted a free holiday?\", \"#LoveIsBlindJapan\")"
[3] "c(\"I decided I need a palate cleanser from the US Love is Blind S2 before I watch S3, so I have been rewatching parts of Love is Blind Japan.\", \"Ryotaro and Motomi are still the most adorable people and couple.\", \"#LoveIsBlind #LoveIsBlindJapan\")"
[4] "c(\"Are they all getting engaged then?!\", \" #LoveIsBlindJapan\")"
[5] "Not these grown ass women crying over a twenty three year of hairdresser #LoveIsBlindJapan"
[6] "Yudai really thinks this is a game #LoveIsBlindJapan"
[7] "Midori-chan is gone, my girl is in love She is actually the best fit for Wataru, I think, because she calls him out on his BS, but genuinely accepts him for who he is #LoveIsBlindJapan"
[8] "c(\"oh my god the bridge thing is beautiful, the American version could never!\", \"#LoveIsBlindJapan\")"
[9] "This probably seems so silly but I'm annoyed that none of the #LoveIsBlindJapan cast members have verified Instagram accounts while Netflix will make sure to get every single American reality show cast member a verified Instagram account."
[10] "c(\"Misrepresenting your personality, your willingness to have children, misrepresenting your acceptance of having a working wife, misrepresenting your job?\", \"This is a shambles.\", \"#LoveIsBlindJapan\")"
---
title: "Blog Post two"
author: "Molly Hackbarth"
desription: "Focusing on downloading data"
date: "10/01/2022"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- blog posts
- hw2
- Molly Hackbarth
---
```{r}
#| label: setup
#| warning: false
library(tidyverse)
library(cld3)
library(dplyr)
library(textclean)
library(stringi)
library(stringr)
library(here)
knitr::opts_chunk$set(echo = TRUE)
```
## Understanding APIs and R packages
In order to understand how to download twitter and reddit I looked more into APIs. I've heard reddit is a bit frustrating, I decided to try twitter first. This included me downloading the R package "rtweet". Once I was able to use their API to create a project, I went ahead and tried to download multiple tweets.
## Frustrations with APIs and R
Unfortunately "rtweet" had a very similar issue to the package "RedditextractorR", both had a limit that made it difficult to work with. "rtweet" only allowed you to search from the last 6-9 days of tweets. This makes it hard to gather a lot of data over time. "RedditextractorR" only allowed you have comments from 7 posts at a time. It seems that using R for both types of packages proved to be very difficult.
I also tried to use the package "twitteR" however it would not load properly for me. It kept giving me errors. Even with the properly set up Twitter API I was unable to have it connect to the account. This took over ***eight hours*** to try to get to work (including looking at multiple pages that suggested adding more packages to make both "twitteR" and "rtweet" to work) before I decide to give up on trying to using all of the packages.
## Looking for a New Option
I ended up deciding to look into other ways I could download tweets and Reddit posts. While most websites offered the same API options as mentioned before, a few of them recommended using Python instead.
After awhile I ended up deciding to download Python and Visual Studio Code to run Python. I had little hope and had some frustrations with downloading the "pip" package but was able to download it.
## Downloading Tweets and Reddit Posts through Python
After finding a YouTube video I was able to use the python package "[snscrape](https://github.com/mehranshakarami/AI_Spectrum/blob/main/2022/snscrape/tweets.py)" that someone had created for python (you can watch the explanation of how it works [here](https://www.youtube.com/watch?v=jtIMnmbnOFo)!) in order to allow downloading tweets without having to us an API. This was extremely helpful as the whole time to download all of the tweets I was interested in (both #loveisblindjapan and "love is blind japan") were downloaded within a few minutes.
For the Reddit posts I used a [website](https://medium.com/swlh/how-to-scrape-large-amounts-of-reddit-data-using-pushshift-1d33bde9286) that explained to me how to download all the comments that were on the subreddit r/loveisblindjapan. This also only took a few minutes.
Between Reddit and Twitter I was able to download over 20k comments from users who watched the TV show.
## Editing the Data in Google Sheets
Since I did this through python I ended up saving the data into a csv file. This allowed me to check out the data in better detail in Google Sheets. I did a few things in Google Sheets since it was easier:
- Combined the two twitter csv files (One for #loveisblindjapan and another for the phrase "love is blind japan") and removed any duplicates between them with the "remove duplicate" function.
- I noticed the reddit csv file had time categorized as "utc" which stands for coordinate universal time. This gave me numbers such as "1643382213" which is fairly unreadable to me. Thus I used this formula to fix it: =X2/86400+DATE(1970,1,1)+time(5,30,0). This allowed me to have 1/28/2022 20:33:33 which is easier to understand. However to match the twitter csv file (done in year/month/day (YMD)) I used removed the time from the end and formatted it using Google Sheet's "custom date and time" format to end up with 2022-01-28.
- Since the twitter csv file had YMD and then time I split the column so it only had YMD.
- I ended up merging the files together (This included a count of comments from people, the username of the person, and the actually tweet or post). I made an extra column that would say if it was from Twitter or Reddit.
## Data Quality
While almost all Reddit posts were made in English, I noticed there were quite a few tweets that were partially or completely in a different language. This has lead to me debating on if I should just remove the non English tweets entirely or leave them in.
I also noticed there were more tweets that had spelling errors than on Reddit posts. This is likely due to being unable to edit tweets, however this may cause a problem. Additionally tweets were more likely to use slang than Reddit.
From a quick glance I also noticed that tweets were often writing about how the show made them feel rather than about the contestants on the show. This may lead me to change my research question or decide to use only Reddit posts. Reddit posts seemed to focus on the contestants more often.
For the Reddit posts I also noticed that unfortunately the data does not seem to tell me how to now if people are replying to another comment on the post. Some of the posts will start with "I know what you mean!" This could lead to less examples of contestants names being shown, which could make my research question difficult.
## Updated Research Question
**Previously my research question was:** Do Reddit and Twitter differentiate on their views of contestants and their relationships in *Love is Blind Japan*?
**My current research question I'm leaning towards is:** How do Reddit and Twitter users feel about the show *Love is Blind Japan*?
**Why I'm considering the change:** It seems that although the contestants are important, if I want to focus on purely how viewers felt about the contestants I would need to only use Reddit posts. Additionally I will be analyzing the positive and negative sentiments of Reddit and Twitter together.
## Bringing in the Data
In order to check the data I've added my csv file to my repository. I will first check that it was added correctly.
I use the "here" package because it allows you to bypass the issue of setwd(), allowing you to change your working directory file. **A relative path to the project root directory will always be created using here().**
```{r Enter the Data}
#corpus <- read.csv(here("posts","loveisblind_socialmedia.csv"))
corpus <- read_csv(here("_data/loveisblind_socialmedia.csv"))
head(corpus)
```
Here we can see the data loaded in correctly and all three of the columns I wanted!
## Attempting to Clean the Data (a bit)
While the data is in the correct columns, I would still like to try a bit of cleaning to see if we can remove some items. The first thing I will do is remove non english from all of my posts. This is due to me being unable to analyze other languages correctly.
### Remove Languages
The first thing I thought of was removing Japanese as the show was in Japan, so I found this [answer](https://stackoverflow.com/questions/60181121/how-do-i-remove-japanese-characters).
```{r remove japanese}
str_rm_jap = function(x) {
#we replace japanese blocks with nothing, and clean any double whitespace from this
#reference at http://www.rikai.com/library/kanjitables/kanji_codes.unicode.shtml
x %>%
#japanese style punctuation
str_replace_all("[\u3000-\u303F]", "") %>%
#katakana
str_replace_all("[\u30A0-\u30FF]", "") %>%
#hiragana
str_replace_all("[\u3040-\u309F]", "") %>%
#kanji
str_replace_all("[\u4E00-\u9FAF]", "") %>%
#remove excess whitespace
str_replace_all(" +", " ") %>%
str_trim()
}
corpus_posts <- corpus$text %>% str_rm_jap
```
However I realized there were many more languages. This made it a bit more difficult. So I decided to keep looking and found this [answer](https://stackoverflow.com/questions/49338549/remove-languages-other-than-english-from-corpus-or-data-frame-in-r).
```{r remove more languages}
library("cld3")
corpus2 <- subset(corpus, detect_language(corpus$text) == "en")
```
This seemed to work well! It may not be the perfect solution but it seems to have removed any tweets or posts that were not in English.
### Check Package TextClean
The next package I'll use for that is "textclean".
I'll first check any posts or tweets (henceforth known as posts) using the check_text() function.
This takes quite awhile (I didn't actually time it but I had enough time to watch a ton of Youtube clips!)
```{r}
check_text(corpus2$text)
```
We're able to see here there's multiple issues with the text that I pulled. What I like about this package is it also gives options to fix these items too. The first thing I'll try is to replace internet slang function.
```{r replace internet slang}
corpus_posts <- replace_internet_slang(corpus2$text)
head(corpus_posts, 20)
```
This has worked well. It has changed slang words like "ppl" to "people"! This makes me quite happy.
I'll go ahead and do "replace_date", "replace_kern" (to adjust spacing that was done manually such as writing "A M A Z I N G" as "AMAZING"), "replace_curly_quotes", "replace_word_elongation" (If someone writes "woooah" it'll change it to "woah") and "replace_contraction".
```{r continue to replace}
corpus_posts <- replace_date(corpus_posts) %>%
replace_contraction(corpus_posts) %>%
replace_kern(corpus_posts) %>%
replace_curly_quote(corpus_posts) %>%
replace_word_elongation(corpus_posts) %>%
replace_white(corpus_posts)
```
### Removing Emojis
I also want to remove emojis. To do this I found in the DACSS slack channel someone who was looking for similar information and was given an answer! Below you will see the emojis removed
```{r remove emojis}
only_ascii_regexp <- '[^\u0001-\u007F]+|<U\\+\\w+>'
corpus_posts <- corpus_posts %>%
str_replace_all(regex(only_ascii_regexp), "")
head(corpus_posts, 20)
```
### Remove HTML links
There are a few html links that I believe mostly lead to Youtube clips of the shows or gifs. I would like to remove those as they don't add to my analysis. I followed [this](https://stackoverflow.com/questions/25352448/remove-urls-from-string) answer.
```{r remove links}
corpus_posts <- gsub(" ?(f|ht)(tp)(s?)(://)(.*)[.|/](.*)", "", corpus_posts)
```
### Another Look at Check_Text
```{r check text again}
check_text(corpus_posts)
```
Here we can see there's still a lot of problems. I'll attempt to fix some of them that I notice right away that seem easy.
### A Bit More Cleaning
Here I'm cleaning some issues that don't seem to work through the "textclean" replace packages.
```{r cleaning text more}
corpus_posts <- mgsub(corpus_posts, c("it's", "i'm", "i've", "she's", "he's", "don't", "isn't", "didn't", "they'll", "can't", "they're", "you're", "EP02Iv'e"), c("it is", "i am", "i have", "she is", "he is", "does not", "is not", "did not", "they will", "cannot", "they are", "you are", "it is", "i have"))
corpus_posts <- mgsub(corpus_posts, c("1/11/22", "23"), c("january eleventh two thousand twenty two", "twenty three"))
corpus_posts <- corpus_posts %>% textshape::split_sentence(corpus_posts, text.var = TRUE)
head(corpus_posts, 10)
```
### Testing Reverting to Data Frame
This a test area to see if this will allow me to put my data back from values to data.
```{r testing to make a data frame}
library(corpus)
library(textshape)
library(cleanNLP)
library(NLP)
corpus_postse <- as_corpus_text(corpus_posts)
corpus_postse <- tidy_list(corpus_posts)
corpus_postse <- Token_Tokenizer(corpus_postse) # become tokens
```
It worked somewhat! It isn't a true data table but it is back under the data category.
## General Notes and Future Thoughts
- The columns I had originally in the table, date and if the post was a tweet or Reddit post are missing. This isn't a problem currently, but if I would like those back I'll have to join the edited post to the table.
- Another option is to just upload two separate csv files with one containing tweets and another containing posts.
- The textclean package has been super useful! It's helped to make a lot of the cleaning fairly easily. However it has left /n around for a few items. I'm unsure why, (as I believe it happened removing other languages) so I'll have to remove it later.
- Although it has been very helpful it seems to be unable to clean up everything. That's alright but a bit confusing as to why.
- I noticed that some posts are now "NA" I'd like to remove those from my database by the next blog post.
- I would also like to change all text to lower case.
- I may want to use the "tm" package to clean up my data even more. Currently I feel like a lot of the harder things have been taken out.
- I'm a bit confused on the creating the corpus as token with the "textshape" package. It seems to work but I'm unsure of how it's working.
## Looking Ahead at Tutorial 5
After taking a brief look at tutorial 5 there does seem to be some very useful tips in there. Here is currently what I have. I may be able to combine some of what I was looking at and what lesson 5 has together!
```{r looking ahead}
library(devtools)
library(tidytext)
library(plyr)
library(tidyverse)
library(quanteda)
library(quanteda)
corpustest <- corpus(corpus$text)
corpussummary <- summary(corpustest)
corpussummary$show <- "Love is Blind Japan"
corpussummary$count <- as.numeric(str_extract(corpussummary, "[0-9]+"))
corpus_tokens <- tokens(corpustest,
remove_punct = T,
remove_numbers = T)
corpus_tokens <- tokens_tolower(corpus_tokens)
corpus_tokens <- tokens_select(corpus_tokens, pattern = stopwords("en"), selection = "remove")
corpus_tokens_stem <- tokens_wordstem(corpus_tokens)
print(corpus_tokens)
```
## Full Code
```{r full current code}
library(tidyverse)
library(cld3)
library(dplyr)
library(textclean)
library(stringi)
library(stringr)
library(textshape)
library(here)
corpus <- read.csv(here("posts","loveisblind_socialmedia.csv"))
head(corpus)
corpus2 <- subset(corpus, detect_language(corpus$text) == "en")
corpus_posts <- replace_internet_slang(corpus2$text)
corpus_posts <- replace_date(corpus_posts) %>%
replace_contraction(corpus_posts) %>%
replace_kern(corpus_posts) %>%
replace_curly_quote(corpus_posts) %>%
replace_word_elongation(corpus_posts) %>%
replace_white(corpus_posts) %>%
replace_html(corpus_posts) %>%
textshape::split_sentence(corpus_posts)
only_ascii_regexp <- '[^\u0001-\u007F]+|<U\\+\\w+>'
corpus_posts <- corpus_posts %>%
str_replace_all(regex(only_ascii_regexp), "")
corpus_posts <- gsub(" ?(f|ht)(tp)(s?)(://)(.*)[.|/](.*)", "", corpus_posts)
corpus_posts <- mgsub(corpus_posts, c("it's", "i'm", "i've", "she's", "he's", "don't", "isn't", "didn't", "they'll", "can't", "they're", "you're", "EP02Iv'e"), c("it is", "i am", "i have", "she is", "he is", "does not", "is not", "did not", "they will", "cannot", "they are", "you are", "it is", "i have"))
corpus_posts <- mgsub(corpus_posts, c("1/11/22", "23"), c("january eleventh two thousand twenty two", "twenty three"))
corpus_postse <- corpus_postse %>% textshape::split_sentence(corpus_postse)
head(corpus_posts, 10)
```