Last active
June 7, 2019 03:25
-
-
Save rccordell/ceb46023066b632a31d2c47d4b07369f to your computer and use it in GitHub Desktop.
This find & replace function was inspired by Daniel Mallory Ortberg's "Bible Verses Where 'Behold' Has Been Replaced With Look, Buddy” <http://the-toast.net/2016/06/06/bible-verses-where-behold-has-been-replaced-with-look-buddy/>, allowing you to create a dataframe of verses from the Bible with similar substitutions.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# imports the necessary libraries | |
library(scriptuRs, stringr) | |
# creates a function to import Bible data, select important columns, detect the first string passed to the function, and create a new column in which that string is replaced by the second string passed to the function. Function returns a dataframe of each verse that includes the first string. The `revText` column contains the revised text of each verse. | |
bibleEdit <- function(kjv, nkjv){ | |
bible <- kjv_bible() %>% | |
select(chapter_number, verse_number, verse_title, book_title, text) %>% | |
drop_na(text) %>% | |
filter(grepl(kjv, text, ignore.case = T)) %>% | |
# filter(str_detect(text, kjv)) %>% | |
mutate(revText = gsub(kjv, nkjv, text, ignore.case = T)) %>% | |
mutate(revText = gsub("(^[a-z]|\\. [a-z])", "\\U\\1", revText, perl = TRUE)) %>% | |
mutate(revText = gsub(",,",",", revText)) %>% | |
mutate(revText = paste(verse_title, " ", revText, "\n\n", sep = "")) | |
return(bible) | |
} | |
# This line calls the function with the string to seach in the first quotation marks and the string to replace in the second. The resulting dataframe is passed to the `verses` variable. | |
verses <- bibleEdit("thou shalt", "it'd be great if you could") | |
# `cat` will print in a way that retains line breaks. This version will print just the revised text to the console. | |
cat(verses$revText) | |
# if you are unsure what phrases might be fun/interesting to substitute, the code below will either count words from the `bible` dataframe or count ngrams. Control the size of the ngrams with the `n =` argument. | |
bible %>% | |
unnest_tokens(word, text) %>% | |
anti_join(stopWords) %>% | |
group_by(word) %>% | |
summarize(count = n()) %>% | |
arrange(desc(count), word) %>% | |
View() | |
bible %>% | |
unnest_tokens(ngram, text, token = "ngrams", n = 3) %>% | |
group_by(ngram) %>% | |
summarize(count = n()) %>% | |
arrange(desc(count), ngram) %>% | |
View() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment