Last active
November 27, 2017 23:32
-
-
Save erinshellman/7ba5ea61d5d83aef4d35 to your computer and use it in GitHub Desktop.
A collection of little helper functions for quick data cleaning in R
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
clean_headers = function(headers) { | |
# Make lowercase | |
headers = tolower(headers) | |
# Replace symbols | |
headers = gsub(' ', '', headers, fixed = TRUE) | |
headers = gsub('.', '_', headers, fixed = TRUE) | |
headers = gsub('[^[:alnum:]_]', '', headers) # remove all symbols except '_' | |
headers = gsub('__', '_', headers, fixed = TRUE) | |
headers = gsub('_$', '', headers) # if last char is '_', remove | |
return(headers) | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
remove_whitespace = function(df, side = 'both') { | |
# Goes over each element of a df and strips out whitespace. | |
# Defaults to stripping out both sides. | |
require(stringr); require(dplyr) | |
df_no_whitespace = mutate_each(df, funs(str_trim(., side = side))) | |
return(df_no_whitespace) | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment