Skip to content

Instantly share code, notes, and snippets.

@vanAmsterdam
Created December 4, 2019 15:44
Show Gist options
  • Save vanAmsterdam/7f664aac394d1bfe6bcac0d1c440d9a1 to your computer and use it in GitHub Desktop.
Save vanAmsterdam/7f664aac394d1bfe6bcac0d1c440d9a1 to your computer and use it in GitHub Desktop.
number of columns with less than 33% na's
library(data.table)
library(purrr)
set.seed(12345)
x <- rnorm(100)
x[sample(1:100, size = 50)] = NA
df <- data.table(matrix(x, nrow=10))
df[, columnA:=rep(letters[1:5], 2)]
df[, list(n_missings=map_dbl(.SD, ~(mean(is.na(.x))<0.33))%>%sum), by='columnA']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment