Skip to content

Instantly share code, notes, and snippets.

@mskyttner
Created October 26, 2022 19:44
Show Gist options
  • Save mskyttner/8cd52cad518b31d02fc83266385e3cfa to your computer and use it in GitHub Desktop.
Save mskyttner/8cd52cad518b31d02fc83266385e3cfa to your computer and use it in GitHub Desktop.
Pivot longer and wider using R
# from SO post at https://stackoverflow.com/questions/72922418/create-rows-from-part-of-column-names/72939299
library(readr)
library(tidyr)
library(dplyr)
library(knitr)
so_blurb <-
"id|Date (05/19/2020)|Type (05/19/2020)|Date (06/03/2020)|Type (06/03/2020)|Type (10/23/2020|Date (10/23/2020)|Type (10/23/2020)
10629465|null|null|06/01/2020|E
98765432|05/18/2020|B||||10/26/2020|T
"
wider <-
so_blurb |>
read_delim(
, col_types = cols(.default = col_character())
, delim = "|"
, trim_ws = TRUE
) |>
mutate(across(.fns = function(x) recode(x, .missing = "")))
wider |> kable(format = "markdown")
#|id |Date (05/19/2020) |Type (05/19/2020) |Date (06/03/2020) |Type (06/03/2020) |Type (10/23/2020 |Date (10/23/2020) |Type (10/23/2020) |
#|:--------|:-----------------|:-----------------|:-----------------|:-----------------|:----------------|:-----------------|:-----------------|
#|10629465 |null |null |06/01/2020 |E | | | |
#|98765432 |05/18/2020 |B | | | |10/26/2020 |T |
# two step pivot - longer first, then wider
longer <-
wider |>
# first step
pivot_longer(
, cols = !c("id")
, names_to = c("Column", "ActivityDate")
, names_pattern = "(.*?)\\s+[(](.*?)[)]"
, values_to = "DateOrLetter"
) |>
filter(DateOrLetter != "") |>
# second step
pivot_wider(
, id_cols = c(1, 2)
, names_from = "Column"
, values_from = "DateOrLetter"
) |>
# convert data types and nulls
mutate(
, ActivityDate = readr::parse_date(ActivityDate, "%m/%d/%Y")
, Date = readr::parse_date(Date, "%m/%d/%Y")
, Type = recode(Type, "null" = NA_character_)
)
longer |> kable(format = "markdown")
#|id |ActivityDate |Date |Type |
#|:--------|:------------|:----------|:----|
#|10629465 |2020-05-19 |NA |NA |
#|10629465 |2020-06-03 |2020-06-01 |E |
#|98765432 |2020-05-19 |2020-05-18 |B |
#|98765432 |2020-10-23 |2020-10-26 |T |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment