Last active
October 1, 2020 20:11
-
-
Save mpettis/71358f026d1640416c0a98dc5b07f555 to your computer and use it in GitHub Desktop.
Parsing timestamps in R
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
``` r | |
# Load libraries | |
library(lubridate) | |
#> Warning: package 'lubridate' was built under R version 3.6.3 | |
#> | |
#> Attaching package: 'lubridate' | |
#> The following objects are masked from 'package:base': | |
#> | |
#> date, intersect, setdiff, union | |
library(dplyr) | |
#> Warning: package 'dplyr' was built under R version 3.6.3 | |
#> | |
#> Attaching package: 'dplyr' | |
#> The following objects are masked from 'package:stats': | |
#> | |
#> filter, lag | |
#> The following objects are masked from 'package:base': | |
#> | |
#> intersect, setdiff, setequal, union | |
## strptime handles numerical offsets | |
strptime("2020-04-06T05:45:00+0300", "%Y-%m-%dT%H:%M:%S%z") %>% as.POSIXct() %>% with_tz("UTC") | |
#> [1] "2020-04-06 02:45:00 UTC" | |
## Standard `strptime()` doesn't deal well with an offset that contains colons, even though it is ISO 8601 standard. | |
## But the strptime() conventions probably predate ISO conventions. | |
strptime("2020-04-06T05:45:00+03:00", "%Y-%m-%dT%H:%M:%S%z") | |
#> [1] NA | |
## Standard `strptime()` can't use %z to parse the 'Z' UTC/Zulu time specifier | |
strptime("2020-04-06T05:45:00Z", "%Y-%m-%dT%H:%M:%S%z") | |
#> [1] NA | |
## But it can use a literal 'Z', but it doesn't understand that it means UTC time. | |
strptime("2020-04-06T05:45:00Z", "%Y-%m-%dT%H:%M:%SZ") | |
#> [1] "2020-04-06 05:45:00 CDT" | |
## You have to tell it explicitly: | |
strptime("2020-04-06T05:45:00Z", "%Y-%m-%dT%H:%M:%SZ", tz="UTC") | |
#> [1] "2020-04-06 05:45:00 UTC" | |
## ... as.POSIXct() has similar limitations | |
## lubridate::parse_date_time() just works | |
parse_date_time("2020-04-06T05:45:00+03:00", orders="Ymd HMSz") | |
#> [1] "2020-04-06 02:45:00 UTC" | |
parse_date_time("2020-04-06T05:45:00+0300", orders="Ymd HMSz") | |
#> [1] "2020-04-06 02:45:00 UTC" | |
parse_date_time("2020-04-06T05:45:00Z", orders="Ymd HMSz") | |
#> [1] "2020-04-06 05:45:00 UTC" | |
``` | |
<sup>Created on 2020-10-01 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0)</sup> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment