Last active
June 11, 2020 22:21
-
-
Save IsmailM/e929e91b06c892d3bfca65d537899245 to your computer and use it in GitHub Desktop.
using the PGP api to get fastq urls, md5s and sizes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# The below is using JQ from https://stedolan.github.io/jq/ + | |
# the PGP API v1.2 - https://www.personalgenomes.org.uk/api/v1.2/ | |
curl -X GET "https://www.personalgenomes.org.uk/api/v1.2/all_wgs" -H "accept: application/json" | jq -r ' | |
.[] | [ | |
.hex_id, | |
(.data[]?.fastq_ftp), | |
(.data[]?.fastq_md5), | |
(.data[]?.fastq_bytes | split(";") | .[] | tonumber | . /1024/1024/1024) | |
] | flatten | @csv' > wgs_fastqs.csv | |
# Note, some of the records have three fastq files - so the CSV does not fully line up :( | |
# The 3 exome sequencing datasets | |
# Note this endpoint is not documented, but it exists (sorry) | |
curl -X GET "https://www.personalgenomes.org.uk/api/v1.2/all_wxs" -H "accept: application/json" | jq -r ' | |
.[] | [ | |
.hex_id, | |
(.data[]?.fastq_ftp), | |
(.data[]?.fastq_md5), | |
(.data[]?.fastq_bytes | split(";") | .[] | tonumber | . /1024/1024/1024) | |
] | flatten | @csv' > wxs_fastqs.csv | |
# Note in the above you can also split the fastq_ftp and fastq_md5 fields | |
(.data[]?.fastq_ftp | split(";")), | |
(.data[]?.fastq_md5 | split(";")), |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment