Skip to content

Instantly share code, notes, and snippets.

@vjcitn
Created August 10, 2024 11:48
Show Gist options
  • Save vjcitn/5c86b6ea617a14c635080afb697e3be8 to your computer and use it in GitHub Desktop.
Save vjcitn/5c86b6ea617a14c635080afb697e3be8 to your computer and use it in GitHub Desktop.
script for filtering a gtex resource from https://ftp.ebi.ac.uk/pub/databases/spot/eQTL/imported/GTEx_V8/ge/
#tabix Lung.tsv.gz 1:1-300000000 > lung.chr1.tsv
#tabix Lung.tsv.gz 2:1-300000000 > lung.chr2.tsv
#tabix Lung.tsv.gz 3:1-300000000 > lung.chr3.tsv
#tabix Lung.tsv.gz 4:1-300000000 > lung.chr4.tsv
#tabix Lung.tsv.gz 5:1-300000000 > lung.chr5.tsv
#tabix Lung.tsv.gz 6:1-300000000 > lung.chr6.tsv
#tabix Lung.tsv.gz 7:1-300000000 > lung.chr7.tsv
#tabix Lung.tsv.gz 8:1-300000000 > lung.chr8.tsv
#tabix Lung.tsv.gz 9:1-300000000 > lung.chr9.tsv
#tabix Lung.tsv.gz 10:1-300000000 > lung.chr10.tsv
#tabix Lung.tsv.gz 11:1-300000000 > lung.chr11.tsv
#tabix Lung.tsv.gz 12:1-300000000 > lung.chr12.tsv
#tabix Lung.tsv.gz 13:1-300000000 > lung.chr13.tsv
#tabix Lung.tsv.gz 14:1-300000000 > lung.chr14.tsv
#tabix Lung.tsv.gz 15:1-300000000 > lung.chr15.tsv
#tabix Lung.tsv.gz 16:1-300000000 > lung.chr16.tsv
#tabix Lung.tsv.gz 17:1-300000000 > lung.chr17.tsv
#tabix Lung.tsv.gz 18:1-300000000 > lung.chr18.tsv
#tabix Lung.tsv.gz 19:1-300000000 > lung.chr19.tsv
#tabix Lung.tsv.gz 20:1-300000000 > lung.chr20.tsv
#tabix Lung.tsv.gz 21:1-300000000 > lung.chr21.tsv
#tabix Lung.tsv.gz 22:1-300000000 > lung.chr22.tsv
#tabix Lung.tsv.gz X:1-300000000 > lung.chrX.tsv
nn = c("variant", "r2", "pvalue", "molecular_trait_object_id", "molecular_trait_id",
"maf", "gene_id", "median_tpm", "beta", "se", "an", "ac", "chromosome",
"position", "ref", "alt", "type", "rsid")
library(dplyr)
library(data.table)
chrs = c(1:22, "X")
fn = sprintf("lung.chr%s.tsv", chrs)
filts = lapply(fn, function(x) {
cat(x, "\n")
tmp = fread(x)
names(tmp) = nn
tmp |> dplyr::filter(pvalue < 5e-2)
}
)
lungpl05 = do.call(rbind, filts)
head(lungpl05)
library(arrow)
write_parquet(lungpl05, "lungpl05.parquet")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment