Created
August 6, 2015 17:51
-
-
Save obenshaindw/bb6c2b4cf2aa7028813a to your computer and use it in GitHub Desktop.
Steam large files from s3 (i.e., FASTQ)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# Pass in s3 URL=$1 | |
# Set up Pathing | |
## Drop s3:// | |
pname=${1#*//} | |
## Drop Bucket Name, i.e., NDAR_Central*, NDAR_Results, etc. | |
pname=${pname#*/} | |
## Get text after last / | |
fname=${1##*/} | |
## Get text before first / | |
full_path=${pname%/*} | |
# Create directories if not exist | |
if [ ! -e /data/$full_path ] | |
then | |
mkdir -p /data/$full_path | |
fi | |
# Create Fifo pipe if not exists | |
if [ ! -e /data/$pname ] | |
then | |
mkfifo /data/$pname | |
fi | |
# Create buffered file stream from s3. | |
# Note use of Picard Tools FifoBuffer to make large file buffer. | |
s3cmd get $1 - | java -jar /home/obenshaindw/picard-tools-1.135/picard.jar FifoBuffer > /data/$full_path/$fname & | |
# Access the Fifo object to begin streaming data from the s3 object | |
# Samtools view -H /data/$pname | |
# bwa mem -M -t 16 ref.fa data/fifo_for_fastq1.gz /data/fifo_for_fastq2.gz > aln.sam | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment