Ash O'Farrell aofarrel

Convert Auspice tree JSON to a data frame and Newick tree

An example script to convert an Auspice tree JSON to a data frame and Newick tree for processing by downstream analyses.

Setup

Install Nextstrain.

Usage

Fastqc is a program to perform some basic quality checks on fastq files. It makes nice html reports for a given file, but (as far as I can tell) doesn't provide a straightfowrard way to compare the results across files (which might represent different library preps, sequencing lanes or samples).

Here is the (really pretty hacky) solution to aggregating these stats that I came up with. This all assumes that you have a directory where reports for each fastq file are in a subdirectories containing the reports with names ./library_name.L001.R1.fastqc/fastqc_data.txt. We will then use regular expressions to match just those parts of the file we care about.

import os
import re

#percent sequences left after de_dup
re_dup = re.compile('Total Deduplicated Percentage\t(\d\d\.\d)')

	kill all running containers with docker kill $(docker ps -q)
	delete all stopped containers with docker rm $(docker ps -a -q)
	delete all images with docker rmi $(docker images -q)