Skip to content

Instantly share code, notes, and snippets.

@afonsoguerra
Forked from dceoy/read_vcf.py
Created April 27, 2023 08:07
Show Gist options
  • Save afonsoguerra/1e47083251d024b470ba6be2bec5a30b to your computer and use it in GitHub Desktop.
Save afonsoguerra/1e47083251d024b470ba6be2bec5a30b to your computer and use it in GitHub Desktop.
[Python] Read VCF (variant call format) as pandas.DataFrame
#!/usr/bin/env python
import io
import os
import pandas as pd
def read_vcf(path):
with open(path, 'r') as f:
lines = [l for l in f if not l.startswith('##')]
return pd.read_csv(
io.StringIO(''.join(lines)),
dtype={'#CHROM': str, 'POS': int, 'ID': str, 'REF': str, 'ALT': str,
'QUAL': str, 'FILTER': str, 'INFO': str},
sep='\t'
).rename(columns={'#CHROM': 'CHROM'})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment