Skip to content

Instantly share code, notes, and snippets.

# Derived from https://towardsdatascience.com/how-to-fine-tune-gpt-2-for-text-generation-ae2ea53bc272
import os
import pandas as pd
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import numpy as np
import random
import torch
from torch.utils.data import Dataset, DataLoader
@ConradStack
ConradStack / get_columns.sql
Created February 20, 2019 04:40
SQL server query to get the list of columns in a table along with Data types, NOT NULL, and PRIMARY KEY constraints
/*
From [this stackover post](https://stackoverflow.com/questions/2418527/sql-server-query-to-get-the-list-of-columns-in-a-table-along-with-data-types-no)
*/
SELECT
c.name 'Column Name',
t.Name 'Data type',
c.max_length 'Max Length',
c.precision ,
c.scale ,
@ConradStack
ConradStack / pyfaidx.extract_by_name.py
Created September 30, 2017 02:06
Extract sequences from a fasta file, preserving read name comments
# Create a new fasta file given a fasta file and list of sequence names
# - outputting the long_name does/did not seem to work properly in the faidx script that is packaged with pyfaidx
from pyfaidx import *
# read fasta file
fa = Fasta('test.fa')
@ConradStack
ConradStack / clear_pagecache.sh
Created September 28, 2017 02:19
Clear linux PageCache
# From [this tutorial](https://www.tecmint.com/clear-ram-memory-cache-buffer-and-swap-space-on-linux/)
sync; echo 1 > /proc/sys/vm/drop_caches
@ConradStack
ConradStack / bamfilter_oneliners.md
Created February 17, 2017 19:59 — forked from davfre/bamfilter_oneliners.md
SAM and BAM filtering oneliners
@ConradStack
ConradStack / append.pl
Created January 9, 2017 23:22 — forked from jimhester/append.pl
Parsing fasta files in perl ruby python and go
#!/usr/bin/env perl
use warnings;use strict;
my ($header,$sequence);
$header = <>;
chomp $header;
while(my $line = <>){
chomp $line;
if($line =~ /^>/){
@ConradStack
ConradStack / genome_links.md
Created November 1, 2016 14:43
Links to publicly available genomes
@ConradStack
ConradStack / tar_exclude.sh
Last active July 25, 2016 18:01
Excluding files from tarball creation
#!/bin/bash
# From: http://www.cyberciti.biz/faq/exclude-certain-files-when-creating-a-tarball-using-tar-command/
tar --exclude-vcs --exclude='nohup.out' -cjf ~/tmp/whatever.tar.bz2 ./*
@ConradStack
ConradStack / ana.py
Last active July 21, 2016 18:29
Managing python versions with Anaconda
## If python 3.5, say, has been installed but you want to use v2.7 for
## a particular project
# Create environment for different (or specific) version of python:
conda create -n py27 python=2.7 anaconda
# Check available environments (optional)
# conda info --envs
@ConradStack
ConradStack / gff3export.R
Last active July 12, 2016 20:45
Export Granges(List) to gff3 with rtracklayer
import(rtracklayer)
# Where syn is a list of GRanges objects:
test = GenomicRangesList(syn)
export.gff(test,"~/tmp/tmp.gff3", version="3")
# ... or a single GRanges element:
export.gff(syn[[1]],"~/tmp/tmp.gff3", version="3")