mmatkinson’s gists

mmatkinson / github_resources.md

Last active November 16, 2018 22:00

github resources

Interactive Tutorials

End-To-end Tutorial

Command line tutorial

mmatkinson / postgres_queries_and_commands.sql

Created June 7, 2018 20:54 — forked from rgreenjr/postgres_queries_and_commands.sql

Useful PostgreSQL Queries and Commands

	-- show running queries (pre 9.2)
	SELECT procpid, age(query_start, clock_timestamp()), usename, current_query
	FROM pg_stat_activity
	WHERE current_query != '<IDLE>' AND current_query NOT ILIKE '%pg_stat_activity%'
	ORDER BY query_start desc;

	-- show running queries (9.2)
	SELECT pid, age(query_start, clock_timestamp()), usename, query
	FROM pg_stat_activity
	WHERE query != '<IDLE>' AND query NOT ILIKE '%pg_stat_activity%'

mmatkinson / converting-jupyter-notebooks-to-various-formats

Created May 9, 2018 13:05 — forked from StevenMMortimer/converting-jupyter-notebooks-to-various-formats

	Convert .ipynb to Slides
	cd "test"
	ipython nbconvert "test.ipynb" --to slides --reveal-prefix "http://cdn.jsdelivr.net/reveal.js/2.6.2" --post serve --config slides_config.py

	* To print slides add ?print-pdf at the end of the URL and print

	Convert .ipynb to LaTex/PDF
	ipython nbconvert MyFirstNotebook.ipynb --to latex --post PDF

	Convert .ipynb to HTML

mmatkinson / geopandas_tour.ipynb

Created January 16, 2018 17:41 — forked from ocefpaf/geopandas_tour.ipynb

explore shapefile

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

mmatkinson / dplyr_to_pandas.ipynb

Created October 31, 2017 19:22 — forked from wlattner/dplyr_to_pandas.ipynb

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

mmatkinson / google_spreadsheets_create_update_example.py

Created November 5, 2016 02:06 — forked from pahaz/google_spreadsheets_create_update_example.py

Python Google spreadsheets v4 API example. Google spreadsheet access management example. Use google drive v3 API for access management

	"""Google spreadsheet related.
	Packages required: oauth2client, google-api-python-client
	* https://gist.github.com/miohtama/f988a5a83a301dd27469
	"""

	from oauth2client.service_account import ServiceAccountCredentials
	from apiclient import discovery


	def get_credentials(scopes: list) -> ServiceAccountCredentials:

mmatkinson / df_to_ddl.py

Created November 4, 2016 13:39

take in a dataframe and output (redshift) DDL For creating a table of that format.

	def df_to_ddl(df, tablename='test.mytable'):
	data_dtypes = df.dtypes.reset_index().rename(columns = {'index':'colname',0:'datatype'})

	# Map pandas datatypes into SQL
	data_dtypes['sql_dtype'] = data_dtypes.datatype.astype(str).map(
	{'object':'varchar(24)',
	'float64':'float',
	'int64':'int',
	'bool':'boolean'} )

mmatkinson / table_comparison.py

Created August 5, 2016 19:52

compare two tables & all of their values

	import pandas as pd

	def df_diff(index_cols, data1, data2, lsuffix='_1'):
	"""
	usage:
	comparisondf= df_diff( ['unique_id','date'], current_df, new_df, lsuffix='_curr')

	retuns:
	single dataframe with index_cols on the index, as well as all other variables stacked on the index, and the
	values in each dataframe along the columns.

mmatkinson / lda_vec.py

Last active May 2, 2016 23:03

Helper class for using sklearn vectorizers with gensim lda.

	# For gensim
	from itertools import groupby
	import gensim


	class VectorizedCorpus(object):
	"""
	Helper Class for using Sklearn Vectorizers with gensim's LDA model
	handles transformations between gensim corpus / bow representations and sklearn matrix

mmatkinson / useful_pandas_snippets.py

Last active April 29, 2016 00:01 — forked from bsweger/useful_pandas_snippets.md

Useful Pandas Snippets

	#List unique values in a DataFrame column
	pd.unique(df.column_name.ravel())

	#Convert Series datatype to numeric, getting rid of any non-numeric values
	df['col'] = df['col'].astype(str).convert_objects(convert_numeric=True)

	#Grab DataFrame rows where column has certain values
	valuelist = ['value1', 'value2', 'value3']
	df = df[df.column.isin(value_list)]