All thanks to a tweet by Chris Albon
- Cut Copy (via Andrew Musselman)
- Com Truise (via Sean J. Taylor)
- Makkam (via Karissa McKelvey)
- Slow Magic (via Clare Corthell)
""" | |
Scraping Nick Saban's seasons as Alabama head coach | |
I was curious what % of his time Alabama has spent at #1 | |
""" | |
from collections import Counter | |
from bs4 import BeautifulSoup | |
import requests |
{ | |
"meta": { | |
"limit": 100, | |
"offset": 0, | |
"total_count": 100 | |
}, | |
"objects": [ | |
{ | |
"caucus": null, | |
"congress_numbers": [ |
id | text | |
---|---|---|
805168201126518784 | @ryanisaac this is the weirdest quarter of football I’ve seen in a while | |
804818096942968833 | @SportsTribution I don't follow. | |
804816669281546240 | Looking for a weekend longread? @samhinkie's resignation letter is still one of the best things I've read in 2016 https://t.co/y7464DISgX | |
804759041318813696 | Have used Postico to query our Redshift cluster for the last few months and it's been great. Similar to Sequel Pro. https://t.co/NN0DvdCpa6 | |
804699067590840320 | @jrmontag @tanehisicoates Agreed. Important book. | |
804690839221964801 | The Year of the Looking Glass: Building Products https://t.co/0MVAbxeSze | |
804469380352446464 | "So how do we build trust? The easy answer is by producing high quality work. The hard part is how you get there." https://t.co/M4MgJYU2Wm | |
804015210621239297 | Holywow this looks awesome. Continuously impressed by the data products the @awscloud team keeps churning out: https://t.co/jmkLqFjyn7 | |
803734870706896896 | RT @jevnin: I'd recommend working with this guy. https://t. |
[ | |
{ | |
"id": "805168201126518784", | |
"text": "@ryanisaac this is the weirdest quarter of football I’ve seen in a while" | |
}, | |
{ | |
"id": "804818096942968833", | |
"text": "@SportsTribution I don't follow." | |
}, | |
{ |
# Installing/upgrading old requirements.txt from python2 to python3 | |
sed s/\=/\ /g requirements.txt | awk '{print $1}' | xargs -n1 pip3 install --upgrade |
from concurrent.futures import ProcessPoolExecutor | |
import concurrent.futures | |
from halas.parsers import boxscore | |
GAMES = [ ... ] | |
results = [] | |
with ProcessPoolExecutor(max_workers=4) as executor: | |
future_results = {executor.submit(boxscore, game): |
# for creating a column like "days since last login" | |
df = pd.read_clipboard(index_col=['customer_id', 'days']) | |
(df | |
.groupby(level='customer_id') | |
.did_login | |
.cumsum() | |
.to_frame() | |
.groupby(level='customer_id') | |
.apply(lambda g: g.groupby('did_login').cumcount()) |
""" | |
add grouped cumulative sum column to pandas dataframe | |
Add a new column to a pandas dataframe which holds the cumulative sum for a given grouped window | |
Desired output: | |
user_id,day,session_minutes,cumulative_minutes | |
516530,0,NaN,0 | |
516530,1,0,0 | |
516532,0,5,5 |
All thanks to a tweet by Chris Albon