Skip to content

Instantly share code, notes, and snippets.

View maciejskorski's full-sized avatar

Maciej Skorski maciejskorski

View GitHub Profile
@maciejskorski
maciejskorski / README.md
Last active August 21, 2024 15:39
log detailed results in dbt clickhouse

Summary

The repo shows how to log run results in DBT with ClickHouse SQL.

The idea is to use on-run-end post-hook. The code logs detailed results data, including invocation arguments and timestamps.

This improves upon https://gist.github.com/jeremyyeo/064106e480106b49cd337f33a765ef20.

⚠️ recall that ClickHouse does not support schemas, so re-work qualified paths when using JINJA

@maciejskorski
maciejskorski / download_OSF.sh
Created July 29, 2024 10:34
download files from OSF
# in bash; select the appropriate resource id and use the download api via wget
osf_id=cwu4m
wget --content-disposition "https://osf.io/$osf_id/download"
@maciejskorski
maciejskorski / gist:f0b2170dd7b75d5e9d7a93f1350287de
Created July 25, 2024 13:52
nvidia torch container with current dir mounted
# select an appropriate container tag https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch/tags
docker run -it -v ${PWD}:/workspace --name torch_container --gpus all nvcr.io/nvidia/pytorch:21.07-py3
@maciejskorski
maciejskorski / get_guardian_content.py
Created June 14, 2024 16:52
How to use The Guardian API
import os
import requests
from functools import partial
API_KEY = os.getenv('API_KEY') # NOTE: can use the limited 'test' key for demo purposes
QUERY_TEMPLATE = 'https://content.guardianapis.com/{section}/{year}/{month}/{day}/{title}?api-key={API_KEY}&show-blocks=all' # NOTE: can try less info, e.g. &show-fields=BodyText
QUERY_FN = partial(QUERY_TEMPLATE.format,API_KEY=API_KEY)
web_url = 'https://www.theguardian.com/business/2014/sep/10/thorntons-60-per-cent-profits-rise-despite-closures' # NOTE: put your desired url
section, year, month, day, title = web_url.split('/')[3:]
@maciejskorski
maciejskorski / abstractsyntaxtree_demo.ipynb
Last active May 9, 2023 04:20
abstractsyntaxtree_demo.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@maciejskorski
maciejskorski / clang_example.out
Created May 7, 2023 10:02
non-trivial clang usage example
print_lcp_config_options
struct netdissect_options * TypeKind.POINTER arg declared in ./tcpdump/print-ppp.c:L403,C39-L403,C59 netdissect_options declared in ./tcpdump/netdissect.h:L161
const unsigned char TypeKind.ELABORATED arg declared in ./tcpdump/print-ppp.c:L403,C61-L403,C73 u_char declared in /Library/Developer/CommandLineTools/SDKs/MacOSX13.sdk/usr/include/sys/_types/_u_char.h:L30
const unsigned int TypeKind.ELABORATED arg declared in ./tcpdump/print-ppp.c:L403,C75-L403,C86 u_int declared in /Library/Developer/CommandLineTools/SDKs/MacOSX13.sdk/usr/include/sys/_types/_u_int.h:L30
print_ipcp_config_options
struct netdissect_options * TypeKind.POINTER arg declared in ./tcpdump/print-ppp.c:L404,C40-L404,C60 netdissect_options declared in ./tcpdump/netdissect.h:L161
const unsigned char * TypeKind.POINTER arg declared in ./tcpdump/print-ppp.c:L404,C62-L404,C77 u_char declared in /Library/Developer/CommandLineTools/SDKs/MacOSX13.sdk/usr/include/sys/_types/_u_char.h:L30
unsigned int TypeKind.ELABO
@maciejskorski
maciejskorski / requirements.txt
Last active April 21, 2023 14:30
kaggle python docker packages
# gcr.io/kaggle-images/python:latest
# checked on April 21 2023
# root@9735f873cb2d:/# pip list show
Package Version Editable project location
-------------------------------------- -------------- -------------------------
absl-py 1.4.0
accelerate 0.12.0
access 1.1.9
affine 2.4.0
aiobotocore 2.5.0
@maciejskorski
maciejskorski / config
Created April 21, 2023 11:34
An example of SSH config for Google Cloud
# Read more about SSH config files: https://linux.die.net/man/5/ssh_config
Host snarto_kaggle_cpu
# update the instance IP under console.cloud.google.com/compute/instances -> more (...) -> network details
HostName 34.170.120.99
# add the public part under console.cloud.google.com/compute/metadata
IdentityFile ~/.ssh/id_rsa
# match the name under the SSH key
User maciej.skorski
Port 22
@maciejskorski
maciejskorski / gist:6fb8b066145ebef53d85adb99db8a9b2
Created December 18, 2020 21:33
profiling custom blocks of TF code
# wrap the TF code to be profiled with <start> and <stop> methods as below
tf.profiler.experimental.start('logs/profile')
for (x,y) in data:
train_step(x,y)
tf.profiler.experimental.stop()
@maciejskorski
maciejskorski / logistic_TF
Created December 18, 2020 21:28
efficient logistic regression
## train a logistic regression (images 28x28 and 10 classes)
w = tf.Variable(tf.random.normal(shape=(28*28,10),stddev=0.1),trainable=True)
optimizer = tf.optimizers.SGD(0.01)
@tf.function
def train_step(x, y):
with tf.GradientTape() as tape:
all_logits = tf.matmul(x,w) # (n_batch,n_class)
y_logits = tf.gather(all_logits,y,batch_dims=1) # (n_batch,)