Skip to content

Instantly share code, notes, and snippets.

View catid's full-sized avatar

Chris Taylor catid

View GitHub Profile
catid / CMakeLists.txt
Created August 17, 2024 18:18
Monte Carlo simulation evaluating policy of when to start using Anthropic Claude caching in tokens
cmake_minimum_required(VERSION 3.10)
# Set optimization flags
set(CMAKE_CXX_FLAGS_RELEASE "-O3 -march=native -mtune=native")
# Find threads package
catid / gist:533dd0c7d4f3ee8d34a6a905155b72ae
Last active April 22, 2024 04:53
How to quantize 70B model so it will fit on 2x4090 GPUs
How to quantize 70B model so it will fit on 2x4090 GPUs:
I tried EXL2, AutoAWQ, and SqueezeLLM and they all failed for different reasons (issues opened).
HQQ worked:
I rented a 4x GPU 1TB RAM ($19/hr) instance on runpod with 1024GB container and 1024GB workspace disk space.
I think you only need 2x GPU with 80GB VRAM and 512GB+ system RAM so probably overpaid.
Note you need to fill in the form to get access to the 70B Meta weights.
catid /
Last active March 29, 2024 23:54
DBRX on 3x 3090 GPUs
# conda create -n dbrx python=3.10 -y && conda activate dbrx
# pip install torch transformers tiktoken flash_attn bitsandbytes
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("SinclairSchneider/dbrx-instruct-quantization-fixed", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("SinclairSchneider/dbrx-instruct-quantization-fixed", device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True, load_in_4bit=True)
input_text = "What does it take to build a great LLM?"
catid /
Created March 23, 2024 23:33
GIVT GMM Decoder (GPT-4)
# Collaboration between Claude-3 and GPT-4 to implement
# This is just the GMM decoder part of the model they propose (which is the new thing).
# This one was mainly generated by GPT-4.
# The AIs provided two implementations of the idea and revised eachothers' code.
# I tested that the unit tests pass but haven't tried it in a language model yet.
import torch
import torch.nn as nn
import torch.nn.functional as F
catid /
Created March 23, 2024 23:30
GIVT GMM Decoder (Claude 3)
# Collaboration between Claude-3 and GPT-4 to implement
# This is just the GMM decoder part of the model they propose (which is the new thing).
# This one was mainly generated by Claude-3.
# The AIs provided two implementations of the idea and revised eachothers' code.
# I tested that the unit tests pass but haven't tried it in a language model yet.
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.nn.init as init
import math
class FeedForward(torch.nn.Module):
def __init__(self, input_features, output_features):
import torch
import torch.nn as nn
import torch.nn.init as init
import torch.nn.functional as F
# This layer is dropped into your pre-trained PyTorch model where nn.Linear is used
class DoRALayer(nn.Module):
def __init__(self, d_in, d_out, rank=4):
catid / z_calibration
Created April 1, 2022 03:42
# Auto Z-Calibration
probe_nozzle_x: 175.5
probe_nozzle_y: 257
# The X and Y coordinates (in mm) for clicking the nozzle on the
# Z endstop.
probe_switch_x: 169.3
catid / print_start
Created March 24, 2022 19:18
[gcode_macro PRINT_START]
M117 Print Starting...
; Make sure we are not applying stale bed mesh or Z offset
; Start heating bed
catid / printer.cfg
Created March 18, 2022 21:26
[gcode_macro PRINT_START]
M117 Print Starting...
; Make sure we are not applying stale bed mesh or Z offset
; Start heating bed