Skip to content

Instantly share code, notes, and snippets.

View SunMarc's full-sized avatar
🌞
Working everywhere

Marc Sun SunMarc

🌞
Working everywhere
View GitHub Profile
@SunMarc
SunMarc / llama_torchao_compile.py
Last active September 16, 2024 23:01
`transformers` + `torchao` quantization + `torch.compile` on Llama3.1 8B
# REQUIRES torchao, torch nightly (or torch 2.5) and transformers
from transformers import AutoTokenizer, AutoModelForCausalLM, TorchAoConfig
from transformers import TextStreamer
import torch
from tqdm import tqdm
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false" # To prevent long warnings :)
torch.set_float32_matmul_precision('high')
@SunMarc
SunMarc / finetune_llama_gptq.py
Last active July 26, 2024 15:37
Finetune GPTQ model with peft and tlr
# coding=utf-8
# Copyright 2023 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software