Skip to content

Instantly share code, notes, and snippets.

View tljstewart's full-sized avatar

Timothy Stewart tljstewart

View GitHub Profile

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

danielgross / ai-plugin.json
Created March 23, 2023 22:59
ChatGPT Plugin for Twilio
"schema_version": "v1",
"name_for_model": "twilio",
"name_for_human": "Twilio Plugin",
"description_for_model": "Plugin for integrating the Twilio API to send SMS messages and make phone calls. Use it whenever a user wants to send a text message or make a call using their Twilio account.",
"description_for_human": "Send text messages and make phone calls with Twilio.",
"auth": {
"type": "user_http",
"authorization_type": "basic"
# STEP 1: Load
# Load documents using LangChain's DocumentLoaders
# This is from
from langchain.document_loaders.csv_loader import CSVLoader
loader = CSVLoader(file_path='./example_data/mlb_teams_2012.csv')
data = loader.load()
rvanbruggen / 1-contacttracing-import.cql
Last active August 18, 2022 16:11
Contact tracing example #cypher #neo4j
//environment: Neo4j Desktop 1.2.7, Neo4j Enteprise 3.5.17, apoc, gds 1.1.0
//or: Neo4j Enterprise 4.0.3, apoc (NOT later! a bug in apoc.coll.max/apoc.coll.min needs to be resolved)
//contact tracing data import
//full spreadsheet with synthetic data
// person sheet˝
qpwo /
Last active September 21, 2024 14:54
Monte Carlo tree search (MCTS) minimal implementation in Python 3, with a tic-tac-toe example gameplay
A minimal implementation of Monte Carlo tree search (MCTS) in Python 3
Luke Harold Miles, July 2019, Public Domain Dedication
See also
from abc import ABC, abstractmethod
from collections import defaultdict
import math
talmo / vid2hdf5.m
Created June 14, 2018 13:32
An example of converting video (MP4/AVI/MOV/etc) to HDF5
% Clean start!
clear all, clc
%% Parameters
% Path to input file
videoPath = '..\..\leap\data\examples\072212_163153.clip.mp4';
% Path to output file
% savePath = '..\..\leap\data\examples\072212_163153.clip.h5';
savePath = 'C:\tmp\072212_163153.clip.h5';
mbinna /
Last active September 16, 2024 15:44
Effective Modern CMake

Effective Modern CMake

Getting Started

For a brief user-level introduction to CMake, watch C++ Weekly, Episode 78, Intro to CMake by Jason Turner. LLVM’s CMake Primer provides a good high-level introduction to the CMake syntax. Go read it now.

After that, watch Mathieu Ropert’s CppCon 2017 talk Using Modern CMake Patterns to Enforce a Good Modular Design (slides). It provides a thorough explanation of what modern CMake is and why it is so much better than “old school” CMake. The modular design ideas in this talk are based on the book [Large-Scale C++ Software Design](

nicolasdao /
Last active September 21, 2024 17:31
What you need to know to choose an open source license.
sid24rane / udp.js
Created July 25, 2016 08:39
Simple UDP Client and Server in Node.js ==> ( Echo Server )
var udp = require('dgram');
// --------------------creating a udp server --------------------
// creating a udp server
var server = udp.createSocket('udp4');
// emits when any error occurs
console.log('Error: ' + error);
digitaljhelms / gist:4287848
Last active September 19, 2024 06:16
Git/GitHub branching standards & conventions


Quick Legend

Description, Instructions, Notes
Instance Branch