Skip to content

Instantly share code, notes, and snippets.

Some remarks on Large Language Models

Yoav Goldberg, January 2023

Audience: I assume you heard of chatGPT, maybe played with it a little, and was imressed by it (or tried very hard not to be). And that you also heard that it is "a large language model". And maybe that it "solved natural language understanding". Here is a short personal perspective of my thoughts of this (and similar) models, and where we stand with respect to language understanding.


Around 2014-2017, right within the rise of neural-network based methods for NLP, I was giving a semi-academic-semi-popsci lecture, revolving around the story that achieving perfect language modeling is equivalent to being as intelligent as a human. Somewhere around the same time I was also asked in an academic panel "what would you do if you were given infinite compute and no need to worry about labour costs" to which I cockily responded "I would train a really huge language model, just to show that it doesn't solve everything!". We

tintoy /
Created April 27, 2018 02:45
SSH via jump-hosts using Paramiko
#!/usr/bin/env python3
import os
import paramiko
ssh_key_filename = os.getenv('HOME') + '/.ssh/id_rsa'
jumpbox_public_addr = ''
jumpbox_private_addr = ''
target_addr = ''
sixy6e /
Last active March 4, 2024 13:08

A simple MPI logging and compute example

The simple logging example can be run via:

  • mpiexec -n 4 python

For the compute and logging example:

  • mpiexec -n 4 python
#coding: utf-8
#demo of beam search for seq2seq model
import numpy as np
import random
vocab = {
0: 'a',
1: 'b',
2: 'c',
3: 'd',
waichee /
Created February 14, 2017 10:12
Steps to create a docker container with dependencies required for compiling Tensorflow Serving
# Clone the Tensorflow Serving source
git clone
cd serving && git checkout <commit_hash>
# Build the docker image (time to go get yourself a coffee, maybe a meal as well, this will take a while.)
docker build -t some_user_namespace/tensorflow-serving:latest -f ./serving/tensorflow_serving/tools/docker/Dockerfile.devel .
# Run up the Docker container in terminal
docker run -ti some_user_namespace/tensorflow-serving:latest
waichee /
Last active January 13, 2019 05:45
Code to Build Tensorflow Serving from source within a Docker container
mkdir -p /work/
# Clone the source from Github
cd /work/ && git clone — recurse-submodules
# Pin the version of Tensorflow Serving and its submodule
cd /work/serving && git checkout $TENSOR_SERVING_COMMIT_HASH
0xnurl /
Last active October 23, 2021 17:35
Installing OpenFST Native Python Extension on MacOS

Installing OpenFst Native Python Extension on MacOS

Starting from version 1.5, OpenFst has offered a native Python module, making the use of external wrappers like PyFst unnecessary. This has been greatly helpful since PyFst doesn't support Python 3.

1. Install OpenFst

krivonogov /
Created October 31, 2016 11:23 — forked from axelborja/
Profile your python code using CProfile or Yappi and KCachegrind / QCachegrind
# 1 - Profile myfunc() from ipython
import cProfile
filename = '''myfunc()', filename)
# 2 - Convert your file to a usable kcachegrind file in your shell
wojteklu /
Last active September 19, 2024 22:29
Summary of 'Clean code' by Robert C. Martin

Code is clean if it can be understood easily – by everyone on the team. Clean code can be read and enhanced by a developer other than its original author. With understandability comes readability, changeability, extensibility and maintainability.

General rules

  1. Follow standard conventions.
  2. Keep it simple stupid. Simpler is always better. Reduce complexity as much as possible.
  3. Boy scout rule. Leave the campground cleaner than you found it.
  4. Always find root cause. Always look for the root cause of a problem.

Design rules

igormq /
Created June 27, 2016 21:03 — forked from nikitakit/
Tensorflow Beam Search
import tensorflow as tf
def beam_decoder(decoder_inputs, initial_state, cell, loop_function, scope=None,
beam_size=7, done_token=0
Beam search decoder
decoder_inputs: A list of 2D Tensors [batch_size x input_size].