Skip to content

Instantly share code, notes, and snippets.

@matthiasr
Created July 31, 2024 20:07
Show Gist options
  • Save matthiasr/160d3a581a18cce7315f1e6946bee766 to your computer and use it in GitHub Desktop.
Save matthiasr/160d3a581a18cce7315f1e6946bee766 to your computer and use it in GitHub Desktop.
What is a vector?

Abstractly, a vector is a direction and a length in some space. For example, in the space around you, a vector might be "from your face to your cup". There's multiple ways you can represent that vector with numbers. For example, it might be 1 foot forward, 1ft to the left, 1 ft down in the way you're facing, or it could be 1.4 ft north, 0ft east, 1ft down. Or even 45°left, 45°down, 1.7ft away. These are all equivalent descriptions of the same vector assuming you're facing northeast. In the space around you, you always need exactly three numbers to describe a vector, it's 3-dimensional. The numbers describe the vector, but the vector itself is just direction+length, no matter how you decide to put that into numbers.

Now there's other spaces that you can also use vectors in, that need a different number of numbers to describe. For example, the "what's the sportsball doing right this instant" is a 6-dimensional space: you need 3 numbers to describe where it is, and another 3 to describe in which direction and how fast it's flying. If you have a radio with two dials, you need 2 numbers to describe any possible setting of the radio.

Machine learning (or as they now call it, AI) very often works by making up an abstract space with a lot of dimensions (thousands, up to millions) and figuring out a clever way to put the things your machine is learning, like words, into that space. This is called an embedding because you're putting all those words into that space. Intuitively, you can think of words being "close together" in this space corresponding to them being "similar". Again here, you're going to represent each word as some number of numbers, although the "direction" each individual number represents has no particular meaning. To generate a sentence from these embeddings, in the simplest models, you'll have some rules that transform one vector (the current word) into another (the next word). Training is finding the best set of rules for that. LLMs do the same, just with more dimensions, more complicated rules, and taking into account more than one starting word.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment