layout

title

date

author

authorAvatar

image

tags

blog

Explore Memoization

2019-07-18 16:20:16 UTC

Toan Nguyen Gia

/img/uploads/t0d8dq73q-u9hr7hu1m-706d3758d60e-512.jpeg

/img/uploads/deco-iskusstvoa-helix.jpg

Tech

Javascript

Functional programming

You may heard about the term memoization before. It's simply a technique that improves performance of an expensive function, by storing and returning the cached result if the same inputs occur. The minimal implementation of a memoize function can be as follow, in case of your targeted function only receives one argument:

const memoize = f => {
  let cache = {};
  return arg => {
    if (cache[arg]) return cache[arg];
    const result = f(arg);
    cache[arg] = result;
    return result;
  };
};

Lets try applying it to a recursive function:

// Factorial
const fac = n => {
  log(); // Keep track of number of recursions
  if (n <= 1) return 1;
  return n * fac(n - 1);
};

const memoizedFac = memoize(fac);

memoizedFac(6);
memoizedFac(6);
expect(log).toHaveBeenCalledTimes(6); // Passed

The result of 6! is cached so in the second call, there is no recursion happens. It may look ok until you try this:

memoizedFac(6);
memoizedFac(6);
memoizedFac(7);
expect(log).toHaveBeenCalledTimes(7); // Failed, it's 13

Because 6! is already cached, should the total number of recursions is 7 only? It turns out that since in our factorial implementation, we try to call the non-memoized version in its body. Let's fix that:

// Non-memoized version
const fac = n => {
  if (n <= 1) return 1;
  return n * fac(n - 1);
};

// Memoized version
const memoizedFac = memoize(n => {
  if (n <= 1) return 1;
  return n * memoizedFac(n - 1);
});

memoizedFac(6);
memoizedFac(6);
memoizedFac(7);
expect(log).toHaveBeenCalledTimes(7); // Passed

Problem solved, easy peasy!

But it has one down side, that is we are copying the logic. What if I want to maintain the logic in just one place, but also have two versions of factorial, memoized and non-memoized. Is it possible?

You can stop here for a while and try it yourself before jumping to the next section. This is where the fun begins.

TL;DR

My favorite solution is:

const fac         = y(_fac);          // Non-memoized version
const memoizedFac = y(memoize(_fac)); // Memoized version

Where y, memoize and _fac are defined as follow:

const y = f => (g => n => f(g(g))(n))(g => n => f(g(g))(n));

// Factorial logic is here
const _fac = anotherFac => n => {
  if (n <= 1) return 1;
  return n * anotherFac(n - 1);
};

// Memoize is a bit different now
const memoize = f1 => {
  let cache = {};
  return f2 => arg => {
    if (cache[arg]) return cache[arg];
    const result = f1(f2)(arg);
    cache[arg] = result;
    return result;
  };
};

The function y is called y-combinator. It may looks confusing, and you may wonder why we need it. Well, actually there are other approaches, but the beauty of this is y-combinator allows you to decorate a recursive function by composition, so you can easily implement other strategies, like trampolining (to prevent stack overflow).

If you are curious about what y-combinator is, or how I came up with this solution, you can continue to the next section. It's gonna be very long and contains a lot of codes, so prepare yourself.

Start from zero

Let's forget everything about y-combinator above and make it from scratch. Take a look back at this code snippet first:

// Non-memoized version
const fac = n => {
  if (n <= 1) return 1;
  return n * fac(n - 1);
};

// Memoized version
const memoizedFac = memoize(n => {
  if (n <= 1) return 1;
  return n * memoizedFac(n - 1);
});

The difference between those definitions is the function called inside the body, fac and memoizedFac. To avoid copying logic, we can move that function out as an parameter. Base on that idea, we can create a special function, called _fac:

const _fac = anotherFac => n => {
  log();
  if (n <= 1) return 1;
  return n * anotherFac(n - 1);
};

As you can see, _fac doesn't look like a normal factorial function. In fact, it's a not-working-yet factorial function. It requires another factorial function, in order to produce a factorial function! I will call it a factorial factory (prefix with a dash). So if I want to create a memoized factorial using _fac, I should pass anotherFac to it which should be already memoized, right?

Let's try using it to see what happens next:

const memoize = f => {
  let cache = {};
  return arg => {
    if (cache[arg]) return cache[arg];
    const result = f(arg);
    cache[arg] = result;
    return result;
  };
};

const memoizedFac = memoize(_fac(anotherMemoizedFac)); // anotherMemoizedFac is not defined

There is no definition of anotherMemoizedFac yet. We can simply create it before memoizedFac:

const anotherMemoizedFac = memoize(_fac(anotherMemoizedFac1)); // anotherMemoizedFac1 is not defined
const memoizedFac = memoize(_fac(anotherMemoizedFac));

Just keep doing and at some points, it will start running:

const anotherMemoizedFac2 = memoize(_fac()); // Fall to the n <= 1 condition
const anotherMemoizedFac1 = memoize(_fac(anotherMemoizedFac2));
const anotherMemoizedFac = memoize(_fac(anotherMemoizedFac1));
const memoizedFac = memoize(_fac(anotherMemoizedFac));

expect(memoizedFac(3)).toEqual(6);    // Passed
expect(memoizedFac(3)).toEqual(6);    // Passed
expect(memoizedFac(4)).toEqual(24);   // Passed
expect(log).toHaveBeenCalledTimes(4); // Failed, it's called 7 times

Great, no error, and some tests get passed. But seem like memoizing is not working properly. The reason is each time we call memoize, it creates a different caching storage. Keep in mind that memoize must be called just once, we can re-write the codes as follow:

const _fac = anotherFac => n => {
  log();
  if (n <= 1) return 1;
  return n * anotherFac(n - 1);
};

// Change memoize a bit so it fits the _fac factory
const memoize = f1 => {
  let cache = {};
  return f2 => arg => {
    if (cache[arg]) return cache[arg];
    const result = f1(f2)(arg);
    cache[arg] = result;
    return result;
  };
};

// This new factory return a memoized factorial function
const _memoizedFac = memoize(_fac);

const anotherMemoizedFac2 = _memoizedFac();
const anotherMemoizedFac1 = _memoizedFac(anotherMemoizedFac2);
const anotherMemoizedFac = _memoizedFac(anotherMemoizedFac1);
const memoizedFac = _memoizedFac(anotherMemoizedFac);

expect(memoizedFac(3)).toEqual(6);    // Passed
expect(memoizedFac(3)).toEqual(6);    // Passed
expect(memoizedFac(4)).toEqual(24);   // Passed
expect(log).toHaveBeenCalledTimes(4); // Passed!!!

And yes, it works! However it's not actually usable now, because it only apply for 4. If I want to calculate 6!, I have to manually add anotherMemoizedFac3 and anotherMemoizedFac4. Let's see what we can refactor in the next section.

Y-combinator

You can notice that the pattern to create memoized factorial is repeated. We can split it into a function, called generator, because I'm bad at naming things:

const _memoizedFac = memoize(_fac);

const anotherMemoizedFac2 = _memoizedFac();
const anotherMemoizedFac1 = _memoizedFac(anotherMemoizedFac2);
const anotherMemoizedFac = _memoizedFac(anotherMemoizedFac1);

const generator = () => _memoizedFac(anotherMemoizedFac);

const memoizedFac = generator();

Replace anotherMemoizedFac with generator() as well:

const _memoizedFac = memoize(_fac);

const generator = generator => _memoizedFac(generator(generator)); // Maximum call stack size exceeded

const memoizedFac = generator(generator);

We got stack overflow, since Javascript is not lazy evaluation. There is a tip to avoid this problem:

const _memoizedFac = memoize(_fac);

const generator = generator => number => _memoizedFac(generator(generator))(number);

const memoizedFac = generator(generator);

Create another function called y, to move _memoizedFac out as a parameter:

const _memoizedFac = memoize(_fac);

const y = _memoizedFac => {
  const generator = generator => number => _memoizedFac(generator(generator))(number);
  return generator(generator);
};

const memoizedFac = y(_memoizedFac);

Rename to make it shorter:

const _memoizedFac = memoize(_fac);

const y = f => {
  const g = g => n => f(g(g))(n);
  return g(g);
};

const memoizedFac = y(_memoizedFac);

Final step:

const y = f => (g => n => f(g(g))(n))(g => n => f(g(g))(n));

const memoizedFac = y(memoize(_fac));

Done! This is the completed code snippet so you can try it yourself:

const log = jest.fn();

const y = f => (g => n => f(g(g))(n))(g => n => f(g(g))(n));

const _fac = anotherFac => n => {
  log();
  if (n <= 1) return 1;
  return n * anotherFac(n - 1);
};

const memoize = f1 => {
  let cache = {};
  return f2 => arg => {
    if (cache[arg]) return cache[arg];
    const result = f1(f2)(arg);
    cache[arg] = result;
    return result;
  };
};

const fac = y(_fac);                  // Non-memoized version
const memoizedFac = y(memoize(_fac)); // Memoized version

expect(fac(3)).toEqual(6);
expect(fac(3)).toEqual(6);
expect(log).toHaveBeenCalledTimes(6);

log.mockClear();

expect(memoizedFac(3)).toEqual(6);
expect(memoizedFac(3)).toEqual(6);
expect(memoizedFac(4)).toEqual(24);
expect(log).toHaveBeenCalledTimes(4);

As you can see, _fac alone is quite useless. Y-combinator helps function which is not-recursive-yet, like _fac, become recursive. This is also why a startup accelerator names their company after it, as in the FAQs:

Why did you choose the name “Y Combinator?”

The Y combinator is one of the coolest ideas in computer science. It's also a metaphor for what we do. It's a program that runs programs; we're a company that helps start companies.

Conclusion

In this article, we have gone through the concept of memoization, its problem with recursive function, and a fancy solution for that problem using y-combinator. Although most of programming languages already support recursion, there are still cases for y-combinator to shine. There are also some good reads about this topic, On recursive function and Why Y. So, enjoy!

toan2406/explore-memoization.md

TL;DR

Start from zero

Y-combinator

Conclusion