vdeturckheim/comparing_async_local_context.md

Last active January 24, 2020 16:42

Star () You must be signed in to star a gist
Fork () You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/vdeturckheim/5b0d2de14430c3fd4a66e81bc03eddd5.js"></script>
Save vdeturckheim/5b0d2de14430c3fd4a66e81bc03eddd5 to your computer and use it in GitHub Desktop.

Raw

DISCLAIMER: This was a working document now availabe in nodejs/TSC#807. No comments on the current gist will be accepted

Executive summary: Introducing a CLS-like API to Node.js core

Context: the 3 PRs and 1 consensus

On the TSC meeting of 2020-JAN-22, the TSC reached consensus regarding the need to have an Asynchronous Storage API in core.

Three PRs related to this topic are currently open, out of simplicity, we will refer to them by a name as of:

PR	Author	Name
#30959	@Qard	executionAsyncResource
#31016	@puzpuzpuz	AsyncLocal
#26540	@vdeturckheim	AsyncContext

The AsyncLocal proposal relies on the executionAsyncResource API. The AsyncContext proposal aims at working without executionAsyncResource, but should be rebased over executionAsyncResource when it is merged. A userland version of this API is available for testing purpose.

The rest of this document aims at comparing the AsyncLocal and the AsyncContext proposals. Both of these proposal introduce a CLS-like API to Node.js core.

Naming

Both proposals introduce a new class in the Async Hooks module. One is named AsyncContext and the other is named AsyncLocal.

Also, the name AsyncStorage has been discussed earlier.

This topic can easily be covered as a consensus on any name can be ported to any proposal.

.NET exposes an AsyncLocal class.

Interfaces

AsyncLocals and AsyncContexts expose different interfaces:

AsyncContexts

const asyncContext = new AsyncContext();
// here context.getStore() will return undefined
asyncContext.run((store) => {
    // store is a new instance of Map for each call to `run`
    // from here asyncContext.getStore() will return the same Map as store
    const item = {};
    store.set('a', item);
    asyncContext.getStore().get('a'); // returns item
    asyncContext.exit(() => {
        // from here asyncContext.getStore() will return undefined
        asyncContext.getStore(); // returns undefined
    });
});

AsyncLocal

const asyncLocal = new AsyncLocal();
const item = {};
asyncLocal.get(); // will return undefined
asyncLocal.set(item); // will populate the store
asyncLocal.get(); // returns item
asyncLocal.remove(); // disable the AsyncLocal
asyncLocal.get(); // will return undefined
asyncLocal.set(item); // will throw an exception

Synchronous vs. Asynchronous API

As the examples show, AsyncLocal exposes a synchronous API and AsyncContext exposes an asynchronous one.

The synchronous API is unopinionated and is very async/await friendly.

The asynchronous API defines a clear scope regarding which pieces of code will have access to the store and which ones will not be able to see it. Calling run is an asynchronous operation that executes the callback in a process.netxTick call. This is intended in order to have no implicit behavior that were a major issue according to the domain post mortem. It is expected that the API will be used to provide domain-like capabilities.

Eventually, a synchronous API could be added to AsyncContext when the executionAsyncResource rebase is done. In this case, documentation will clearly state that using run is the prefered method and that synchronous methods have less explicit behaviors.

Eventually, an asynchronous API could be added to AsyncLocal if there is a need for it.

Stopping propagation

AsyncContext exposes a method named exit(callback) that stops propagation of the context through the following asynchronous calls. Asynchronous operations following the callback cannot access the store.

With AsyncLocal, propagation is stopped by calling set(undefined).

Disabling

An instance of AsyncLocal can be disabled by calling remove. It can't be used anymore after this call. Underlying resources are freed when the call is made, i.e. no strong references for the value remain in AsyncLocal and the internal global async hook is disabled (unless there are more active AsyncLocal exist).

AsyncContext does not provide such method.

Store type

AsyncContext AsyncContext.prototype.getStore will return:

undefined
- if called outside the callback of run or
- inside the callback of exit
an instance of Map

AsyncLocal AsyncLocal.prototype.get will return:

undefined if AsyncLocal.prototype.set has not been called first
any value the user would have given to AsyncLocal.prototype.set

Store mutability

AsyncContext propagates it's built in mutable store which is accessible in whole async tree created.

AsyncLocal uses copy on write semantics resulting in branch of parts of the tree by setting a new value. Only mutation of the value (e.g. changing/setting a Map entry) will not branch off.

Overall philosophy

AsyncLocal is a low-level unopinionated API that aims at being used as a foundation by ecosystem packages. It will be a standard brick upon which other modules are built.

AsyncContext is a high-level user-friendly API that cans be used out of the box by Node.js users. It will be an API used directly by most users who have needs for context tracing.

Next steps

After an API (AsyncContext, AsyncLocal or another potential API) is merged, this roadmap might be followed:

Releasing the API in the current version of Node.js (as experimental)
Backporting the API to currently supported versions of Node.js (as experimental)
Defining conditions for the API to get out of experimental
Moving the API to its own core module and alias it from Async Hooks (tentatively for Node.js 14)
Move the API out of experimental (tentatively when Node.js 14 becomes LTS)

This will enable us to iterate over Async Hook and maybe bring breaking changes to it while still providing an API filling most of Node.js users need in term of tracing through a stable API.

puzpuzpuz commented Jan 24, 2020

Thanks for writing this comparison document, Vladimir! I have a couple of comments.

An instance of AsyncLocal can be disabled by calling remove. It can't be used anymore after this call.

Proposed change:
An instance of AsyncLocal can be disabled by calling remove. It can't be used anymore after this call. Underlying resources are freed when the call is made, i.e. no strong references for the value remain in AsyncLocal and the internal global async hook is disabled (unless there are more active AsyncLocal exist).

AsyncLocal does not provide a method to stop propagation.

Proposed change:
asyncLocal.set(undefined) stops context propagation. So, there is no need for a separate method in AsyncLocal.

Flarna commented Jan 24, 2020

Regarding name: Maybe add a link to .NET AsyncLocal to let TSC know that this name is already used for something similar but with significant differences.

Regarding Async/Sync API:
I think it should be noted that the async API has also it's pitfalls and I think they should be mentioned in some doc (either here or in the node docs).

In general I like APIs like domain.run(fn) or asyncResource.runInAsyncScope() which avoid the need to exception safe match enter/exit calls. If enter/exit are exposed additionally it's should be documented that run is preferred.
But there is an important difference between how this works for domain/asyncResource and AsyncContext: domain/asyncResource call the callback synchron. Consider following code:

const { AsyncContext } = require('async_hooks')
const domain = require('domain');

const d = domain.create();
const asyncContext = new AsyncContext()

function withContext(cb, ...args) {
  asyncContext.run((store) => {
    cb(...args)
  })
}

function doSomething(s) {
  console.log(s)
}

function foo() {
  doSomething("foo: first")
  doSomething("foo: second")
  doSomething("foo: last")
}

function fooWithContext() {
  doSomething("fooWithContext: first")
  withContext(doSomething, "fooWithContext: second")
  doSomething("fooWithContext: last")
}

function foWithDomain() {
  doSomething("foWithDomain: first")
  d.run(doSomething, "foWithDomain: second")
  doSomething("foWithDomain: last")
}

foo()
fooWithContext()
foWithDomain()

results in:

foo: first
foo: second
foo: last
fooWithContext: first
fooWithContext: last
foWithDomain: first
foWithDomain: second
foWithDomain: last
fooWithContext: second

As you can see sequence has changed for AsyncContext but not for Domains.

Similar if an exception is thrown in doSomething() a try/catch around withContext() or in the closure past to asyncContext.run() will not help - it will end up in and uncaught exception.

Regarding AsyncLocal does not provide a method to stop propagation.: This can be done by calling asyncLocal.set(undefined).

What I miss are the differences in propagation but maybe that's going too deep into the details:

AsyncContext propagates it's built in mutable store which is accessible in whole async tree created.
AsyncLocal uses copy on write semantics resulting in branch of parts of the tree by setting a new value. Mutation of the value (e.g. if it is a Map/Object) is not triggering cow.
AsyncContext uses trigger path, AsyncLocal uses execution path (Main problem is anyway that neither is the "always correct" approach)

Regarding next steps: I would mention also that adding this to core could improve the situation with user land modules not using AsyncResource.
Maybe it's out of scope for this discussion but I think we should also aim to get AsyncResource out of experimental - just keep the low level hooks experimental.

Author

vdeturckheim commented Jan 24, 2020

@puzpuzpuz thanks for the respones:

regarding context propagation (set(undefined)) -> I did not see it in in the PR, did I miss it?
regarding:

An instance of AsyncLocal can be disabled by calling remove. It can't be used anymore after this call. Underlying resources are freed when the call is made, i.e. no strong references for the value remain in AsyncLocal and the internal global async hook is disabled (unless there are more active AsyncLocal exist).

I think we should still highlight that the following calls to set will throw right? Can you update the paragraph with this?

@Flarna

name -> good point
sync/async

In general I like APIs like domain.run(fn) or asyncResource.runInAsyncScope() which avoid the need to exception safe match enter/exit calls. If enter/exit are exposed additionally it's should be documented that run is preferred.
This is how I see it going eventually

Regarding the order: this is my intended goal here, a call to run is asynchronous and should be handled as a call to a database or a nextTick. Do you think the document should highlight this more? One of the goal here is actually to avoid this domain behavior that created issues for a lot of users.

Regarding exceptions handling, it would still be possible to add an exception handler as we do in the current implementation of domain, this would result in "cleaner" domains.
I can add a part in the doc for this.

asyncLocal.set(undefined)

Ack, I probably missed it in the doc

Missing points:

AsyncContext propagates it's built in mutable store which is accessible in whole async tree created.
AsyncLocal uses copy on write semantics resulting in branch of parts of the tree by setting a new value. Mutation of the value (e.g. if it is a Map/Object) is not triggering cow.

Good catch. I will add a part with this. Do you mind if I used thie quote directly?

Path

AsyncContext uses trigger path, AsyncLocal uses execution path (Main problem is anyway that neither is the "always correct" approach)

I am not sure all TSC members are clear on this. Do uou want to either:

add a paragraph here
comment in the TSC issue when I create it
?
Next steps

I would mention also that adding this to core could improve the situation with user land modules not using AsyncResource.
Would this work?
Having such API in core will make context tracking standard and simpler for all ecosystem.

Maybe it's out of scope for this discussion but I think we should also aim to get AsyncResource out of experimental - just keep the low level hooks experimental.

This one is more on @Qard who is championing the experimental status of Async Hooks.

puzpuzpuz commented Jan 24, 2020

@vdeturckheim

regarding context propagation (set(undefined)) -> I did not see it in in the PR, did I miss it?

I'm talking about calling asyncLocal.set with undefined as the input value.

I think we should still highlight that the following calls to set will throw right? Can you update the paragraph with this?

The code snippet in the document already mentions this behavior. I don't think we need to describe it twice.

Author

vdeturckheim commented Jan 24, 2020

I'm talking about calling asyncLocal.set with undefined as the input value.

I meant in the documentation part of the PR ^^

puzpuzpuz commented Jan 24, 2020

@Flarna

Maybe add a link to .NET AsyncLocal to let TSC know that this name is already used for something similar but with significant differences.

Here is the link: https://docs.microsoft.com/en-us/dotnet/api/system.threading.asynclocal-1?view=netframework-4.8

What .NET has is quite similar to what's implemented in #31016 (referred as AsyncLocal in the document). What are those significant differences that you see?

I would mention also that adding this to core could improve the situation with user land modules not using AsyncResource.

Yes, that's one of the major reasons for having CLS API in core. We definitely should include it into the document.

AsyncContext uses trigger path, AsyncLocal uses execution path

As far as I understand, the plan is to migrate AsyncContext to execution path once #30959 lands. But it may be a good idea to mention the current difference.

puzpuzpuz commented Jan 24, 2020

I meant in the documentation part of the PR ^^

The PR didn't go through review, so there are some things that might be improved.

Author

vdeturckheim commented Jan 24, 2020

I would mention also that adding this to core could improve the situation with user land modules not using AsyncResource.

Yes, that's one of the major reasons for having CLS API in core. We definitely should include it into the document.

On second tought, this has already been discussed and decided by the TSC (see consensus link)

puzpuzpuz commented Jan 24, 2020

@vdeturckheim

Thanks for the updates.

With AsyncLocal, propagation is stopped byt calling set(undefined)

A couple of minor misprints here. Should be:
With AsyncLocal, propagation is stopped by calling set(undefined).

Flarna commented Jan 24, 2020

What .NET has is quite similar to what's implemented in #31016 (referred as AsyncLocal in the document). What are those significant differences that you see?

This was a comment for the general naming paragraph not specific to your implementation. I agree that your variant is close to .NET so name would be ok for me. But I think renaming the current AsyncContext API to AsyncLocal would be a bad match. My feeling is that naming may be discussed independent of implementation selected.

In general I like APIs like domain.run(fn) or asyncResource.runInAsyncScope() which avoid the need to exception safe match enter/exit calls. If enter/exit are exposed additionally it's should be documented that run is preferred.
This is how I see it going eventually

Regarding the order: this is my intended goal here, a call to run is asynchronous and should be handled as a call to a database or a nextTick. Do you think the document should highlight this more? One of the goal here is actually to avoid this domain behavior that created issues for a lot of users.

Yes, I think this should be more highlighted. At least for me it's not intuitive that wrapping a function in a runWith(cb) like API changes the program flow and not just the context seen within the callback.

Regarding exceptions handling, it would still be possible to add an exception handler as we do in the current implementation of domain, this would result in "cleaner" domains.

Maybe but I fear the complexity...
It's not only about exceptions, also the return value is "lost". Or consider that doSomething above would be an async function the ability to await it gets lost.
Most likely this can be improved in followup work and it's not up to this discussion to agree on this. My point is that this is an important difference that should be clearly noted.

Good catch. I will add a part with this. Do you mind if I used thie quote directly?

Sure, feel free to quote.

AsyncContext uses trigger path, AsyncLocal uses execution path (Main problem is anyway that neither is the "always correct" approach)

I am not sure all TSC members are clear on this.

Thinking once more about this and most likely it's better to skip this for the current discussion.

Author

vdeturckheim commented Jan 24, 2020 •

edited

Loading

@puzpuzpuz @Flarna

I updated the gist based on the 3 last comments.

Author

vdeturckheim commented Jan 24, 2020

The document is already 4 pages long. We probably should not make it any longer at this point

Flarna commented Jan 24, 2020 •

edited

Loading

Mutation of the value (e.g. if it is a Map/Object) will not impact other branches.

This is actually not correct, only setting a new value will impact other branches. What about Only mutation of the value (e.g. changing/setting a Map entry) will not branch off.?

Author

vdeturckheim commented Jan 24, 2020

@Flarna, I updated but I am not sure this is entierly clear for a non-expert reader.

Flarna commented Jan 24, 2020

Thanks!
Yes, I agree that not everything is easy to understand here. It's mostly because the topic itself is complex. This is intended as input for TSC members which are for sure no newbies.

puzpuzpuz commented Jan 24, 2020

@vdeturckheim

Eventually, a synchronous API could be added to AsyncContext when the executionAsyncResource rebase id done.

A typo: id done -> is done.

Author

vdeturckheim commented Jan 24, 2020 •

edited

Loading

If the document works for you in this state, I will submit it tonight or tomorrow and ping you in the issue.

Flarna commented Jan 24, 2020

Fine for me.

vdeturckheim/comparing_async_local_context.md

Executive summary: Introducing a CLS-like API to Node.js core

Context: the 3 PRs and 1 consensus

Naming

Interfaces

Synchronous vs. Asynchronous API

Stopping propagation

Disabling

Store type

Store mutability

Overall philosophy

Next steps

puzpuzpuz commented Jan 24, 2020

Flarna commented Jan 24, 2020

vdeturckheim commented Jan 24, 2020

puzpuzpuz commented Jan 24, 2020

vdeturckheim commented Jan 24, 2020

puzpuzpuz commented Jan 24, 2020

puzpuzpuz commented Jan 24, 2020

vdeturckheim commented Jan 24, 2020

puzpuzpuz commented Jan 24, 2020

Flarna commented Jan 24, 2020

vdeturckheim commented Jan 24, 2020 • edited Loading

vdeturckheim commented Jan 24, 2020

Flarna commented Jan 24, 2020 • edited Loading

vdeturckheim commented Jan 24, 2020

Flarna commented Jan 24, 2020

puzpuzpuz commented Jan 24, 2020

vdeturckheim commented Jan 24, 2020 • edited Loading

Flarna commented Jan 24, 2020

vdeturckheim commented Jan 24, 2020 •

edited

Loading

Flarna commented Jan 24, 2020 •

edited

Loading

vdeturckheim commented Jan 24, 2020 •

edited

Loading