Coroutines

Field	Value
DIP:	(number/id -- assigned by DIP Manager)
Author:	Richard (Rikki) Andrew Cattermole firstname@lastname.co.nz
Implementation:	(links to implementation PR if any)
Status:	Draft

Abstract

Stackless coroutines are a state machine representation of a function that enables acknowledging and blocking actions dependent on a function. When paired with advanced event loops such as IOCP it allows for a high throughput system to handle event handling.

Rationale
Prior Work
Description
Breaking Changes and Deprecations
Reference
Copyright & License
Reviews

Rationale

Implemented in druntime is a stackful coroutine a "Fiber". This has a high overhead and unclear yield points in the state machine of a function call stack. For this implementation to work it requires two distinct allocations with one acting as a guard page. This limits the number of fibers you may have in a single process well below the desired number of instances (around 32k).

Due to missing prevention of escaping of thread local resources, it is not possible to safely move a fiber to another thread currently. While this is solvable it is currently not implemented.

On the other hand, stackless coroutines which have been shown to not have the known downsides of stackful coroutines. But as a consequence of knowing the state machine requires language support.

Prior Work

Many languages have varying levels of support for stackless coroutines. They typically come under the guise of the async and await keywords. The concept of using the await keyword to wait for a condition to be signalled before continuing dates back to the 1973 paper Concurrent Programming Concepts.

In 1968 Dijkstra published Co-operating sequential processes which introduced the notion of parallel begin/end that later authors adopted by using the keywords cobegin and coend. In the C family, the begin and end markers for a block of code in a function are well defined with the help of braces, for the languages of the time they did not have such clear delineations at the function level and were instead treated as a scope statement.

With Rust, the async keyword produces a state machine in the form of a stackless coroutine that must be moveable between threads. Each coroutine is represented by a library type that they define called Future (a trait). From there you explicitly pick how to wait upon the Future to complete. Common options are to await, or to block_on which is a library function provided by the executor. Executors are a library concept that provides a thread pool to execute coroutine upon. It performs user mode scheduling and may be part of a system event loop.

Not all designs are equal when a coroutine language feature interacts with an event loop. Rust uses polling to determine if a future has been completed. This works quite well on Posix-based systems which do have a polling methodology that does not depend upon direct uploads of event handles each time you poll. However, Windows does not expose this functionality. This leads to Rust using internal undocumented behavior that tends to break, to enable their await feature to work. This behavior can be implemented alternatively with a condition variable which is a known possible solution within the D language.

In some languages, the coroutine is heavily tied to their standard library, for C# it is tied to its runtime's Task class. This is the nominal representation of a coroutine object. It can be cancelled and will be automatically created for you as long as you are annotated appropriately. Multiple returns are not supported and awaiting upon a task will result in the return value. There is protection built into the language against escaping memory.

Another approach that can be taken is to hide the task at the variable level by utilizing a syntax upon a variable declaration like with Swift's async let binding feature which was introduced by Swift 5.5.

Description

A coroutine is a function that has been converted into a descriptor "object", it need not be usable as a regular function after this has occurred. This descriptor object contains the state machine that represents the coroutine, it's state struct that contains variables that cross the yield divide and any input-output.

As part of this proposal, the consideration of what language features cannot be sliced into the state machine will not be considered. The intent is for the entire language to work as a coroutine, however this will likely not be the case. As a result, it is not in the scope of this proposal to attempt to resolve these issues. However, if a language feature is not supported in a coroutine it must be documented.

The language does not dictate which library coroutine types you must use. They can be any struct or class, this introduces some complications with templates as it results in the design being heavily designed towards introspection within the type system.

Do note it is out of the scope of this proposal to solve registration of coroutine functions automatically although the facilities on the coroutine side are provided.

Library coroutine types do have language integration, which enables non-coroutines to perform blocking waits upon a coroutine automatically.

Example Usage

Defining and returning values from your coroutine:

// acceptCo has one function parameter that accepts a coroutine descriptor

acceptCo((Param name, ...) {
	...

	// multiple returns!
	@async return value;

	// final return that completes the coroutine
	return value;
});

acceptCo(&myCo);

ReturnType myCo(Param name, ...) @async {
	...

	// multiple returns!
	@async return value;

	// final return that completes the coroutine
	return value;
}

Using a coroutine in your defined coroutine:

acceptCo(() {
	auto co = myCo(arg);
	// compiler injected yield
	return 2;
});

Using a coroutine in a function that is not a coroutine:

int main() {
	auto co = myCo(arg);
	// compiler injected blocking access
	return 0;
}

Syntactical

There are four language changes, an additional attribute and an operator overload, these are:

A new attribute is added to core.attributes (@isasync) that will enable a type when used as a UDA on a function to become a coroutine.

import core.attributes : isasync;

@isasync
struct Route {
	string path;
}

@Route("/")
void myRoute(Request, Response) {
	// This is a coroutine!
}

The four language changes are the introduction of a new attribute @async that designates a function as a coroutine.

Return myCo(Args) @async {
	return Return.init;
}

If @async is placed on a struct or class, this is designated as a library coroutine type. This is only required for non-instanceable types.

When a function is a coroutine is a method this pointer is treated as a regular function parameter that may be specified explicitly. For nested functions, the captured enclosure is treated the same although that would need further and explicit additions to make work.

The ability to check if a type is a coroutine descriptor object is(T : __descriptorco).

static foreach(m; __traits(allMembers, Context)) {
	static if (is(__traits(getMember, Context, m) : __descriptorco)) {
		pragma(msg, "Member ", m, " is a coroutine!");
	}
}

Or to verify that a template parameter is a coroutine descriptor object (T : __descriptorco, ...).

struct GenericCoroutine {
	this(CoroutineDescriptor : __descriptorco)(CoroutineDescriptor, ...);
}

Lastly, the introduction of multiple returns to enable generator coroutines @async return typeof(return).init;.

int myCo() @async {
	@async return 5;
	return 2;
}

Should a multiple return take place, the function encompassing it will infer that it is a coroutine.

The operator overload opConstructCo is an implicitly called static method, that when seen on a struct or class designates it as a library coroutine type.

struct InstantiableCoroutine(Return, Args...) {
	static InstiableCoroutine opConstructCo(CoroutineDescriptor : __descriptorco)(CoroutineDescriptor);
}

void acceptCo(InstantiableCoroutine!(void, Socket) co);

acceptCo((Socket socket) {
	...
});

It can also be used to construct a coroutine object using variable assignment or by returning a function with a typed coroutine library type on the outer function.

void main() {
	InstantiableCoroutine!(int, int, int) var = &myCo;
}

int myCo(int a, int b) @async {
	return a + b;
}

InstantiableCoroutine!(int, int, int) returnedCo() {
	return (int a, int b) {
		return a + b;
	};
}

The full grammar changes:

AtAttribute:
+    '@' "async"

TypeSpecialization:
+    "__descriptorco"

TemplateTypeParmeterSpecialization:
+    ':' "__descriptorco"

ReturnStatement:
+    '@' "async" "return" Expression|opt ';'

Template Inference

The following template inference must work:

struct InstantiableCoroutine(Return, Args...) {
	static InstiableCoroutine opConstructCo(CoroutineDescriptor : __descriptorco)(CoroutineDescriptor) {
		return InstatiableCoroutine.init;
	}
}

void acceptCo(Return, Arguments...)(InstatiableCoroutine!(Return, Arguments) co) {
	static assert(Arguments.length == 1);
	static assert(is(Arguments[0] == int));
	static assert(is(Return == string));
}

acceptCo((int value) {
	return "hello!";
});

This is described due to the introduction of the implicitly constructing operator overload function opConstructCo which may not be friendly towards template inference.

Coroutine Descriptor Type/Object

The coroutine descriptor object is accessed by "taking the pointer to" a coroutine function or by inferring it as part of a function call that takes a closure as an argument.

The tags:

0 .. int.max is reserved for functions in the state machine.
-1 is reserved for a completed coroutine.
-2 is reserved for an erroneously completed coroutine.

struct _ {
	alias ReturnType = ...;
	alias Parameters = ...;
	alias VarTypes = ...;
	sumtype ExceptionTypes = ...; // or just an alias if we don't have a sumtype
	sumtype WaitingOn = ...; // if no sumtypes this will have to be library

	static struct State {
		int tag;
		Parameters parameters;
		ExceptionTypes exception;
		
		WaitingOn waitingOnCoroutine;
		
		bool haveValue;
		ReturnType value;

		VarTypes vars;
	}

	alias Functions = [f1, f2];

	static void f1(ref State);
	static void f2(ref State);
}

Yielding

Yielding from a coroutine comes in four forms:

Completed with error.
Completed without error and may have a value.
Multiple returns of values.
Waiting on another coroutine to complete.

When a yield occurs with completion, this does not trigger a slicing of the function body. This can be implemented using a nominal function return after adjusting the state.

When multiple returns of values occur or waiting on another coroutine to complete this requires a slicing of the function body.

Like with synchronous support, yields of waiting for another coroutine to complete are batched.

acceptCo(() {
	auto co1 = ...;
	auto co2 = ...;
	// yield co1
	// yield co2
});

This is not always possible, unfortunately in some cases an object may be an argument to a gotten coroutine and used again in another, in which case batching cannot occur in full.

acceptCo(() {
	Socket so = ...;
	File f = ...;
	
	auto co1 = so.readUntil("\n");
	auto co2 = f.readUntil("\n");
	// yield co1
	// yield co2
	
	auto co3 = so.readUntil("\n");
	// yield co3
});

If the coroutine that is being yielded upon has the method asGeneric, it is called and the returned object is then what gets yielded. This is to minimize the number of library coroutine types that have to be supported as part of the coroutine scheduler. More importantly the "generic coroutine" is the most parent type that represents this coroutine object without the need for templates.

If a coroutine has a return type with multiple returns, it may indicate completed but without any value by using a return statement without an expression return;.

Synchronous Support

To support automatically blocking for completion or a value, may appear to be a valuable asset to a coroutine proposal, however, for the usage side of a coroutine such as a future, this would be true. If you consider the implementation side of an event loop or a data processing library, this will activate falsely and result in wrong behaviour.

At the time of this proposal, no integration into synchronous functions is proposed, although acknowledgement that integration would be a good addition for the future.

Such integration would include a standardised blocking function, detection of a completed block, and automatic calling into each.

Silently Creating Instances

Being able to call into a coroutine from a synchronous function enables unifying on a single paradigm. However, there are downsides to having full synchronous support. It can produce silent logic errors when event loops are at play.

Instead of performing an inline of the asynchronous function into the synchronous function and then performing the above synchronous integration, it is better to simply error out.

void main() {
	int var = someFunc(); // Error: `someFunc` is a coroutine, but has not been instantiated into an object so cannot be yielded upon.
}

int someFunc() @async {
	return 0;
}

If you did indeed want coroutine support, construct and make an instance of the coroutine.

void main() {
	InstantiatbleCoroutine!(int) descriptor = &someFunc;
	Future!int co = descriptor.makeInstance();
}

Example Descriptor Object

Earlier it was stated that we would not consider how slicing and dicing a function into a coroutine descriptor object would work. However, for completeness an example one is provided.

The example is a simple HTTP client, it is simplified down and ignores some types on the socket side for readability.

void clientCO(Socket socket) @async {
	writeln("Connection has been made");

	socket.write("GET / HTTP/1.1\r\n");
	socket.write("Accept-Encoding: identity\r\n");
	socket.write("\r\n");

	while(Future!string readLine = socket.readUntil("\n")) {
		if (!readLine.isComplete) {
			writeln("Not alive and did not get a result");
			return;
		}
		
		string result = readLine.result;
		writeln(result);

		if (result == "</html>") {
			writeln("Saw end of expected input");
			return;
		}
	}
}

The descriptor:

struct _ {
	alias ReturnType = void;
	alias Parameters = Socket;
	alias VarTypes = Tuple!(Future!string, readLine);
	sumtype ExceptionTypes = :None;
	sumtype WaitingOn = Future!string;

	static struct State {
		int tag;
		Parameters parameters;
		ExceptionTypes exception;
		
		WaitingOn waitingOnCoroutine;
		
		bool haveValue;
		ReturnType value;

		VarTypes vars;
	}

	alias Functions = [f1, f2];

	static void f1(ref State state) {
		writeln("Connection has been made");

		state.socket.write("GET / HTTP/1.1\r\n");
		state.socket.write("Accept-Encoding: identity\r\n");
		state.socket.write("\r\n");

		state.readLine = socket.readUntil("\n");
		state.tag = 1;
	}
	
	static void f2(ref State state) {
		if (!state.readLine.isComplete) {
			writeln("Not alive and did not get a result");
			state.tag = -1;
			return;
		}
		
		string result = state.readLine.result;
		writeln(result);

		if (result == "</html>") {
			writeln("Saw end of expected input");
			state.tag = -1;
			return;
		}
		
		state.readLine = state.socket.readUntil("\n");
		state.tag = 1;
	}
}

Example Prime Sieve

This example comes from a programming benchmark repository.

void main(string[] args) {
    int n = args.length < 2 ? 100 : to!int(args[1]);
    
    InstantiableCoroutine!(int) ico = &generate;
    Future!int ch = ico.makeInstance();
    ch.block;
    
    foreach(i; 0 .. n) {
        int prime = ch.result;
        writeln(prime);
        
        filter(ch, prime);
    }
}

int generate() @async {
    int i = 2;
    
    for(;;) {
        @async return i;
        
        i++;
    }
}

void filter(ref Future!int ch, int prime) {
    ch.block;
    
    while(ch.result && ch.result % prime == 0) {
	    ch.block;
    }
}

Breaking Changes and Deprecations

The attribute @async will break existing symbols that are called async. It may be desirable to limit this breakage solely to UDA's or to make accessing it only available in an edition.

Reference

Copyright & License

Licensed under Creative Commons Zero 1.0

History

The DIP Manager will supplement this section with links to forum discussions and a summary of the formal assessment.

rikkimax/DIP-coroutines.md