We will use coroutines as custom hyperthreading. We want to be able to dealn with memory as if it were a remote location from where we are fetching data. When issuing a prefetch command we want to switch to a different.
int main () {
event_loop ev(3);
int val1 = 1;
int val2 = 2;
int val3 = 3;
for (int i = 0; i < 6; i++) {
ev.spawn<event_loop::base_task>([&] (int i) -> event_loop::base_task {
int v;
printf("[%d]Prefetching val1\n",i);
v = co_await prefetch(val1);
printf("[%d]in cache val1 = %d, prefetching val2\n",i, v);
v = co_await prefetch(val2);
printf("[%d]in cache val2 = %d, prefetching val3\n",i, v);
v = co_await prefetch(val3);
printf("[%d]in cache val3 = %d\n",i, v);
co_return;
},i);
}
ev.join();
}
a call to co_await prefetch(...)
will issue a pre-fetch
instruction and immediately switch to another coroutine. After this
is done 3 times we return to the initial coroutine and actually use
the variable we prefetched and hopefully it has reached L1 by now so
it’s immediately available.
A coroutine is a function that returns a task object that has the
lifetime of the coroutine. This object is used by the compiler to
find task::promise_type
which is expected to implement a
particular interface. A function
([]() -> task { /* body of the coroutine */ })()
Is desugared into something similar to.
task::promise_type p;
task t = p.get_return_object();
p.initial_suspend(); // may pass control to the caller
// body of the coroutine
p.return_void();
p.final_suspend(); // may pass control to the caller
It is clear then that the task object must be of a type similar to
struct task {
struct promise_type {
// suspend_always: always suspends and does not produce a value.
suspend_always initial_suspend() { return {}; }
// suspend_never: never suspends and does not produce a value.
suspend_never final_suspend() {return {};}
void return_void() {}
// This actually builds the task
base_task get_return_object() {return {};}
void unhandled_exception() {}
};
};
Moving to the actual meat of the matter.
co_await expr;
Is desugared into something resembling
awaitable_type awaitable;
if (awaitable.await_ready) {
awaitable.await_suspend(coro_handle);
// <pass control to caller and wait for resume>
}
return awaitable.await_resume();
Therefore the prefetch function must be something similar to
template<typename T>
auto prefetch(T& value) {
return prefetch_awaitable<T,task_type::promise_type>{value};
}
Now we need to define the prefetch_awaitable object. As indicated
it needs to be able to wait_ready
, await_suspend
and
await_resume
.
template<typename T,typename P>
struct prefetch_awaitable {
T& value;
prefetch_awaitable(T& value) : value(value) {}
~prefetch_awaitable() {}
bool await_ready () { return false; }
coroutine_handle<P> await_suspend (coroutine_handle<P> h) {
_mm_prefetch(static_cast<const char*>(std::addressof(value)));
P& promise = h.promise();
assert(promise.owner);
auto& sch = promise.owner->scheduler;
sch.push_back(h);
return sch.pop_front();
}
T& await_resume () { return value; }
};
Note here that await_suspend
could optionally return void
if we
can’t immediately come up with a coroutine to continue with. In
that case control is returned to the caller.
In the case of a yielding coroutine we want something like
gen_t f( ) {
printf("Yield: 1");
co_yield 1;
printf("Yield: 2");
co_yield 2;
printf("Yield: 3");
co_yield 3;
}
int main() {
for (int i : f()) {
printf("Iter: i = %d\n", i);
}
}
Which would be similar to the python notion of what a coroutine
is. A co_yield i
keyworkd is equivalent to co_await
promise.yield_value(expr)
. In practice it means that the gen_t
task will be resumed not by an external source (like the awaiting
coroutine) but rather by an incrementing iterator.