ZeroVM Networking Broker

Simple, modern networking broker
Built-in support to libzrt
http server
- nginx
- fastcgi
libzrt
1 http frontend + 1 daemon backend
factor some object-query functionality into this
need good multiplexing file/stream transfer
no discovery needed
- in the ZeroCloud case, we already know the cluster topology
two main functions:
- register
  - register(channel_id)
  - unregister(chanel_id)
- transfer
  - send(ip:port, channel_id_src, channel_id_dest, data, size)
  - recv(ip:port, channel_id_src, channel_id_dest, size) -> data
- channel_ids are just opaque identifiers, or a "label" for messages
NOT a queue
- only contains 0..1 messages at a given time
block receivers until message is available
- use case: MapReduce
  - Mapper should not be able to fill memory with all mapped data if the reducer is not ready to consume it
Nothing should happen until somebody wants to consume something
Channels are UNIdirectional
- we are limited to this because of pipe semantics
- CSP supports bidirectional channels
  - but it's not deterministic
  - "select" operator
    - "read from any channel in the list"
    - data = select(list)
    - not supported now by ZeroVM, but could be
    - recv_any([ip_list], [channel_list])
broker-to-broker communication
- http: PUT/GET /broker/channel_id
- register lets brokers learn about each other
proxy-query sends list of brokers and channels in each execution request
- proxy knows it because it always has the complete cluster topology
broker also passes the list of other brokers ip:port tuples to the ZeroVM process/thread
Iterface for broker <--> ZeroVM communication
- register
- unregister
- send
- recv
No routing
- ip:port for each remote end of channel is known on job start
- broker connects there directly, or reuses existing connection to that ip:port
Might need multiple redundant connections
- Not for each ZeroVM instance, though
  - (This would be a good way to run out of file descriptors.)
- For QoS
- Possible one connection for each message-size class, e.g.:
  - 64 bytes
  - 128 bytes
  - 512 bytes
  - 1024 bytes
  - 1024 bytes
  - needed so that long data transfers don't stall short ones
  - Or, just "< 1024 bytes" and ">= 1024 bytes"
    - 2 of each connection
    - round-robin between them
    - this is the typical strategy people implement to speed up page loads in HTTP
Push vs. Pull
- We can decide based on message size
- Push:
  - good for latency
  - application calls write(fd, data)
  - fd translated into channel_id
  - channel_id translated to remote IP
  - broker issues send
  - when send comes to remote broker:
    - reads channel_id
    - translates it into ZeroVM process/thread ID
  - if recv was issued, the send is accepted
  - if not, tell the other party (sender) to wait/buffer
- Pull:
  - good for long transfers
  - only need a buffer with size "small_message" (see message sizes above)
  - accepts first send() unconditionally
    - or else a deadlock will happen often
  - broker: I got a send()
  - broker: Did I get a recv() for that?
  - If not, buffer it.
  - If no buffer, block it.

Other notes:

ZeroVM must support sending channels over channels. This would fix David Holland's inter-instance communication problem without breaking determinism.
Have a look at process calculi

larsbutler/raw.md