Largely based on an explanation of Erlang/Elixir/BEAM’s concurrency model in The Soul of Erlang and Elixir

Prompt/example: a browser establishes a websocket connection with a server, and gives the server various tasks to perform via the WS connection.


  • The BEAM VM’s primary abstraction around a unit of execution is a Process
  • Processes are lightweight and share no state; all communication is done via message-passing.
  • Processses are multiplexed onto n OS threads (where n > 0)
  • Processes are pre-empted by the VM, so CPU-bound processes can’t monopolize execution, even on a single thread.
  • This structure allows consumers to flood the system with too many CPU-bound threads, which needs application-level logic (as far as I know) to avoid.

Rust / Tokio

  • The most basic abstraction here is a Future, which represents a unit of execution.
    • A Future can be pending or ready, and can be polled.
    • You can await a future, which:
      • Returns immediately if the future is ready
      • “Pauses” execution and returns control to the scheduler if the future is not ready.
  • Tokio’s abstraction around futures are “tasks”
    • Tasks are executed on a thread pool (typically one thread per core).
    • Tasks cannot be pre-empted arbitrarily, but Tokio can swap them out in three ways:
      • At await points: if the future/task is not ready, the scheduler regains control.
      • Every task is given a budget, and every call to an async function decreases the budget.
        • When the budget hits zero, all futures called by the current task return “pending” even if they are ready, to force the task to yield control.
        • Is this budget tracking limited to async functions in tokio’s stdlib? If not, how is this tracked?
      • The task can “cooperate” with the scheduler and yield control voluntarily at strategic points during execution.
  • Three approaches to architect the browser + WS prompt:
    1. The server’s WS hander could create a new task for each incoming computation, await it, and send down the response.
      • This effectively means that there’s no concurrency to be had within the context of a single connection..
    2. The server’s WS handler could directly call an async fn for each incoming computation and await it.
    3. A more robust approach is to have the WS handler create a new task for incoming connection but not await it.
      • Instead, use mpsc (or a similar method) to let the connection handler know when the task is done, so it can send down a response.
      • It can serve other incoming requests while tasks are executing.
      • Use spawn_blocking if the task is CPU-bound, which uses a different, larger (512 by default) thread pool.