Capy and TooManyCooks: A Comparison

You want to write async code in C++. You’ve heard about coroutines. Two libraries exist: Capy and TooManyCooks (TMC). Both let you write co_await. Both run on multiple threads.

One was designed for network I/O. The other was designed for compute tasks. Choosing the wrong one creates friction. This document helps you choose.

The Simple Version

Capy:

  • Built for waiting on things (network, files, timers)

  • When data arrives, your code wakes up in the right place automatically

  • Cancellation works - if you stop waiting, pending operations stop too

  • Handles data buffers natively - the bytes flowing through your program

TMC:

  • Built for doing things (calculations, parallel work)

  • Multi-threaded work pool that keeps CPUs busy

  • Priority levels so important work runs first (16 of them, to be precise)

  • Mid-coroutine executor switching for flexible work migration

  • No built-in I/O - you add that separately (via Asio integration)

If you’re building a network server, one of these is swimming upstream.

On priorities: Capy defines executors using a concept. Nothing stops you from implementing a priority-enforcing executor. You could have 24 priority levels, if 16 somehow felt insufficient.

Where Does Your Code Run?

When async code finishes waiting, it needs to resume somewhere. Where?

Capy’s answer: The same place it started. Automatically.

  • Context flows forward through await_suspend(h, ex, token) parameters

  • Your coroutine started on executor X? It resumes on executor X.

  • Child tasks can run on different executors via run(other_ex)(child_task())

TMC’s answer: Where you tell it, with flexibility to change mid-execution.

  • Context flows via tmc::detail::awaitable_traits - a traits-based injection mechanism

  • Thread-local variables track the current executor for quick access

  • Coroutines can hop executors mid-body via resume_on() and enter()/exit()

  • Works fine within TMC’s ecosystem; integrating external I/O requires the coordination headers (ex_asio.hpp, aw_asio.hpp)

Both libraries propagate executor context. They differ in mechanism and mobility.

Executor Mobility

TMC allows a coroutine to switch executors mid-body:

tmc::task<void> example() {
    // Running on executor A
    co_await tmc::resume_on(executor_b);
    // Now running on executor B - same coroutine!

    // Or scoped:
    auto scope = co_await tmc::enter(io_exec);
    // Temporarily on io_exec
    co_await scope.exit();
    // Back to original
}

This is powerful for compute workloads where work can migrate between thread pools.

Capy’s design choice: Intentionally prevent mid-coroutine executor switching. A coroutine stays on its bound executor for its entire lifetime. Child tasks can run on different executors via run(other_ex)(child_task()), but the parent never moves.

Why Capy prevents this: I/O objects often have invariants tied to their executor:

  • A socket may only be accessed from threads associated with a specific io_context

  • File handles on Windows IOCP must complete on the same context they were initiated

  • Timer state is executor-specific

Allowing a coroutine holding I/O objects to hop executors mid-body would break these invariants. TMC doesn’t face this constraint because it’s a compute scheduler - work items don’t carry I/O state with executor affinity.

Stopping Things

What happens when you need to cancel an operation?

Capy: Stop tokens propagate automatically through the call chain.

  • Cancel at the top, everything below receives the signal

  • Pending I/O operations cancel at the OS level (CancelIoEx, IORING_OP_ASYNC_CANCEL)

  • Clean shutdown, no leaked resources

TMC: You manage cancellation yourself.

  • Stop tokens exist in C++20 but TMC doesn’t propagate them automatically

  • This is intentional: TMC is designed to work with various external libraries

  • Pending work completes, or you wait for it

The TMC author acknowledged that automatic cancellation propagation is an "excellent killer feature" for an integrated I/O stack like Capy.

Keeping Things Orderly

Both libraries support multi-threaded execution. Sometimes you need guarantees: "these operations must not overlap."

Capy’s strand:

  • Wraps any executor

  • Coroutines dispatched through a strand never run concurrently

  • Even if one suspends (waits for I/O), ordering is preserved

  • When you resume, the world is as you left it

TMC’s ex_braid:

  • Also serializes execution

  • But: when a coroutine suspends, the lock is released

  • Another coroutine may enter and begin executing

  • When you resume, the state may have changed

TMC’s documentation describes this as "optimized for higher throughput with many serialized tasks." This is a design choice. Whether it matches your mental model is a separate question.

Neither library prevents the caller from initiating multiple concurrent I/O operations on the same object - that’s always the caller’s responsibility. Both provide mutual exclusion for coroutine/handler execution only, not I/O operation queuing.

Working with Data

Network code moves bytes around. A lot of bytes. Efficiently.

Capy provides:

  • Buffer sequences (scatter/gather I/O without copying)

  • Algorithms: slice, copy, concatenate, consume

  • Dynamic buffers that grow as needed

  • Type-erased streams: write code once, use with any stream type

TMC provides:

  • Nothing. TMC is not an I/O library.

  • You use Asio’s buffers through the integration layer.

Memory Allocation Control

HALO (Heap Allocation Lowering Optimization) lets compilers eliminate coroutine frame allocations when the frame’s lifetime doesn’t escape the caller. But I/O operations always escape - the awaitable must live until the kernel/reactor completes the operation.

Capy provides:

  • Custom allocator propagation via run_async(ex, allocator) and run(allocator)

  • Per-connection arena allocation

  • Memory isolation between connections

  • Instant reclamation on connection close

std::pmr::monotonic_buffer_resource arena;
run_async(ex, &arena)(handle_connection(socket));
// On disconnect: entire arena reclaimed instantly

TMC provides:

  • Global ::operator new (with cache-line padding)

  • Recommends tcmalloc for improved performance

  • No per-operation allocator control

For I/O workloads where HALO cannot apply, allocator control is essential, not optional.

Getting Technical: The IoAwaitable Protocol

When you write co_await something, what happens?

Standard C++20:

void await_suspend(std::coroutine_handle<> h);
// or
bool await_suspend(std::coroutine_handle<> h);
// or
std::coroutine_handle<> await_suspend(std::coroutine_handle<> h);

The awaitable receives a handle to resume. That’s all. No information about where to resume, no cancellation mechanism.

Capy extends this:

auto await_suspend(coro h, executor_ref ex, std::stop_token token);

The awaitable receives:

  • h - The handle (for resumption)

  • ex - The executor (where to resume)

  • token - A stop token (for cancellation)

TMC’s approach:

Standard signature, plus traits-based context injection:

// TMC propagates context via awaitable_traits<T>
awaitable_traits<T>::set_continuation(awaitable, continuation);
awaitable_traits<T>::set_continuation_executor(awaitable, executor);

TMC also tracks this_thread::executor and this_task.prio in thread-local variables for quick access.

Both approaches achieve context propagation. Neither is compatible with arbitrary third-party awaitables without explicit support.

Protocol Strictness

What happens when you co_await an awaitable that doesn’t implement the extended protocol?

Capy: Compile-time error.

// From task.hpp transform_awaitable()
else
{
    static_assert(sizeof(A) == 0, "requires IoAwaitable");
}

TMC: Wrap in a trampoline that captures current context.

// From task.hpp await_transform()
return tmc::detail::safe_wrap(std::forward<Awaitable>(awaitable));

Trade-offs:

Aspect Capy TMC

Unknown awaitables

Compilation failure

safe_wrap() trampoline

Context propagation

Required by protocol

Lost for wrapped awaitables

Integration flexibility

Requires protocol adoption

More permissive interop

Capy makes the conscious decision that silent degradation is worse than compilation failure. If an awaitable doesn’t carry context forward, the code doesn’t compile. This prevents subtle bugs where cancellation or executor affinity silently stops working.

TMC’s approach is more flexible for incremental adoption but risks silent context loss when mixing TMC with non-TMC awaitables.

Integration Approaches

Aspect TMC Capy

External adapter

Traits specialization (non-intrusive)

Member function (intrusive)

Unknown awaitables

safe_wrap() trampoline

static_assert failure

Context mechanism

Traits + TLS capture

Parameter passing

Both require explicit support from awaitables. TMC’s traits are external specializations, making it theoretically easier to build adapters for third-party libraries without modifying them. Capy’s member function signature requires the awaitable itself to implement the protocol.

Practically, both require cooperation from awaitable authors for full functionality.

I/O Performance: Native vs Integration

TMC integrates with Asio via aw_asio.hpp/ex_asio.hpp. Corosio provides native I/O objects built on Capy’s protocol.

TMC + Asio call chain for socket.async_read_some(buf, tmc::aw_asio):

  1. async_result<aw_asio_t>::initiate() - creates awaitable, stores initiation + args in std::tuple

  2. operator co_await() returns aw_asio_impl

  3. await_suspend() calls async_initiate()initiate_await(callback) - virtual call

  4. std::apply unpacks tuple, invokes Asio initiation

  5. Asio type-erases handler into internal storage

  6. On completion: callback stores result, calls resume_continuation()

  7. resume_continuation() checks executor/priority, posts if different

Corosio native call chain for socket.read_some(buf):

  1. Returns read_some_awaitable (stack object)

  2. await_suspend(h, ex, token) calls impl_.read_some() - virtual call to platform impl

  3. Platform impl issues direct syscall (recv/WSARecv)

  4. Registers with reactor

  5. On completion: ex.dispatch(h) - inline resume when on io_context executor

Overhead comparison:

Aspect TMC + Asio Corosio Native

Virtual calls

1 (initiate_await)

1 (platform impl)

Type erasure

Asio handler + ex_any

executor_ref only

Tuple packing

Yes (init args)

No

Handler storage

Asio internal (likely heap)

Operation slot in socket

Completion dispatch

Checks executor/priority, posts if different

dispatch() call, inline resume on io_context

Lambda wrapper

Yes (ex_asio::post)

No

The critical path difference is completion. TMC+Asio goes through resume_continuation() which checks executor/priority and often posts via asio::post(). Corosio’s dispatch() can resume the coroutine inline when already on the io_context executor, avoiding the post overhead.

Type Erasure

Capy:

  • any_stream, any_read_stream, any_write_stream

  • Write a function taking any_stream& - it compiles once

  • One virtual call per I/O operation

  • Clean ABI boundaries

TMC:

  • Traits-based: executor_traits<T> specializations

  • Type-erased executor: ex_any (function pointers, not virtuals)

  • No stream abstractions (not an I/O library)

Different Positions in the Tree of Need

TMC and Capy occupy different architectural positions. Rather than competing, they serve different needs:

TMC sits above I/O:

  • Compute scheduler designed for CPU-bound parallel work

  • Integrates with existing I/O solutions (Asio)

  • Flexible executor mobility for work migration

  • Permissive interop via safe_wrap() for gradual adoption

Capy sits below compute:

  • I/O foundation designed for network/file operations

  • Strict protocol enforcement prevents silent failures

  • Executor stability protects I/O object invariants

  • Allocator control where HALO cannot apply

Neither is "more fundamental." If you’re building a network server, Capy’s constraints exist to protect you. If you’re parallelizing CPU work, TMC’s flexibility is valuable.

Corosio: Proof It Works

Capy is a foundation. Corosio builds real networking on it:

  • TCP sockets, acceptors

  • TLS streams (WolfSSL)

  • Timers, DNS resolution, signal handling

  • Native backends: IOCP (Windows), epoll (Linux), io_uring (planned)

All built on Capy’s IoAwaitable protocol. Coroutines only. No callbacks.

When to Use Each

Choose TMC if:

  • CPU-bound parallel algorithms

  • Compute workloads needing work-stealing or priority scheduling (1-16 levels)

  • Work that benefits from mid-coroutine executor migration

  • You’re already using Asio and want a scheduler on top

  • Gradual adoption with mixed awaitable sources

Choose Capy if:

  • Network servers or clients

  • Protocol implementations

  • I/O-bound workloads

  • You want cancellation that propagates automatically

  • You want buffers and streams as first-class concepts

  • You need per-connection allocator control

  • You prefer strict compile-time protocol enforcement

Or use both:

TMC for compute scheduling, Capy/Corosio for I/O. They can coexist at different layers of your application.

Summary

Aspect Capy TooManyCooks

Primary purpose

I/O foundation

Compute scheduling

Threading

Multi-threaded (thread_pool)

Multi-threaded (work-stealing)

Executor mobility

Fixed per coroutine

Mid-body switching (resume_on)

Serialization

strand (ordering preserved across suspend)

ex_braid (lock released on suspend)

Context propagation

await_suspend parameters

awaitable_traits + TLS

Unknown awaitables

static_assert failure

safe_wrap() trampoline

Cancellation

Automatic propagation

Manual

Allocator control

Per-task (std::pmr)

Global (::operator new)

Buffer sequences

Yes

No (use Asio)

Stream concepts

Yes (ReadStream, WriteStream, etc.)

No

Type-erased streams

Yes (any_stream)

No

I/O support

Via Corosio (native IOCP/epoll/io_uring)

Via Asio integration headers

Priority scheduling

Implement your own

Built-in (1-16 levels)

Work-stealing

No

Yes

Executor model

Concept-based (user-extensible)

Traits-based (executor_traits<T>)

Revision History

Date Changes

2026-02-04

Revised to correct inaccuracies regarding TMC’s context propagation mechanism. The author of TooManyCooks provided feedback clarifying that TMC implements executor affinity via tmc::detail::awaitable_traits, not just thread-local state. Reframed comparison to acknowledge both libraries as complementary solutions for different architectural positions rather than competitors.