- Systems programming doesn’t necessarily have to be magical and opaque.
- “Cloud-native” abstractions have a cost; a modest, regular machine can go surprisingly far.
it does not take “that much machine” to serve a fair number of clients
What I have is a server which starts up and kicks off a “listener” thread to do nothing but watch for incoming traffic. When it gets a new connection, it kicks off a “serviceworker” thread to handle it. The “listener” thread owns all of the file descriptors (listeners and clients both), and manages a single epoll set to watch over them.
When a client fd goes active, it pokes that serviceworker thread with a condvar, and it wakes up and does a read from the network. If there’s a complete and usable message, it dispatches it to my wacky little RPC situation which is sitting behind it. Then it pushes the result out to the network again and goes back to waiting for another wakeup.
If you’re keeping score, this means I spawn a whole OS-level thread every time I get a connection. I designed this thing around the notion of “what if we weren’t afraid to create threads”, and this is the result.
The server process has about 90 MB of RSS – that is, physical memory. That’s not bad for something with a hair over 2000 threads! Now, sure, the virtual size (VSZ) of the process is something like 20 GB, but who cares? That’s all of the overhead of having that many threads around, but they aren’t actually using it in physical terms, so it works.
So, my dumb little box from 2011 can handle 10000 persistent connections each firing a request down the pipe 5 times a second to do something truly stupid with random numbers. It does this with maybe half a gig of physical memory used, and about 75% of the CPU power available. I haven’t done anything special in terms of balancing IRQs from the NIC, splitting up the accept/epoll/dispatch load, or pipelining requests. It’s about as stupid as you can imagine.
This is just one example of what can be done. It’s not magic. It’s just engineering.