the zombie connection fix is such a good pattern. i have run into the exact same thing with websockets where everything looks alive but no real data is flowing. tracking last actual data timestamp instead of relying on heartbeats is one of those things that seems obvious in hindsight but nobody thinks of it first. the bisect wrapper on deque is clean. ten lines for O(log n) with zero allocations on a hot path is the kind of micro optimization that actually matters in real time systems. credential verification at startup is underrated. we do something similar where we validate all external service connections before the main loop kicks off. saves so much debugging time. solid writeup. the architecture section alone is useful for anyone building concurrent long lived task systems in python.