WebSocket: LocalProtocolError: Pong … cannot be sent in state LOCAL_CLOSING after remote reset (WinError 10054) causes actor cascade & zombie cleanup #47
Loading…
Reference in New Issue
There is no content yet.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may exist for a short time before cleaning up, in most cases it CANNOT be undone. Continue?
Summary
On Windows, the Binance data feed occasionally drops with WinError 10054 (remote reset). During shutdown, the WebSocket stack (wsproto/trio-websocket) tries to send a Pong while the connection is already closing, raising LocalProtocolError. The feed task fails; internal actor/IPC channels are torn down; the supervisor cleans up “zombie” sub-actors. Reconnect then races with close, resulting in repeated failures.
Environment
OS: Windows 10/11 (x64)
Runtime: Python 3.12.x (venv)
Packages (observed): piker, trio, trio-websocket, wsproto, tractor
Endpoint: wss://stream.binance.com:9443 (public stream)
Steps to Reproduce
Start pikerd with Binance feed enabled (chart/EMS optional).
Let it run behind typical Windows environment (Wi-Fi/VPN/consumer firewall).
Trigger or wait for a transient network hiccup (remote RST).
Observe reconnect loop and eventual crash.
Expected Behavior
On remote close/reset: gracefully stop ping/pong, finish close handshake, and attempt reconnection with backoff—without raising LocalProtocolError or tearing down unrelated actors.
Actual Behavior
Socket error: WinError 10054 (connection forcibly closed by remote host).
Immediately after, LocalProtocolError: Event Pong(…) cannot be sent in state ConnectionState.LOCAL_CLOSING.
Actor nursery cancellations and IPC channel closures; “zombie” processes cleaned by supervisor.
Reconnect attempt races with shutdown leading to additional errors.
Representative Logs / Trace
Impact
Intermittent feed loss on Windows networks (Wi-Fi/VPN/middleboxes).
Crash cascades and requires manual restart; charts/EMS sessions are interrupted.
Root Cause Analysis (RCA)
After receiving a close/EOF (or remote RST), the WebSocket state transitions to LOCAL_CLOSING.
A background ping task or on-ping callback still attempts to send Pong, which wsproto disallows during closing → LocalProtocolError.
Exception bubbles out of feed task, cancelling the actor nursery. Reconnect logic competes with shutdown, exacerbating the failure.