tractor/ai/conc-anal
Gud Boi d0121960b9 Add `subint_forkserver` test-cancellation leak doc
New `ai/conc-anal/
subint_forkserver_test_cancellation_leak_issue.md`
captures a descendant-leak surfaced while wiring
`subint_forkserver` into the full test matrix:
running `tests/test_cancellation.py` under
`--spawn-backend=subint_forkserver` reproducibly
leaks **exactly 5** `subint-forkserv` comm-named
child processes that survive session exit, each
holding a `LISTEN` on `:1616` (the tractor default
registry addr) — and therefore poisons every
subsequent test session that defaults to that addr.

Deats,
- TL;DR + ruled-out checks confirming the procs are
  ours (not piker / other tractor-embedding apps) —
  `/proc/$pid/cmdline` + cwd both resolve to this
  repo's `py314/` venv
- root cause: `_ForkedProc.kill()` is PID-scoped
  (plain `os.kill(SIGKILL)` to the direct child),
  not tree-scoped — grandchildren spawned during a
  multi-level cancel test get reparented to init and
  inherit the registry listen socket
- proposed fix directions ranked: (1) put each
  forkserver-spawned subactor in its own process-
  group (`os.setpgrp()` in fork-child) + tree-kill
  via `os.killpg(pgid, SIGKILL)` on teardown,
  (2) `PR_SET_CHILD_SUBREAPER` on root, (3) explicit
  `/proc/<pid>/task/*/children` walk. Vote: (1) —
  POSIX-standard, aligns w/ `start_new_session=True`
  semantics in `subprocess.Popen` / trio's
  `open_process`
- inline reproducer + cleanup recipe scoped to
  `$(pwd)/py314/bin/python.*pytest.*spawn-backend=
  subint_forkserver` so cleanup doesn't false-flag
  unrelated tractor procs (consistent w/
  `run-tests` skill's zombie-check guidance)

Stopgap hygiene fix (wiring `reg_addr` through the 5
leaky tests in `test_cancellation.py`) is incoming as
a follow-up — that one stops the blast radius, but
zombies still accumulate per-run until the real
tree-kill fix lands.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-23 13:58:42 -04:00
..
subint_cancel_delivery_hang_issue.md Doc `subint` backend hang classes + arm `dump_on_hang` 2026-04-20 16:41:07 -04:00
subint_fork_blocked_by_cpython_post_fork_issue.md Doc `subint_fork` as blocked by CPython post-fork 2026-04-22 16:02:01 -04:00
subint_fork_from_main_thread_smoketest.py Add trio-parent tests for `_subint_forkserver` 2026-04-22 18:00:06 -04:00
subint_forkserver_orphan_sigint_hang_issue.md Refine `subint_forkserver` orphan-SIGINT diagnosis 2026-04-23 09:31:32 -04:00
subint_forkserver_test_cancellation_leak_issue.md Add `subint_forkserver` test-cancellation leak doc 2026-04-23 13:58:42 -04:00
subint_forkserver_thread_constraints_on_pep684_issue.md Add `subint_forkserver` PEP 684 audit-plan doc 2026-04-22 18:18:30 -04:00
subint_sigint_starvation_issue.md Expand `subint` sigint-starvation hang catalog 2026-04-21 17:42:37 -04:00